File-CountLines-v0.0.3000755001750001750 011433221300 14660 5ustar00moritzmoritz000000000000File-CountLines-v0.0.3/Build.PL000444001750001750 132611433221300 16313 0ustar00moritzmoritz000000000000use strict; use warnings; use Module::Build; my $build = Module::Build->new( create_readme => 1, create_makefile_pl => 'traditional', license => 'perl', module_name => 'File::CountLines', dist_author => 'Moritz Lenz', dist_abstract => 'Efficiently count the number of line breaks in a file', dist_version_from => 'lib/File/CountLines.pm', requires => { 'Exporter' => '5.57', 'Carp' => 0, 'charnames' => 1.01, 'warnings' => 0, 'strict' => 0, }, recommends => {}, sign => 0, ); $build->create_build_script; # vim: sw=4 ts=4 expandtab File-CountLines-v0.0.3/Changes000444001750001750 42511433221277 16306 0ustar00moritzmoritz000000000000Revision History for Perl module File::CountLines 0.0.3 Thu Aug 19 14:24:04 CEST 2010 - documentation fixes 0.0.2 Wed Nov 12 20:01:34 CET 2008 - require a sufficiently new version of `charnames' - Small documentation fix 0.0.1 Sun Nov 9 23:29:53 CET 2008 - Initial release File-CountLines-v0.0.3/MANIFEST000444001750001750 16711433221277 16147 0ustar00moritzmoritz000000000000Build.PL Changes lib/File/CountLines.pm Makefile.PL MANIFEST This list of files META.yml README t/01-pod.t t/basic.t File-CountLines-v0.0.3/README000444001750001750 736311433221300 15706 0ustar00moritzmoritz000000000000NAME File::CountLines - efficiently count the number of line breaks in a file. SYNOPSIS use File::CountLines qw(count_lines); my $no_of_lines = count_lines('/etc/passwd'); # other uses my $carriage_returns = count_lines( 'path/to/file.txt', style => 'cr', ); # possible styles are 'native' (the default), 'cr', 'lf' DESCRIPTION perlfaq5 answers the question on how to count the number of lines in a file. This module is a convenient wrapper around that method, with additional options. More specifically, it counts the number of *line breaks* rather than lines. On Unix systems nearlly all text files end with a newline (by convention), so usually the number of lines and number of line breaks is equal. Since different operating systems have different ideas of what a newline is, you can specifiy a "style" option, which can be one of the following values: "native" This takes Perl's "\n" as the line separator, which should be the right thing in most cases. See perlport for details. This is the default. "cr" Take a carriage return as line separator (MacOS style) "lf" Take a line feed as line separator (Unix style) "crlf" Take a carriage return followed by a line feed as separator (Microsoft Windows style) Alternatively you can specify an arbitrary separator like this: my $lists = count_lines($file, separator => '\end{itemize}'); It is taken verbatim and searched for in the file. The file is read in equally sized blocks. The size of the blocks can be supplied with the "blocksize" option. The default is 4096, and can be changed by setting $File::CountLines::BlockSize. Do not use a block size smaller than the length of the separator, that might produce wrong results. (In general there's no reason to chose a smaller block size at all. Depending on your size a larger block size might speed up things a bit.) Character Encodings If you supply a separator yourself, it should not be a decoded string. The file is read in binary mode, which implies that this module works fine for text files in ASCII-compatible encodings, including ASCII itself, UTF-8 and all the ISO-8859-* encodings (aka Latin-1, Latin-2, ...). Note that the multi byte encodings like UTF-32, UTF-16le, UTF-16be and UCS-2 encode a line feed character in a way that the 0x0A byte is a substring of the encoded character, but if you search blindly for that byte you will get false positives. For example the *LATIN CAPITAL LETTER C WITH DOT ABOVE*, U+010A has the byte sequence "0x0A 0x01" when encoded as UTF-16le, so it would be counted as a newline. Even search for "0x0A 0x00" might give false positives. So the summary is that for now you can't use this module in a meaningful way to count lines of text files in encodings that are not ASCII-compatible. If there's demand for, I can implement that though. Extending You can add your own EOL styles by adding them to the %File::CountLines::StyleMap hash, with the name of the style as hash key and the separator as the value. AUTHOR Moritz Lenz , COPYRIGHT AND LICENSE Copyright (C) 2008 by Moritz A. Lenz. This module is free software. You may use, redistribute and modify it under the same terms as perl itself. Example code included in this package may be used as if it were in the Public Domain. DEVELOPMENT You can obtain the latest development version from : git clone git://github.com/moritz/File-CountLines.git File-CountLines-v0.0.3/Makefile.PL000444001750001750 112211433221300 16763 0ustar00moritzmoritz000000000000# Note: this file was auto-generated by Module::Build::Compat version 0.3603 use ExtUtils::MakeMaker; WriteMakefile ( 'NAME' => 'File::CountLines', 'VERSION_FROM' => 'lib/File/CountLines.pm', 'PREREQ_PM' => { 'Carp' => 0, 'Exporter' => '5.57', 'charnames' => '1.01', 'strict' => 0, 'warnings' => 0 }, 'INSTALLDIRS' => 'site', 'EXE_FILES' => [], 'PL_FILES' => {} ) ; File-CountLines-v0.0.3/META.yml000444001750001750 103411433221277 16301 0ustar00moritzmoritz000000000000--- abstract: 'Efficiently count the number of line breaks in a file' author: - 'Moritz Lenz' configure_requires: Module::Build: 0.36 generated_by: 'Module::Build version 0.3603' license: perl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 name: File-CountLines provides: File::CountLines: file: lib/File/CountLines.pm version: v0.0.3 requires: Carp: 0 Exporter: 5.57 charnames: 1.01 strict: 0 warnings: 0 resources: license: http://dev.perl.org/licenses/ version: v0.0.3 File-CountLines-v0.0.3/lib000755001750001750 011433221277 15443 5ustar00moritzmoritz000000000000File-CountLines-v0.0.3/lib/File000755001750001750 011433221277 16322 5ustar00moritzmoritz000000000000File-CountLines-v0.0.3/lib/File/CountLines.pm000444001750001750 1437111433221277 21126 0ustar00moritzmoritz000000000000package File::CountLines; use strict; use warnings; our $VERSION = '0.0.3'; our @EXPORT_OK = qw(count_lines); use Exporter 5.057; Exporter->import('import'); use Carp qw(croak); use charnames qw(:full); our %StyleMap = ( 'cr' => "\N{CARRIAGE RETURN}", 'lf' => "\N{LINE FEED}", 'crlf' => "\N{CARRIAGE RETURN}\N{LINE FEED}", 'native' => "\n", ); our $BlockSize = 4096; sub count_lines { my $filename = shift; croak 'expected filename in call to count_lines()' unless defined $filename; my %options = @_; my $sep = $options{separator}; unless (defined $sep) { my $style = exists $options{style} ? $options{style} : 'native'; $sep = $StyleMap{$style}; die "Don't know how to map style '$style'" unless defined $sep; } if (length($sep) > 1) { return _cl_sysread_multiple_chars( $filename, $sep, $options{blocksize} || $BlockSize, ); } else { return _cl_sysread_one_char( $filename, $sep, $options{blocksize} || $BlockSize, ); } } sub _cl_sysread_one_char { my ($filename, $sep, $blocksize) = @_; local $Carp::CarpLevel = 1; open my $handle, '<:raw', $filename or croak "Can't open file `$filename' for reading: $!"; binmode $handle; my $lines = 0; $sep =~ s/([\\{}])/\\$1/g; # need eval here because tr/// doesn't interpolate my $sysread_status; eval qq[ while (\$sysread_status = sysread \$handle, my \$buffer, $blocksize) { \$lines += (\$buffer =~ tr{$sep}{}); } ]; die "Can't sysread() from file `$filename': $!" unless defined ($sysread_status); close $handle or croak "Can't close file `$filename': $!"; return $lines; } sub _cl_sysread_multiple_chars { my ($filename, $sep, $blocksize) = @_; local $Carp::CarpLevel = 1; open my $handle, '<:raw', $filename or croak "Can't open file `$filename' for reading: $!"; binmode $handle; my $len = length($sep); my $lines = 0; my $buffer = ''; my $sysread_status; while ($sysread_status = sysread $handle, $buffer, $blocksize, length($buffer)) { my $offset = -$len; while (-1 != ($offset = index $buffer, $sep, $offset + $len)) { $lines++; } # we assume $len >= 2; otherwise use _cl_sysread_one_char() $buffer = substr $buffer, 1 - $len; } die "Can't sysread() from file `$filename': $!" unless defined ($sysread_status); close $handle or croak "Can't close file `$filename': $!"; return $lines; } 1; __END__ =head1 NAME File::CountLines - efficiently count the number of line breaks in a file. =head1 SYNOPSIS use File::CountLines qw(count_lines); my $no_of_lines = count_lines('/etc/passwd'); # other uses my $carriage_returns = count_lines( 'path/to/file.txt', style => 'cr', ); # possible styles are 'native' (the default), 'cr', 'lf' =head1 DESCRIPTION L answers the question on how to count the number of lines in a file. This module is a convenient wrapper around that method, with additional options. More specifically, it counts the number of I rather than lines. On Unix systems nearlly all text files end with a newline (by convention), so usually the number of lines and number of line breaks is equal. Since different operating systems have different ideas of what a newline is, you can specifiy a C