HTTP-Async-0.30/0000755000175000017500000000000012602263415011733 5ustar alexalexHTTP-Async-0.30/MANIFEST0000644000175000017500000000133112602263415013062 0ustar alexalex.travis.yml Changes lib/HTTP/Async.pm lib/HTTP/Async/Polite.pm Makefile.PL MANIFEST This list of files README.md t/bad-connections.t t/bad-headers.t t/bad-hosts.t t/cookies.t t/dead-connection.t t/headers.t t/invalid-options.t t/key_aliases.t t/local-addr.t t/make-url-absolute.t t/not-modified.t t/peer-addr.t t/pod-coverage.t t/pod.t t/polite.t t/poll-interval.t t/proxy-with-https.t t/proxy.t t/real-servers.t t/redirects.t t/release-cpan-changes.t t/remove.t t/setup.t t/strip-host-from-uri.t t/template.t t/test-utils.pl t/TestServer.pm t/timeout.t TODO META.yml Module YAML meta-data (added by MakeMaker) META.json Module JSON meta-data (added by MakeMaker) HTTP-Async-0.30/Changes0000644000175000017500000001176412602263301013231 0ustar alexalexCHANGES to HTTP::Async 0.30 2015/05/30 * Allow max_redirect or max_redirects, to be consistent with LWP::UserAgent Thanks Vincent Lequertier (SkySymbol)! 0.29 2015/05/30 * Make add_with_opts throw error on invalid options Thanks Tom Grimwood-Taylor (tgt)! 0.28 2015/03/09 * Allow manual override of PeerAddr via peer_addr (rt #102634) * Switch from print() to note() in TestServer for test suite 0.27 2014/11/17 * Github user acferen finally patched the long-standing timeout bug Thanks acferen! 0.26 2014/06/06 * Daniel Lintott of the Debian Perl Group reported that the HTTP::Async proxy tests were broken with a development version of HTTP-Server-Simple (0.45_1) I fixed the test, or rather fixed t/TestServer.pm, so that it would work Thanks Daniel! * While I was in there, I replaced some warn() calls in the tests with diag() calls, to be a better TAP citizen 0.25 2014/03/20 * Added remove($id) and remove_all() methods Thanks go to rt.cpan.org user Ikegami * Added support for forwarding headers on redirect Thanks to Github users kloevschall and kaol * Added support for having an HTTP::Cookies cookie jar object Thanks again to Github user kaol * Use Net::EmptyPort for the TestServer in the tests Thanks to Github user and all around great Perl Monger DrHyde 0.24 2014/03/19 * Better POD docs for the counting methods - Requested by Dave Hodgkinson via rt.cpan.org 0.23 2013/11/03 * Added REAL_SERVERS check to t/proxy-with-https.t - Thanks to Gregor Herrmann, Debian Perl Group, for the patch 0.22 2013/09/12 * Added repository cpan metadata to Makefile.PL - Thanks to David Steinbrunner for the patch 0.21 2013/08/29 * Updated Changes file to meet CPAN::Changes::Spec * FIxed unparseable date for version 0.02 0.20 2013/07/18 * Updates Changes file to meet CPAN::Changes::Spec * Changed and standardized date formats * Changed name from CHANGES to Changes * Added author/release test to check this going forward 0.19 2013/07/17 * Added ssl_options support * Increased Net::HTTPS::NB requirement to 0.13 - Thanks to Heikki Vatiainen for the patch 0.18 2013/05/27 * Fixed typo in POD - Added THANKS for Florian (fschlich) 0.17 2013/04/20 * Added local_addr and local_port support * Standardised test names * Added THANKS for github user c00ler- 0.16 2013/04/04 * Fixed CPAN Testers bug in bad-hosts.t 0.15 2013/04/04 * Two bug fixes provided by Josef Toman: * Fixed header handling to use header_field_names() * Replaced _make_url_absolute with URI::new_abs() 0.14 2013/04/01 * More diagnostics in bad-hosts.t on failure 0.13 2013/03/29 * Fixed t/real-servers.t to work whether or not Net::HTTPS::NB is available 0.12 2013/03/29 * New logic for making https requests through a proxy * Made tests run ok in parallel by using different ports per test * Set explicit SSL_verify_mode in real-servers.t * Minor update to code comment about is_proxy mode 0.11 2012/11/13 * Use high ports to prevent test failure when 8080 is already used * Travis config 0.10 2012/03/08 * added support for https requests - thanks Naveed Massjouni 0.09 2007/09/13 * added requirement for Pod::Coverage >= 0.19 if perl >= 5.9.0 * moved polite.t test into t/ so that it gets run by the makefile. 0.08 2007/09/12 * Deleted Module::Build * Removed test in bad-hosts.t that was unreliable. I think that it was failing under certain proxy configs. 0.07 2007/02/18 * Added proper handling of 304 responses based on code patch and test by Tomohiro Ikebe from livedoor.jp 0.06 2007/02/06 * Changed the request uri that is used so that it has the host in for proxy requests and does not otherwise. This is to comply with the RFC for HTTP ( http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2 ). 0.05 2006/11/17 * Added ability to pass arguments to new to configure the async object. 0.04 2006/09/28 * Fixed stupid bug that caused the polite module to crash if the numbers of requests per domain were not the same. 0.03 2006/09/27 * Created HTTP::Async::Polite that adds limits to the scraping to avoid over stretching the domain being scraped. * Increased the delay in poll-interval tests to stop them failing on slow machines. * Added pod tests, README and Makefile.PL in an attempt to achieve kwalitee. 0.02 2006/09/06 * Changed the timeout to be an inactivity timeout and added a max_request_length to limit the amount of time that a request can be running for. * Added more diagnostics to the tests to try to find the bug that is causing MIYAGAWA issues. * Created TODO and CHANGES docs. * Added error checking to catch connections that fail before the headers are sent. (patch submitted by Egor Egorov) * Added ability to specify proxy to use. (based on patch from Egor Egorov) * Added 'add_with_opts' method that lets you override the default options for this request. 0.01 2006/08/21 * Initial release onto CPAN. HTTP-Async-0.30/Makefile.PL0000644000175000017500000000213512533246267013717 0ustar alexalexuse strict; use warnings; use ExtUtils::MakeMaker; WriteMakefile( 'NAME' => 'HTTP::Async', 'VERSION_FROM' => 'lib/HTTP/Async.pm', LICENSE => 'perl', 'PREREQ_PM' => { 'Carp' => 0, 'Data::Dumper' => 0, 'HTTP::Request' => 0, 'HTTP::Response' => 0, 'HTTP::Server::Simple::CGI' => 0, 'HTTP::Status' => 0, 'IO::Select' => 0, 'LWP::UserAgent' => 0, 'Net::HTTP' => 0, 'Net::HTTP::NB' => 0, 'Net::HTTPS::NB' => 0.13, 'Test::HTTP::Server::Simple' => 0, 'Test::More' => 0, 'Test::Fatal' => 0, 'Time::HiRes' => 0, 'URI' => 0, 'URI::Escape' => 0, 'Net::EmptyPort' => 0, }, META_MERGE => { resources => { repository => 'https://github.com/evdb/HTTP-Async', }, }, ); HTTP-Async-0.30/META.json0000664000175000017500000000277312602263415013367 0ustar alexalex{ "abstract" : "unknown", "author" : [ "unknown" ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 6.66, CPAN::Meta::Converter version 2.142690", "license" : [ "perl_5" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "HTTP-Async", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "runtime" : { "requires" : { "Carp" : "0", "Data::Dumper" : "0", "HTTP::Request" : "0", "HTTP::Response" : "0", "HTTP::Server::Simple::CGI" : "0", "HTTP::Status" : "0", "IO::Select" : "0", "LWP::UserAgent" : "0", "Net::EmptyPort" : "0", "Net::HTTP" : "0", "Net::HTTP::NB" : "0", "Net::HTTPS::NB" : "0.13", "Test::Fatal" : "0", "Test::HTTP::Server::Simple" : "0", "Test::More" : "0", "Time::HiRes" : "0", "URI" : "0", "URI::Escape" : "0" } } }, "release_status" : "stable", "resources" : { "repository" : { "url" : "https://github.com/evdb/HTTP-Async" } }, "version" : "0.30" } HTTP-Async-0.30/README.md0000644000175000017500000000123212125077367013221 0ustar alexalex# HTTP::Async [![Build Status](https://secure.travis-ci.org/evdb/HTTP-Async.png?branch=master)](https://travis-ci.org/evdb/HTTP-Async) This module lets you process several HTTP connections at once, in parallel and without blocking. For docs please see [the HTTP::Async page on search.cpan.org](http://search.cpan.org/dist/HTTP-Async/lib/HTTP/Async.pm). ## INSTALLATION To install you can use the following commands: perl Makefile.PL make make test make install ## COPYRIGHT AND LICENCE Copyright (C) 2006-2012, Edmund von der Burg This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. HTTP-Async-0.30/lib/0000755000175000017500000000000012602263415012501 5ustar alexalexHTTP-Async-0.30/lib/HTTP/0000755000175000017500000000000012602263415013260 5ustar alexalexHTTP-Async-0.30/lib/HTTP/Async/0000755000175000017500000000000012602263415014335 5ustar alexalexHTTP-Async-0.30/lib/HTTP/Async/Polite.pm0000644000175000017500000001333012125077367016140 0ustar alexalexuse strict; use warnings; package HTTP::Async::Polite; use base 'HTTP::Async'; our $VERSION = '0.05'; use Carp; use Data::Dumper; use Time::HiRes qw( time sleep ); use URI; =head1 NAME HTTP::Async::Polite - politely process multiple HTTP requests =head1 SYNOPSIS See L - the usage is unchanged. =head1 DESCRIPTION This L module allows you to have many requests going on at once. This can be very rude if you are fetching several pages from the same domain. This module add limits to the number of simultaneous requests to a given domain and adds an interval between the requests. In all other ways it is identical in use to the original L. =head1 NEW METHODS =head2 send_interval Getter and setter for the C - the time in seconds to leave between each request for a given domain. By default this is set to 5 seconds. =cut sub send_interval { my $self = shift; return scalar @_ ? $self->_set_opt( 'send_interval', @_ ) : $self->_get_opt('send_interval'); } =head1 OVERLOADED METHODS These methods are overloaded but otherwise work exactly as the original methods did. The docs here just describe what they do differently. =head2 new Sets the C value to the default of 5 seconds. =cut sub new { my $class = shift; my $self = $class->SUPER::new; # Set the interval between sends. $self->{opts}{send_interval} = 5; # seconds $class->_add_get_set_key('send_interval'); $self->_init(@_); return $self; } =head2 add_with_opts Adds the request to the correct queue depending on the domain. =cut sub add_with_opts { my $self = shift; my $req = shift; my $opts = shift; my $id = $self->_next_id; # Instead of putting this request and opts directly onto the to_send array # instead get the domain and add it to the domain's queue. Store this # domain with the opts so that it is easy to get at. my $uri = URI->new( $req->uri ); my $host = $uri->host; my $port = $uri->port; my $domain = "$host:$port"; $opts->{_domain} = $domain; # Get the domain array - create it if needed. my $domain_arrayref = $self->{domain_stats}{$domain}{to_send} ||= []; push @{$domain_arrayref}, [ $req, $id ]; $self->{id_opts}{$id} = $opts; $self->poke; return $id; } =head2 to_send_count Returns the number of requests waiting to be sent. This is the number in the actual queue plus the number in each domain specific queue. =cut sub to_send_count { my $self = shift; $self->poke; my $count = scalar @{ $$self{to_send} }; $count += scalar @{ $self->{domain_stats}{$_}{to_send} } for keys %{ $self->{domain_stats} }; return $count; } sub _process_to_send { my $self = shift; # Go through the domain specific queues and add all requests that we can # to the real queue. foreach my $domain ( keys %{ $self->{domain_stats} } ) { my $domain_stats = $self->{domain_stats}{$domain}; next unless scalar @{ $domain_stats->{to_send} }; # warn "TRYING TO ADD REQUEST FOR $domain"; # warn sleep 5; # Check that this request is good to go. next if $domain_stats->{count}; next unless time > ( $domain_stats->{next_send} || 0 ); # We can add this request. $domain_stats->{count}++; push @{ $self->{to_send} }, shift @{ $domain_stats->{to_send} }; } # Use the original to send the requests on the queue. return $self->SUPER::_process_to_send; } sub _add_to_return_queue { my $self = shift; my $req_and_id = shift; # decrement the count for this domain so that another request can start. # Also set the interval so that we don't scrape too fast. my $id = $req_and_id->[1]; my $domain = $self->{id_opts}{$id}{_domain}; my $domain_stat = $self->{domain_stats}{$domain}; my $interval = $self->_get_opt( 'send_interval', $id ); $domain_stat->{count}--; $domain_stat->{next_send} = time + $interval; return $self->SUPER::_add_to_return_queue($req_and_id); } =head1 SEE ALSO L - the module that this one is based on. =head1 AUTHOR Edmund von der Burg C<< >>. L =head1 LICENCE AND COPYRIGHT Copyright (c) 2006, Edmund von der Burg C<< >>. All rights reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 DISCLAIMER OF WARRANTY BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. =cut 1; HTTP-Async-0.30/lib/HTTP/Async.pm0000644000175000017500000007004512602263324014700 0ustar alexalexuse strict; use warnings; package HTTP::Async; our $VERSION = '0.30'; use Carp; use Data::Dumper; use HTTP::Response; use IO::Select; use Net::HTTP::NB; use Net::HTTP; use URI; use Time::HiRes qw( time sleep ); =head1 NAME HTTP::Async - process multiple HTTP requests in parallel without blocking. =head1 SYNOPSIS Create an object and add some requests to it: use HTTP::Async; my $async = HTTP::Async->new; # create some requests and add them to the queue. $async->add( HTTP::Request->new( GET => 'http://www.perl.org/' ) ); $async->add( HTTP::Request->new( GET => 'http://www.ecclestoad.co.uk/' ) ); and then EITHER process the responses as they come back: while ( my $response = $async->wait_for_next_response ) { # Do some processing with $response } OR do something else if there is no response ready: while ( $async->not_empty ) { if ( my $response = $async->next_response ) { # deal with $response } else { # do something else } } OR just use the async object to fetch stuff in the background and deal with the responses at the end. # Do some long code... for ( 1 .. 100 ) { some_function(); $async->poke; # lets it check for incoming data. } while ( my $response = $async->wait_for_next_response ) { # Do some processing with $response } =head1 DESCRIPTION Although using the conventional C is fast and easy it does have some drawbacks - the code execution blocks until the request has been completed and it is only possible to process one request at a time. C attempts to address these limitations. It gives you a 'Async' object that you can add requests to, and then get the requests off as they finish. The actual sending and receiving of the requests is abstracted. As soon as you add a request it is transmitted, if there are too many requests in progress at the moment they are queued. There is no concept of starting or stopping - it runs continuously. Whilst it is waiting to receive data it returns control to the code that called it meaning that you can carry out processing whilst fetching data from the network. All without forking or threading - it is actually done using C