Web-Query-1.01000755001750001750 014550254663 13204 5ustar00yanickyanick000000000000INSTALL100644001750001750 452614550254663 14325 0ustar00yanickyanick000000000000Web-Query-1.01This is the Perl distribution Web-Query. Installing Web-Query is straightforward. ## Installation with cpanm If you have cpanm, you only need one line: % cpanm Web::Query If it does not have permission to install modules to the current perl, cpanm will automatically set up and install to a local::lib in your home directory. See the local::lib documentation (https://metacpan.org/pod/local::lib) for details on enabling it in your environment. ## Installing with the CPAN shell Alternatively, if your CPAN shell is set up, you should just be able to do: % cpan Web::Query ## Manual installation As a last resort, you can manually install it. If you have not already downloaded the release tarball, you can find the download link on the module's MetaCPAN page: https://metacpan.org/pod/Web::Query Untar the tarball, install configure prerequisites (see below), then build it: % perl Makefile.PL % make && make test Then install it: % make install On Windows platforms, you should use `dmake` or `nmake`, instead of `make`. If your perl is system-managed, you can create a local::lib in your home directory to install modules to. For details, see the local::lib documentation: https://metacpan.org/pod/local::lib The prerequisites of this distribution will also have to be installed manually. The prerequisites are listed in one of the files: `MYMETA.yml` or `MYMETA.json` generated by running the manual build process described above. ## Configure Prerequisites This distribution requires other modules to be installed before this distribution's installer can be run. They can be found under the "configure_requires" key of META.yml or the "{prereqs}{configure}{requires}" key of META.json. ## Other Prerequisites This distribution may require additional modules to be installed after running Makefile.PL. Look for prerequisites in the following phases: * to run make, PHASE = build * to use the module code itself, PHASE = runtime * to run tests, PHASE = test They can all be found in the "PHASE_requires" key of MYMETA.yml or the "{prereqs}{PHASE}{requires}" key of MYMETA.json. ## Documentation Web-Query documentation is available as POD. You can run `perldoc` from a shell to read the documentation: % perldoc Web::Query For more information on installing Perl modules via CPAN, please see: https://www.cpan.org/modules/INSTALL.html Changes100644001750001750 1636314550254663 14611 0ustar00yanickyanick000000000000Web-Query-1.01Revision history for Perl extension Web::Query 1.01 2024-01-12 [BUG FIXES] - Fix tests to work with new version of HTML::TreeBuilder::LibXML. (GH#57) [DOCUMENTATION] - Fix documentation typos. (GH#56, esabol) [ENHANCEMENTS] - Move tests to Test2::V0. [STATISTICS] - code churn: 48 files changed, 229 insertions(+), 210 deletions(-) 1.00 2023-09-06 [API CHANGES] - Web::Query will now throw when failing to retrieve an url, instead of silently returning C. (GH#55) [STATISTICS] - code churn: 8 files changed, 56 insertions(+), 32 deletions(-) 0.39 2018-08-21 [BUG FIXES] - localize $@ in destructor to prevent clobbering. (GH#51, Maurice Aubrey) [STATISTICS] - code churn: 6 files changed, 81 insertions(+), 4 deletions(-) 0.38 2016-07-03 [BUG FIXES] - HTML::Selector::XPath 0.19 has a bug regarding '//b' expressions. [STATISTICS] - code churn: 2 files changed, 9 insertions(+), 2 deletions(-) 0.37 2016-07-02 [BUG FIXES] - Require List::Util 1.44+ (for 'uniq') [STATISTICS] - code churn: 2 files changed, 19 insertions(+), 7 deletions(-) 0.36 2016-06-30 [BUG FIXES] - `->text()` doesn't break on text nodes. (GH#47, reported by Gabor Szabo) [DOCUMENTATION] - Add mention of a way to get PIs of XML documents (GH#49). [ENHANCEMENTS] - `wq()` can now create an empty document. - Add 'join' argument to `as_html`. - Add 'match' function. - Add 'split' function. (GH#45) [STATISTICS] - code churn: 11 files changed, 322 insertions(+), 46 deletions(-) 0.35 2016-05-31 [DOCUMENTATION] - Add troubleshooting entry for 'script' elements. [GH#8] [ENHANCEMENTS] - 'attr' method now accept many attributes and code refs in setter mode. [STATISTICS] - code churn: 6 files changed, 104 insertions(+), 33 deletions(-) 0.34 2015-09-23 [BUG FIXES] - 'filter' was exploding on text nodes. [GH#44] [STATISTICS] - code churn: 4 files changed, 24 insertions(+), 4 deletions(-) 0.33 2015-09-23 [BUG FIXES] - Make sure we use XML::LibXML > 2.0107 for `unique_keys`. [GH#43] - 'filter' with coderef was not generating a sub-WQ object. [ENHANCEMENTS] - Be more resilient to #text nodes. (GH#42) [STATISTICS] - code churn: 6 files changed, 101 insertions(+), 34 deletions(-) 0.32 2015-08-29 [ENHANCEMENTS] - add id() as a shortcut method for `->attr('id')`. [GH#38] - add 'name()' as a shortcut method for `->attr('name')`. [GH#39] - add 'data()' as a shortcut method for `->attr('data-*foo*')`. [GH#40] - add `toggle_class()` method. [GH#41] [STATISTICS] - code churn: 5 files changed, 394 insertions(+), 172 deletions(-) 0.31 2015-08-25 - each() would skip nodes if its subref was calling remove(). [yanick] - remove duplicate code for duplicate(). [yanick] [STATISTICS] - code churn: 5 files changed, 46 insertions(+), 25 deletions(-) 0.30 2015-08-23 - next_until.t was failing if XML::LibXML isn't installed. [yanick] 0.29 2015-08-21 - add() now returns a new element (instead of modifying $self). [yanick] - added 'not()'. [yanick] - added 'and_back'. [yanick] - added 'next_until()'. [yanick] 0.28 2015-06-30 - new_from_html with options was breaking 'end()'. (yanick) 0.27 2014-12-24T00:52:33Z - new() with a bad url wasn't returning 'undef' when options were given. (yanick) - Add 'no_space_compacting' option. #33 (yanick) - Add 'tagname' to query/modify tag names. #34 (yanick) - XPath expressions can now be used as well. #35 (yanick) 0.26 2014-03-31T08:23:34Z - impl prev() and next() method #31 (xaicron) 0.25 2014-02-13T01:26:42Z - re-packaging(no feature changes) 0.24 2014-02-12T05:34:09Z - replace_with: Can't call method "clone" on an undefined value #24 (Reported by @daxim++, Fixed by @yanick++) 0.23 2013-05-30T16:09:03Z - improved find() documentation - fixed cpanfile min perl version - modified tests to use the expression form of eval to try to load Web::Query::LibXML - the block form of eval is not working as expected on some perl versions on i386-freebsd (cafe01) 0.22 2013-05-15T23:36:38Z - added new module: Web::Query::LibXML - modified test files to also test Web::Query::LibXML (if it loads). 0.21 2013-05-15T14:36:11Z - new jQuery-compatible method: add() - fixed filter() that relied on wrong find() behavior - fixed two t/03_traverse.t tests that was expecting wrong behavior from filter() 0.20 2013-05-13T22:51:02Z - improved documentation - fixed find() to match only descendant elements This is the correct jQuery compatible implementation, which I have changed in 0.14 to also match root nodes, my bad. - fixed tests that relied on that wrong find() behavior. (cafe01) 0.19 2013-05-12T18:19:57Z - implemented contents() jQuery-compatible method - new() now accepts another Web::Query object (cafe01) 0.18 2013-05-09T19:40:40Z - fixed html() method, now using $self->_build_tree - calling parent() instead of undocumented getParentNode() - calling disembowel() instead of guts() Need for Web::Query::LibXML, so nodes get detached from old document and returned each as root of a new document. (Carlos Fernando Avila Gratz) 0.17 2013-05-08T01:18:36Z - new_from_file() now calling guts() instead of elementify() So the file can contain a document fragment (multiple root nodes) instead of a full document (single root). Also, now all new_from_* methods behave the same. (Carlos Fernando Avila Gratz) 0.16 2013-04-22T14:26:44Z - modified new_from_element() to ignore non-blessed items (Carlos Fernando Avila Gratz) - created _build_tree() method (Carlos Fernando Avila Gratz) 0.15 2013-04-09T00:29:48Z - added clone() method (Carlos Fernando Avila Gratz) - now storing comments from parsed html (Carlos Fernando Avila Gratz) - fixed remove() to get rid of removed element refs removes from $self and from all $self->{before}. Also modified how each() instantiates the objects, so $_->end works in the callback, which is needed for $_->remove() to work in the callback. (Carlos Fernando Avila Gratz) 0.14 2013-04-07T02:22:25Z - new jQuery compatible methods, and related tests * append * prepend * before * after * insert_before * insert_after * detach * add_class * remove_class * has_class (Carlos Fernando Avila Gratz) 0.13 2013-04-05T06:37:27Z - fixed find() bug was calling selector_to_xpath() in the loop, breaking the selector after the second call. (Carlos Fernando Avila Gratz) - Search from '//' when the node was created from HTML. (tokuhirom) 0.12 2013-04-03T20:24:49Z - Make subclass friendly (Carlos Fernando Avila Gratz) 0.11 - Implement a remove method that effects the html results. (gugod++) 0.10 [INCOMPATIBLE CHANGES] - new_from_url() is no longer throws exception on bad response from HTTP server. https://rt.cpan.org/Ticket/Display.html?id=76187 (oleg++) 0.09 - Switch to Module::Build - first() and last() should construct new object, but not modify self (Oleg++) 0.08 - added ->map and ->filter methods (Hiroki Honda) - fixed as (empty)->first->size and (empty)->last->size return 0 (Hiroki Honda) 0.07 - HTML5 support 0.06 - added first, last methods(akiym) 0.05 - added docs for 'how do i customize useragent'. 0.04 - added ->size and ->parent method. 0.03 - fix fucking win32 new line issue. (it may works, i hope.) 0.02 - added docs for find method(reported by kan++). 0.01 2011-02-19T10:38:22Z - original version LICENSE100644001750001750 4406214550254663 14320 0ustar00yanickyanick000000000000Web-Query-1.01This software is copyright (c) 2012 by Tokuhiro Matsuno Etokuhirom AAJKLFJEF@ GMAIL COME. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. Terms of the Perl programming language system itself a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" --- The GNU General Public License, Version 1, February 1989 --- This software is Copyright (c) 2012 by Tokuhiro Matsuno Etokuhirom AAJKLFJEF@ GMAIL COME. This is free software, licensed under: The GNU General Public License, Version 1, February 1989 GNU GENERAL PUBLIC LICENSE Version 1, February 1989 Copyright (C) 1989 Free Software Foundation, Inc. 51 Franklin St, Suite 500, Boston, MA 02110-1335 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The license agreements of most software companies try to keep users at the mercy of those companies. By contrast, our General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. The General Public License applies to the Free Software Foundation's software and to any other program whose authors commit to using it. You can use it for your programs, too. When we speak of free software, we are referring to freedom, not price. Specifically, the General Public License is designed to make sure that you have the freedom to give away or sell copies of free software, that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of a such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any work containing the Program or a portion of it, either verbatim or with modifications. Each licensee is addressed as "you". 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this General Public License and to the absence of any warranty; and give any other recipients of the Program a copy of this General Public License along with the Program. You may charge a fee for the physical act of transferring a copy. 2. You may modify your copy or copies of the Program or any portion of it, and copy and distribute such modifications under the terms of Paragraph 1 above, provided that you also do the following: a) cause the modified files to carry prominent notices stating that you changed the files and the date of any change; and b) cause the whole of any work that you distribute or publish, that in whole or in part contains the Program or any part thereof, either with or without modifications, to be licensed at no charge to all third parties under the terms of this General Public License (except that you may choose to grant warranty protection to some or all third parties, at your option). c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the simplest and most usual way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this General Public License. d) You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. Mere aggregation of another independent work with the Program (or its derivative) on a volume of a storage or distribution medium does not bring the other work under the scope of these terms. 3. You may copy and distribute the Program (or a portion or derivative of it, under Paragraph 2) in object code or executable form under the terms of Paragraphs 1 and 2 above provided that you also do one of the following: a) accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Paragraphs 1 and 2 above; or, b) accompany it with a written offer, valid for at least three years, to give any third party free (except for a nominal charge for the cost of distribution) a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Paragraphs 1 and 2 above; or, c) accompany it with the information you received as to where the corresponding source code may be obtained. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form alone.) Source code for a work means the preferred form of the work for making modifications to it. For an executable file, complete source code means all the source code for all modules it contains; but, as a special exception, it need not include source code for modules which are standard libraries that accompany the operating system on which the executable file runs, or for standard header files or definitions files that accompany that operating system. 4. You may not copy, modify, sublicense, distribute or transfer the Program except as expressly provided under this General Public License. Any attempt otherwise to copy, modify, sublicense, distribute or transfer the Program is void, and will automatically terminate your rights to use the Program under this License. However, parties who have received copies, or rights to use copies, from you under this General Public License will not have their licenses terminated so long as such parties remain in full compliance. 5. By copying, distributing or modifying the Program (or any work based on the Program) you indicate your acceptance of this license to do so, and all its terms and conditions. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. 7. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of the license which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation. 8. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Appendix: How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to humanity, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) 19yy This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) 19xx name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (a program to direct compilers to make passes at assemblers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice That's all there is to it! --- The Artistic License 1.0 --- This software is Copyright (c) 2012 by Tokuhiro Matsuno Etokuhirom AAJKLFJEF@ GMAIL COME. This is free software, licensed under: The Artistic License 1.0 The Artistic License Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions: - "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. - "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder. - "Copyright Holder" is whoever is named in the copyright or copyrights for the package. - "You" is you, if you're thinking about copying or distributing this Package. - "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) - "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a) place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as ftp.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b) use the modified Package only within your corporation or organization. c) rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d) make other distribution arrangements with the Copyright Holder. 4. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a) distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b) accompany the distribution with the machine-readable source of the Package with your modifications. c) accompany any non-standard executables with their corresponding Standard Version executables, giving the non-standard executables non-standard names, and clearly documenting the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d) make other distribution arrangements with the Copyright Holder. 5. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. 6. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. 7. C or perl subroutines supplied by you and linked into this Package shall not be considered part of this Package. 8. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End t000755001750001750 014550254663 13370 5ustar00yanickyanick000000000000Web-Query-1.01add.t100644001750001750 333714550254663 14453 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = <
Foo
Bar
HTML # add($object) is join('|', wq($html)->find('.foo')->add(wq($html)->find('.bar'))->as_html) => '
Foo
|
Bar
', 'add($object)'; # add($html) is join('|', wq($html)->find('.foo')->add('
Bar
')->as_html) => '
Foo
|
Bar
', 'add($html)'; # add(@elements) is join('|', wq($html)->find('.foo')->add(@{ wq($html)->find('div')->{trees}})->as_html) => '
Foo
|
Foo
|
Bar
', 'add(@elements)'; # add($selector, $xpath_context) is join('|', wq($html)->find('.foo')->add('.bar', wq($html)->{trees}->[0] )->as_html) => '
Foo
|
Bar
', 'add($selector, $xpath_context)'; subtest "add() create new object" => sub { my $wq = wq($html); my $x = $wq->find('.foo'); my $y = $x->add( $wq->find('.bar') ); is $x->size => 1, "original object"; is $y->size => 2, "new object"; }; subtest "add() doesn't add the same node twice" => sub { my $wq = wq($html); my $x = $wq->find('.foo')->add( $wq->find('.foo') ); is $x->size => 1, "only one node"; }; } new.t100644001750001750 44514550254663 14471 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; subtest 'create an empty $q' => sub { my $new = $class->new; $new = $new->add( '

something

' ); is $new->as_html => '

something

'; }; } META.yml100644001750001750 364114550254663 14542 0ustar00yanickyanick000000000000Web-Query-1.01--- abstract: 'Yet another scraping library like jQuery' author: - 'Tokuhiro Matsuno ' build_requires: Cwd: '0' ExtUtils::MakeMaker: '0' File::Spec: '0' FindBin: '0' IO::Handle: '0' IPC::Open3: '0' Test2::Tools::Exception: '0' Test2::V0: '0' Test::Exception: '0' Test::More: '0' lib: '0' utf8: '0' configure_requires: ExtUtils::MakeMaker: '0' dynamic_config: 0 generated_by: 'Dist::Zilla version 6.030, CPAN::Meta::Converter version 2.150010' license: perl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: Web-Query provides: Web::Query: file: lib/Web/Query.pm version: '1.01' Web::Query::LibXML: file: lib/Web/Query/LibXML.pm version: '1.01' requires: Exporter: '0' HTML::Entities: '0' HTML::Selector::XPath: '0.20' HTML::TreeBuilder::LibXML: '0' HTML::TreeBuilder::XPath: '0' LWP::UserAgent: '0' List::Util: '1.44' Scalar::Util: '0' parent: '0' perl: '5.008005' strict: '0' warnings: '0' resources: bugtracker: https://github.com/tokuhirom/Web-Query/issues homepage: https://github.com/tokuhirom/Web-Query repository: https://github.com/tokuhirom/Web-Query.git version: '1.01' x_authority: cpan:TOKUHIROM x_contributor_covenant: version: 0.02 x_contributors: - 'Carlos Fernando Avila Gratz ' - 'DQNEO ' - 'Ed Sabol ' - 'Hiroki Honda ' - 'Kang-min Liu ' - 'Maurice Aubrey ' - 'moznion ' - 'Oleg ' - 'Tokuhiro Matsuno ' - 'xaicron ' - 'Yanick Champoux ' - 'Yanick Champoux ' x_generated_by_perl: v5.38.0 x_serialization_backend: 'YAML::Tiny version 1.74' x_spdx_expression: 'Artistic-1.0-Perl OR GPL-1.0-or-later' MANIFEST100644001750001750 203414550254663 14415 0ustar00yanickyanick000000000000Web-Query-1.01CODE_OF_CONDUCT.md CONTRIBUTORS Changes INSTALL LICENSE MANIFEST META.json META.yml Makefile.PL README.mkdn SIGNATURE cpanfile doap.xml lib/Web/Query.pm lib/Web/Query/LibXML.pm t/00-compile.t t/00-report-prereqs.dd t/00-report-prereqs.t t/00_compile.t t/01_src.t t/02_op.t t/03_traverse.t t/04_element.t t/05_html5.t t/06_new_from_url_error_handling.t t/07_remove.t t/08_indent.t t/09_as_html.t t/10_subclass.t t/11_get_eq.t t/add.t t/after.t t/append.t t/attr.t t/bad-url-with-options.t t/before.t t/bug-text-contents.t t/class.t t/clone.t t/contents.t t/data/foo.html t/data/html5_snippet.html t/destroy.t t/detach.t t/filter.t t/find.t t/has_class.t t/insert_after.t t/insert_before.t t/lib/My/TreeBuilder.pm t/lib/My/Web/Query.pm t/lib/WQTest.pm t/match_and_not.t t/new.t t/next.t t/next_until.t t/no_space_compacting.t t/node-types.t t/prepend.t t/prev.t t/processing-instructions.t t/remove.t t/remove_class.t t/replace_with.t t/special-attributes.t t/split.t t/store_comments.t t/tagname.t t/xpath.t xt/live/01_simple.t xt/release/unused-vars.t attr.t100644001750001750 153714550254663 14675 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; subtest 'set many attrs at the same time' => sub { my $doc = wq( '
hi
' ); $doc->attr( foo => 1, bar => 'baz', ); is $doc->attr('foo') => 1, 'foo is set'; is $doc->attr('bar') => 'baz', 'bar is set'; }; subtest 'code ref as setter' => sub { my $doc = wq( '
kitten
' ); $doc->find('img')->attr(alt => sub{ $_ ||= 'A picture' }); is [ $doc->find('img')->attr('alt') ], [ 'A picture', 'kitten' ]; } } find.t100644001750001750 133514550254663 14637 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq(<
Hello
Hello
HTML is $wq->find('.inner')->size, 2, 'find() on multiple tree object'; is wq('1')->find('html')->size, 0, 'find() does not include root elements'; is(wq('
foo
bar
')->find('div')->size, 0); } next.t100644001750001750 234314550254663 14675 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq"}; my $wq = wq(<
Hello
World
Hello
World
HTML my $elem = $wq->find('.d1')->next; is $elem->size, 2; is $elem->attr('class'), 'd2', 'next'; subtest 'next->as_html' => sub { plan tests => 6; $wq = wq( q{
one two three
} ); my @expected = ( [ b => qr/one/ ], [ '#text' => qr/two/ ], [ 'i' => qr/three/ ], ); my $next = $wq->find('b'); while( $next->size ) { my $exp = shift @expected; is $next->tagname => $exp->[0], 'tagname'; like $next->as_html => $exp->[1], 'as_html'; $next = $next->next; }; }; } prev.t100644001750001750 120114550254663 14663 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq"}; my $wq = wq(<
Hello
World
Hello
World
HTML my $elem = $wq->find('.d2')->prev; is $elem->size, 2; is $elem->attr('class'), 'd1', 'previous'; } doap.xml100644001750001750 2665114550254663 14764 0ustar00yanickyanick000000000000Web-Query-1.01 Web-Query Yet another scraping library like jQuery Tokuhiro Matsuno Carlos Fernando Avila Gratz DQNEO Ed Sabol Hiroki Honda Kang-min Liu Maurice Aubrey moznion Oleg Tokuhiro Matsuno xaicron Yanick Champoux Yanick Champoux 0.01 2011-02-19T10:38:22Z 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 2013-04-03T20:24:49Z 0.13 2013-04-05T06:37:27Z 0.14 2013-04-07T02:22:25Z 0.15 2013-04-09T00:29:48Z 0.16 2013-04-22T14:26:44Z 0.17 2013-05-08T01:18:36Z 0.18 2013-05-09T19:40:40Z 0.19 2013-05-12T18:19:57Z 0.20 2013-05-13T22:51:02Z 0.21 2013-05-15T14:36:11Z 0.22 2013-05-15T23:36:38Z 0.23 2013-05-30T16:09:03Z 0.24 2014-02-12T05:34:09Z 0.25 2014-02-13T01:26:42Z 0.26 2014-03-31T08:23:34Z 0.27 2014-12-24T00:52:33Z 0.28 2015-06-30 0.29 2015-08-21 0.30 2015-08-23 0.31 2015-08-25 0.32 2015-08-29 0.33 2015-09-23 0.34 2015-09-23 0.35 2016-05-31 0.36 2016-06-30 0.37 2016-07-02 0.38 2016-07-03 0.39 2018-08-21 1.00 2023-09-06 Perl cpanfile100644001750001750 227414550254663 14776 0ustar00yanickyanick000000000000Web-Query-1.01# This file is generated by Dist::Zilla::Plugin::CPANFile v6.030 # Do not edit this file directly. To change prereqs, edit the `dist.ini` file. requires "Exporter" => "0"; requires "HTML::Entities" => "0"; requires "HTML::Selector::XPath" => "0.20"; requires "HTML::TreeBuilder::LibXML" => "0"; requires "HTML::TreeBuilder::XPath" => "0"; requires "LWP::UserAgent" => "0"; requires "List::Util" => "1.44"; requires "Scalar::Util" => "0"; requires "parent" => "0"; requires "perl" => "5.008005"; requires "strict" => "0"; requires "warnings" => "0"; on 'test' => sub { requires "Cwd" => "0"; requires "ExtUtils::MakeMaker" => "0"; requires "File::Spec" => "0"; requires "FindBin" => "0"; requires "IO::Handle" => "0"; requires "IPC::Open3" => "0"; requires "Test2::Tools::Exception" => "0"; requires "Test2::V0" => "0"; requires "Test::Exception" => "0"; requires "Test::More" => "0"; requires "lib" => "0"; requires "utf8" => "0"; }; on 'test' => sub { recommends "CPAN::Meta" => "2.120900"; }; on 'configure' => sub { requires "ExtUtils::MakeMaker" => "0"; }; on 'develop' => sub { requires "Test::More" => "0.96"; requires "Test::Vars" => "0"; requires "utf8" => "0"; }; META.json100644001750001750 620314550254663 14707 0ustar00yanickyanick000000000000Web-Query-1.01{ "abstract" : "Yet another scraping library like jQuery", "author" : [ "Tokuhiro Matsuno " ], "dynamic_config" : 0, "generated_by" : "Dist::Zilla version 6.030, CPAN::Meta::Converter version 2.150010", "license" : [ "perl_5" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : 2 }, "name" : "Web-Query", "prereqs" : { "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "develop" : { "requires" : { "Test::More" : "0.96", "Test::Vars" : "0", "utf8" : "0" } }, "runtime" : { "requires" : { "Exporter" : "0", "HTML::Entities" : "0", "HTML::Selector::XPath" : "0.20", "HTML::TreeBuilder::LibXML" : "0", "HTML::TreeBuilder::XPath" : "0", "LWP::UserAgent" : "0", "List::Util" : "1.44", "Scalar::Util" : "0", "parent" : "0", "perl" : "5.008005", "strict" : "0", "warnings" : "0" } }, "test" : { "recommends" : { "CPAN::Meta" : "2.120900" }, "requires" : { "Cwd" : "0", "ExtUtils::MakeMaker" : "0", "File::Spec" : "0", "FindBin" : "0", "IO::Handle" : "0", "IPC::Open3" : "0", "Test2::Tools::Exception" : "0", "Test2::V0" : "0", "Test::Exception" : "0", "Test::More" : "0", "lib" : "0", "utf8" : "0" } } }, "provides" : { "Web::Query" : { "file" : "lib/Web/Query.pm", "version" : "1.01" }, "Web::Query::LibXML" : { "file" : "lib/Web/Query/LibXML.pm", "version" : "1.01" } }, "release_status" : "stable", "resources" : { "bugtracker" : { "web" : "https://github.com/tokuhirom/Web-Query/issues" }, "homepage" : "https://github.com/tokuhirom/Web-Query", "repository" : { "type" : "git", "url" : "https://github.com/tokuhirom/Web-Query.git", "web" : "https://github.com/tokuhirom/Web-Query" } }, "version" : "1.01", "x_authority" : "cpan:TOKUHIROM", "x_contributor_covenant" : { "version" : 0.02 }, "x_contributors" : [ "Carlos Fernando Avila Gratz ", "DQNEO ", "Ed Sabol ", "Hiroki Honda ", "Kang-min Liu ", "Maurice Aubrey ", "moznion ", "Oleg ", "Tokuhiro Matsuno ", "xaicron ", "Yanick Champoux ", "Yanick Champoux " ], "x_generated_by_perl" : "v5.38.0", "x_serialization_backend" : "Cpanel::JSON::XS version 4.37", "x_spdx_expression" : "Artistic-1.0-Perl OR GPL-1.0-or-later" } 02_op.t100644001750001750 143214550254663 14634 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; subtest 'get/set text' => sub { my $q = wq('t/data/foo.html'); $q->find('.foo a')->text('> ok'); is trim($q->find('.foo a')->text()), '> ok'; is trim($q->find('.foo a')->html()), '> ok'; }; subtest 'get/set html' => sub { my $q = wq('t/data/foo.html'); $q->find('.foo')->html('ok'); is trim($q->find('.foo')->html()), 'ok'; }; } sub trim { local $_ = shift; $_ =~ s/[\r\n]+$//; $_ } after.t100644001750001750 117014550254663 15015 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
Hello
Goodbye
'; is wq($html)->find('.inner')->after('

Test

')->end->as_html, '
Hello

Test

Goodbye

Test

', 'after'; } class.t100644001750001750 321014550254663 15016 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; subtest 'toggle_class' => sub { test_toggle_class($class) }; subtest 'add_class' => sub { test_add_class($class) }; }; sub test_toggle_class { my $class = shift; my $q = $class->new(q{ })->find('a'); $q->toggle_class( 'foo' ); is $q->map( sub { $_->has_class('foo') } ), [ undef, 1, undef ]; $q->toggle_class( 'foo', 'bar' ); is $q->map( sub { $_->has_class('foo') } ), [ 1, undef, 1 ]; is $q->map( sub { $_->has_class('bar') } ), [ undef, 1, 1 ]; subtest "double toggling" => sub { $q->toggle_class( 'foo', 'foo' ); is $q->map( sub { $_->has_class('foo') } ), [ undef, 1, undef ]; }; } sub test_add_class { my $class = shift; my $html = '
Hello
Goodbye
'; my $wq = $class->new($html); $wq->find('.inner')->add_class('foo bar inner'); is $wq->as_html, '
Hello
Goodbye
', 'add_class("foo bar inner")'; # add_class(CODE) $wq = $class->new($html); $wq->find('.inner')->add_class(sub{ my ($i, $current, $el) = @_; return "foo-$i bar"; }); is $wq->as_html, '
Hello
Goodbye
', 'add_class(CODE)'; } clone.t100644001750001750 63114550254663 14775 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '

Hithereworld

'; is wq($html)->clone->as_html, $html, 'clone'; } split.t100644001750001750 262014550254663 15050 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use lib 't/lib'; use Test2::V0; use Web::Query; use WQTest; my $doc = <<'END';

stuff

alpha

aaa

beta>

gamma

bbb

ccc

END WQTest::test { my $class = shift; subtest 'straight split' => sub { my @splitted = $class->new($doc)->split( 'h1' ); is scalar @splitted => 4; like $splitted[0]->as_html(join => ''), qr/stuff/; like $splitted[1]->as_html(join => ''), qr/alpha.*aaa/s; like $splitted[2]->as_html(join => ''), qr/beta/; like $splitted[3]->as_html(join => ''), qr/gamma.*ccc/s; }; subtest 'split in pairs' => sub { my @splitted = $class->new($doc)->split( 'h1', pairs => 1 ); is scalar @splitted => 4; like $splitted[0][1]->as_html(join => ''), qr/stuff/; like $splitted[1][0]->as_html(join => ''), qr/alpha/; like $splitted[1][1]->as_html(join => ''), qr/aaa/; }; subtest 'skip leading' => sub { my @splitted = $class->new($doc)->split( 'h1', pairs => 1, skip_leading => 1 ); is scalar @splitted => 3; like $splitted[0][0]->as_html( join => '' ), qr/alpha/; like $splitted[0][1]->as_html( join => '' ), qr/aaa/; }; } xpath.t100644001750001750 76414550254663 15030 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; my @modules = qw/ Web::Query Web::Query::LibXML /; plan tests => scalar @modules; for my $module ( @modules ) { subtest $module => sub { eval "require $module; 1" or plan skip_all => "couldn't load $module"; my $wq = $module->new_from_html(<<'END');

hello

there

END is $wq->find('b')->html => 'hello', 'css'; is $wq->find('//b')->text => 'hello', 'xpath'; }; } SIGNATURE100644001750001750 1524014550254663 14573 0ustar00yanickyanick000000000000Web-Query-1.01This file contains message digests of all files listed in MANIFEST, signed via the Module::Signature module, version 0.88. To verify the content in this distribution, first make sure you have Module::Signature installed, then type: % cpansign -v It will check each file's integrity, as well as the signature's validity. If "==> Signature verified OK! <==" is not displayed, the distribution may already have been compromised, and you should not run its Makefile.PL or Build.PL. -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 SHA256 4953b895d5e03b9f89aab99e0879c21e159378b2014c1708823dd3d49c809fbf CODE_OF_CONDUCT.md SHA256 42c0f4cd5024bd7ce6f6771240cc0e73825ca21e1322e05a02d96125953bd680 CONTRIBUTORS SHA256 fefb64383dd553c03b0e8db6b4bf9506ab48f96ba603f70597814e2570372480 Changes SHA256 25d649e2072feec11a92062b8607ae149cfd2cd41a2fcc9342388ffa28c67b52 INSTALL SHA256 c5100e5146ea8a8fb71b163fcdb078e2cc6c00f6bea47b70e087a5c0e2c31856 LICENSE SHA256 dfdb52c6fab4fc8c2c3d509a3c5c89a12afec2a8e30a87e90707237ee7b70fc8 MANIFEST SHA256 1f4ce5df29e7ab583a361471ba29b321f6bd560b0a9505f158f84349809f2fda META.json SHA256 15227f914990b77aa42a91671fda8e1ddfda5346e26e1862cedda1bcad3f4d1d META.yml SHA256 2b8146da7b6db690ace3fad85fffd003c7bf9d18ad9a461df8bad2a49f7b14a1 Makefile.PL SHA256 12faf6f89ee0e12ce4df4c8b98c7b36a4ebe81a04376aee836ee1c531c4710fa README.mkdn SHA256 9a3c8d8b2d2713a87a54a4f11eb94d414225b7123f2fa3de91772f9e7d4f11c4 cpanfile SHA256 1c2f26aede7b0ee23013cbdc8d87e612ed1f8acb56be23e90f2c9c2b469341da doap.xml SHA256 765a3dd84e88dac6a8df228b6163a61206e63ea8b519e1f00011ed94f2060ea3 lib/Web/Query.pm SHA256 c2868bf4b409919c9f0b05b745c29418fca23c9c7ea20e203110ecbc81be9395 lib/Web/Query/LibXML.pm SHA256 34d97f11053e1e5054565fe37ebafc339718f5e10fcb624dba2ef62f2ece8ec6 t/00-compile.t SHA256 d3b40201a2f7695ceeed9cc7e2dc32dff611251dc3fa879f974771c94851c5e0 t/00-report-prereqs.dd SHA256 5996417e8ae9973f82860dcf6d5f180d285d5c4b046e0346d06ed22802a3452b t/00-report-prereqs.t SHA256 0e9fbb92090bd0df7a20da47f7079aa24834465cee601c31d46c40026c872233 t/00_compile.t SHA256 57a9df21b88b775b6875178eec1cbab9b6d221d991547af91d4cc6ed6f927887 t/01_src.t SHA256 92b93b31a87de9d388cd12e92d7d4dd3042dfb5746a65c0bf86324e7fa30667b t/02_op.t SHA256 c139102cee2be9701b818913d1a4482fcfa9bc612088eefa6947280d2d2f3d5a t/03_traverse.t SHA256 d811a6f4069b91eb863f0ae59e204bc33dedf90d7e453454ca5c5aa57f38700f t/04_element.t SHA256 dd2e704866fb6ea4ccb7acc542a08d1da7aad25eee088dadc617e21e67287b02 t/05_html5.t SHA256 53adb741b2d2eb1c589a013e2851addd3028c94c9db201f9364bfbe21afdbd40 t/06_new_from_url_error_handling.t SHA256 eaa771d6fa5be582f2d00adf168ee19365516f153bf2c9f72ad23177b052ce30 t/07_remove.t SHA256 728966c628579a553960ffdd74d27331cfe2650941e4b3c4417743911c6986df t/08_indent.t SHA256 3fdffb16161cbf75a0e5dd93d446e5de2d1b7eeb5338d10b6ecc32df6d20633c t/09_as_html.t SHA256 36a2f7c9723335b1a8c52f58ea8cb528e675a6bf215b4d987d53fb4b70d7d957 t/10_subclass.t SHA256 47fbd77abb12ad87aa5ace266f8aa68e94ee7d14fdd3c32ab51e151fa1475297 t/11_get_eq.t SHA256 39e986347852575afd76697bc19c11068cb152d073dfcdcb93538aff99c2ddcb t/add.t SHA256 5634c5620e4a961d96ac36ff29a07d50b5b7c128e17b40937ba906d0bab1915d t/after.t SHA256 1a012475ad5589fcbb755b05847193ab0b9b1ea34af55ac3a7d17acede2f63ef t/append.t SHA256 d42aadd39a3403b180ad91f90dbb65062d72398cf3e84ea237960aae876e80e4 t/attr.t SHA256 a09f575e0364a40aa95b6abe61af4e679f5cb497ddde6891adc259c00c1a7a6a t/bad-url-with-options.t SHA256 1e45295ca961ce5d8ed1df1eb153d858adb7a2895b394888b9fe4c70a7c4dccf t/before.t SHA256 43875fde5a056b42189fad8bb60d9eaba9443197692eb14913ecea26f8c1124c t/bug-text-contents.t SHA256 fa277ef23c5d0db7b0d1a701d701a0ae83a7dac4f93ce16c42c6a4d731abd46f t/class.t SHA256 c8a7159f03175830412da8def86ca34f992ab6be73b9757867c09bfd589542b9 t/clone.t SHA256 6f2a1cd41dd8d90dfdb87c37b8079fbea17647361b3c4e651f4785fe54490800 t/contents.t SHA256 52cf9de96d6db2848e4fbbc4e61ba0e0e0cc30a32a1dc25c2d9fdbeff2631a02 t/data/foo.html SHA256 afad79a2bcd7b17fc44ed19644cec46bdf4a8f8fe2500380c22945bf911393c6 t/data/html5_snippet.html SHA256 d192cd11dfaebc29c38250ae2e1c099dafb75bd1ad02af9d66421254dc22147e t/destroy.t SHA256 175726e726b623ab803c7d5769fcdb566fda7641332b2e7fa9d571c83815beac t/detach.t SHA256 3b118eb840aec89ffd95ca159d41aaca069f762eca21605d35647d6910a80b28 t/filter.t SHA256 05102f4a7ea07f107dba7749601ac6d832c0d3604d06385a6652d464d5917fc6 t/find.t SHA256 d9e9d4b27e4262f8888fa62a0d7d9276ede1891df31ca5e80c275cdb8d633938 t/has_class.t SHA256 b2fe3dee358f7f7c515d7a257995c1d1102aa1b98391c61664ede13f26f170b6 t/insert_after.t SHA256 62192f496708ba4c9a156fb0d0bc24ebaf72d69b539af956485c6cb990c6b518 t/insert_before.t SHA256 7ce0d93221f4ffd8a768fbcc786cebdc26c784d88f206a623892e79159fa62a2 t/lib/My/TreeBuilder.pm SHA256 945fbf16cce0db8de2f976e7f2c6458f6d5bb633fae8c45d8767801dbaa50a0f t/lib/My/Web/Query.pm SHA256 6ad87e131a9ed004c4ce56e397364fd6bb25888ec3dd156c031dedacd8098ec1 t/lib/WQTest.pm SHA256 b17774390903352f1eb88eb07f16a468a796c75888749eb4d998d970307ba017 t/match_and_not.t SHA256 d9d1876dab017d7abef7d388c4020e840c9214e697107b1aecb50b5b400be051 t/new.t SHA256 cb5dad345b79bcceccebe822f30e9759f191886964ff61d11f03f206b329843b t/next.t SHA256 1da754f1b9ca40b0b3960173024e8a8b66e30e6aae6e586cdf8aee02a8d101cb t/next_until.t SHA256 4edb8d48932cf19dcaab0c5e1a899b7a07fe7041997555f1f8452097a4af7dbf t/no_space_compacting.t SHA256 38b3a00c2ce73b304a0ad7c4bac8a26004af3d76c2bd46a0bce306c59de84ef7 t/node-types.t SHA256 b4b5e499a7e30d77829b48a7d5dd4c2c27aeacacd6cf90c345239b283c3cd691 t/prepend.t SHA256 3c154fa9ea2089320a8263c226ab53d7a7fd3689ac56b56c14a56a60cb0d6574 t/prev.t SHA256 3388c6c1f0a3e1a73f6b860baa9f35ba80c7feb58517b41f4c86daa6885a49b7 t/processing-instructions.t SHA256 3e7f4ef578718421e57baa5ce739bb8d9f8448c0d9143386aa60667384fe04ff t/remove.t SHA256 c9a238c9fc5bf0b634bdca366f70352af4528baee26783b63bb398cddc6f9c63 t/remove_class.t SHA256 86d424b5b226329016be9a6720bf35e6a76798771db6cab7fd0990c498573c4c t/replace_with.t SHA256 794efeb6385dcae9b83f04099d1a48c1d3c1c91cf688927f48d84ff815d90c5d t/special-attributes.t SHA256 7ad00cc5d22713fcd8c4e1047c31a353b7d5d107e4aa5b5e0d4487fde8c7dd90 t/split.t SHA256 6be3113f6ef331115275528c5d5ac0439f352fb1358b58d2581520004446a4f3 t/store_comments.t SHA256 cd1dd204f3f430e06b99eaca66386113ecc19806dfd729dc6ba35fcf869eedb7 t/tagname.t SHA256 1bd74319a4463d69a344c80d6821af2ef4578d12fd3286df2b035fb6e1b3af33 t/xpath.t SHA256 73c975b52de8cc03f2dba445072bdb0f78ae0e9f8fa23e2b5d98a7d403f94425 xt/live/01_simple.t SHA256 d34f839d8340478663bd6ae59c72760d7e1a2b7a2d4879ad1ec97db1a2271b2f xt/release/unused-vars.t -----BEGIN PGP SIGNATURE----- iF0EAREDAB0WIQS4RMr2LZlyA/IlbHHfgfB/4bALjAUCZaFZswAKCRDfgfB/4bAL jEnjAKDidGFUHWxF3YRhFig5XPZxbsc46wCaA/qAV7ZnSd8LANMh8UEvL2IUcdg= =V1Bq -----END PGP SIGNATURE----- 01_src.t100644001750001750 376414550254663 15016 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Cwd (); use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; subtest 'from file' => sub { plan tests => 5; run_tests(wq('t/data/foo.html')); }; is wq('t/data/html5_snippet.html')->size, 3, 'snippet from file'; subtest 'from url' => sub { plan tests => 5; run_tests(wq('file://' . Cwd::abs_path('t/data/foo.html'))); }; subtest 'from treebuilder' => sub { plan tests => 5; my $tree = HTML::TreeBuilder::XPath->new_from_file('t/data/foo.html'); run_tests(wq($tree)); }; subtest 'from Array[treebuilder]' => sub { plan tests => 5; my $tree = HTML::TreeBuilder::XPath->new_from_file('t/data/foo.html'); run_tests(wq([$tree])); }; subtest 'from html' => sub { plan tests => 5; open my $fh, '<', 't/data/foo.html'; my $html = do { local $/; <$fh> }; run_tests(wq($html)); }; subtest 'from Web::Query object' => sub { plan tests => 5; my $tree = HTML::TreeBuilder::XPath->new_from_file('t/data/foo.html'); run_tests(wq(wq($tree))); }; if (eval "require URI; 1;") { subtest 'from URI' => sub { plan tests => 5; run_tests(wq(URI->new('file://' . Cwd::abs_path('t/data/foo.html')))); }; } } sub run_tests { $_[0]->find('.foo')->find('a')->each(sub { is $_->text, 'foo!'; is $_->attr('href'), '/foo'; }) ->end()->end() ->find('.bar')->find('a')->each(sub { is $_->text, 'bar!'; is $_->attr('href'), '/bar'; $_->attr('href' => '/bar2'); note $_->html; }); like $_[0]->html, qr{href="/bar2"}; } append.t100644001750001750 121314550254663 15161 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
Hello
Goodbye
'; is wq($html)->find('.inner')->append('

Test

')->end->as_html, '
Hello

Test

Goodbye

Test

', 'append'; }before.t100644001750001750 121714550254663 15160 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
Hello
Goodbye
'; is wq($html)->find('.inner')->before('

Test

')->end->as_html, '

Test

Hello

Test

Goodbye
', 'before'; }detach.t100644001750001750 145414550254663 15151 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq('

Hello

Goodbye

'); my $detached = $wq->find('.inner')->detach; is join('', $detached->as_html), '

Hello

Goodbye

', 'detach - retval'; is $wq->as_html, '
', 'detach - original object modified'; is $detached->find('p')->size, 2, 'find() works on detached elements'; } filter.t100644001750001750 176614550254663 15214 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Test::Exception; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; my $html = <
Hello
Hello
HTML my $q = $class->new($html); subtest "selector" => sub { is $q->filter('span')->size, 0; is $q->filter('div.container')->size, 1; is $q->filter('div')->size, 2; }; subtest coderef => sub { is $q->size, 2; is $q->filter(sub { $_->has_class( 'container' ) } )->size, 1; # 'filter' on a coderef was modifying the parent element is $q->size, 2, 'still two elements'; }; subtest on_text => sub { on_text($class) }; }; sub on_text { my $class = shift; my $wq = $class->new('

bar

Standalone Text'); lives_ok { $wq->filter('.foo') }, "doesn't explode on text nodes"; } remove.t100644001750001750 142114550254663 15210 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; my $wq = $class->new_from_html( '
', { indent => "\t" } ); $wq->find('foo')->remove; is $wq->as_html => '
'; for my $method ( qw/ each map / ) { subtest $method => sub { plan tests => 5; my $wq = new_wq($class); $wq->find('p')->$method(sub{ pass "deleting " . $_->text; $_->remove; }); is $wq->find('p')->size => 0, "all deleted"; }; } }; sub new_wq { shift->new(<<'END');

one

two

three

four

END } Makefile.PL100644001750001750 404414550254663 15241 0ustar00yanickyanick000000000000Web-Query-1.01# This file was automatically generated by Dist::Zilla::Plugin::MakeMaker v6.030. use strict; use warnings; use 5.008005; use ExtUtils::MakeMaker; my %WriteMakefileArgs = ( "ABSTRACT" => "Yet another scraping library like jQuery", "AUTHOR" => "Tokuhiro Matsuno ", "CONFIGURE_REQUIRES" => { "ExtUtils::MakeMaker" => 0 }, "DISTNAME" => "Web-Query", "LICENSE" => "perl", "MIN_PERL_VERSION" => "5.008005", "NAME" => "Web::Query", "PREREQ_PM" => { "Exporter" => 0, "HTML::Entities" => 0, "HTML::Selector::XPath" => "0.20", "HTML::TreeBuilder::LibXML" => 0, "HTML::TreeBuilder::XPath" => 0, "LWP::UserAgent" => 0, "List::Util" => "1.44", "Scalar::Util" => 0, "parent" => 0, "strict" => 0, "warnings" => 0 }, "TEST_REQUIRES" => { "Cwd" => 0, "ExtUtils::MakeMaker" => 0, "File::Spec" => 0, "FindBin" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Test2::Tools::Exception" => 0, "Test2::V0" => 0, "Test::Exception" => 0, "Test::More" => 0, "lib" => 0, "utf8" => 0 }, "VERSION" => "1.01", "test" => { "TESTS" => "t/*.t" } ); my %FallbackPrereqs = ( "Cwd" => 0, "Exporter" => 0, "ExtUtils::MakeMaker" => 0, "File::Spec" => 0, "FindBin" => 0, "HTML::Entities" => 0, "HTML::Selector::XPath" => "0.20", "HTML::TreeBuilder::LibXML" => 0, "HTML::TreeBuilder::XPath" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "LWP::UserAgent" => 0, "List::Util" => "1.44", "Scalar::Util" => 0, "Test2::Tools::Exception" => 0, "Test2::V0" => 0, "Test::Exception" => 0, "Test::More" => 0, "lib" => 0, "parent" => 0, "strict" => 0, "utf8" => 0, "warnings" => 0 ); unless ( eval { ExtUtils::MakeMaker->VERSION(6.63_03) } ) { delete $WriteMakefileArgs{TEST_REQUIRES}; delete $WriteMakefileArgs{BUILD_REQUIRES}; $WriteMakefileArgs{PREREQ_PM} = \%FallbackPrereqs; } delete $WriteMakefileArgs{CONFIGURE_REQUIRES} unless eval { ExtUtils::MakeMaker->VERSION(6.52) }; WriteMakefile(%WriteMakefileArgs); README.mkdn100644001750001750 3577314550254663 15134 0ustar00yanickyanick000000000000Web-Query-1.01# NAME Web::Query - Yet another scraping library like jQuery # VERSION version 1.01 # SYNOPSIS ```perl use Web::Query; wq('http://www.w3.org/TR/html401/') ->find('div.head dt') ->each(sub { my $i = shift; printf("%d %s\n", $i+1, $_->text); }); ``` # DESCRIPTION Web::Query is a yet another scraping framework, have a jQuery like interface. Yes, I know Ingy's [pQuery](https://metacpan.org/pod/pQuery). But it's just alpha quality. It doesn't work. Web::Query built at top of the CPAN modules, [HTML::TreeBuilder::XPath](https://metacpan.org/pod/HTML%3A%3ATreeBuilder%3A%3AXPath), [LWP::UserAgent](https://metacpan.org/pod/LWP%3A%3AUserAgent), and [HTML::Selector::XPath](https://metacpan.org/pod/HTML%3A%3ASelector%3A%3AXPath). So, this module uses [HTML::Selector::XPath](https://metacpan.org/pod/HTML%3A%3ASelector%3A%3AXPath) and only supports the CSS 3 selector supported by that module. Web::Query doesn't support jQuery's extended queries(yet?). If a selector is passed as a scalar ref, it'll be taken as a straight XPath expression. ``` $wq( '

hello

there

' )->find( 'p' ); # css selector $wq( '

hello

there

' )->find( \'/div/p' ); # xpath selector ``` **THIS LIBRARY IS UNDER DEVELOPMENT. ANY API MAY CHANGE WITHOUT NOTICE**. # FUNCTIONS - `wq($stuff)` This is a shortcut for `Web::Query->new($stuff)`. This function is exported by default. # METHODS ## CONSTRUCTORS - my $q = Web::Query->new($stuff, \\%options ) Create new instance of Web::Query. You can make the instance from URL(http, https, file scheme), HTML in string, URL in string, [URI](https://metacpan.org/pod/URI) object, `undef`, and either one [HTML::Element](https://metacpan.org/pod/HTML%3A%3AElement) object or an array ref of them. ``` # all valid creators $q = Web::Query->new( 'http://techblog.babyl.ca' ); $q = Web::Query->new( '

foo

' ); $q = Web::Query->new( undef ); ``` This method throw the exception on unknown $stuff. This method returns undefined value on non-successful response with URL. Currently, the only two valid options are _indent_, which will be used as the indentation string if the object is printed, and _no\_space\_compacting_, which will prevent the compaction of whitespace characters in text blocks. - my $q = Web::Query->new\_from\_element($element: HTML::Element) Create new instance of Web::Query from instance of [HTML::Element](https://metacpan.org/pod/HTML%3A%3AElement). - `my $q = Web::Query->new_from_html($html: Str)` Create new instance of Web::Query from HTML. - my $q = Web::Query->new\_from\_url($url: Str) Create new instance of Web::Query from URL. If the response is not success(It means /^20\[0-9\]$/), this method returns undefined value. You can get a last result of response, use the `$Web::Query::RESPONSE`. Here is a best practical code: ```perl my $url = 'http://example.com/'; my $q = Web::Query->new_from_url($url) or die "Cannot get a resource from $url: " . Web::Query->last_response()->status_line; ``` - my $q = Web::Query->new\_from\_file($file\_name: Str) Create new instance of Web::Query from file name. ## TRAVERSING ### add Returns a new object augmented with the new element(s). - add($html) An HTML fragment to add to the set of matched elements. - add(@elements) One or more @elements to add to the set of matched elements. @elements that already are part of the set are not added a second time. ```perl my $group = $wq->find('#foo'); # collection has 1 element $group = $group->add( '#bar', $wq ); # 2 elements $group->add( '#foo', $wq ); # still 2 elements ``` - add($wq) An existing Web::Query object to add to the set of matched elements. - add($selector, $context) $selector is a string representing a selector expression to find additional elements to add to the set of matched elements. $context is the point in the document at which the selector should begin matching ### contents Get the immediate children of each element in the set of matched elements, including text and comment nodes. ### each Visit each nodes. `$i` is a counter value, 0 origin. `$elem` is iteration item. `$_` is localized by `$elem`. ```perl $q->each(sub { my ($i, $elem) = @_; ... }) ``` ### end Back to the before context like jQuery. ### filter Reduce the elements to those that pass the function's test. ```perl $q->filter(sub { my ($i, $elem) = @_; ... }) ``` ### find Get the descendants of each element in the current set of matched elements, filtered by a selector. ```perl my $q2 = $q->find($selector); # $selector is a CSS3 selector. ``` **NOTE** If you want to match the element itself, use ["filter"](#filter). **INCOMPATIBLE CHANGE** From v0.14 to v0.19 (inclusive) find() also matched the element itself, which is not jQuery compatible. You can achieve that result using `filter()`, `add()` and `find()`: ```perl my $wq = wq('

bar

'); # needed because we don't have a global document like jQuery does print $wq->filter('.foo')->add($wq->find('.foo'))->as_html; #

bar

bar

``` ### first Return the first matching element. This method constructs a new Web::Query object from the first matching element. ### last Return the last matching element. This method constructs a new Web::Query object from the last matching element. ### match($selector) Returns a boolean indicating if the elements match the `$selector`. In scalar context returns only the boolean for the first element. For the reverse of `not()`, see `filter()`. ### not($selector) Returns all the elements not matching the `$selector`. ```perl # $do_for_love will be every thing, except #that my $do_for_love = $wq->find('thing')->not('#that'); ``` ### and\_back Add the previous set of elements to the current one. ``` # get the h1 plus everything until the next h1 $wq->find('h1')->next_until('h1')->and_back; ``` ### map Creates a new array with the results of calling a provided function on every element. ```perl $q->map(sub { my ($i, $elem) = @_; ... }) ``` ### parent Get the parent of each element in the current set of matched elements. ### prev Get the previous node of each element in the current set of matched elements. ```perl my $prev = $q->prev; ``` ### next Get the next node of each element in the current set of matched elements. ```perl my $next = $q->next; ``` ### next\_until( $selector ) Get all subsequent siblings, up to (but not including) the next node matched `$selector`. ## MANIPULATION ### add\_class Adds the specified class(es) to each of the set of matched elements. ``` # add class 'foo' to

elements wq('

foo

bar

')->find('p')->add_class('foo'); ``` ### toggle\_class( @classes ) Toggles the given class or classes on each of the element. I.e., if the element had the class, it'll be removed, and if it hadn't, it'll be added. Classes are toggled once, no matter how many times they appear in the argument list. ``` $q->toggle_class( 'foo', 'foo', 'bar' ); # equivalent to $q->toggle_class('foo')->toggle_class('bar'); # and not $q->toggle_class('foo')->toggle_class('foo')->toggle_class('bar'); ``` ### after Insert content, specified by the parameter, after each element in the set of matched elements. ``` wq('

foo

')->find('p') ->after('bar') ->end ->as_html; #

foo

bar
``` The content can be anything accepted by ["new"](#new). ### append Insert content, specified by the parameter, to the end of each element in the set of matched elements. ``` wq('
')->append('

foo

')->as_html; #

foo

``` The content can be anything accepted by ["new"](#new). ### as\_html Returns the string representations of either the first or all elements, depending if called in list or scalar context. If given an argument `join`, the string representations of the elements will be concatenated with the given string. ```perl wq( '

foo

bar

' ) ->find('p') ->as_html( join => '!' ); #

foo

!

bar

``` ### ` attr ` Get/set attribute values. In getter mode, it'll return either the values of the attribute for all elements of the set, or only the first one depending of the calling context. ```perl my @values = $q->attr('style'); # style of all elements my $first_value = $q->attr('style'); # style of first element ``` In setter mode, it'll set attributes value for all elements, and return back the original object for easy chaining. ```perl $q->attr( 'alt' => 'a picture' )->find( ... ); # can pass more than 1 element too $q->attr( alt => 'a picture', src => 'file:///...' ); ``` The value passed for an attribute can be a code ref. In that case, the code will be called with `$_` set to the current attribute value. If the code modifies `$_`, the attribute will be updated with the new value. ```perl $q->attr( alt => sub { $_ ||= 'A picture' } ); ``` ### ` id ` Get/set the elements's id attribute. In getter mode, it behaves just like `attr()`. In setter mode, it behaves like `attr()`, but with the following exceptions. If the attribute value is a scalar, it'll be only assigned to the first element of the set (as ids are supposed to be unique), and the returned object will only contain that first element. ```perl my $first_element = $q->id('the_one'); ``` It's possible to set the ids of all the elements by passing a sub to `id()`. The sub is given the same arguments as for `each()`, and its return value is taken to be the new id of the elements. ```perl $q->id( sub { my $i = shift; 'foo_' . $i } ); ``` ### ` name ` Get/set the elements's 'name' attribute. ```perl my $name = $q->name; # equivalent to $q->attr( 'name' ); $q->name( 'foo' ); # equivalent to $q->attr( name => 'foo' ); ``` ### ` data ` Get/set the elements's 'data-\*name\*' attributes. ```perl my $data = $q->data('foo'); # equivalent to $q->attr( 'data-foo' ); $q->data( 'foo' => 'bar' ); # equivalent to $q->attr( 'data-foo' => 'bar' ); ``` ### tagname Get/Set the tag name of elements. ```perl my $name = $q->tagname; $q->tagname($new_name); ``` ### before Insert content, specified by the parameter, before each element in the set of matched elements. ``` wq('

foo

')->find('p') ->before('bar') ->end ->as_html; #
bar

foo

``` The content can be anything accepted by ["new"](#new). ### clone Create a deep copy of the set of matched elements. ### detach Remove the set of matched elements from the DOM. ### has\_class Determine whether any of the matched elements are assigned the given class. ### ` html ` Get/Set the innerHTML. ```perl my @html = $q->html(); my $html = $q->html(); # 1st matching element only $q->html('

foo

'); ``` ### insert\_before Insert every element in the set of matched elements before the target. ### insert\_after Insert every element in the set of matched elements after the target. ### ` prepend ` Insert content, specified by the parameter, to the beginning of each element in the set of matched elements. ### remove Delete the elements associated with the object from the DOM. ``` # remove all tags from the document $q->find('blink')->remove; ``` ### remove\_class Remove a single class, multiple classes, or all classes from each element in the set of matched elements. ### replace\_with Replace the elements of the object with the provided replacement. The replacement can be a string, a `Web::Query` object or an anonymous function. The anonymous function is passed the index of the current node and the node itself (with is also localized as `$_`). ```perl my $q = wq( '

Abracadabra

' ); $q->find('b')->replace_with('
Ocus); #

Ocuscadabra

$q->find('u')->replace_with($q->find('b')); #

cadaAbra

$q->find('i')->replace_with(sub{ my $name = $_->text; return "<$name>"; }); #

Abrabra

``` ### size Return the number of elements in the Web::Query object. ``` wq('

foo

bar

')->find('p')->size; # 2 ``` ### text Get/Set the text. ```perl my @text = $q->text(); my $text = $q->text(); # 1st matching element only $q->text('text'); ``` If called in a scalar context, only return the string representation of the first element ## OTHERS - Web::Query->last\_response() Returns last HTTP response status that generated by `new_from_url()`. # HOW DO I CUSTOMIZE USER AGENT? You can specify your own instance of [LWP::UserAgent](https://metacpan.org/pod/LWP%3A%3AUserAgent). ```perl $Web::Query::UserAgent = LWP::UserAgent->new( agent => 'Mozilla/5.0' ); ``` # FAQ AND TROUBLESHOOTING ## How to find XML processing instructions in a document? It's possible with [Web::Query::LibXML](https://metacpan.org/pod/Web%3A%3AQuery%3A%3ALibXML) and by using an xpath expression with `find()`: ``` # find $q->find(\"//processing-instruction('xml-stylesheet')"); ``` However, note that the support for processing instructions in [HTML::TreeBuilder::LibXML::Node](https://metacpan.org/pod/HTML%3A%3ATreeBuilder%3A%3ALibXML%3A%3ANode) is sketchy, so there are methods like `attr()` that won't work. ## Can't get the content of script elements The <script> tag is treated differently by [HTML::TreeBuilder](https://metacpan.org/pod/HTML%3A%3ATreeBuilder), the parser used by Web::Query. To retrieve the content, you can use either the method `html()` (with the caveat that the content will be escaped), or use [Web::Query::LibXML](https://metacpan.org/pod/Web%3A%3AQuery%3A%3ALibXML), which parse the 'script' element differently. ```perl my $node = ""; say Web::Query::wq( $node )->text; # nothing is printed! say Web::Query::wq( $node )->html; # var x = '<p>foo</p>'; say Web::Query::LibXML::wq( $node )->text; # var x = '

foo

'; say Web::Query::LibXML::wq( $node )->html; # var x = '<p>foo</p>'; ``` # INCOMPATIBLE CHANGES - 0.10 new\_from\_url() is no longer throws exception on bad response from HTTP server. # AUTHOR Tokuhiro Matsuno <tokuhirom AAJKLFJEF@ GMAIL COM> # SEE ALSO - [pQuery](https://metacpan.org/pod/pQuery) - [XML::LibXML::jQuery](https://metacpan.org/pod/XML%3A%3ALibXML%3A%3AjQuery) # LICENSE Copyright (C) Tokuhiro Matsuno This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. # BUGS Please report any bugs or feature requests on the bugtracker website [https://github.com/tokuhirom/Web-Query/issues](https://github.com/tokuhirom/Web-Query/issues) When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. destroy.t100644001750001750 40314550254663 15363 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query (); my $wq = Web::Query->new(''); local $@ = 'foo'; $wq->DESTROY; is $@, 'foo', 'eval error string should not be clobbered'; done_testing; prepend.t100644001750001750 121714550254663 15353 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
Hello
Goodbye
'; is wq($html)->find('.inner')->prepend('

Test

')->end->as_html, '

Test

Hello

Test

Goodbye
', 'prepend'; }tagname.t100644001750001750 102314550254663 15325 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; my @modules = qw/ Web::Query Web::Query::LibXML /; plan tests => scalar @modules; for my $module ( @modules ) { subtest $module => sub { eval "require $module; 1" or plan skip_all => "couldn't load $module"; my $wq = $module->new_from_html(<<'END');

hello

there

END $wq->find('p')->each(sub{ $_->tagname('q') }); is $wq->as_html, '
hellothere
', 'p -> q'; }; } CONTRIBUTORS100644001750001750 73514550254663 15132 0ustar00yanickyanick000000000000Web-Query-1.01 # WEB-QUERY CONTRIBUTORS # This is the (likely incomplete) list of people who have helped make this distribution what it is, either via code contributions, patches, bug reports, help with troubleshooting, etc. A huge 'thank you' to all of them. * Carlos Fernando Avila Gratz * DQNEO * Ed Sabol * Hiroki Honda * Kang-min Liu * Maurice Aubrey * moznion * Oleg * Tokuhiro Matsuno * xaicron * Yanick Champoux * Yanick Champoux 05_html5.t100644001750001750 61014550254663 15227 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; is(wq('
foo
')->find('header')->first->text, 'foo'); } contents.t100644001750001750 121214550254663 15546 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = "

foo

bar

baz
"; is join('|', wq($html)->contents->as_html), '

foo

|

bar

|baz', 'contents()'; is join('|', wq($html)->contents('p')->as_html), '

foo

|

bar

', 'contents("p")'; is wq('

foo

')->contents->as_html => 'foo'; } 07_remove.t100644001750001750 323114550254663 15517 0ustar00yanickyanick000000000000Web-Query-1.01/t# -*- perl -*- use strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; subtest "remove and size" => sub { my $q = wq('t/data/foo.html'); $q->find('.foo')->remove(); is $q->find('.foo')->size() => 0, "all .foo are removed and cannot be found."; }; subtest "remove and html" => sub { my $q = wq('t/data/foo.html'); $q->find('.foo, .bar')->remove(); my $result = $q->html; $result =~ s/\s//g; like $result, qr{^test1\s*$}, ".foo and .bar are removed and not showing in html"; }; subtest "\$q->remove->end->html" => sub { my $q = wq('t/data/foo.html'); my $result = $q->find('.foo, .bar')->remove->end->html; $result =~ s/\s//g; like( $result, qr{^test1$}, "The chaining works." ); }; subtest "remove root elements" => sub { my $q = wq('t/data/foo.html'); $q->remove; is $q->size, 0, "size 0 after remove"; is join('', $q->as_html), '', "html '' after remove"; # not '<>' }; subtest "remove elements via each()" => sub { my $q = wq('t/data/foo.html'); $q->each(sub{ $_->remove }); is $q->size, 0, "size 0 after remove"; is join('', $q->as_html), '', "html '' after remove"; # not '<>' }; } 08_indent.t100644001750001750 46714550254663 15474 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; plan tests => 2; my $inner = "

Hi there

"; my $html = "$inner"; is( Web::Query->new($html)->html => $inner, "no indent" ); like( Web::Query->new($html, { indent => "\t" } )->html => qr/\t/, "indented" ); 11_get_eq.t100644001750001750 306714550254663 15470 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
  • A
  • B
  • C
  • D
  • E
  • F
'; subtest 'get first' => sub { my $q = wq($html)->find('#foo li'); my $elm = $q->get(0); isa_ok $elm, 'HTML::Element'; is wq($elm)->text(), 'A'; }; subtest 'get second' => sub { my $q = wq($html)->find('#foo li'); my $elm = $q->get(1); isa_ok $elm, 'HTML::Element'; is wq($elm)->text(), 'B'; }; subtest 'get last' => sub { my $q = wq($html)->find('#foo li'); my $elm = $q->get(-1); isa_ok $elm, 'HTML::Element'; is wq($elm)->text(), 'F'; }; subtest 'get before last' => sub { my $q = wq($html)->find('#foo li'); my $elm = $q->get(-2); isa_ok $elm, 'HTML::Element'; is wq($elm)->text(), 'E'; }; subtest 'eq first' => sub { is wq($html)->find('#foo li')->eq(0)->text(), 'A'; }; subtest 'eq second' => sub { is wq($html)->find('#foo li')->eq(1)->text(), 'B'; }; subtest 'eq last' => sub { is wq($html)->find('#foo li')->eq(-1)->text(), 'F'; }; subtest 'eq before last' => sub { is wq($html)->find('#foo li')->eq(-2)->text(), 'E'; }; } has_class.t100644001750001750 113414550254663 15654 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq('
Hello
Goodbye
'); is $wq->find('.inner')->has_class('inner'), 1, 'has_class - positive'; is $wq->find('.inner')->has_class('nahh'), undef, 'has_class - negative'; } 00-compile.t100644001750001750 266514550254663 15573 0ustar00yanickyanick000000000000Web-Query-1.01/tuse 5.006; use strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::Compile 2.058 use Test::More; plan tests => 2 + ($ENV{AUTHOR_TESTING} ? 1 : 0); my @module_files = ( 'Web/Query.pm', 'Web/Query/LibXML.pm' ); # no fake home requested my @switches = ( -d 'blib' ? '-Mblib' : '-Ilib', ); use File::Spec; use IPC::Open3; use IO::Handle; open my $stdin, '<', File::Spec->devnull or die "can't open devnull: $!"; my @warnings; for my $lib (@module_files) { # see L my $stderr = IO::Handle->new; diag('Running: ', join(', ', map { my $str = $_; $str =~ s/'/\\'/g; q{'} . $str . q{'} } $^X, @switches, '-e', "require q[$lib]")) if $ENV{PERL_COMPILE_TEST_DEBUG}; my $pid = open3($stdin, '>&STDERR', $stderr, $^X, @switches, '-e', "require q[$lib]"); binmode $stderr, ':crlf' if $^O eq 'MSWin32'; my @_warnings = <$stderr>; waitpid($pid, 0); is($?, 0, "$lib loaded ok"); shift @_warnings if @_warnings and $_warnings[0] =~ /^Using .*\bblib/ and not eval { +require blib; blib->VERSION('1.01') }; if (@_warnings) { warn @_warnings; push @warnings, @_warnings; } } is(scalar(@warnings), 0, 'no warnings found') or diag 'got warnings: ', ( Test::More->can('explain') ? Test::More::explain(\@warnings) : join("\n", '', @warnings) ) if $ENV{AUTHOR_TESTING}; 00_compile.t100644001750001750 12014550254663 15615 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use Test2::V0; use Web::Query; pass "it compiles"; done_testing; 04_element.t100644001750001750 116614550254663 15655 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
  • A
  • B
  • C
  • D
  • E
  • F
'; subtest 'first' => sub { is wq($html)->find('#foo li')->first()->text(), 'A'; }; subtest 'last' => sub { is wq($html)->find('#foo li')->last()->text(), 'F'; }; } 09_as_html.t100644001750001750 172114550254663 15655 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $inner = "

Hi there

How is life?

"; my $html = "$inner"; my $q = Web::Query->new($html); is $q->html => $inner, "html() returns inner html"; is $q->as_html => $html, "as_html() returns element itself"; my $scalar = $q->find('p')->as_html; my @array = $q->find('p')->as_html; is $scalar => '

Hi there

', 'called in scalar context'; is \@array => [ '

Hi there

', q{

How is life?

} ], 'called in list context'; subtest 'join' => sub { is $q->find('p')->as_html(join => '!') => '

Hi there

!

How is life?

'; }; } next_until.t100644001750001750 153114550254663 16106 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq"}; my $wq = wq(q{

one

two

three

four

five

six

}); for my $id ( qw/ first second / ) { my $next = $wq->find('#'.$id)->next_until('h1'); is $next->size => 2; } is $wq->find('#first')->next_until('h1')->and_back->size => 3, "and_back"; is $wq->find('h1')->next_until('h1')->size => 4; is $wq->find('h1')->next_until('foo')->size => 5; } node-types.t100644001750001750 126114550254663 16004 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; my $q = $class->new_from_html(<<'END'); one

two

END my $contents = $q->find('x')->contents; is $contents->find('p')->html => 'two', 'skip over text and comments'; like $contents->filter(sub{ $_->tagname eq '#text' })->as_html => qr'one', '#text'; like $contents->filter(sub{ $_->tagname eq '#comment' })->as_html => qr'three', '#comment'; } 03_traverse.t100644001750001750 363014550254663 16054 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use Scalar::Util qw/refaddr/; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '
'; subtest 'parent' => sub { is wq($html)->find('#baz')->parent()->attr('id'), 'bar'; is wq($html)->find('#bar')->parent()->attr('id'), 'foo'; }; subtest 'first/last return new instance' => sub { subtest 'first' => sub { my $q = wq($html)->find('div'); my $first = $q->first; isnt(refaddr($first), refaddr($q)); }; subtest 'last' => sub { my $q = wq($html)->find('div'); my $last = $q->last; isnt(refaddr($last), refaddr($q)); }; }; subtest 'size' => sub { is wq($html)->find('div')->size, 3; is wq($html)->find('body')->size, 1; is wq($html)->find('li')->size, 0; is wq($html)->find('.null')->first->size, 0; is wq($html)->find('.null')->last->size, 0; }; subtest 'map' => sub { is wq($html)->find('div')->map(sub {$_[0]}), [0, 1, 2]; is wq($html)->find('div')->map(sub {$_->attr('id')}), [qw/foo bar baz/]; }; subtest 'filter' => sub { is wq($html)->filter('div')->size, 0; is wq($html)->filter('body')->size, 0; is wq($html)->filter('li')->size, 0; is wq($html)->find('div')->filter(sub {$_->attr('id') =~ /ba/})->size, 2; is wq($html)->find('div')->filter(sub {my $i = shift; $i % 2 == 0})->size, 2; }; } 10_subclass.t100644001750001750 54414550254663 16017 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; plan tests => 3; use FindBin; use lib 'lib'; use lib "$FindBin::Bin/lib"; use My::Web::Query; # web::query is a child class friendly my $query = wq('
foo
'); isa_ok $query, 'My::Web::Query'; $query->each(sub{ isa_ok $_[1], 'My::Web::Query'; }); isa_ok $query->_build_tree, 'My::TreeBuilder'; data000755001750001750 014550254663 14301 5ustar00yanickyanick000000000000Web-Query-1.01/tfoo.html100644001750001750 40414550254663 16070 0ustar00yanickyanick000000000000Web-Query-1.01/t/data test1 lib000755001750001750 014550254663 14136 5ustar00yanickyanick000000000000Web-Query-1.01/tWQTest.pm100644001750001750 67114550254663 16007 0ustar00yanickyanick000000000000Web-Query-1.01/t/libpackage WQTest; use strict; use warnings; use Test::More; sub test(&) { my $code = shift; plan tests => 3; use_ok 'Web::Query'; for my $class ( qw/ Web::Query Web::Query::LibXML / ) { subtest $class => sub { if( $class =~ /LibXML/ ) { plan skip_all => "can't load $class" unless eval "use $class; 1"; } $code->($class, \&{$class . "::wq" }); }; } } 1; Web000755001750001750 014550254663 14410 5ustar00yanickyanick000000000000Web-Query-1.01/libQuery.pm100644001750001750 7616614550254663 16253 0ustar00yanickyanick000000000000Web-Query-1.01/lib/Webpackage Web::Query; our $AUTHORITY = 'cpan:TOKUHIROM'; # ABSTRACT: Yet another scraping library like jQuery $Web::Query::VERSION = '1.01'; use strict; use warnings; use 5.008001; use parent qw/Exporter/; use HTML::TreeBuilder::XPath; use LWP::UserAgent; use HTML::Selector::XPath 0.20 qw/selector_to_xpath/; use Scalar::Util qw/blessed refaddr/; use HTML::Entities qw/encode_entities/; use List::Util 1.44 qw/ reduce uniq /; use Scalar::Util qw/ refaddr /; our @EXPORT = qw/wq/; our $RESPONSE; sub wq { Web::Query->new(@_) } our $UserAgent; sub __ua { $UserAgent ||= LWP::UserAgent->new } sub _build_tree { my( $self, $options ) = @_; my $no_space_compacting = ref $self ? $self->{no_space_compacting} : ref $options eq 'HASH' ? $options->{no_space_compacting} : 0; my $tree = HTML::TreeBuilder::XPath->new( no_space_compacting => $no_space_compacting ); $tree->ignore_unknown(0); $tree->store_comments(1); $tree; } sub new { my ($class, $stuff, $options) = @_; my $self = $class->_resolve_new($stuff,$options) or return undef; $self->{indent} = $options->{indent} if $options->{indent}; $self->{no_space_compacting} = $options->{no_space_compacting}; return $self; } sub _resolve_new { my( $class, $stuff, $options) = @_; return $class->new_from_element([],undef,$options) unless defined $stuff; if (blessed $stuff) { return $class->new_from_element([$stuff],undef,$options) if $stuff->isa('HTML::Element'); return $class->new_from_url($stuff->as_string,$options) if $stuff->isa('URI'); return $class->new_from_element($stuff->{trees}, undef, $options) if $stuff->isa($class); die "Unknown source type: $stuff"; } return $class->new_from_element($stuff,undef,$options) if ref $stuff eq 'ARRAY'; return $class->new_from_url($stuff,$options) if $stuff =~ m{^(?:https?|file)://}; return $class->new_from_html($stuff,$options) if $stuff =~ /<.*?>/; return $class->new_from_file($stuff,$options) if $stuff !~ /\n/ && -f $stuff; die "Unknown source type: $stuff"; } sub new_from_url { my ($class, $url,$options) = @_; $RESPONSE = __ua()->get($url); no warnings 'uninitialized'; unless( $RESPONSE->is_success ) { die "failed to retrieve '$url', " . $RESPONSE->code. " " . $RESPONSE->message."\n"; }; return $class->new_from_html($RESPONSE->decoded_content,$options); } sub new_from_file { my ($class, $fname, $options) = @_; my $tree = $class->_build_tree($options); $tree->parse_file($fname); my $self = $class->new_from_element([$tree->disembowel],undef,$options); $self->{need_delete}++; return $self; } sub new_from_html { my ($class, $html,$options) = @_; my $tree = $class->_build_tree($options); $tree->parse_content($html); my $self = $class->new_from_element([ map { ref $_ ? $_ : bless { _content => $_ }, 'HTML::TreeBuilder::XPath::TextNode' } $tree->disembowel ],undef,$options); $self->{need_delete}++; return $self; } sub new_from_element { my $self_or_class = shift; my $trees = ref $_[0] eq 'ARRAY' ? $_[0] : [$_[0]]; return bless { trees => [ @$trees ], before => $_[1] }, ref $self_or_class || $self_or_class; } sub end { my $self = shift; return $self->{before}; } sub size { my $self = shift; return scalar(@{$self->{trees}}); } sub parent { my $self = shift; my @new = map { $_->parent } @{$self->{trees}}; return (ref $self || $self)->new_from_element(\@new, $self); } sub first { my $self = shift; return $self->eq(0); } sub last { my $self = shift; return $self->eq(-1); } sub get { my ($self, $index) = @_; return $self->{trees}[$index]; } sub eq { my ($self, $index) = @_; return (ref $self || $self)->new_from_element([$self->{trees}[$index] || ()], $self); } sub find { my ($self, $selector) = @_; my $xpath = ref $selector ? $$selector : selector_to_xpath($selector, root => './'); my @new = map { eval{ $_->findnodes($xpath) } } @{$self->{trees}}; return (ref $self || $self)->new_from_element(\@new, $self); } sub contents { my ($self, $selector) = @_; my @new = map { $_->content_list } @{$self->{trees}}; if ($selector) { my $xpath = ref $selector ? $$selector : selector_to_xpath($selector); @new = grep { $_->matches($xpath) } @new; } return (ref $self || $self)->new_from_element(\@new, $self); } sub as_html { my $self = shift; my %args = @_; my @html = map { ref $_ ? ( $_->isa('HTML::TreeBuilder::XPath::TextNode') || $_->isa('HTML::TreeBuilder::XPath::CommentNode' ) ) ? $_->getValue : $_->as_HTML( q{&<>'"}, $self->{indent}, {} ) : $_ } @{$self->{trees}}; return join $args{join}, @html if defined $args{join}; return wantarray ? @html : $html[0]; } sub html { my $self = shift; if (@_) { map { $_->delete_content; my $tree = $self->_build_tree; $tree->parse_content($_[0]); $_->push_content($tree->disembowel); } @{$self->{trees}}; return $self; } my @html; for my $t ( @{$self->{trees}} ) { push @html, join '', map { ref $_ ? $_->as_HTML( q{&<>'"}, $self->{indent}, {}) : encode_entities($_) } eval { $t->content_list }; } return wantarray ? @html : $html[0]; } sub text { my $self = shift; if (@_) { map { $_->delete_content; $_->push_content($_[0]) } @{$self->{trees}}; return $self; } my @html = map { ref $_ ? $_->as_text : $_ } @{$self->{trees}}; return wantarray ? @html : $html[0]; } sub attr { my $self = shift; if ( @_ == 1 ) { # getter return wantarray ? map { $_->attr(@_) } @{$self->{trees}} : eval { $self->{trees}[0]->attr(@_) } ; } while( my( $attr, $value ) = splice @_, 0, 2 ) { my $code = ref $value eq 'CODE' ? $value : undef; for my $t ( @{$self->{trees}} ) { if ( $code ) { no warnings 'uninitialized'; my $orig = $_ = $t->attr($attr); $code->(); next if $orig eq $_; $value = $_; } $t->attr($attr => $value); } } return $self; } sub id { my $self = shift; if ( @_ ) { # setter my $new_id = shift; return $self if $self->size == 0; return $self->each(sub{ $_->attr( id => $new_id->(@_) ) }) if ref $new_id eq 'CODE'; if ( $self->size == 1 ) { $self->attr( id => $new_id ); } else { return $self->first->attr( id => $new_id ); } } else { # getter # the eval is there in case there is no tree return wantarray ? map { $_->attr('id') } @{$self->{trees}} : eval { $self->{trees}[0]->attr('id') } ; } } sub name { my $self = shift; $self->attr( 'name', @_ ); } sub data { my $self = shift; my $name = shift; $self->attr( join( '-', 'data', $name ), @_ ); } sub tagname { my $self = shift; my @retval = map { $_ eq '~comment' ? '#comment' : $_ } map { ref $_ eq 'HTML::TreeBuilder::XPath::TextNode' ? '#text' : ref $_ eq 'HTML::TreeBuilder::XPath::CommentNode' ? '#comment' : ref $_ ? $_->tag(@_) : '#text' ; } @{$self->{trees}}; return wantarray ? @retval : $retval[0]; } sub each { my ($self, $code) = @_; my $i = 0; # make a copy such that if we modify the list via 'delete', # it won't change from under our feet (see t/each-and-delete.t # for a case where it can) my @trees = @{ $self->{trees} }; for my $tree ( @trees ) { local $_ = (ref $self || $self)->new_from_element([$tree], $self); $code->($i++, $_); } return $self; } sub map { my ($self, $code) = @_; my $i = 0; return +[map { my $tree = $_; local $_ = (ref $self || $self)->new($tree); $code->($i++, $_); } @{$self->{trees}}]; } sub filter { my $self = shift; my @new; if (ref($_[0]) eq 'CODE') { my $code = $_[0]; my $i = 0; @new = grep { my $tree = $_; local $_ = (ref $self || $self)->new_from_element($tree); $code->($i++, $_); } @{$self->{trees}}; } else { my $xpath = ref $_[0] ? ${$_[0]} : selector_to_xpath($_[0]); @new = grep { eval { $_->matches($xpath) } } @{$self->{trees}}; } return (ref $self || $self)->new_from_element(\@new, $self); } sub _is_same_node { refaddr($_[1]) == refaddr($_[2]); } sub remove { my $self = shift; my $before = $self->end; while (defined $before) { @{$before->{trees}} = grep { my $el = $_; not grep { $self->_is_same_node( $el, $_ ) } @{$self->{trees}}; } @{$before->{trees}}; $before = $before->end; } $_->delete for @{$self->{trees}}; @{$self->{trees}} = (); $self; } sub replace_with { my ( $self, $replacement ) = @_; my $i = 0; for my $node ( @{ $self->{trees} } ) { my $rep = $replacement; if ( ref $rep eq 'CODE' ) { local $_ = (ref $self || $self)->new($node); $rep = $rep->( $i++ => $_ ); } $rep = (ref $self || $self)->new_from_html( $rep ) unless ref $rep; my $r = $rep->{trees}->[0]; { no warnings; $r = $r->clone if ref $r; } $r->parent( $node->parent ) if ref $r and $node->parent; $node->replace_with( $r ); } $replacement->remove if ref $replacement eq (ref $self || $self); return $self; } sub append { my ($self, $stuff) = @_; $stuff = (ref $self || $self)->new($stuff); foreach my $t (@{$self->{trees}}) { $t->push_content($_) for ref($t)->clone_list(@{$stuff->{trees}}); } $self; } sub prepend { my ($self, $stuff) = @_; $stuff = (ref $self || $self)->new($stuff); foreach my $t (@{$self->{trees}}) { $t->unshift_content($_) for ref($t)->clone_list(@{$stuff->{trees}}); } $self; } sub before { my ($self, $stuff) = @_; $stuff = (ref $self || $self)->new($stuff); foreach my $t (@{$self->{trees}}) { $t->preinsert(ref($t)->clone_list(@{$stuff->{trees}})); } $self; } sub after { my ($self, $stuff) = @_; $stuff = (ref $self || $self)->new($stuff); foreach my $t (@{$self->{trees}}) { $t->postinsert(ref($t)->clone_list(@{$stuff->{trees}})); } $self; } sub insert_before { my ($self, $target) = @_; foreach my $t (@{$target->{trees}}) { $t->preinsert(ref($t)->clone_list(@{$self->{trees}})); } $self; } sub insert_after { my ($self, $target) = @_; foreach my $t (@{$target->{trees}}) { $t->postinsert(ref($t)->clone_list(@{$self->{trees}})); } $self; } sub detach { my ($self) = @_; $_->detach for @{$self->{trees}}; $self; } sub add_class { my ($self, $class) = @_; for (my $i = 0; $i < @{$self->{trees}}; $i++) { my $t = $self->{trees}->[$i]; my $current_class = $t->attr('class') || ''; my $classes = ref $class eq 'CODE' ? $class->($i, $current_class, $t) : $class; my @classes = split /\s+/, $classes; foreach (@classes) { $current_class .= " $_" unless $current_class =~ /(?:^|\s)$_(?:\s|$)/; } $current_class =~ s/(?:^\s*|\s*$)//g; $current_class =~ s/\s\s+/ /g; $t->attr('class', $current_class); } $self; } sub remove_class { my ($self, $class) = @_; for (my $i = 0; $i < @{$self->{trees}}; $i++) { my $t = $self->{trees}->[$i]; my $current_class = $t->attr('class'); next unless defined $current_class; my $classes = ref $class eq 'CODE' ? $class->($i, $current_class, $t) : $class; my @remove_classes = split /\s+/, $classes; my @final = grep { my $existing_class = $_; not grep { $existing_class eq $_} @remove_classes; } split /\s+/, $current_class; $t->attr('class', join ' ', @final); } $self; } sub toggle_class { my $self = shift; my @classes = uniq @_; $self->each(sub{ for my $class ( @classes ) { my $method = $_->has_class($class) ? 'remove_class' : 'add_class'; $_->$method($class); } }); } sub has_class { my ($self, $class) = @_; foreach my $t (@{$self->{trees}}) { no warnings 'uninitialized'; return 1 if $t->attr('class') =~ /(?:^|\s)$class(?:\s|$)/; } return undef; } sub clone { my ($self) = @_; my @clones = map { $_->clone } @{$self->{trees}}; return (ref $self || $self)->new_from_element(\@clones); } sub add { my ($self, @stuff) = @_; my @nodes; # add(selector, context) if (@stuff == 2 && !ref $stuff[0] && $stuff[1]->isa('HTML::Element')) { my $xpath = ref $stuff[0] ? ${$stuff[0]} : selector_to_xpath($stuff[0]); push @nodes, $stuff[1]->findnodes( $xpath, root => './'); } else { # handle any combination of html string, element object and web::query object push @nodes, map { $self->{need_delete} = 1 if $_->{need_delete}; delete $_->{need_delete}; @{$_->{trees}}; } map { (ref $self || $self)->new($_) } @stuff; } my %ids = map { $self->_node_id($_) => 1 } @{ $self->{trees} }; $self->new_from_element( [ @{$self->{trees}}, grep { ! $ids{ $self->_node_id($_) } } @nodes ], $self ); } sub _node_id { my( undef, $node ) = @_; refaddr $node; } sub prev { my $self = shift; my @new; for my $tree (@{$self->{trees}}) { push @new, $tree->getPreviousSibling; } return (ref $self || $self)->new_from_element(\@new, $self); } sub next { my $self = shift; my @new = grep { $_ } map { $_->getNextSibling } @{ $self->{trees} }; return (ref $self || $self)->new_from_element(\@new, $self); } sub match { my( $self, $selector ) = @_; my $xpath = ref $selector ? $$selector : selector_to_xpath($selector); my $results = $self->map(sub{ my(undef,$e) = @_; return 0 unless ref $e; # it's a string return !!$e->get(0)->matches($xpath); }); return wantarray ? @$results : $results->[0]; } sub not { my( $self, $selector ) = @_; my $class = ref $self; my $xpath = ref $selector ? $$selector : selector_to_xpath($selector); $self->filter(sub { ! grep { $_->matches($xpath) } grep { ref $_ } $class->new($_)->{trees}[0] } ); } sub and_back { my $self = shift; $self->add( $self->end ); } sub next_until { my( $self, $selector ) = @_; my $class = ref $self; my $collection = $class->new_from_element([],$self); my $next = $self->next->not($selector); while( $next->size ) { $collection = $collection->add($next); $next = $next->next->not( $selector ); } # hide the loop from the outside world $collection->{before} = $self; return $collection; } sub split { my( $self, $selector, %args ) = @_; my @current; my @list; $self->contents->each(sub{ my(undef,$e)=@_; if( $e->match($selector) ) { push @list, [ @current ]; @current = ( $e ); } else { if ( $current[1] ) { $current[1] = $current[1]->add($e); } else { $current[1] = $e; } } }); push @list, [ @current ]; if( $args{skip_leading} ) { @list = grep { $_->[0] } @list; } unless ( $args{pairs} ) { @list = map { reduce { $a->add($b) } grep { $_ } @$_ } @list; } return @list; } sub last_response { return $RESPONSE; } sub DESTROY { return unless $_[0]->{need_delete}; # avoid memory leaks local $@; eval { $_->delete } for @{$_[0]->{trees}}; } 1; __END__ =pod =encoding UTF-8 =head1 NAME Web::Query - Yet another scraping library like jQuery =head1 VERSION version 1.01 =head1 SYNOPSIS use Web::Query; wq('http://www.w3.org/TR/html401/') ->find('div.head dt') ->each(sub { my $i = shift; printf("%d %s\n", $i+1, $_->text); }); =head1 DESCRIPTION Web::Query is a yet another scraping framework, have a jQuery like interface. Yes, I know Ingy's L. But it's just alpha quality. It doesn't work. Web::Query built at top of the CPAN modules, L, L, and L. So, this module uses L and only supports the CSS 3 selector supported by that module. Web::Query doesn't support jQuery's extended queries(yet?). If a selector is passed as a scalar ref, it'll be taken as a straight XPath expression. $wq( '

hello

there

' )->find( 'p' ); # css selector $wq( '

hello

there

' )->find( \'/div/p' ); # xpath selector B. =for stopwords prev =head1 FUNCTIONS =over 4 =item C<< wq($stuff) >> This is a shortcut for C<< Web::Query->new($stuff) >>. This function is exported by default. =back =head1 METHODS =head2 CONSTRUCTORS =over 4 =item my $q = Web::Query->new($stuff, \%options ) Create new instance of Web::Query. You can make the instance from URL(http, https, file scheme), HTML in string, URL in string, L object, C, and either one L object or an array ref of them. # all valid creators $q = Web::Query->new( 'http://techblog.babyl.ca' ); $q = Web::Query->new( '

foo

' ); $q = Web::Query->new( undef ); This method throw the exception on unknown $stuff. This method returns undefined value on non-successful response with URL. Currently, the only two valid options are I, which will be used as the indentation string if the object is printed, and I, which will prevent the compaction of whitespace characters in text blocks. =item my $q = Web::Query->new_from_element($element: HTML::Element) Create new instance of Web::Query from instance of L. =item C<< my $q = Web::Query->new_from_html($html: Str) >> Create new instance of Web::Query from HTML. =item my $q = Web::Query->new_from_url($url: Str) Create new instance of Web::Query from URL. If the response is not success(It means /^20[0-9]$/), this method returns undefined value. You can get a last result of response, use the C<< $Web::Query::RESPONSE >>. Here is a best practical code: my $url = 'http://example.com/'; my $q = Web::Query->new_from_url($url) or die "Cannot get a resource from $url: " . Web::Query->last_response()->status_line; =item my $q = Web::Query->new_from_file($file_name: Str) Create new instance of Web::Query from file name. =back =head2 TRAVERSING =head3 add Returns a new object augmented with the new element(s). =over 4 =item add($html) An HTML fragment to add to the set of matched elements. =item add(@elements) One or more @elements to add to the set of matched elements. @elements that already are part of the set are not added a second time. my $group = $wq->find('#foo'); # collection has 1 element $group = $group->add( '#bar', $wq ); # 2 elements $group->add( '#foo', $wq ); # still 2 elements =item add($wq) An existing Web::Query object to add to the set of matched elements. =item add($selector, $context) $selector is a string representing a selector expression to find additional elements to add to the set of matched elements. $context is the point in the document at which the selector should begin matching =back =head3 contents Get the immediate children of each element in the set of matched elements, including text and comment nodes. =head3 each Visit each nodes. C<< $i >> is a counter value, 0 origin. C<< $elem >> is iteration item. C<< $_ >> is localized by C<< $elem >>. $q->each(sub { my ($i, $elem) = @_; ... }) =head3 end Back to the before context like jQuery. =head3 filter Reduce the elements to those that pass the function's test. $q->filter(sub { my ($i, $elem) = @_; ... }) =head3 find Get the descendants of each element in the current set of matched elements, filtered by a selector. my $q2 = $q->find($selector); # $selector is a CSS3 selector. B If you want to match the element itself, use L. B From v0.14 to v0.19 (inclusive) find() also matched the element itself, which is not jQuery compatible. You can achieve that result using C, C and C: my $wq = wq('

bar

'); # needed because we don't have a global document like jQuery does print $wq->filter('.foo')->add($wq->find('.foo'))->as_html; #

bar

bar

=head3 first Return the first matching element. This method constructs a new Web::Query object from the first matching element. =head3 last Return the last matching element. This method constructs a new Web::Query object from the last matching element. =head3 match($selector) Returns a boolean indicating if the elements match the C<$selector>. In scalar context returns only the boolean for the first element. For the reverse of C, see C. =head3 not($selector) Returns all the elements not matching the C<$selector>. # $do_for_love will be every thing, except #that my $do_for_love = $wq->find('thing')->not('#that'); =head3 and_back Add the previous set of elements to the current one. # get the h1 plus everything until the next h1 $wq->find('h1')->next_until('h1')->and_back; =head3 map Creates a new array with the results of calling a provided function on every element. $q->map(sub { my ($i, $elem) = @_; ... }) =head3 parent Get the parent of each element in the current set of matched elements. =head3 prev Get the previous node of each element in the current set of matched elements. my $prev = $q->prev; =head3 next Get the next node of each element in the current set of matched elements. my $next = $q->next; =head3 next_until( $selector ) Get all subsequent siblings, up to (but not including) the next node matched C<$selector>. =head2 MANIPULATION =head3 add_class Adds the specified class(es) to each of the set of matched elements. # add class 'foo' to

elements wq('

foo

bar

')->find('p')->add_class('foo'); =head3 toggle_class( @classes ) Toggles the given class or classes on each of the element. I.e., if the element had the class, it'll be removed, and if it hadn't, it'll be added. Classes are toggled once, no matter how many times they appear in the argument list. $q->toggle_class( 'foo', 'foo', 'bar' ); # equivalent to $q->toggle_class('foo')->toggle_class('bar'); # and not $q->toggle_class('foo')->toggle_class('foo')->toggle_class('bar'); =head3 after Insert content, specified by the parameter, after each element in the set of matched elements. wq('

foo

')->find('p') ->after('bar') ->end ->as_html; #

foo

bar
The content can be anything accepted by L. =head3 append Insert content, specified by the parameter, to the end of each element in the set of matched elements. wq('
')->append('

foo

')->as_html; #

foo

The content can be anything accepted by L. =head3 as_html Returns the string representations of either the first or all elements, depending if called in list or scalar context. If given an argument C, the string representations of the elements will be concatenated with the given string. wq( '

foo

bar

' ) ->find('p') ->as_html( join => '!' ); #

foo

!

bar

=head3 C< attr > Get/set attribute values. In getter mode, it'll return either the values of the attribute for all elements of the set, or only the first one depending of the calling context. my @values = $q->attr('style'); # style of all elements my $first_value = $q->attr('style'); # style of first element In setter mode, it'll set attributes value for all elements, and return back the original object for easy chaining. $q->attr( 'alt' => 'a picture' )->find( ... ); # can pass more than 1 element too $q->attr( alt => 'a picture', src => 'file:///...' ); The value passed for an attribute can be a code ref. In that case, the code will be called with C<$_> set to the current attribute value. If the code modifies C<$_>, the attribute will be updated with the new value. $q->attr( alt => sub { $_ ||= 'A picture' } ); =head3 C< id > Get/set the elements's id attribute. In getter mode, it behaves just like C. In setter mode, it behaves like C, but with the following exceptions. If the attribute value is a scalar, it'll be only assigned to the first element of the set (as ids are supposed to be unique), and the returned object will only contain that first element. my $first_element = $q->id('the_one'); It's possible to set the ids of all the elements by passing a sub to C. The sub is given the same arguments as for C, and its return value is taken to be the new id of the elements. $q->id( sub { my $i = shift; 'foo_' . $i } ); =head3 C< name > Get/set the elements's 'name' attribute. my $name = $q->name; # equivalent to $q->attr( 'name' ); $q->name( 'foo' ); # equivalent to $q->attr( name => 'foo' ); =head3 C< data > Get/set the elements's 'data-*name*' attributes. my $data = $q->data('foo'); # equivalent to $q->attr( 'data-foo' ); $q->data( 'foo' => 'bar' ); # equivalent to $q->attr( 'data-foo' => 'bar' ); =head3 tagname Get/Set the tag name of elements. my $name = $q->tagname; $q->tagname($new_name); =head3 before Insert content, specified by the parameter, before each element in the set of matched elements. wq('

foo

')->find('p') ->before('bar') ->end ->as_html; #
bar

foo

The content can be anything accepted by L. =head3 clone Create a deep copy of the set of matched elements. =head3 detach Remove the set of matched elements from the DOM. =head3 has_class Determine whether any of the matched elements are assigned the given class. =head3 C< html > Get/Set the innerHTML. my @html = $q->html(); my $html = $q->html(); # 1st matching element only $q->html('

foo

'); =head3 insert_before Insert every element in the set of matched elements before the target. =head3 insert_after Insert every element in the set of matched elements after the target. =head3 C< prepend > Insert content, specified by the parameter, to the beginning of each element in the set of matched elements. =head3 remove Delete the elements associated with the object from the DOM. # remove all tags from the document $q->find('blink')->remove; =head3 remove_class Remove a single class, multiple classes, or all classes from each element in the set of matched elements. =head3 replace_with Replace the elements of the object with the provided replacement. The replacement can be a string, a C object or an anonymous function. The anonymous function is passed the index of the current node and the node itself (with is also localized as C<$_>). my $q = wq( '

Abracadabra

' ); $q->find('b')->replace_with('Ocus); #

Ocuscadabra

$q->find('u')->replace_with($q->find('b')); #

cadaAbra

$q->find('i')->replace_with(sub{ my $name = $_->text; return "<$name>"; }); #

Abrabra

=head3 size Return the number of elements in the Web::Query object. wq('

foo

bar

')->find('p')->size; # 2 =head3 text Get/Set the text. my @text = $q->text(); my $text = $q->text(); # 1st matching element only $q->text('text'); If called in a scalar context, only return the string representation of the first element =head2 OTHERS =over 4 =item Web::Query->last_response() Returns last HTTP response status that generated by C. =back =head1 HOW DO I CUSTOMIZE USER AGENT? You can specify your own instance of L. $Web::Query::UserAgent = LWP::UserAgent->new( agent => 'Mozilla/5.0' ); =head1 FAQ AND TROUBLESHOOTING =head2 How to find XML processing instructions in a document? It's possible with L and by using an xpath expression with C: # find $q->find(\"//processing-instruction('xml-stylesheet')"); However, note that the support for processing instructions in L is sketchy, so there are methods like C that won't work. =head2 Can't get the content of script elements The "; say Web::Query::wq( $node )->text; # nothing is printed! say Web::Query::wq( $node )->html; # var x = '<p>foo</p>'; say Web::Query::LibXML::wq( $node )->text; # var x = '

foo

'; say Web::Query::LibXML::wq( $node )->html; # var x = '<p>foo</p>'; =head1 INCOMPATIBLE CHANGES =over 4 =item 0.10 new_from_url() is no longer throws exception on bad response from HTTP server. =back =head1 AUTHOR Tokuhiro Matsuno Etokuhirom AAJKLFJEF@ GMAIL COME =head1 SEE ALSO =over =item L =item L =back =head1 LICENSE Copyright (C) Tokuhiro Matsuno This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =cut insert_after.t100644001750001750 123614550254663 16404 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq('
Hello
Goodbye
'); wq('

Test

')->insert_after($wq->find('.inner')); is $wq->as_html, '
Hello

Test

Goodbye

Test

', 'insert_after'; }remove_class.t100644001750001750 204514550254663 16400 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq('
Hello
Goodbye
'); my $rv = $wq->find('.inner')->remove_class('foo bar'); isa_ok $rv, ['Web::Query'], 'remove_class returned'; is $wq->as_html, '
Hello
Goodbye
', 'remove_class("foo bar")'; $wq = wq('
Hello
Goodbye
'); $wq->find('.inner')->remove_class(sub{ 'foo bar' }); is $wq->as_html, '
Hello
Goodbye
', 'remove_class(CODE)'; } replace_with.t100644001750001750 222714550254663 16366 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; my @modules = qw/ Web::Query Web::Query::LibXML /; plan tests => scalar @modules; subtest $_ => sub { test($_) } for @modules; sub test { my $class = shift; eval "require $class; 1" or plan skip_all => "couldn't load $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $html = '

Hithereworld

'; is wq($html)->find('b')->replace_with('Hello')->end->as_html => '

Hellothereworld

'; my $q = wq( $html ); is $q->find('u')->replace_with($q->find('b'))->end->as_html => '

thereHi

'; is wq($html)->find('*')->replace_with(sub { my $i = $_->text; return "<$i>"; } )->end->as_html => '

'; is wq($html)->find('*')->replace_with( '' )->end->as_html => '

'; is wq('

foo

')->find('span') ->replace_with(sub { $_->contents }) ->end->as_html => '

foo

'; } insert_before.t100644001750001750 124014550254663 16540 0ustar00yanickyanick000000000000Web-Query-1.01/t#!/usr/bin/env perl use strict; use warnings; use lib 'lib'; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $wq = wq('
Hello
Goodbye
'); wq('

Test

')->insert_before($wq->find('.inner')); is $wq->as_html, '

Test

Hello

Test

Goodbye
', 'insert_before'; }match_and_not.t100644001750001750 114214550254663 16511 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; my $wq = $class->new(<

one

two

three

HTML is $wq->find('p')->not( '#second' )->size => 2, 'not'; is $wq->find('p')->filter( '#second' )->size => 1, 'filter'; subtest 'match' => sub { is [ $wq->find('p')->match( '.foo' ) ], [ 1, '', 1 ], "list context"; is scalar $wq->find('p')->match( '.foo' ) => 1, "scalar context"; }; } CODE_OF_CONDUCT.md100644001750001750 1216014550254663 16104 0ustar00yanickyanick000000000000Web-Query-1.01# Contributor Covenant Code of Conduct ## Our Pledge We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. ## Our Standards Examples of behavior that contributes to a positive environment for our community include: * Demonstrating empathy and kindness toward other people * Being respectful of differing opinions, viewpoints, and experiences * Giving and gracefully accepting constructive feedback * Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience * Focusing on what is best not just for us as individuals, but for the overall community Examples of unacceptable behavior include: * The use of sexualized language or imagery, and sexual attention or advances of any kind * Trolling, insulting or derogatory comments, and personal or political attacks * Public or private harassment * Publishing others' private information, such as a physical or email address, without their explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ## Enforcement Responsibilities Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful. Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. ## Scope This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at tokuhirom AAJKLFJEF@ GMAIL COM. All complaints will be reviewed and investigated promptly and fairly. All community leaders are obligated to respect the privacy and security of the reporter of any incident. ## Enforcement Guidelines Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct: ### 1. Correction **Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. **Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. ### 2. Warning **Community Impact**: A violation through a single incident or series of actions. **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. ### 3. Temporary Ban **Community Impact**: A serious violation of community standards, including sustained inappropriate behavior. **Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. ### 4. Permanent Ban **Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. **Consequence**: A permanent ban from any sort of public interaction within the community. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity). [homepage]: https://www.contributor-covenant.org For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations. store_comments.t100644001750001750 113114550254663 16752 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use Web::Query; test('Web::Query'); test('Web::Query::LibXML') if eval "require Web::Query::LibXML; 1"; done_testing; sub test { my $class = shift; diag "testing $class"; no warnings 'redefine'; *wq = \&{$class . "::wq" }; my $source = '
'; is join('', wq($source)->as_html), $source, 'constructor stores comments'; is wq($source)->find('header')->html('

')->as_html, '

', 'html() stores comments'; }live000755001750001750 014550254663 14517 5ustar00yanickyanick000000000000Web-Query-1.01/xt01_simple.t100644001750001750 57614550254663 16625 0ustar00yanickyanick000000000000Web-Query-1.01/xt/liveuse strict; use warnings; use utf8; use Test::More; use Web::Query; binmode Test::More->builder->$_, ":utf8" for qw/output failure_output todo_output/; my @res; wq('https://techblog.babyl.ca/') ->find('div') ->each(sub { my $i = shift; push @res, $_->text; note(sprintf "%d) %s\n", $i+1, $_->text) }); ok @res; done_testing; bug-text-contents.t100644001750001750 123414550254663 17307 0ustar00yanickyanick000000000000Web-Query-1.01/t# see https://github.com/tokuhirom/Web-Query/issues/47 use strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; my $html = <<'HTML';

Hello

World

HTML WQTest::test { my $q = $_[0]->new($html); isa_ok $q, 'Web::Query'; my @text; my @contents; $q->find('p')->each(sub { my ($i, $elem) = @_; push @text, $elem->text; push @contents, $elem->contents; }); is \@text, [qw/ Hello World /], 'elements'; is @contents, 2, 'two contents'; isa_ok $_, 'Web::Query' for @contents;; is $contents[0]->text => 'Hello'; is $contents[1]->text => 'World'; }; Web000755001750001750 014550254663 15240 5ustar00yanickyanick000000000000Web-Query-1.01/t/lib/MyQuery.pm100644001750001750 52014550254663 17020 0ustar00yanickyanick000000000000Web-Query-1.01/t/lib/My/Webpackage My::Web::Query; use strict; use warnings; use parent qw/Web::Query Exporter/; use My::TreeBuilder; our @EXPORT = qw/wq/; sub wq { My::Web::Query->new(@_) } sub _build_tree { my ($self, $content) = @_; my $tree = My::TreeBuilder->new(); $tree->ignore_unknown(0); $tree->store_comments(1); $tree; }00-report-prereqs.t100644001750001750 1360114550254663 17145 0ustar00yanickyanick000000000000Web-Query-1.01/t#!perl use strict; use warnings; # This test was generated by Dist::Zilla::Plugin::Test::ReportPrereqs 0.029 use Test::More tests => 1; use ExtUtils::MakeMaker; use File::Spec; # from $version::LAX my $lax_version_re = qr/(?: undef | (?: (?:[0-9]+) (?: \. | (?:\.[0-9]+) (?:_[0-9]+)? )? | (?:\.[0-9]+) (?:_[0-9]+)? ) | (?: v (?:[0-9]+) (?: (?:\.[0-9]+)+ (?:_[0-9]+)? )? | (?:[0-9]+)? (?:\.[0-9]+){2,} (?:_[0-9]+)? ) )/x; # hide optional CPAN::Meta modules from prereq scanner # and check if they are available my $cpan_meta = "CPAN::Meta"; my $cpan_meta_pre = "CPAN::Meta::Prereqs"; my $HAS_CPAN_META = eval "require $cpan_meta; $cpan_meta->VERSION('2.120900')" && eval "require $cpan_meta_pre"; ## no critic # Verify requirements? my $DO_VERIFY_PREREQS = 1; sub _max { my $max = shift; $max = ( $_ > $max ) ? $_ : $max for @_; return $max; } sub _merge_prereqs { my ($collector, $prereqs) = @_; # CPAN::Meta::Prereqs object if (ref $collector eq $cpan_meta_pre) { return $collector->with_merged_prereqs( CPAN::Meta::Prereqs->new( $prereqs ) ); } # Raw hashrefs for my $phase ( keys %$prereqs ) { for my $type ( keys %{ $prereqs->{$phase} } ) { for my $module ( keys %{ $prereqs->{$phase}{$type} } ) { $collector->{$phase}{$type}{$module} = $prereqs->{$phase}{$type}{$module}; } } } return $collector; } my @include = qw( ); my @exclude = qw( ); # Add static prereqs to the included modules list my $static_prereqs = do './t/00-report-prereqs.dd'; # Merge all prereqs (either with ::Prereqs or a hashref) my $full_prereqs = _merge_prereqs( ( $HAS_CPAN_META ? $cpan_meta_pre->new : {} ), $static_prereqs ); # Add dynamic prereqs to the included modules list (if we can) my ($source) = grep { -f } 'MYMETA.json', 'MYMETA.yml'; my $cpan_meta_error; if ( $source && $HAS_CPAN_META && (my $meta = eval { CPAN::Meta->load_file($source) } ) ) { $full_prereqs = _merge_prereqs($full_prereqs, $meta->prereqs); } else { $cpan_meta_error = $@; # capture error from CPAN::Meta->load_file($source) $source = 'static metadata'; } my @full_reports; my @dep_errors; my $req_hash = $HAS_CPAN_META ? $full_prereqs->as_string_hash : $full_prereqs; # Add static includes into a fake section for my $mod (@include) { $req_hash->{other}{modules}{$mod} = 0; } for my $phase ( qw(configure build test runtime develop other) ) { next unless $req_hash->{$phase}; next if ($phase eq 'develop' and not $ENV{AUTHOR_TESTING}); for my $type ( qw(requires recommends suggests conflicts modules) ) { next unless $req_hash->{$phase}{$type}; my $title = ucfirst($phase).' '.ucfirst($type); my @reports = [qw/Module Want Have/]; for my $mod ( sort keys %{ $req_hash->{$phase}{$type} } ) { next if grep { $_ eq $mod } @exclude; my $want = $req_hash->{$phase}{$type}{$mod}; $want = "undef" unless defined $want; $want = "any" if !$want && $want == 0; if ($mod eq 'perl') { push @reports, ['perl', $want, $]]; next; } my $req_string = $want eq 'any' ? 'any version required' : "version '$want' required"; my $file = $mod; $file =~ s{::}{/}g; $file .= ".pm"; my ($prefix) = grep { -e File::Spec->catfile($_, $file) } @INC; if ($prefix) { my $have = MM->parse_version( File::Spec->catfile($prefix, $file) ); $have = "undef" unless defined $have; push @reports, [$mod, $want, $have]; if ( $DO_VERIFY_PREREQS && $HAS_CPAN_META && $type eq 'requires' ) { if ( $have !~ /\A$lax_version_re\z/ ) { push @dep_errors, "$mod version '$have' cannot be parsed ($req_string)"; } elsif ( ! $full_prereqs->requirements_for( $phase, $type )->accepts_module( $mod => $have ) ) { push @dep_errors, "$mod version '$have' is not in required range '$want'"; } } } else { push @reports, [$mod, $want, "missing"]; if ( $DO_VERIFY_PREREQS && $type eq 'requires' ) { push @dep_errors, "$mod is not installed ($req_string)"; } } } if ( @reports ) { push @full_reports, "=== $title ===\n\n"; my $ml = _max( map { length $_->[0] } @reports ); my $wl = _max( map { length $_->[1] } @reports ); my $hl = _max( map { length $_->[2] } @reports ); if ($type eq 'modules') { splice @reports, 1, 0, ["-" x $ml, "", "-" x $hl]; push @full_reports, map { sprintf(" %*s %*s\n", -$ml, $_->[0], $hl, $_->[2]) } @reports; } else { splice @reports, 1, 0, ["-" x $ml, "-" x $wl, "-" x $hl]; push @full_reports, map { sprintf(" %*s %*s %*s\n", -$ml, $_->[0], $wl, $_->[1], $hl, $_->[2]) } @reports; } push @full_reports, "\n"; } } } if ( @full_reports ) { diag "\nVersions for all modules listed in $source (including optional ones):\n\n", @full_reports; } if ( $cpan_meta_error || @dep_errors ) { diag "\n*** WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING ***\n"; } if ( $cpan_meta_error ) { my ($orig_source) = grep { -f } 'MYMETA.json', 'MYMETA.yml'; diag "\nCPAN::Meta->load_file('$orig_source') failed with: $cpan_meta_error\n"; } if ( @dep_errors ) { diag join("\n", "\nThe following REQUIRED prerequisites were not satisfied:\n", @dep_errors, "\n" ); } pass('Reported prereqs'); # vim: ts=4 sts=4 sw=4 et: special-attributes.t100644001750001750 417614550254663 17531 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; use lib 't/lib'; use WQTest; WQTest::test { my $class = shift; subtest 'data()' => sub { test_data($class) }; subtest 'name()' => sub { test_name($class) }; subtest 'id()' => sub { test_id($class) }; }; sub test_data { my $class = shift; my $wq = $class->new_from_html(q{
}); subtest setter => sub { $wq->find('a')->data( foo => 'bar' ); pass; }; subtest 'getter' => sub { is $wq->find('a')->data('foo') => 'bar'; }; } sub test_name { my $class = shift; my $wq = $class->new_from_html(q{
}); subtest 'getter' => sub { is [ $wq->find('a,b,c')->name ], [ 'foo', undef, 'bar' ], "getter, list context"; is scalar $wq->find('a,b,c')->name, 'foo', "getter, scalar context"; }; subtest setter => sub { $wq->find('a,b,c')->name( 'quux' ); is $wq->find($_)->name => 'quux' for 'a'..'c'; } } sub test_id { my $class = shift; my $wq = $class->new_from_html(q{
1 2 3
}); is [ $wq->find('a')->id ] => [ undef ], "no id, list context"; is scalar $wq->find('a')->id => undef, "no id, scalar context"; is $wq->find('#foo')->id => 'foo', 'single element'; is scalar($wq->find('#foo')->id) => 'foo', 'single element, scalar context'; is [ $wq->find('c')->id ], [ 'bar', undef, 'baz' ], 'many elements, list context'; is scalar $wq->find('c')->id, 'bar', 'many elements, scalar context'; $wq->find('b')->id('fool'); is $wq->find('#fool')->tagname => 'b', 'change id, scalar'; isa_ok $wq->find('c')->id('buz'), 'Web::Query'; is $wq->find('c')->id('buz')->size => 1, 'only the first element'; is $wq->find('#buz')->text => 1, "change first element"; my $i = 0; $wq->find('c')->id(sub{ 'new_'.$i++ }); is $wq->find('#new_'.$_)->size => 1 for 0..2; } 00-report-prereqs.dd100644001750001750 441514550254663 17254 0ustar00yanickyanick000000000000Web-Query-1.01/tdo { my $x = { 'configure' => { 'requires' => { 'ExtUtils::MakeMaker' => '0' } }, 'develop' => { 'requires' => { 'Test::More' => '0.96', 'Test::Vars' => '0', 'utf8' => '0' } }, 'runtime' => { 'requires' => { 'Exporter' => '0', 'HTML::Entities' => '0', 'HTML::Selector::XPath' => '0.20', 'HTML::TreeBuilder::LibXML' => '0', 'HTML::TreeBuilder::XPath' => '0', 'LWP::UserAgent' => '0', 'List::Util' => '1.44', 'Scalar::Util' => '0', 'parent' => '0', 'perl' => '5.008005', 'strict' => '0', 'warnings' => '0' } }, 'test' => { 'recommends' => { 'CPAN::Meta' => '2.120900' }, 'requires' => { 'Cwd' => '0', 'ExtUtils::MakeMaker' => '0', 'File::Spec' => '0', 'FindBin' => '0', 'IO::Handle' => '0', 'IPC::Open3' => '0', 'Test2::Tools::Exception' => '0', 'Test2::V0' => '0', 'Test::Exception' => '0', 'Test::More' => '0', 'lib' => '0', 'utf8' => '0' } } }; $x; }Query000755001750001750 014550254663 15515 5ustar00yanickyanick000000000000Web-Query-1.01/lib/WebLibXML.pm100644001750001750 516114550254663 17305 0ustar00yanickyanick000000000000Web-Query-1.01/lib/Web/Querypackage Web::Query::LibXML; our $AUTHORITY = 'cpan:TOKUHIROM'; # ABSTRACT: fast, drop-in replacement for Web::Query $Web::Query::LibXML::VERSION = '1.01'; use 5.008005; use strict; use warnings; use parent qw/Web::Query Exporter/; use HTML::TreeBuilder::LibXML; # version required for unique_key use XML::LibXML 2.0107; our @EXPORT = qw/wq/; sub wq { Web::Query::LibXML->new(@_) } sub _build_tree { my $tree = HTML::TreeBuilder::LibXML->new(); $tree->ignore_unknown(0); $tree->store_comments(1); $tree; } sub _is_same_node { $_[1]->{node}->isSameNode($_[2]->{node}); } sub prev { my $self = shift; my @new; for my $tree (@{$self->{trees}}) { push @new, $tree->left; } return (ref $self || $self)->new_from_element(\@new, $self); } sub next { my $self = shift; my @new; for my $tree (@{$self->{trees}}) { push @new, grep { $_ } $tree->right; } return (ref $self || $self)->new_from_element(\@new, $self); } sub tagname { my $self = shift; my $method = @_ ? 'setNodeName' : 'nodeName'; my @retval = map { $_->{node}->$method(@_) } @{$self->{trees}}; return wantarray ? @retval : $retval[0]; } sub _node_id { $_[1]{node}->unique_key } 1; __END__ =pod =encoding utf-8 =head1 NAME Web::Query::LibXML - fast, drop-in replacement for Web::Query =head1 VERSION version 1.01 =head1 SYNOPSIS use Web::Query::LibXML; # imports wq() # all methods inherited from Web::Query # see Web::Query for documentation =head1 DESCRIPTION Web::Query::LibXML is Web::Query subclass that overrides the _build_tree() method to use HTML::TreeBuilder::LibXML instead of HTML::TreeBuilder::XPath. Its a lot faster than its superclass. Use this module unless you can't install (or depend on) L on your system. =head1 FUNCTIONS =over 4 =item C<< wq($stuff) >> This is a shortcut for C<< Web::Query::LibXML->new($stuff) >>. This function is exported by default. =back =head1 METHODS All public methods are inherited from L. =head1 LICENSE Copyright (C) Carlos Fernando Avila Gratz. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR Carlos Fernando Avila Gratz Ecafe@q1software.comE =head1 SEE ALSO L, L, L =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =cut My000755001750001750 014550254663 14523 5ustar00yanickyanick000000000000Web-Query-1.01/t/libTreeBuilder.pm100644001750001750 10414550254663 17402 0ustar00yanickyanick000000000000Web-Query-1.01/t/lib/Mypackage My::TreeBuilder; use parent qw/HTML::TreeBuilder::XPath/; 1;no_space_compacting.t100644001750001750 174014550254663 17712 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use Test2::V0; plan tests => 3; use Web::Query; is( Web::Query->new_from_html(<<'END')->as_html, '

hello there

', 'spaces trimmed' );

hello there

END is( Web::Query->new_from_html(<<'END', {no_space_compacting => 1})->as_html, '

hello there

', 'spaces left' );

hello there

END subtest 'LibXML' => sub { eval "require Web::Query::LibXML; 1" or plan skip_all => "couldn't load Web::Query::LibXML"; # LibXML doesn't trim by default is( Web::Query::LibXML->new_from_html(<<'END')->as_html, '

hello there

' );

hello there

END is( Web::Query::LibXML->new_from_html(<<'END', {no_space_compacting => 1})->as_html, '

hello there

' );

hello there

END }; bad-url-with-options.t100644001750001750 104714550254663 17707 0ustar00yanickyanick000000000000Web-Query-1.01/tuse Test2::V0; use Test2::Tools::Exception qw/dies/; use strict; use warnings; use utf8; use LWP::UserAgent; use Web::Query; my $ua = $Web::Query::UserAgent = LWP::UserAgent->new( agent => 'Mozilla/5.0' ); $ua->add_handler(request_send => sub { my ($request) = @_; my $code = $request->uri->host eq 'bad.com' ? 500 : 200; return HTTP::Response->new($code); }); plan tests => 2; ok dies { Web::Query->new('http://bad.com/'); }, "without options"; ok dies { Web::Query->new('http://bad.com/',{indent=>3}); }, "with options"; release000755001750001750 014550254663 15200 5ustar00yanickyanick000000000000Web-Query-1.01/xtunused-vars.t100644001750001750 14214550254663 17756 0ustar00yanickyanick000000000000Web-Query-1.01/xt/releaseuse Test::More 0.96 tests => 1; use Test::Vars; subtest 'unused vars' => sub { all_vars_ok(); }; html5_snippet.html100644001750001750 5514550254663 20062 0ustar00yanickyanick000000000000Web-Query-1.01/t/datafoo
bar
baz
processing-instructions.t100644001750001750 71514550254663 20616 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use lib 't/lib'; use Test2::V0; use WQTest; my $doc = <<'END';

stuff

alpha

aaa

END WQTest::test { my $class = shift; plan skip_all => "not working for $class" if $class eq 'Web::Query'; like $class->new($doc)->find(\"//processing-instruction('xml-stylesheet')")->as_html => qr/style.css/; } 06_new_from_url_error_handling.t100644001750001750 175714550254663 22007 0ustar00yanickyanick000000000000Web-Query-1.01/tuse strict; use warnings; use utf8; use Test2::V0; use LWP::UserAgent; use Web::Query; my $ua = LWP::UserAgent->new( agent => 'Mozilla/5.0' ); $Web::Query::UserAgent = $ua; $ua->add_handler(request_send => sub { my ($request, $ua, $h) = @_; if ($request->uri->host eq 'bad.com') { return HTTP::Response->new(500); } else { return HTTP::Response->new(200); } }); subtest 'bad url' => sub { my $q = eval { wq('http://bad.com/') }; is($q, undef); ok $@; isa_ok($Web::Query::RESPONSE, 'HTTP::Response'); is($Web::Query::RESPONSE->code, 500); isa_ok(Web::Query->last_response, 'HTTP::Response'); is(Web::Query::last_response->code, 500); }; subtest 'good status code' => sub { my $q = wq('http://good.com/'); ok($q); isa_ok($Web::Query::RESPONSE, 'HTTP::Response'); is($Web::Query::RESPONSE->code, 200); isa_ok(Web::Query->last_response, 'HTTP::Response'); is(Web::Query::last_response->code, 200); }; done_testing;