AnyData-0.12/000755 000765 000024 00000000000 12462242165 012723 5ustar00snostaff000000 000000 AnyData-0.12/Changes000644 000765 000024 00000006515 12462240547 014227 0ustar00snostaff000000 000000 AnyData - easy access to data in many formats Revision history for Perl extension AnyData. 0.12 2015-01-28 * fix various packaging issues (RT#101768, RT#95908, RT#99763, RT#82862) * add missing prereq on CGI (RT#99916) * officially marked as deprecated * tidy up distribution * reformat Changes according to rough spec * fix some meta-data #toolchain pointed out 0.11 2012-12-14 * new maintainer Sven Dowideit * use Test::More * Fails to retrieve XML data over HTTP (debian#421884) Bart Martens * Fix syntax error in POD documentation (debian) Ansgar Burchardt * Fix spelling errors (debian) Ansgar Burchardt * adColumn $distinct_flag not handled (RT#6248 & RT#6251) John D. Lima * writing fields containing 0 with AnyData::Format::Fixed (RT#8671) * weblog request and referer regexs were too greedy, and the referer and client regex's where in the wrong order (RT#34063) and (RT#72334) Wes Brown and pawal [...] blipp.com * adDump writes out fields containing 0 as empty strings (RT#28006) xcaron [...] gmail.com * A bug in AnyData::Format::Weblog, SQL excuting result is not correct (RT#56962) youngyik@gmail.com * 0.10 2004-04-19 * really fixed adConvert, thanks for bug reports - Dan Wright, Scott Godin 0.09 2004-04-18 * fixed adConvert - many thanks to Dan Wright for a patch 0.08 2003-10-25 * Storage/RAM.pm : added version number 0.07 2003-10-25 * Storage/PassThru.pm : added version number (thanks Randal Schwartz) 0.06 2003-10-25 * test.pl : omitted XML & HTMLtable tests 0.05 2001-07-17 There are major changes in the way the tied hash interface does deletions and exporting and in the XML and Weblog format parsers. If you use the tied hash interface or XML or Weblog (in either interface), please update to this version. The Big Stuff: * XML now accepts user-supplied tag-to-column mappings. * Import now supports cross and outer joins by importing more than one table into the same in-memory table. Many minor fixes: * fixed adExport to allow flags as documented (thx Matthew Wickline) * greatly improved and speeded up deletions from tied hashes and documented how they work (thx *alot* Matthew Wickline) * fixed weblog to handle embedded double quotes (thx Bob O'Neill) * added documentation to DBD::AnyData on cross joins and full outer joins as part of the import statement * fixed tests so they skip XML and HTMLtable if the required extra modules (e.g. XML::Twig) aren't available (thx Malcolm Cook) * fixed XML to be able to work with XML::Twig 3.00 as well as earlier versions (meant to do that earlier) * fixed README to mention that DBD::AnyData replaces DBD::RAM (thx Mark Whittiker) * fixed weblog README to use current method and column names (thx Bob O'Neill) * fixed HTMLtable so that it can be used to export even if HTML::TableExtract and HTML::Parser aren't installed (thx Matthew Wickline) * fixed bad version number on DBD::AnyData - it reported 0.03 instead of 0.04 (hey I caught this one all by myself) * added version numbers to all of the submodules and standardized the docs 0.04 2001-06-26 * Beta release 0.03 2001-03-17 * second Alpha release 0.02 2001-02-04 * first Alpha release 0.01 2001-01-17 * pre Alpha release AnyData-0.12/lib/000755 000765 000024 00000000000 12462242164 013470 5ustar00snostaff000000 000000 AnyData-0.12/LICENSE000644 000765 000024 00000043774 12423400733 013741 0ustar00snostaff000000 000000 This software is copyright (c) 2012 by Sven Dowideit . This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. Terms of the Perl programming language system itself a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" --- The GNU General Public License, Version 1, February 1989 --- This software is Copyright (c) 2012 by Sven Dowideit . This is free software, licensed under: The GNU General Public License, Version 1, February 1989 GNU GENERAL PUBLIC LICENSE Version 1, February 1989 Copyright (C) 1989 Free Software Foundation, Inc. 51 Franklin St, Suite 500, Boston, MA 02110-1335 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The license agreements of most software companies try to keep users at the mercy of those companies. By contrast, our General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. The General Public License applies to the Free Software Foundation's software and to any other program whose authors commit to using it. You can use it for your programs, too. When we speak of free software, we are referring to freedom, not price. Specifically, the General Public License is designed to make sure that you have the freedom to give away or sell copies of free software, that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of a such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any work containing the Program or a portion of it, either verbatim or with modifications. Each licensee is addressed as "you". 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this General Public License and to the absence of any warranty; and give any other recipients of the Program a copy of this General Public License along with the Program. You may charge a fee for the physical act of transferring a copy. 2. You may modify your copy or copies of the Program or any portion of it, and copy and distribute such modifications under the terms of Paragraph 1 above, provided that you also do the following: a) cause the modified files to carry prominent notices stating that you changed the files and the date of any change; and b) cause the whole of any work that you distribute or publish, that in whole or in part contains the Program or any part thereof, either with or without modifications, to be licensed at no charge to all third parties under the terms of this General Public License (except that you may choose to grant warranty protection to some or all third parties, at your option). c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the simplest and most usual way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this General Public License. d) You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. Mere aggregation of another independent work with the Program (or its derivative) on a volume of a storage or distribution medium does not bring the other work under the scope of these terms. 3. You may copy and distribute the Program (or a portion or derivative of it, under Paragraph 2) in object code or executable form under the terms of Paragraphs 1 and 2 above provided that you also do one of the following: a) accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Paragraphs 1 and 2 above; or, b) accompany it with a written offer, valid for at least three years, to give any third party free (except for a nominal charge for the cost of distribution) a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Paragraphs 1 and 2 above; or, c) accompany it with the information you received as to where the corresponding source code may be obtained. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form alone.) Source code for a work means the preferred form of the work for making modifications to it. For an executable file, complete source code means all the source code for all modules it contains; but, as a special exception, it need not include source code for modules which are standard libraries that accompany the operating system on which the executable file runs, or for standard header files or definitions files that accompany that operating system. 4. You may not copy, modify, sublicense, distribute or transfer the Program except as expressly provided under this General Public License. Any attempt otherwise to copy, modify, sublicense, distribute or transfer the Program is void, and will automatically terminate your rights to use the Program under this License. However, parties who have received copies, or rights to use copies, from you under this General Public License will not have their licenses terminated so long as such parties remain in full compliance. 5. By copying, distributing or modifying the Program (or any work based on the Program) you indicate your acceptance of this license to do so, and all its terms and conditions. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. 7. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of the license which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation. 8. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Appendix: How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to humanity, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) 19yy This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) 19xx name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (a program to direct compilers to make passes at assemblers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice That's all there is to it! --- The Artistic License 1.0 --- This software is Copyright (c) 2012 by Sven Dowideit . This is free software, licensed under: The Artistic License 1.0 The Artistic License Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions: - "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. - "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder. - "Copyright Holder" is whoever is named in the copyright or copyrights for the package. - "You" is you, if you're thinking about copying or distributing this Package. - "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) - "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a) place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as ftp.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b) use the modified Package only within your corporation or organization. c) rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d) make other distribution arrangements with the Copyright Holder. 4. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a) distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b) accompany the distribution with the machine-readable source of the Package with your modifications. c) accompany any non-standard executables with their corresponding Standard Version executables, giving the non-standard executables non-standard names, and clearly documenting the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d) make other distribution arrangements with the Copyright Holder. 5. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. 6. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. 7. C or perl subroutines supplied by you and linked into this Package shall not be considered part of this Package. 8. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End AnyData-0.12/Makefile.PL000644 000765 000024 00000011014 12462241137 014670 0ustar00snostaff000000 000000 # -*- perl -*- use strict; use 5.008001; use ExtUtils::MakeMaker; my %RUN_DEPS = ( 'CGI' => '0' ); my %CONFIGURE_DEPS = ( 'ExtUtils::MakeMaker' => 0, ); my %BUILD_DEPS = (); my %TEST_DEPS = ( 'Test::Output' => '0', 'Test::More' => '0', ); my @AUTHORS = ( 'Jeff Zucker (jeff@vpservices.com)', 'Sven Dowideit (SvenDowideit@fosiki.com)', ); WriteMakefile1( MIN_PERL_VERSION => '5.008001', META_ADD => { 'meta-spec' => { version => 2 }, dynamic_config => 0, resources => { homepage => 'https://metacpan.org/release/AnyData', repository => { url => 'https://github.com/perl5-dbi/AnyData.git', web => 'https://github.com/perl5-dbi/AnyData', type => 'git', }, bugtracker => { web => 'http://rt.cpan.org/Public/Dist/Display.html?Name=AnyData', mailto => 'bug-AnyData@rt.cpan.org', }, license => 'http://dev.perl.org/licenses/', x_deprecated => 1, x_IRC => "irc://irc.perl.org/#dbi", x_MailingList => "mailto:dbi-dev\@perl.org'", }, prereqs => { develop => { requires => { 'Test::CPAN::Changes' => 0, 'Test::CheckManifest' => 0, 'Module::CPANTS::Analyse' => '0.96', 'Test::Kwalitee' => 0, 'Test::Pod' => 0, 'Test::Pod::Coverage' => 0, 'Test::Pod::Spelling::CommonMistakes' => 0, 'Test::Spelling' => 0, }, }, configure => { requires => {%CONFIGURE_DEPS}, }, build => { requires => {%BUILD_DEPS} }, test => { requires => {%TEST_DEPS} }, runtime => { requires => { %RUN_DEPS, }, }, }, }, NAME => 'AnyData', VERSION_FROM => 'lib/AnyData.pm', ABSTRACT_FROM => 'lib/AnyData.pm', LICENSE => "perl", AUTHOR => \@AUTHORS, CONFIGURE_REQUIRES => \%CONFIGURE_DEPS, BUILD_REQUIRES => \%BUILD_DEPS, PREREQ_PM => \%RUN_DEPS, TEST_REQUIRES => \%TEST_DEPS, test => { TESTS => 't/*.t xt/*.t' }, ); sub WriteMakefile1 { # originally written by Alexandr Ciornii, version 0.21. Added by eumm-upgrade. my %params = @_; my $eumm_version = $ExtUtils::MakeMaker::VERSION; $eumm_version = eval $eumm_version; die "EXTRA_META is deprecated" if ( exists( $params{EXTRA_META} ) ); die "License not specified" if ( !exists( $params{LICENSE} ) ); $params{TEST_REQUIRES} and $eumm_version < 6.6303 and $params{BUILD_REQUIRES} = { %{ $params{BUILD_REQUIRES} || {} }, %{ delete $params{TEST_REQUIRES} } }; #EUMM 6.5502 has problems with BUILD_REQUIRES $params{BUILD_REQUIRES} and $eumm_version < 6.5503 and $params{PREREQ_PM} = { %{ $params{PREREQ_PM} || {} }, %{ delete $params{BUILD_REQUIRES} } }; ref $params{AUTHOR} and "ARRAY" eq ref $params{AUTHOR} and $eumm_version < 6.5702 and $params{AUTHOR} = join( ", ", @{ $params{AUTHOR} } ); delete $params{CONFIGURE_REQUIRES} if ( $eumm_version < 6.52 ); delete $params{MIN_PERL_VERSION} if ( $eumm_version < 6.48 ); delete $params{META_MERGE} if ( $eumm_version < 6.46 ); delete $params{META_ADD}{prereqs} if ( $eumm_version < 6.58 ); delete $params{META_ADD}{'meta-spec'} if ( $eumm_version < 6.58 ); delete $params{META_ADD} if ( $eumm_version < 6.46 ); delete $params{LICENSE} if ( $eumm_version < 6.31 ); delete $params{AUTHOR} if ( $] < 5.005 ); delete $params{ABSTRACT_FROM} if ( $] < 5.005 ); delete $params{BINARY_LOCATION} if ( $] < 5.005 ); # more or less taken from Moose' Makefile.PL if ( $params{CONFLICTS} ) { my $ok = CheckConflicts(%params); exit(0) if ( $params{PREREQ_FATAL} and not $ok ); my $cpan_smoker = grep { $_ =~ m/(?:CR_SMOKER|CPAN_REPORTER|AUTOMATED_TESTING)/ } keys %ENV; unless ( $cpan_smoker || $ENV{PERL_MM_USE_DEFAULT} ) { sleep 4 unless ($ok); } delete $params{CONFLICTS}; } WriteMakefile(%params); } AnyData-0.12/MANIFEST000644 000765 000024 00000001653 12462242165 014061 0ustar00snostaff000000 000000 Changes lib/AnyData.pm lib/AnyData/Format/Base.pm lib/AnyData/Format/CSV.pm lib/AnyData/Format/FileSys.pm lib/AnyData/Format/Fixed.pm lib/AnyData/Format/HTMLtable.pm lib/AnyData/Format/Ini.pm lib/AnyData/Format/Mp3.pm lib/AnyData/Format/Paragraph.pm lib/AnyData/Format/Passwd.pm lib/AnyData/Format/Pipe.pm lib/AnyData/Format/Tab.pm lib/AnyData/Format/Text.pm lib/AnyData/Format/Weblog.pm lib/AnyData/Format/XML.pm lib/AnyData/Storage/File.pm lib/AnyData/Storage/File.pod lib/AnyData/Storage/FileSys.pm lib/AnyData/Storage/PassThru.pm lib/AnyData/Storage/RAM.pm lib/AnyData/Storage/TiedHash.pm LICENSE Makefile.PL MANIFEST This list of files MANIFEST.SKIP README t/56962.log t/56962.t t/api.t t/fixed.t t/fixed.tbl t/htmltable.t t/test.t t/weblog.t t/weblog.tbl t/xml.t META.yml Module YAML meta-data (added by MakeMaker) META.json Module JSON meta-data (added by MakeMaker) AnyData-0.12/MANIFEST.SKIP000644 000765 000024 00000000510 12462135307 014614 0ustar00snostaff000000 000000 #!include_default \B\.svn\b \.[Bb][Aa][Kk]$ \.orig$ \.old$ \.tdy$ \.tmp$ \..*swp ^Makefile$ ^Build$ CVS/.* \.svn/.* \.git \.git/.* \.cvsignore$ \.Inline/.* _Inline/.* \.bak$ \.tar$ \.tgz$ \.tar\.gz$ ^mess/ ^tmp/ ^testdata/ ^blib/ ^pm_to_blib$ ^_build/.* ~$ ^MYMETA.*$ AnyData-.* /t/htmltable.out \.[co] \.bs \bxt\b/ .travis.yml AnyData-0.12/META.json000644 000765 000024 00000003727 12462242165 014355 0ustar00snostaff000000 000000 { "abstract" : "(DEPRECATED) easy access to data in many formats", "author" : [ "Jeff Zucker (jeff@vpservices.com)", "Sven Dowideit (SvenDowideit@fosiki.com)" ], "dynamic_config" : 0, "generated_by" : "ExtUtils::MakeMaker version 7.04, CPAN::Meta::Converter version 2.143240", "license" : [ "perl_5" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "AnyData", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : {} }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "develop" : { "requires" : { "Module::CPANTS::Analyse" : "0.96", "Test::CPAN::Changes" : "0", "Test::CheckManifest" : "0", "Test::Kwalitee" : "0", "Test::Pod" : "0", "Test::Pod::Coverage" : "0", "Test::Pod::Spelling::CommonMistakes" : "0", "Test::Spelling" : "0" } }, "runtime" : { "requires" : { "CGI" : "0" } }, "test" : { "requires" : { "Test::More" : "0", "Test::Output" : "0" } } }, "release_status" : "stable", "resources" : { "bugtracker" : { "mailto" : "bug-AnyData@rt.cpan.org", "web" : "http://rt.cpan.org/Public/Dist/Display.html?Name=AnyData" }, "homepage" : "https://metacpan.org/release/AnyData", "license" : [ "http://dev.perl.org/licenses/" ], "repository" : { "type" : "git", "url" : "https://github.com/perl5-dbi/AnyData.git", "web" : "https://github.com/perl5-dbi/AnyData" }, "x_IRC" : "irc://irc.perl.org/#dbi", "x_MailingList" : "mailto:dbi-dev@perl.org'", "x_deprecated" : 1 }, "version" : "0.12" } AnyData-0.12/META.yml000644 000765 000024 00000001560 12462242165 014176 0ustar00snostaff000000 000000 --- abstract: '(DEPRECATED) easy access to data in many formats' author: - 'Jeff Zucker (jeff@vpservices.com)' - 'Sven Dowideit (SvenDowideit@fosiki.com)' build_requires: Test::More: '0' Test::Output: '0' configure_requires: ExtUtils::MakeMaker: '0' dynamic_config: 0 generated_by: 'ExtUtils::MakeMaker version 7.04, CPAN::Meta::Converter version 2.143240' license: perl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: AnyData no_index: directory: - t - inc requires: CGI: '0' resources: Deprecated: 1 IRC: irc://irc.perl.org/#dbi MailingList: "mailto:dbi-dev@perl.org'" bugtracker: http://rt.cpan.org/Public/Dist/Display.html?Name=AnyData homepage: https://metacpan.org/release/AnyData license: http://dev.perl.org/licenses/ repository: https://github.com/perl5-dbi/AnyData.git version: '0.12' AnyData-0.12/README000644 000765 000024 00000007040 12423400733 013576 0ustar00snostaff000000 000000 README FILE FOR PERL MODULE -- AnyData WHY USE IT? The AnyData modules provide simple and uniform access to data from many sources -- perl arrays, local files, remote files retrievable via http or ftp -- and in many formats including flat files (CSV, Fixed Length, Tab Delimited, etc), standard format files (Web Logs, Passwd files, etc.), structured files (XML, HTML Tables) and binary files with parseable headers (mp3s, jpgs, pngs, etc). There are two separate modules, each providing a different interface: AnyData.pm provides a simple tied hash interface and DBD::AnyData provides a DBI/SQL interface. You can use either or both depending on your needs. Here are a few examples of the tied hash interface: # FIND A USER'S HOME DIRECTORY IN A PASSWD FILE # my $users = adTie( 'Passwd', '/etc/passwd' ); print $users->{jdoe}->{homedir}; # DELETE A PLAYER FROM A PIPE DELIMITED GAMES DATABASE # my $players = adTie( 'Pipe', 'games.db', 'u' ); delete $players->{jdoe}; # RECURSIVELY LIST THE ARTISTS FOR ALL REGGAE MP3s # IN A SPECIFIED DIRECTORY TREE # my $music = adTie( 'Mp3', ['c:/My Music/'] ); while ( my $song = each %$music ) { print $song->{artist},"\n" if $song->{genre} eq 'Reggae'; } # RETRIEVE A CSV FILE FROM AN FTP SERVER # AND PRINT IT TO THE SCREEN AS AN HTML TABLE # # print adConvert( 'CSV', 'ftp://foo.edu/pub/bar.csv', 'HTMLtable' ); # COUNT THE NUMBER OF HITS FOR A SPECIFIED PAGE IN A WEB LOG # my $hits = adTie( 'Weblog', 'access.log'); print adCount( $hits , request => 'mypage.html' ); # CREATE A CGI POP-UP MENU FROM A LISTING # OF THE VALUES OF A TABLE COLUMN # my $game = adTie( 'Pipe','games.db' ); my @players = adColumn( $game, 'player' ); print CGI::popup_menu( 'players', \@players ); # SELECT OR MODIFY MULTIPLE ROWS BASED ON COMPLEX CRITERIA # (this deletes all North American males over age 30) # my $data = adTie( 'Tab', 'mydb.tab'); delete $data->{{ country => qr/us|mx|ca/, gender => 'eq m', age => '> 30', }}; WHAT ELSE DO I NEED? * Perl * Additional modules are required for some advanced features, see 'perldoc AnyData'. HOW DO I INSTALL IT? 1. Install Perl if not already installed 2. Unpack the compressed files. (AnyData-version.tar.gz or AnyData-version.zip) 3a. If you are not familiar with the standard Perl makefile method, you can simply copy the files 3b. If you are familiar with the standard Perl make installation, just do as always (perl Makefile.PL; make; make test; make install) this should also work with dmake or nmake. HOW DO I USE IT? First you might like to try this simple script which creates a database and inserts the string "hello new world" into a record and then retrieves the record and prints it: #!perl -w use strict; use AnyData; my $table = adTie ('CSV','test.db','o',{cols=>'id,phrase'}); $table->{1} = {phrase=>'hello new world'}; print $table->{1}->{phrase}. WHERE CAN I GET MORE INFO? After installing the module, type "perldoc AnyData" at the command prompt, or just read the documentation at the bottom of the AnyData.pm file. WHO DUNNIT? Jeff Zucker Feel free to email me comments and suggestions, but please post questions requiring a response to the comp.lang.perl.modules newsgroup. READ MORE AND GRAB THE MODULE AT http://www.vpservices.com/jeff/programs/AnyData/ Enjoy! AnyData-0.12/t/000755 000765 000024 00000000000 12462242164 013165 5ustar00snostaff000000 000000 AnyData-0.12/t/56962.log000644 000765 000024 00000102453 12423400733 014363 0ustar00snostaff000000 000000 192.168.192.149 - - [31/Mar/2010:00:00:00 +0800] "GET /cgi-bin/john/logoutjohn.fcgi HTTP/1.0" 200 7599 125.79.143.144 192.168.192.149 - - [31/Mar/2010:00:00:00 +0800] "GET /john/job_sy.html? HTTP/1.0" 200 10763 125.79.143.144 192.168.208.93 - - [31/Mar/2010:00:00:00 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7151 61.138.125.11 192.168.208.92 - - [31/Mar/2010:00:00:02 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 219.229.173.58 192.168.192.149 - - [31/Mar/2010:00:00:02 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=3 HTTP/1.0" 200 1483 120.68.47.26 192.168.208.93 - - [31/Mar/2010:00:00:03 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 60.28.129.250 192.168.208.93 - - [31/Mar/2010:00:00:04 +0800] "GET /john/job_gz_2010.html? HTTP/1.0" 200 16738 61.138.125.11 192.168.208.92 - - [31/Mar/2010:00:00:05 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 58.251.166.6 192.168.192.150 - - [31/Mar/2010:00:00:05 +0800] "GET /cgi-bin/bill/newusr.fcgi HTTP/1.0" 200 1587 58.51.82.14 192.168.192.149 - - [31/Mar/2010:00:00:05 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7147 125.79.143.144 192.168.192.149 - - [31/Mar/2010:00:00:05 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1481 222.76.51.189 192.168.192.150 - - [31/Mar/2010:00:00:07 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 218.80.214.142 192.168.192.150 - - [31/Mar/2010:00:00:07 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 218.80.214.142 192.168.208.93 - - [31/Mar/2010:00:00:08 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1485 112.93.87.177 192.168.208.92 - - [31/Mar/2010:00:00:08 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7142 110.243.49.79 192.168.192.150 - - [31/Mar/2010:00:00:08 +0800] "GET /cgi-bin/bill/newusr.fcgi HTTP/1.0" 200 1598 58.51.82.14 192.168.208.93 - - [31/Mar/2010:00:00:10 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1484 119.97.202.145 192.168.192.149 - - [31/Mar/2010:00:00:10 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=0 HTTP/1.0" 200 1197 125.79.143.144 192.168.208.92 - - [31/Mar/2010:00:00:10 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 60.23.181.42 192.168.192.150 - - [31/Mar/2010:00:00:11 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 116.226.80.131 192.168.208.93 - - [31/Mar/2010:00:00:12 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1485 119.97.202.145 192.168.192.148 - - [31/Mar/2010:00:00:12 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 59.39.159.221 192.168.208.93 - - [31/Mar/2010:00:00:13 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1195 119.5.51.215 192.168.208.93 - - [31/Mar/2010:00:00:13 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 112.93.87.177 192.168.192.150 - - [31/Mar/2010:00:00:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8797 218.80.214.142 192.168.208.92 - - [31/Mar/2010:00:00:14 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 124.160.46.39 192.168.192.148 - - [31/Mar/2010:00:00:14 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 9191 113.15.27.49 192.168.208.92 - - [31/Mar/2010:00:00:14 +0800] "GET /john/job_gz_2010.html? HTTP/1.0" 200 16738 119.109.23.248 192.168.192.149 - - [31/Mar/2010:00:00:14 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 222.76.51.189 192.168.192.149 - - [31/Mar/2010:00:00:15 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7152 116.52.145.132 192.168.192.148 - - [31/Mar/2010:00:00:15 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 113.15.27.49 192.168.208.93 - - [31/Mar/2010:00:00:15 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 398 119.97.202.145 192.168.192.150 - - [31/Mar/2010:00:00:16 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8810 113.140.9.242 192.168.208.93 - - [31/Mar/2010:00:00:16 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8795 58.20.153.180 192.168.192.150 - - [31/Mar/2010:00:00:19 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 122.231.3.44 192.168.208.93 - - [31/Mar/2010:00:00:19 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1484 119.97.202.145 192.168.208.93 - - [31/Mar/2010:00:00:19 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1481 119.5.51.215 192.168.208.92 - - [31/Mar/2010:00:00:20 +0800] "GET /cgi-bin/bill/gift.fcgi?type=2 HTTP/1.0" 302 235 112.240.bill.63 192.168.192.149 - - [31/Mar/2010:00:00:21 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 116.232.143.236 192.168.192.148 - - [31/Mar/2010:00:00:21 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7152 113.89.210.81 192.168.208.93 - - [31/Mar/2010:00:00:21 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7153 221.216.165.92 192.168.208.93 - - [31/Mar/2010:00:00:21 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1481 119.5.51.215 192.168.208.93 - - [31/Mar/2010:00:00:22 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 112.93.87.177 192.168.208.93 - - [31/Mar/2010:00:00:22 +0800] "GET /bill/job_sy.html? HTTP/1.0" 304 - 112.93.87.177 192.168.208.93 - - [31/Mar/2010:00:00:22 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 119.97.202.145 192.168.192.149 - - [31/Mar/2010:00:00:24 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 121.12.58.95 192.168.208.93 - - [31/Mar/2010:00:00:24 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1483 119.5.51.215 192.168.192.149 - - [31/Mar/2010:00:00:26 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7147 59.62.10.131 192.168.208.92 - - [31/Mar/2010:00:00:27 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 117.136.9.58 192.168.192.148 - - [31/Mar/2010:00:00:27 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 218.80.212.186 192.168.192.148 - - [31/Mar/2010:00:00:29 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8809 117.39.184.151 192.168.208.92 - - [31/Mar/2010:00:00:30 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 222.35.87.158 192.168.192.148 - - [31/Mar/2010:00:00:30 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 117.34.216.122 192.168.192.150 - - [31/Mar/2010:00:00:30 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 304 - 124.40.191.61 192.168.192.150 - - [31/Mar/2010:00:00:31 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7148 117.26.154.97 192.168.192.150 - - [31/Mar/2010:00:00:31 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 113.88.187.61 192.168.192.149 - - [31/Mar/2010:00:00:31 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=0 HTTP/1.0" 200 1197 59.62.10.131 192.168.192.148 - - [31/Mar/2010:00:00:32 +0800] "GET /cgi-bin/bill/gift.fcgi?type=1 HTTP/1.0" 302 235 222.84.234.13 192.168.192.148 - - [31/Mar/2010:00:00:32 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 220.189.245.194 192.168.208.93 - - [31/Mar/2010:00:00:33 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 121.138.98.19 192.168.208.93 - - [31/Mar/2010:00:00:33 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 61.158.169.253 192.168.192.148 - - [31/Mar/2010:00:00:34 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 218.86.103.165 192.168.192.150 - - [31/Mar/2010:00:00:34 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7145 218.5.227.106 192.168.192.148 - - [31/Mar/2010:00:00:34 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 220.169.14.112 192.168.208.93 - - [31/Mar/2010:00:00:35 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 58.19.198.7 192.168.192.148 - - [31/Mar/2010:00:00:35 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 122.243.245.135 192.168.192.148 - - [31/Mar/2010:00:00:36 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 122.243.245.135 192.168.192.150 - - [31/Mar/2010:00:00:36 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 113.88.187.61 192.168.208.92 - - [31/Mar/2010:00:00:38 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 58.251.242.52 192.168.192.150 - - [31/Mar/2010:00:00:38 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 118.116.23.8 192.168.192.149 - - [31/Mar/2010:00:00:38 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 222.216.69.87 192.168.208.92 - - [31/Mar/2010:00:00:38 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 119.4.120.117 192.168.208.93 - - [31/Mar/2010:00:00:39 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 58.19.198.7 192.168.208.93 - - [31/Mar/2010:00:00:40 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 119.5.51.215 192.168.192.148 - - [31/Mar/2010:00:00:40 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 117.36.32.175 192.168.192.150 - - [31/Mar/2010:00:00:41 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 59.37.241.218 192.168.208.92 - - [31/Mar/2010:00:00:42 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 119.4.120.117 192.168.192.149 - - [31/Mar/2010:00:00:42 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 116.28.234.73 192.168.192.150 - - [31/Mar/2010:00:00:43 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 121.32.2.188 192.168.192.149 - - [31/Mar/2010:00:00:43 +0800] "GET /cgi-bin/bill/gift.fcgi?type=4 HTTP/1.0" 302 235 113.106.209.100 192.168.208.93 - - [31/Mar/2010:00:00:44 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 121.138.98.19 192.168.208.93 - - [31/Mar/2010:00:00:45 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 61.158.169.253 192.168.208.92 - - [31/Mar/2010:00:00:45 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 222.35.87.158 192.168.208.92 - - [31/Mar/2010:00:00:45 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 222.35.87.158 192.168.208.93 - - [31/Mar/2010:00:00:45 +0800] "GET /cgi-bin/bill/gift.fcgi?type=1 HTTP/1.0" 302 235 124.160.98.2 192.168.208.93 - - [31/Mar/2010:00:00:46 +0800] "GET /john/job_sy.html?ckjx9ca HTTP/1.0" 200 10763 60.243.172.211 192.168.192.149 - - [31/Mar/2010:00:00:47 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 125.75.232.87 192.168.192.150 - - [31/Mar/2010:00:00:47 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 304 - 113.88.187.61 192.168.208.93 - - [31/Mar/2010:00:00:47 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7151 119.182.155.56 192.168.192.148 - - [31/Mar/2010:00:00:47 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 113.106.209.99 192.168.208.92 - - [31/Mar/2010:00:00:48 +0800] "GET /cgi-bin/bill/newusr.fcgi HTTP/1.0" 200 1587 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:00:48 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 222.213.206.72 192.168.192.148 - - [31/Mar/2010:00:00:49 +0800] "GET /john/job_gz_2010.html? HTTP/1.0" 200 16738 117.36.32.175 192.168.192.148 - - [31/Mar/2010:00:00:49 +0800] "GET /bill/job_gz_2010.html? HTTP/1.0" 200 17987 117.34.216.122 192.168.208.92 - - [31/Mar/2010:00:00:50 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 119.183.42.3 192.168.192.148 - - [31/Mar/2010:00:00:50 +0800] "GET /cgi-bin/bill/gift.fcgi?type=1 HTTP/1.0" 302 235 222.84.234.13 192.168.192.148 - - [31/Mar/2010:00:00:50 +0800] "GET /john/job_sy.html?ckjx9ca HTTP/1.0" 200 10763 125.72.139.34 192.168.208.92 - - [31/Mar/2010:00:00:50 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:00:52 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 183.5.5.50 192.168.192.148 - - [31/Mar/2010:00:00:52 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 222.84.234.13 192.168.192.150 - - [31/Mar/2010:00:00:52 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 116.53.69.133 192.168.208.93 - - [31/Mar/2010:00:00:53 +0800] "GET /john/job_yjhd.html?sdkj989 HTTP/1.0" 200 14065 60.243.172.211 192.168.208.93 - - [31/Mar/2010:00:00:53 +0800] "GET /cgi-bin/bill/gift.fcgi?type=1 HTTP/1.0" 302 235 110.251.143.243 192.168.208.92 - - [31/Mar/2010:00:00:53 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1195 61.149.133.120 192.168.208.93 - - [31/Mar/2010:00:00:54 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 116.117.57.99 192.168.192.148 - - [31/Mar/2010:00:00:54 +0800] "GET /cgi-bin/bill/loginbill.fcgi?type=3&username=chengling201314 HTTP/1.0" 200 8802 113.15.27.49 192.168.192.150 - - [31/Mar/2010:00:00:54 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 114.138.22.48 192.168.192.148 - - [31/Mar/2010:00:00:54 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1483 218.86.103.165 192.168.192.149 - - [31/Mar/2010:00:00:55 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 222.216.69.87 192.168.192.148 - - [31/Mar/2010:00:00:56 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 113.133.74.135 192.168.208.93 - - [31/Mar/2010:00:00:56 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 202.108.50.70 192.168.208.93 - - [31/Mar/2010:00:00:56 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 124.160.98.2 192.168.192.148 - - [31/Mar/2010:00:00:58 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7146 61.152.251.88 192.168.192.150 - - [31/Mar/2010:00:00:58 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7145 222.208.224.113 192.168.192.148 - - [31/Mar/2010:00:01:00 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 113.15.27.49 192.168.208.93 - - [31/Mar/2010:00:01:00 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 121.138.98.19 192.168.208.92 - - [31/Mar/2010:00:01:01 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 113.229.91.227 192.168.192.150 - - [31/Mar/2010:00:01:01 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 121.35.115.74 192.168.192.148 - - [31/Mar/2010:00:01:01 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1485 218.86.103.165 192.168.208.92 - - [31/Mar/2010:00:01:01 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=2 HTTP/1.0" 200 890 117.136.9.58 192.168.208.92 - - [31/Mar/2010:00:01:02 +0800] "GET /cgi-bin/bill/newusr.fcgi HTTP/1.0" 200 1598 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:01:02 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 113.133.74.135 192.168.208.92 - - [31/Mar/2010:00:01:03 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 60.29.168.46 192.168.192.150 - - [31/Mar/2010:00:01:03 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 113.88.187.61 192.168.208.92 - - [31/Mar/2010:00:01:04 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=1 HTTP/1.0" 200 890 117.136.9.58 192.168.208.92 - - [31/Mar/2010:00:01:05 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 222.35.87.158 192.168.192.148 - - [31/Mar/2010:00:01:05 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1481 218.86.103.165 192.168.192.148 - - [31/Mar/2010:00:01:06 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8796 59.40.132.41 192.168.192.149 - - [31/Mar/2010:00:01:06 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 117.95.37.227 192.168.192.149 - - [31/Mar/2010:00:01:06 +0800] "GET /cgi-bin/john/billjohn.fcgi?tm=2 HTTP/1.0" 200 890 116.52.145.132 192.168.208.92 - - [31/Mar/2010:00:01:07 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 123.15.45.242 192.168.192.148 - - [31/Mar/2010:00:01:07 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 117.39.184.151 192.168.208.93 - - [31/Mar/2010:00:01:08 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8808 221.4.154.10 192.168.192.149 - - [31/Mar/2010:00:01:08 +0800] "GET /cgi-bin/john/billjohn.fcgi?tm=1 HTTP/1.0" 200 890 116.52.145.132 192.168.208.92 - - [31/Mar/2010:00:01:10 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=0 HTTP/1.0" 200 890 117.136.9.58 192.168.192.149 - - [31/Mar/2010:00:01:10 +0800] "GET /cgi-bin/john/billjohn.fcgi?tm=0 HTTP/1.0" 200 890 116.52.145.132 192.168.192.149 - - [31/Mar/2010:00:01:10 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1486 219.133.179.203 192.168.208.92 - - [31/Mar/2010:00:01:10 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:01:11 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1193 218.86.103.165 192.168.192.148 - - [31/Mar/2010:00:01:12 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 113.106.209.99 192.168.192.149 - - [31/Mar/2010:00:01:12 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7150 58.34.84.149 192.168.208.92 - - [31/Mar/2010:00:01:12 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=1 HTTP/1.0" 200 890 117.136.9.58 192.168.192.149 - - [31/Mar/2010:00:01:12 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=3 HTTP/1.0" 200 1481 116.52.145.132 192.168.192.148 - - [31/Mar/2010:00:01:12 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 119.john.197.17 192.168.192.148 - - [31/Mar/2010:00:01:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 116.23.13.232 192.168.208.92 - - [31/Mar/2010:00:01:13 +0800] "GET /_vti_bin/owssvr.dll?UL=1&ACT=4&BUILD=8164&STRMVER=4&CAPREQ=0 HTTP/1.0" 404 217 58.255.32.1 192.168.208.92 - - [31/Mar/2010:00:01:14 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7145 58.255.32.1 192.168.208.92 - - [31/Mar/2010:00:01:14 +0800] "GET /bill/job_gz_2010.html? HTTP/1.0" 200 17987 60.29.168.46 192.168.192.149 - - [31/Mar/2010:00:01:15 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 113.115.183.77 192.168.208.92 - - [31/Mar/2010:00:01:15 +0800] "GET /MSOffice/cltreq.asp?UL=1&ACT=4&BUILD=8164&STRMVER=4&CAPREQ=0 HTTP/1.0" 404 217 58.255.32.1 192.168.192.149 - - [31/Mar/2010:00:01:15 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=2 HTTP/1.0" 200 1483 116.52.145.132 192.168.192.148 - - [31/Mar/2010:00:01:15 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8809 58.214.34.75 192.168.208.93 - - [31/Mar/2010:00:01:16 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 218.200.118.202 192.168.208.93 - - [31/Mar/2010:00:01:16 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 218.200.118.202 192.168.208.93 - - [31/Mar/2010:00:01:18 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1196 221.4.154.10 192.168.192.149 - - [31/Mar/2010:00:01:18 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=2 HTTP/1.0" 200 890 219.133.179.203 192.168.192.148 - - [31/Mar/2010:00:01:19 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 113.92.116.64 192.168.192.149 - - [31/Mar/2010:00:01:19 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=1 HTTP/1.0" 200 890 219.133.179.203 192.168.192.148 - - [31/Mar/2010:00:01:19 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 118.249.118.101 192.168.208.92 - - [31/Mar/2010:00:01:20 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1486 117.136.9.58 192.168.192.149 - - [31/Mar/2010:00:01:20 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 117.22.14.167 192.168.192.149 - - [31/Mar/2010:00:01:21 +0800] "GET /cgi-bin/bill/bill.fcgi?tm=0 HTTP/1.0" 200 890 219.133.179.203 192.168.192.148 - - [31/Mar/2010:00:01:21 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7146 114.106.204.183 192.168.192.149 - - [31/Mar/2010:00:01:22 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 222.242.106.86 192.168.192.149 - - [31/Mar/2010:00:01:24 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7147 125.79.143.144 192.168.192.148 - - [31/Mar/2010:00:01:24 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=0 HTTP/1.0" 200 1197 118.249.118.101 192.168.208.93 - - [31/Mar/2010:00:01:24 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=3 HTTP/1.0" 200 447 58.255.141.58 192.168.192.149 - - [31/Mar/2010:00:01:25 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=1 HTTP/1.0" 200 1483 116.52.145.132 192.168.208.92 - - [31/Mar/2010:00:01:25 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1485 117.136.9.58 192.168.208.93 - - [31/Mar/2010:00:01:27 +0800] "GET /john/job_yjhd.html?sdkj989 HTTP/1.0" 200 14065 60.28.129.250 192.168.192.149 - - [31/Mar/2010:00:01:27 +0800] "GET /cgi-bin/john/logoutjohn.fcgi HTTP/1.0" 200 7599 125.79.143.144 192.168.208.92 - - [31/Mar/2010:00:01:27 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 123.128.64.131 192.168.192.149 - - [31/Mar/2010:00:01:27 +0800] "GET /john/job_sy.html? HTTP/1.0" 304 - 125.79.143.144 192.168.192.150 - - [31/Mar/2010:00:01:28 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 121.32.2.188 192.168.192.148 - - [31/Mar/2010:00:01:28 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 218.86.103.165 192.168.192.148 - - [31/Mar/2010:00:01:28 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8809 218.85.148.22 192.168.208.93 - - [31/Mar/2010:00:01:28 +0800] "GET /cgi-bin/bill/gift.fcgi?type=3 HTTP/1.0" 302 235 124.160.98.2 192.168.192.149 - - [31/Mar/2010:00:01:28 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=0 HTTP/1.0" 200 1193 116.52.145.132 192.168.208.92 - - [31/Mar/2010:00:01:29 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1485 117.136.9.58 192.168.192.148 - - [31/Mar/2010:00:01:29 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 218.78.209.20 192.168.208.92 - - [31/Mar/2010:00:01:30 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 119.183.42.3 192.168.208.92 - - [31/Mar/2010:00:01:32 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1196 117.136.9.58 192.168.208.93 - - [31/Mar/2010:00:01:33 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 117.136.9.65 192.168.208.92 - - [31/Mar/2010:00:01:33 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:01:34 +0800] "GET /john/notice_newrule.html? HTTP/1.0" 200 3277 61.152.251.88 192.168.192.149 - - [31/Mar/2010:00:01:34 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 113.15.23.197 192.168.192.148 - - [31/Mar/2010:00:01:34 +0800] "GET /cgi-bin/bill/gift.fcgi?type=1 HTTP/1.0" 302 235 118.122.85.157 192.168.192.149 - - [31/Mar/2010:00:01:35 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 113.90.184.254 192.168.192.149 - - [31/Mar/2010:00:01:35 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7150 119.60.199.119 192.168.192.148 - - [31/Mar/2010:00:01:35 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 113.104.0.85 192.168.192.148 - - [31/Mar/2010:00:01:36 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8810 59.62.83.174 192.168.208.92 - - [31/Mar/2010:00:01:38 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8806 122.95.114.63 192.168.208.93 - - [31/Mar/2010:00:01:38 +0800] "GET /john/job_sy.html?ckjx9ca HTTP/1.0" 200 10763 60.28.129.250 192.168.192.148 - - [31/Mar/2010:00:01:38 +0800] "GET /cgi-bin/john/index.fcgi?index=rule HTTP/1.0" 200 7151 61.152.251.88 192.168.208.92 - - [31/Mar/2010:00:01:39 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 116.77.229.25 192.168.192.148 - - [31/Mar/2010:00:01:40 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 218.17.234.92 192.168.208.92 - - [31/Mar/2010:00:01:41 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=3 HTTP/1.0" 200 1481 119.109.23.248 192.168.192.150 - - [31/Mar/2010:00:01:42 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 117.36.88.27 192.168.192.150 - - [31/Mar/2010:00:01:42 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7145 119.144.233.182 192.168.192.148 - - [31/Mar/2010:00:01:42 +0800] "GET /john/job_gz_2010.html? HTTP/1.0" 200 16738 218.78.209.20 192.168.208.92 - - [31/Mar/2010:00:01:42 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 222.35.87.158 192.168.192.149 - - [31/Mar/2010:00:01:42 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8793 219.133.114.244 192.168.192.150 - - [31/Mar/2010:00:01:42 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 123.97.143.102 192.168.208.92 - - [31/Mar/2010:00:01:43 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=2 HTTP/1.0" 200 1483 119.109.23.248 192.168.192.148 - - [31/Mar/2010:00:01:44 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 113.142.17.64 192.168.208.92 - - [31/Mar/2010:00:01:46 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7146 119.6.16.114 192.168.192.148 - - [31/Mar/2010:00:01:46 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 119.41.200.143 192.168.192.149 - - [31/Mar/2010:00:01:46 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 304 - 219.133.179.203 192.168.192.150 - - [31/Mar/2010:00:01:49 +0800] "GET /john/job_sy.html?ckjx9ca HTTP/1.0" 200 10763 121.35.115.74 192.168.192.150 - - [31/Mar/2010:00:01:50 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7147 58.49.233.156 192.168.208.93 - - [31/Mar/2010:00:01:50 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7149 219.150.john.218 192.168.192.148 - - [31/Mar/2010:00:01:51 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 221.224.131.210 192.168.192.148 - - [31/Mar/2010:00:01:51 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 118.122.85.157 192.168.192.149 - - [31/Mar/2010:00:01:51 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7152 117.25.176.98 192.168.208.92 - - [31/Mar/2010:00:01:51 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=1 HTTP/1.0" 200 1481 119.109.23.248 192.168.208.92 - - [31/Mar/2010:00:01:52 +0800] "GET /cgi-bin/bill/gift.fcgi?type=2 HTTP/1.0" 302 235 110.252.199.206 192.168.192.149 - - [31/Mar/2010:00:01:53 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 116.52.48.183 192.168.208.92 - - [31/Mar/2010:00:01:54 +0800] "GET /cgi-bin/john/flowjohn.fcgi?tm=0 HTTP/1.0" 200 1195 119.109.23.248 192.168.192.149 - - [31/Mar/2010:00:01:55 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 122.4.32.188 192.168.192.150 - - [31/Mar/2010:00:01:56 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7150 219.150.142.227 192.168.192.149 - - [31/Mar/2010:00:01:56 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 9191 119.144.58.33 192.168.192.148 - - [31/Mar/2010:00:01:56 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7147 123.54.174.131 192.168.192.149 - - [31/Mar/2010:00:01:57 +0800] "GET /bill/job_sy.html? HTTP/1.0" 304 - 119.144.58.33 192.168.192.149 - - [31/Mar/2010:00:01:57 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7144 218.22.151.227 192.168.192.148 - - [31/Mar/2010:00:01:57 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 117.34.216.122 192.168.192.148 - - [31/Mar/2010:00:01:58 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 117.34.216.122 192.168.192.149 - - [31/Mar/2010:00:01:58 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7144 117.40.28.107 192.168.192.148 - - [31/Mar/2010:00:01:59 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 116.228.153.94 192.168.208.93 - - [31/Mar/2010:00:01:59 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 124.165.230.222 192.168.192.149 - - [31/Mar/2010:00:02:00 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 183.1.23.206 192.168.208.92 - - [31/Mar/2010:00:02:00 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1486 117.136.9.58 192.168.208.92 - - [31/Mar/2010:00:02:01 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7148 58.248.149.114 192.168.192.149 - - [31/Mar/2010:00:02:00 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1481 116.28.234.73 192.168.192.150 - - [31/Mar/2010:00:02:01 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 222.77.19.219 192.168.192.149 - - [31/Mar/2010:00:02:01 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 118.116.137.72 192.168.208.93 - - [31/Mar/2010:00:02:01 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8796 119.112.191.63 192.168.208.92 - - [31/Mar/2010:00:02:02 +0800] "GET /cgi-bin/bill/newusr.fcgi HTTP/1.0" 200 1598 60.29.168.46 192.168.192.148 - - [31/Mar/2010:00:02:02 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 113.92.116.64 192.168.208.92 - - [31/Mar/2010:00:02:04 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 60.29.168.46 192.168.192.149 - - [31/Mar/2010:00:02:04 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1481 116.28.234.73 192.168.192.150 - - [31/Mar/2010:00:02:05 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 114.229.232.27 192.168.208.93 - - [31/Mar/2010:00:02:05 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.192.149 - - [31/Mar/2010:00:02:06 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1481 116.28.234.73 192.168.208.92 - - [31/Mar/2010:00:02:06 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 219.238.42.200 192.168.208.92 - - [31/Mar/2010:00:02:06 +0800] "GET /john/job_sy.html?ckjx9ca HTTP/1.0" 200 10763 119.109.23.248 192.168.208.92 - - [31/Mar/2010:00:02:07 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8805 222.175.165.26 192.168.208.93 - - [31/Mar/2010:00:02:07 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.192.149 - - [31/Mar/2010:00:02:08 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1195 116.28.234.73 192.168.192.149 - - [31/Mar/2010:00:02:08 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7152 59.37.162.249 192.168.208.93 - - [31/Mar/2010:00:02:08 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.208.93 - - [31/Mar/2010:00:02:08 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.208.92 - - [31/Mar/2010:00:02:09 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7145 58.255.32.1 192.168.208.92 - - [31/Mar/2010:00:02:09 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8801 113.1.87.189 192.168.208.93 - - [31/Mar/2010:00:02:09 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.208.93 - - [31/Mar/2010:00:02:10 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8809 119.51.190.50 192.168.208.92 - - [31/Mar/2010:00:02:10 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8797 123.15.45.242 192.168.208.93 - - [31/Mar/2010:00:02:11 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.192.150 - - [31/Mar/2010:00:02:12 +0800] "GET /bill/job_sy.html?sdf2n0 HTTP/1.0" 200 10603 121.32.2.188 192.168.208.93 - - [31/Mar/2010:00:02:12 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.208.93 - - [31/Mar/2010:00:02:12 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 119.248.110.130 192.168.192.148 - - [31/Mar/2010:00:02:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8810 61.178.131.103 192.168.208.92 - - [31/Mar/2010:00:02:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 123.123.106.93 192.168.208.93 - - [31/Mar/2010:00:02:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8810 61.49.64.204 192.168.192.149 - - [31/Mar/2010:00:02:13 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 183.2.137.136 192.168.192.150 - - [31/Mar/2010:00:02:13 +0800] "GET /cgi-bin/john/logoutjohn.fcgi HTTP/1.0" 200 7599 121.35.115.74 192.168.192.150 - - [31/Mar/2010:00:02:14 +0800] "GET /john/job_sy.html? HTTP/1.0" 200 10763 121.35.115.74 192.168.192.150 - - [31/Mar/2010:00:02:14 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 304 - 121.32.2.188 192.168.192.149 - - [31/Mar/2010:00:02:15 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 116.28.234.73 192.168.208.93 - - [31/Mar/2010:00:02:15 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1197 119.51.190.50 192.168.208.92 - - [31/Mar/2010:00:02:15 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8799 124.134.190.107 192.168.192.148 - - [31/Mar/2010:00:02:17 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 116.207.94.1 192.168.208.92 - - [31/Mar/2010:00:02:18 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1195 124.134.190.107 192.168.192.150 - - [31/Mar/2010:00:02:18 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1483 59.40.134.15 192.168.208.93 - - [31/Mar/2010:00:02:19 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8808 219.221.108.157 192.168.192.150 - - [31/Mar/2010:00:02:20 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1481 59.40.134.15 192.168.192.149 - - [31/Mar/2010:00:02:20 +0800] "GET /cgi-bin/bill/logoutbill.fcgi HTTP/1.0" 200 9191 117.95.37.227 192.168.192.149 - - [31/Mar/2010:00:02:21 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 117.95.37.227 192.168.192.150 - - [31/Mar/2010:00:02:21 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1481 113.90.84.3 192.168.192.148 - - [31/Mar/2010:00:02:22 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8804 113.89.75.233 192.168.208.93 - - [31/Mar/2010:00:02:22 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7153 123.115.149.124 192.168.192.150 - - [31/Mar/2010:00:02:22 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8800 219.133.249.56 192.168.208.93 - - [31/Mar/2010:00:02:22 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8802 218.28.96.17 192.168.192.150 - - [31/Mar/2010:00:02:22 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1193 113.90.84.3 192.168.208.93 - - [31/Mar/2010:00:02:23 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7153 218.29.88.62 192.168.208.92 - - [31/Mar/2010:00:02:24 +0800] "GET /cgi-bin/john/index.fcgi HTTP/1.0" 200 7150 221.207.145.180 192.168.192.149 - - [31/Mar/2010:00:02:25 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 122.4.32.188 192.168.192.150 - - [31/Mar/2010:00:02:26 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 59.57.157.182 192.168.208.92 - - [31/Mar/2010:00:02:26 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 9191 114.241.187.229 192.168.192.149 - - [31/Mar/2010:00:02:26 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8798 117.95.37.227 192.168.208.92 - - [31/Mar/2010:00:02:26 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1481 113.229.91.227 192.168.208.92 - - [31/Mar/2010:00:02:27 +0800] "GET /bill/job_sy.html? HTTP/1.0" 200 10603 114.241.187.229 192.168.192.149 - - [31/Mar/2010:00:02:27 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=3 HTTP/1.0" 200 1483 222.242.106.86 192.168.192.150 - - [31/Mar/2010:00:02:28 +0800] "GET /bill/job_yjhd.html?dselkn8 HTTP/1.0" 200 16742 219.133.249.56 192.168.208.92 - - [31/Mar/2010:00:02:28 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=2 HTTP/1.0" 200 1481 113.229.91.227 192.168.192.148 - - [31/Mar/2010:00:02:29 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 117.35.91.238 192.168.208.92 - - [31/Mar/2010:00:02:29 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=1 HTTP/1.0" 200 1481 113.229.91.227 192.168.192.149 - - [31/Mar/2010:00:02:29 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1195 222.242.106.86 192.168.192.148 - - [31/Mar/2010:00:02:29 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8807 116.207.94.1 192.168.208.93 - - [31/Mar/2010:00:02:30 +0800] "GET /cgi-bin/bill/index.fcgi HTTP/1.0" 200 8803 121.138.98.19 192.168.208.92 - - [31/Mar/2010:00:02:30 +0800] "GET /cgi-bin/bill/flow.fcgi?tm=0 HTTP/1.0" 200 1193 113.229.91.227 AnyData-0.12/t/56962.t000644 000765 000024 00000001533 12423400733 014042 0ustar00snostaff000000 000000 #!perl use strict; use warnings; use Test::More; eval { require DBI; require DBD::AnyData; }; plan skip_all => 'extra test for RT#56962 - needs DBI & DBD::AnyData' if ($@); plan tests => 1; my $file = shift; $file ||= 't/56962.log'; my $sql = qq{ select remotehost,count(remotehost) as crh from accesslog group by remotehost order by crh desc }; my $dbh = DBI->connect("dbi:AnyData(RaiseError=>1):"); $dbh->func( 'accesslog', 'Weblog', $file, 'ad_catalog' ); my $sth = $dbh->prepare($sql); $sth->execute(); my $test_output = ''; while ( my @res = $sth->fetchrow_array ) { $test_output = $test_output . join( '|', @res ) . "\n"; } $sth->finish(); $dbh->disconnect(); ok( $test_output eq <<'HERE', "sort test: \n" . $test_output ); 192.168.208.92|68 192.168.192.148|65 192.168.192.149|63 192.168.208.93|62 192.168.192.150|42 HERE AnyData-0.12/t/api.t000644 000765 000024 00000001053 12423400733 014115 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -wT use strict; use warnings; use AnyData; use Test::More; my $table = adTie( 'CSV', ["word,number\none,1\ntwo,2\nthree,3\nunknown\nunknowncomma,\nzero,0"] ); eval { require Test::Output; Test::Output->import(); }; if ($@) { plan tests => 1; } else { plan tests => 2; #adDump prints to SDTOUT :/ stdout_is( sub { adDump($table) }, <<'HERE', 'export fixed format' ); [one][1] [two][2] [three][3] [unknown][] [unknowncomma][] [zero][0] HERE } ok( 6 == adRows($table), "Failed rows" ); __END__ AnyData-0.12/t/fixed.t000644 000765 000024 00000001541 12423400733 014445 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -wT use strict; use warnings; use Test::More; plan tests => 6; use AnyData; my $table = adTie( 'Fixed', 't/fixed.tbl', 'r', { pattern => 'A11 A2' } ); ok( 6 == adRows($table), "Failed rows" ); ok( 'au' eq $table->{'australia'}->{code}, 'select one' ); ok( 'ch' eq $table->{'switzerland'}->{code}, 'select another' ); ok( '0' eq $table->{'broken'}->{code}, 'select another' ); ok( ' 0' eq $table->{'broken2'}->{code}, 'select another' ); #write test ok( <<'HERE' eq adExport( $table, 'Fixed', undef, { pattern => 'A11 A2' } ), 'export fixed format' ); country co australia au germany de france fr switzerlandch broken 0 broken2 0 HERE #TODO: note that the docco says the column names need to be comma separated, and the input file has 'country,code', thus the written file would be busted too __END__ AnyData-0.12/t/fixed.tbl000644 000765 000024 00000000140 12423400733 014755 0ustar00snostaff000000 000000 country,code australia au germany de france fr switzerlandch broken 0 broken2 0 AnyData-0.12/t/htmltable.t000644 000765 000024 00000001474 12423400733 015327 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -w use strict; use warnings; use Test::More; plan tests => 3; use AnyData; my $table = adTie( 'CSV', ["word,number\none,1\ntwo,2\nthree,3\nunknown\nunknowncomma,\nzero,0"] ); ok( 6 == adRows($table), "Failed rows" ); adExport( $table, "HTMLtable", 't/htmltable.out' ); ok( open( my $fh, '<', 't/htmltable.out' ), 'open file' ); local $\ = ''; my $result = <$fh>; #print STDERR "\n---\n"; #print STDERR "$result"; #print STDERR "\n---\n"; ok( $result eq 't/htmltable.out
word number
one 1
two 2
three 3
unknown
unknowncomma  
zero  
', 'xml export ok' ); __END__ AnyData-0.12/t/test.t000755 000765 000024 00000002764 12423400733 014340 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -wT use strict; use warnings; #the original tests that came with AnyData 0.10 my @formats = qw(CSV Pipe Tab Fixed Paragraph ARRAY); use Test::More; plan tests => ( 1 + $#formats ) * 6; use AnyData; for my $format (@formats) { test_ad($format); } sub test_ad { my $file = []; my $format = shift; my $mode = 'o'; my $flags = { cols => 'name,country,sex', pattern => 'A5 A8 A3' }; my $table = adTie( $format, $file, $mode, $flags ); # create a table $table->{Sue} = { country => 'fr', sex => 'f' }; # insert rows $table->{Tom} = { country => 'fr', sex => 'f' }; $table->{Bev} = { country => 'en', sex => 'f' }; $table->{Nel} = { country => 'en', sex => 'f' }; $table->{Pam} = { country => 'au', sex => 'f' }; $table->{ { name => 'Tom' } } = { sex => 'm' }; # update a row delete $table->{Bev}; # delete a row $flags = { pattern => 'A5 A8 A3' }; ok( 'f' eq $table->{Sue}->{sex}, "Failed single select" ); my $tstr; while ( my $person = each %$table ) { # select mulitple rows $tstr .= $person->{name} if $person->{country} eq 'fr'; } ok( 'SueTom' eq $tstr, "Failed multiple select" ); ok( 'namecountrysex' eq join( '', adNames($table) ), "Failed names" ); ok( 4 == adRows($table), "Failed rows" ); ok( 4 == adColumn( $table, 'country' ), "total number of rows" ); ok( 3 == adColumn( $table, 'country', 1 ), "distinct countries" ); } __END__ AnyData-0.12/t/weblog.t000644 000765 000024 00000002750 12423400733 014630 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -wT use strict; use warnings; use Test::More; plan tests => 10; use AnyData; my $table = adTie( 'Weblog', 't/weblog.tbl', 'r', {} ); ok( 1 == adRows($table), "Failed rows" ); #remotehost,username,authuser,date,request,status,bytes,client,referer #12.34.56.78 - - [13/Mar/2008:07:38:53 +0100] "GET /creeper/image HTTP/1.1" 200 252 "http://www.example.com/" "Mozilla/5.0 (Windows; U; Windows NT 6.0; sv-SE; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12" my $row = each %$table; ok( '12.34.56.78' eq $row->{remotehost}, 'remotehost' ); ok( '-' eq $row->{username}, 'username' ); ok( '-' eq $row->{authuser}, 'authuser' ); ok( '13/Mar/2008:07:38:53 +0100' eq $row->{date}, 'date' ); ok( 'GET /creeper/image HTTP/1.1' eq $row->{request}, 'request' ); ok( '200' eq $row->{status}, 'status' ); ok( '252' eq $row->{bytes}, 'bytes' ); ok( '"Mozilla/5.0 (Windows; U; Windows NT 6.0; sv-SE; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"' eq $row->{client}, 'client ' . $row->{client} ); ok( '"http://www.example.com/"' eq $row->{referer}, 'referer: ' . $row->{referer} ); #write test #TODO: looks like writing a weblog is broken #print STDERR "\n---\n"; #print STDERR adExport( $table, 'Weblog', undef, { } ); #print STDERR "\n---\n"; #ok( # <<'HERE' eq adExport( $table, 'Weblog', undef, { } ), 'export weblog format' ); #HERE __END__ AnyData-0.12/t/weblog.tbl000644 000765 000024 00000000314 12423400733 015140 0ustar00snostaff000000 000000 12.34.56.78 - - [13/Mar/2008:07:38:53 +0100] "GET /creeper/image HTTP/1.1" 200 252 "http://www.example.com/" "Mozilla/5.0 (Windows; U; Windows NT 6.0; sv-SE; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12" AnyData-0.12/t/xml.t000644 000765 000024 00000001540 12423400733 014145 0ustar00snostaff000000 000000 #!/usr/local/bin/perl -w use strict; use warnings; use Test::More; eval 'use XML::Twig;'; plan( skip_all => 'XML::Twig not installed; skipping' ) if $@; plan tests => 3; use AnyData; my $table = adTie( 'CSV', ["word,number\none,1\ntwo,2\nthree,3\nunknown\nunknowncomma,\nzero,0"] ); ok( 6 == adRows($table), "Failed rows" ); adExport( $table, "XML", 't/xml.out' ); ok( open( my $fh, '<', 't/xml.out' ), 'open file' ); local $\ = ''; my $result = <$fh>; #print STDERR "\n---\n"; #print STDERR "$result"; #print STDERR "\n---\n"; ok( $result eq 'one1two2three3unknownunknowncommazero0
', 'xml export ok' ); __END__ AnyData-0.12/lib/AnyData/000755 000765 000024 00000000000 12462242164 015011 5ustar00snostaff000000 000000 AnyData-0.12/lib/AnyData.pm000755 000765 000024 00000150633 12462133234 015357 0ustar00snostaff000000 000000 ################################################################## package AnyData; ################################################################### # # This module is copyright (c), 2000 by Jeff Zucker # All rights reserved. # ################################################################### use strict; use warnings; require Exporter; use AnyData::Storage::TiedHash; use vars qw( @ISA @EXPORT $VERSION ); @ISA = qw(Exporter); @EXPORT = qw( adConvert adTie adRows adColumn adExport adDump adNames adFormats); #@EXPORT = qw( ad_fields adTable adErr adArray); $VERSION = '0.12'; sub new { my $class = shift; my $format = shift; my $flags = shift || {}; my $del_marker = "\0"; $format = 'CSV' if $format eq 'ARRAY'; my $parser_name = 'AnyData/Format/' . $format . '.pm'; eval { require $parser_name; }; die "Error Opening File-Parser: $@" if $@; $parser_name =~ s#/#::#g; $parser_name =~ s#\.pm$##g; my $col_names = $flags->{col_names} || undef; if ($col_names) { my @cols; @cols = ref $col_names eq 'ARRAY' ? @$col_names : split ',',$col_names; $flags->{col_names} = \@cols; } $flags->{del_marker} = $del_marker; $flags->{records} ||= $flags->{data}; $flags->{field_sep} ||= $flags->{sep_char} ||= $flags->{ad_sep_char}; $flags->{quote} ||= $flags->{quote_char} ||= $flags->{ad_quote_char}; $flags->{escape} ||= $flags->{escape_char}||= $flags->{ad_escape_char}; $flags->{record_sep}||= $flags->{eol} ||= $flags->{ad_eol}; # $flags->{skip_first_row} my $parser = $parser_name->new ($flags); if ($parser->{col_names} && !$col_names) { my @cols; @cols = ref $parser->{col_names} eq 'ARRAY' ? @{$parser->{col_names}} : split ',',$parser->{col_names}; $flags->{col_names} = \@cols; $parser->{col_names} = \@cols; } my $storage_name = $flags->{storage} || $parser->storage_type() || 'File'; $storage_name = "AnyData/Storage/$storage_name.pm"; eval { require $storage_name; }; die "Error Opening Storage Module: $@" if $@; $storage_name =~ s#/#::#g; $storage_name =~ s#\.pm$##g; my $storage = new $storage_name({del_marker=>$del_marker,%$flags}); if ($storage_name =~ 'PassThru') { $storage->{parser} = $parser; $parser->{del_marker} = "\0"; $parser->{url} = $flags->{file} if $flags->{file} and $flags->{file} =~ /http:|ftp:/; } my $self = { storage => $storage, parser => $parser, }; return( bless($self,$class) ); } sub adFormats { my @formats; for my $dir(@INC) { my $format_dir = "$dir/AnyData/Format"; if ( -d $format_dir ) { local *D; opendir(D,$format_dir); @formats = grep {/\.pm$/} readdir(D); last; } } unshift @formats,'ARRAY'; @formats = map {s/^(.*)\.pm$/$1/;$_} @formats; return @formats; } sub export { my $self=shift; my $fh = $self->{storage}->{fh}; my $mode = $self->{storage}->{open_mode} || 'r'; # if ( $self->{parser}->{export_on_close} # && $self->{storage}->{fh} # && $mode ne 'r' # ){ return $self->{parser}->export( $self->{storage}, @_ ); # } } sub DESTROY { my $self=shift; # $self->export; $self->zpack; #print "AD DESTROYED "; } ########################################## # DBD STUFF ########################################## # required only for DBD-AnyData ########################################## sub prep_dbd_table { my $self = shift; my $tname = shift; my $createMode = shift; my $col_names; my $col_nums; my $first_row_pos; if (!$createMode) { $col_names = $self->{storage}->get_col_names($self->{parser}); $col_nums = $self->{storage}->set_col_nums(); $first_row_pos = $self->{storage}->{first_row_pos}; } die "ERROR: No Column Names!:", $self->{storage}->{open_mode} if (!$col_names || !scalar @$col_names) && 'ru' =~ $self->{storage}->{open_mode} && !$createMode eq 'o'; my $table = { NAME => $tname, DATA => [], CURRENT_ROW => 0, col_names => $col_names, col_nums => $col_nums, first_row_pos => $first_row_pos, fh => $self->{storage}->get_file_handle, file => $self->{storage}->get_file_name, ad => $self, }; #use Data::Dumper; print Dumper $table; return $table; } sub fetch_row { my $self = shift; my $requested_cols = shift || []; my $rec; if ( $self->{parser}->{skip_pattern} ) { my $found; while (!$found) { $rec = $self->{storage}->file2str($self->{parser},$requested_cols); last if !defined $rec; next if $rec =~ $self->{parser}->{skip_pattern}; last; } } else { $rec = $self->{storage}->file2str($self->{parser},$requested_cols); } return $rec if ref $rec eq 'ARRAY'; return unless $rec; my @fields = $self->{parser}->read_fields($rec); return undef if scalar @fields == 1 and !defined $fields[0]; return \@fields; } sub fetch_rowNEW { my $self = shift; my $requested_cols = shift || []; my $rec = $self->{storage}->file2str($self->{parser},$requested_cols); my @fields; if (ref $rec eq 'ARRAY') { @fields = @$rec; } else { return unless defined $rec; my @fields = $self->{parser}->read_fields($rec); return undef if scalar @fields == 1 and !defined $fields[0]; } if ( my $subs = $self->{parser}->{read_sub} ) { for (@$subs) { my($col,$sub) = @$_; next unless defined $col; my $col_num = $self->{storage}->{col_nums}->{$col}; next unless defined $col_num; $fields[$col_num] = &$sub($fields[$col_num]); } } return \@fields; } sub push_names { my $self = shift; my $col_names = shift || undef; #print "Can't find column names!" unless scalar @$col_names; $self->{storage}->print_col_names( $self->{parser}, $col_names ) unless $self->{parser}->{col_names} && $self->parser_type ne 'XML'; # $self->set_col_nums; $self->{parser}->{key} ||= $col_names->[0]; #use Data::Dumper; print Dumper $self; exit; } sub drop { shift->{storage}->drop(@_); } sub truncate { shift->{storage}->truncate(@_) } ################################################################## # END OF DBD STUFF ################################################################## ################################################################## # REQUIRED BY BOTH DBD AND TIEDHASH ################################################################## sub push_row { my $self = shift; die "ERROR: No Column Names!" unless scalar @{$self->col_names}; my $requested_cols = []; my @row = @_; if (ref($row[0]) eq 'ARRAY') { $requested_cols = shift @row; } my $rec = $self->{parser}->write_fields(@row) or return undef; return $self->{storage}->push_row( $rec, $self->{parser}, $requested_cols); } sub push_rowNEW { my $self = shift; #print "PUSHING... "; die "ERROR: No Column Names!" unless scalar @{$self->col_names}; my $requested_cols = []; my @row = @_; use Data::Dumper; #print "PUSHING ", Dumper \@row; if (ref($row[0]) eq 'ARRAY') { $requested_cols = shift @row; } my $rec = $self->{parser}->write_fields(@row) or return undef; return $self->{storage}->push_row( $rec, $self->{parser}, $requested_cols); } sub seek { shift->{storage}->seek(@_); } sub seek_first_record { my $self=shift; $self->{storage}->seek_first_record($self->{parser}); } sub col_names { my $self = shift; my $c = $self->{storage}->{col_names}; $c = $self->{parser}->{col_names} unless (ref $c eq 'ARRAY') and scalar @$c; $c ||= []; } sub is_url { my $file = shift; return $file if $file and $file =~ m"^http://|ftp://"; } sub adTable { ########################################################### # Patch from Wes Hardaker ########################################################### # my($formatref,$file,$read_mode,$lockMode,$othflags)=@_; my($formatref,$file,$read_mode,$lockMode,$othflags,$tname)=@_; ########################################################### #use Data::Dumper; print Dumper \@_; my($format,$flags); $file ||= ''; my $url = is_url($file); $flags = {}; $othflags ||= {}; if ( ref $formatref eq 'HASH' or $othflags->{data}) { $format = 'Base'; $flags = $othflags; if (ref $formatref eq 'HASH') { %$flags = (%$formatref,%$othflags); } } else { ($format,$flags) = split_params($formatref); $othflags ||= {}; %$flags = (%$flags,%$othflags); } if ( $flags->{cols} ) { $flags->{col_names} = $flags->{cols}; delete $flags->{cols}; } if (ref($file) eq 'ARRAY') { if ($format eq 'Mp3' or $format eq 'FileSys') { $flags->{dirs} = $file; } else { $flags->{recs} = join '',@$file; $flags->{recs} = $file if $format =~ /ARRAY/i; $flags->{storage} = 'RAM' unless $format eq 'XML'; $read_mode = 'u'; } } else { $flags->{file} = $file; } if ($format ne 'XML' and ($format eq 'Base' or $url) ) { my $x; $flags->{storage} = 'RAM'; delete $flags->{recs}; my $ad = AnyData->new( $format, $flags); $format eq 'Base' ? $ad->open_table( $file ) : $ad->open_table( $file, 'r', $ad->{storage}->get_remote_data($file) ); return $ad; } my $ad = AnyData->new( $format, $flags); my $createMode = 0; $createMode = $read_mode if defined $lockMode; $read_mode = 'c' if $createMode and $lockMode; $read_mode = 'u' if !$createMode and $lockMode; $read_mode ||= 'r'; $ad->{parser}->{keep_first_line} = 1 if $flags->{col_names} and 'ru' =~ /$read_mode/; ##################################################### # Patch from Wes Hardaker ##################################################### # $ad->open_table( $file, $read_mode ); ## $ad->open_table( $file, $read_mode, $tname ); $ad->open_table( $file, $read_mode, $tname ); # use Data::Dumper; my $x = $ad; delete $x->{parser}->{twig}; delete $x->{parser}->{record_tag}; delete $x->{parser}->{current_element}; print Dumper $x; ##################################################### return $ad; } sub open_table { my $self = shift; $self->{storage}->open_table( $self->{parser}, @_ ); my $col_names = $self->col_names(); $self->{parser}->{key} ||= ''; $self->{parser}->{key} ||= $col_names->[0] if $col_names->[0]; } ################################################################## ################################################################## # TIEDHASH STUFF ################################################################## sub key_col { shift->{parser}->{key} } sub fetchrow_hashref { my $self = shift; my $rec = $self->get_undeleted_record or return undef; my @fields = ref $rec eq 'ARRAY' ? @$rec : $self->{parser}->read_fields($rec); my $col_names = $self->col_names(); return undef unless scalar @fields; return undef if scalar @fields == 1 and !defined $fields[0]; my $rowhash; @{$rowhash}{@$col_names} = @fields; return ( $rowhash ); } sub get_undeleted_record { my $self = shift; my $rec; my $found=0; return $self->fetch_row if $self->parser_type eq 'XML'; while (!$found) { my $test = $rec = $self->{storage}->file2str($self->{parser}); return if !defined $rec; next if $self->{storage}->is_deleted($self->{parser}); next if $self->{parser}->{skip_pattern} and $rec =~ $self->{parser}->{skip_pattern}; last; } return $rec; # return $rec if ref $rec eq 'ARRAY'; # return unless $rec; # my @fields = $self->{parser}->read_fields($rec); # return undef if scalar @fields == 1 and !defined $fields[0]; # return \@fields; } sub update_single_row { my $self = shift; my $oldrow = shift; my $newvals = shift; my @colnames = @{ $self->col_names }; my @newrow; my $requested_cols = []; for my $i(0..$#colnames) { push @$requested_cols, $colnames[$i] if defined $newvals->{$colnames[$i]}; $newrow[$i] = $newvals->{$colnames[$i]}; $newrow[$i] = $oldrow->{$colnames[$i]} unless defined $newrow[$i]; } unshift @newrow, $requested_cols; $self->{storage}->seek(0,2); $self->push_row( @newrow ); return \@newrow; } sub update_multiple_rows { my $self = shift; my $key = shift; my $values = shift; $self->seek_first_record; my @rows_to_update; while (my $row = $self->fetchrow_hashref) { next unless $self->match($row,$key); $self->{parser}->{has_update_function} ? $self->update_single_row($row,$values) : $self->delete_single_row(); $self->{parser}->{has_update_function} ? push @rows_to_update,1 : push @rows_to_update,$row; } if (!$self->{parser}->{has_update_function}) { for (@rows_to_update) { $self->update_single_row($_,$values); } } return scalar @rows_to_update; } sub match { my($self,$row,$key) = @_; if ( ref $key ne 'HASH') { return 0 if !$row->{$self->key_col} or $row->{$self->key_col} ne $key; return 1; } my $found = 0; while (my($col,$re)=each %$key) { next unless defined $row->{$col} and is_matched($row->{$col},$re); $found++; } return 1 if $found == scalar keys %$key; } sub is_matched { my($str,$re)=@_; if (ref $re eq 'Regexp') { return $str =~ /$re/ ? 1 : 0; } my($op,$val); if ( $re and $re =~/^(\S*)\s+(.*)/ ) { $op = $1; $val = $2; } elsif ($re) { return $str =~ /$re/ ? 1 : 0; } else { return $str eq '' ? 1 : 0; } my $numop = '< > == != <= >='; my $chrop = 'lt gt eq ne le ge'; if (!($numop =~ /$op/) and !($chrop =~ /$op/)) { return $str =~ /$re/ ? 1 : 0; } if ($op eq '<' ) { return $str < $val; } if ($op eq '>' ) { return $str > $val; } if ($op eq '==') { return $str == $val; } if ($op eq '!=') { return $str != $val; } if ($op eq '<=') { return $str <= $val; } if ($op eq '>=') { return $str >= $val; } if ($op eq 'lt') { return $str lt $val; } if ($op eq 'gt') { return $str gt $val; } if ($op eq 'eq') { return $str eq $val; } if ($op eq 'ne') { return $str ne $val; } if ($op eq 'le') { return $str le $val; } if ($op eq 'ge') { return $str ge $val; } } sub delete_single_row { my $self = shift; # my $curpos = $self->{storage}->get_pos; $self->{storage}->delete_record($self->{parser}); # $self->{storage}->go_pos($curpos); $self->{needs_packing}++; } sub delete_multiple_rows { my $self = shift; my $key = shift; $self->seek_first_record; my $rows_deleted =0; while (my $row = $self->fetchrow_hashref) { next unless $self->match($row,$key); $self->delete_single_row; $rows_deleted++; } return $rows_deleted; } sub adNames { @{ shift->{__colnames}} } sub adDump { my $table = shift; my $pat = shift; die "No table defined" unless $table; my $ad = tied(%$table)->{ad}; my @cols = @{ $ad->col_names }; print "<",join(":", @cols), ">\n"; while (my $row = each %$table) { my @row = map {defined $row->{$_} ? $row->{$_} : ''} @cols; for (@row) { print "[$_]"; } print "\n"; } } sub adRows { my $thash = shift; my %keys = @_; my $obj = tied(%$thash); return $obj->adRows(\%keys); } sub adColumn { my $thash = shift; my $column = shift; my $flags = shift; my $obj = tied(%$thash); return $obj->adColumn($column, $flags); } sub adArray { my($format,$data)=@_; my $t = adTie( $format, $data ); my $t1 = tied(%$t); my $ad = $t1->{ad}; my $arrayref = $ad->{storage}->{records}; unshift @$arrayref, $ad->{storage}->{col_names}; return $arrayref; } ################################################################## # END OF TIEDHASH STUFF ################################################################## sub parser_type { my $type = ref shift->{parser}; $type =~ s/AnyData::Format::(.*)/$1/; return $type; } sub zpack { my $self = shift; return if $self->{storage}->{no_pack}; return if (ref $self->{storage} ) !~ /File$/; # return unless $self->{needs_packing}; # $self->{needs_packing} = 0; return unless scalar(keys %{ $self->{storage}->{deleted} } ); $self->{needs_packing} = 0; # my @callA = caller 2; # my @callB = caller 3; # return if $callA[3] =~ /DBD/; # return if $callB[3] and $callB[3] =~ /SQL::Statement/; # return if $self->{parser}->{export_on_close}; #print "PACKING"; my $bak_file = $self->{storage}->get_file_name . '.bak'; my $bak = adTable( 'Text', $bak_file, 'o' ); my $bak_fh = $bak->{storage}->get_file_handle; my $fh = $self->{storage}->get_file_handle; die "Can't pack to backup $!" unless $fh and $bak_fh; # $self->seek_first_record; $fh->seek(0,0) || die $!; #$bak_fh->seek(0,0) || die $!; # while (my $line = $self->get_record) { # next if $self->is_deleted($line); while (my $line = $self->get_undeleted_record) { my $tmpstr = $bak->{parser}->write_fields($line) . $self->{parser}->{record_sep}; $bak_fh->write($tmpstr,length $tmpstr); } $fh->seek(0,0); $fh->truncate(0) || die $!; $bak->seek_first_record; while (<$bak_fh>) { $fh->write($_,length $_); } $fh->close; $bak_fh->close; $self->{doing_pack} = 0; undef $self->{storage}->{deleted}; } ########################################################## # FUNCTION CALL INTERFACE ########################################################## sub adTie { my($format,$file,$read_mode,$flags)=@_; my $data; if (ref $file eq 'ARRAY' && !$read_mode ) { $read_mode = 'u'; } # ARRAY only {data=>[]}; if (scalar @_ == 1){ $read_mode = 'o'; tie %$data, 'AnyData::Storage::TiedHash', adTable($format), $read_mode; return $data; } tie %$data, 'AnyData::Storage::TiedHash', adTable($format,$file,$read_mode,undef,$flags), $read_mode; return $data; } sub adErr { my $hash = shift; my $t = tied(%$hash); my $errstr = $t->{ad}->{parser}->{errstr} || $t->{ad}->{storage}->{errstr}; print $errstr if $errstr; return $errstr; } sub adExport { my $tiedhash = shift; my($tformat,$tfile,$tflags)=@_; my $ad = tied(%$tiedhash)->{ad}; my $sformat = ref $ad->{parser}; $sformat =~ s/AnyData::Format:://; $tformat ||= $sformat; if ($tformat eq $sformat and $tformat eq 'XML') { return $ad->{parser}->export($ad->{storage},$tfile,$tflags); } return adConvert('adHash',$ad,$tformat,$tfile,undef,$tflags); } sub adConvert { my( $source_format, $source_data, $target_format,$target_file_name, $source_flags,$target_flags )=@_; my $target_type = 'STRING'; $target_type = 'FILE' if defined $target_file_name; $target_type = 'ARRAY' if $target_format eq 'ARRAY'; my $data_type = 'AD-OBJECT'; $data_type = 'ARRAY' if ref $source_data eq 'ARRAY' and ref $source_data->[0] eq 'ARRAY'; # INIT SOURCE OBJECT my $source_ad; if ($source_format eq 'adHash') { $source_ad = $source_data; undef $source_data; } else { $source_format = 'CSV' if $source_format =~ /ARRAY/i; $source_ad = adTable( $source_format,$source_data,'r',undef,$source_flags ); } # GET COLUMN NAMES my @cols; if ( $data_type eq 'ARRAY') { @cols = @{ shift @{ $source_data } }; } else { @cols = @{ $source_ad->col_names }; } # insert storable here if ('XML HTMLtable' =~ /$target_format/) { $target_flags->{col_names} = join ',',@cols; my $target_ad = adTable( $target_format,$target_file_name,'o',undef,$target_flags ); if ($data_type eq 'ARRAY' ) { for my $row(@$source_data) { my @fields=$source_ad->str2ary($row); $target_ad->push_row( $source_ad->str2ary(\@fields) ); } unshift @$source_data, \@cols; return $target_ad->export($target_file_name); } $source_ad->seek_first_record; while (my $row = $source_ad->get_undeleted_record) { $target_ad->push_row( $source_ad->str2ary($row) ); } return $target_ad->export($target_file_name); } my($target_ad,$fh); ### INIT TARGET OBJECT if ($target_type eq 'FILE') { $target_ad = adTable( $target_format,$target_file_name,'c',undef,$target_flags ); $fh = $target_ad->{storage}->get_file_handle; } elsif ($target_type eq 'STRING') { $target_ad = AnyData->new( $target_format,$target_flags); } my($str,$aryref); ### GET COLUMN NAMES if ( !$target_ad->{parser}->{no_col_print} ) { if ($target_type eq 'ARRAY') { push @$aryref, \@cols; } else { $str = $target_ad->{parser}->write_fields(@cols); $str =~ s/ /,/g if $target_format eq 'Fixed'; if ($target_type eq 'FILE') { $fh->write($str,length $str); } if ($target_type eq 'STRING') { $str = $target_ad->{parser}->write_fields(@cols); } } } # GET DATA if ($data_type eq 'ARRAY') { for my $row(@$source_data) { my @fields = $source_ad->str2ary($row); my $tmpstr = $target_ad->{parser}->write_fields(@fields); # print $tmpstr if $check; $fh->write($tmpstr,length $tmpstr) if $target_type eq 'FILE'; $str .= $tmpstr if $target_type eq 'STRING'; } unshift @$source_data, \@cols; return $str if $target_format ne 'ARRAY'; return $aryref; } $source_ad->seek_first_record; # unless $source_format eq 'XML'; while (my $row = $source_ad->get_undeleted_record) { if ($target_format eq 'ARRAY') { push @$aryref,$row if $target_format eq 'ARRAY'; next; } my @fields = $source_ad->str2ary($row); my $tmpstr = $target_ad->{parser}->write_fields(@fields); $str .= $target_type eq 'FILE' ? $fh->write($tmpstr,length $tmpstr) : $tmpstr; } return $str if $target_format ne 'ARRAY'; return $aryref; } # if ('Storable' =~ /$target_format/) { # $target_flags->{col_names} = join ',',@cols; # $target_ad = adTable( # $target_format,$target_file_name,'c',undef,$target_flags # ); # if (ref $source_data && !$data) { # for my $row(@$source_data) { # push @$data,$row; # } # } # elsif (!$data) { # $source_ad->seek_first_record; # while (my $row = $source_ad->fetch_row) { # push @$data, $row; # } # } # unshift @$data, \@cols; # return $target_ad->{parser}->export($data,$target_file_name); # } sub str2ary { my($ad,$row) = @_; return @$row if ref $row eq 'ARRAY'; return $ad->{parser}->read_fields($row); } sub ad_string { my($formatref,@fields) = @_; my($format,$flags) = split_params($formatref); # &dump($formatref); print "<$format>"; &dump($flags) if $flags; #$formatref =~ s/(.*)/$1/; my $ad = AnyData->new( $format, $flags ); return $ad->{parser}->write_fields(@fields); # return $ad->write_fields(@fields); } sub ad_fields { my($formatref,$str,$flags) = @_; # my($format,$flags) = split_params($formatref); # my $ad = AnyData::new( $format, $flags ); my $ad = AnyData->new( $formatref, $flags ); return $ad->{parser}->read_fields($str); } sub ad_convert_str { my($source_formatref,$target_formatref,$str) = @_; my($source_format,$source_flags) = split_params($source_formatref); my($target_format,$target_flags) = split_params($target_formatref); my $source_ad = AnyData->new( $source_format,$source_flags); my $target_ad = AnyData->new( $target_format,$target_flags); my @fields = $source_ad->read_fields($str); return $target_ad->write_fields( @fields ); } ######################################################### # UTILITY METHODS ######################################################### # # For all methods that have $format as a parameter, # $format can be either a string name of a format e.g. 'CSV' # or a hashref of the format and flags for that format e.g. # { format => 'FixedWidth', pattern=>'A1 A3 A2' } # # given this parameter, this method returns $format and $flags # setting $flags to {} if none are given # sub split_params { my $source_formatref = shift; my $source_flags = {}; my $source_format = $source_formatref; if (ref $source_formatref eq 'HASH') { while (my($k,$v)=each %$source_formatref) { ($source_format,$source_flags) = ($k,$v); } } #use Data::Dumper; return( $source_format, $source_flags); } sub dump { my $var = shift; my $name = ref($var); #use Data::Dumper; $Data::Dumper::Indent = 1; $Data::Dumper::Useqq = 0; print Data::Dumper->new([$var],[$name])->Dump(); } ########################################################################### # START OF DOCUMENTATION ########################################################################### =pod =head1 NAME AnyData - (DEPRECATED) easy access to data in many formats =head1 SYNOPSIS use AnyData; my $table = adTie( 'CSV','my_db.csv','o', # create a table {col_names=>'name,country,sex'} ); $table->{Sue} = {country=>'de',sex=>'f'}; # insert a row delete $table->{Tom}; # delete a single row $str = $table->{Sue}->{country}; # select a single value while ( my $row = each %$table ) { # loop through table print $row->{name} if $row->{sex} eq 'f'; } $rows = $table->{{age=>'> 25'}}; # select multiple rows delete $table->{{country=>qr/us|mx|ca/}}; # delete multiple rows $table->{{country=>'Nz'}}={country=>'nz'}; # update multiple rows my $num = adRows( $table, age=>'< 25' ); # count matching rows my @names = adNames( $table ); # get column names my @cars = adColumn( $table, 'cars' ); # group a column my @formats = adFormats(); # list available parsers adExport( $table, $format, $file, $flags ); # save in specified format print adExport( $table, $format, $flags ); # print to screen in format print adDump($table); # dump table to screen undef $table; # close the table #adConvert( $format1, $file1, $format2, $file2 ); # convert btwn formats #print adConvert( $format1, $file1, $format2 ); # convert to screen =head1 DESCRIPTION The rather wacky idea behind this module and its sister module DBD::AnyData is that any data, regardless of source or format should be accessible and modifiable with the same simple set of methods. This module provides a multidimensional tied hash interface to data in a dozen different formats. The DBD::AnyData module adds a DBI/SQL interface for those same formats. Both modules provide built-in protections including appropriate flocking() for all I/O and (in most cases) record-at-a-time access to files rather than slurping of entire files. Currently supported formats include general format flat files (CSV, Fixed Length, etc.), specific formats (passwd files, httpd logs, etc.), and a variety of other kinds of formats (XML, Mp3, HTML tables). The number of supported formats will continue to grow rapidly since there is an open API making it easy for any author to create additional format parsers which can be plugged in to AnyData itself and thereby be accessible by either the tiedhash or DBI/SQL interface. =head1 PREREQUISITES The AnyData.pm module itself is pure Perl and does not depend on anything other than modules that come standard with Perl. Some formats and some advanced features require additional modules: to use the remote ftp/http features, you must have the LWP bundle installed; to use the XML format, you must have XML::Parser and XML::Twig installed; to use the HTMLtable format for reading, you must have HTML::Parser and HTML::TableExtract installed but you can use the HTMLtable for writing with just the standard CGI module. To use DBI/SQL commands, you must have DBI, DBD::AnyData, SQL::Statement and DBD::File installed. =head1 USAGE The AnyData module imports eight methods (functions): =for test ignore adTie() -- create a new table or open an existing table adExport() -- save an existing table in a specified format adConvert() -- convert data in one format into another format adFormats() -- list available formats adNames() -- get the column names of a table adRows() -- get the number of rows in a table or query adDump() -- display the data formatted as an array of rows adColumn() -- group values in a single column The adTie() command returns a special tied hash. The tied hash can then be used to access and/or modify data. See below for details With the exception of the XML, HTMLtable, and ARRAY formats, the adTie() command saves all modifications of the data directly to file as they are made. With XML and HTMLtable, you must make your modifications in memory and then explicitly save them to file with adExport(). =head2 adTie() my $table = adTie( $format, $data, $open_mode, $flags ); The adTie() command creates a reference to a multidimensional tied hash. In its simplest form, it simply reads a file in a specified format into the tied hash: my $table = adTie( $format, $file ); $format is the name of any supported format 'CSV','Fixed','Passwd', etc. $file is the name of a relative or absolute path to a local file e.g. my $table = adTie( 'CSV', '/usr/me/myfile.csv' ); this creates a tied hash called $table by reading data in the CSV (comma separated values) format from the file 'myfile.csv'. The hash reference resulting from adTie() can be accessed and modified as follows: use AnyData; my $table = adTie( $format, $file ); $table->{$key}->{$column}; # select a value $table->{$key} = {$col1=>$val1,$col2=>$val2...}; # update a row delete $table->{$key}; # delete a row while(my $row = each %$table) { # loop through rows print $row->{$col1} if $row->{$col2} ne 'baz'; } The thing returned by adTie ($table in the example) is not an object, it is a reference to a tied hash. This means that hash operations such as exists, values, keys, may be used, keeping in mind that this is a *reference* to a tied hash so the syntax would be for( keys %$table ) {...} for( values %$table ) {...} Also keep in mind that if the table is really large, you probably do not want to use keys and values because they create arrays in memory containing data from every row in the table. Instead use 'each' as shown above since that cycles through the file one record at a time and never puts the entire table into memory. It is also possible to use more advanced searching on the hash, see "Multiple Row Operations" below. In addition to the simple adTie($format,$file), there are other ways to specify additional information in the adTie() command. The full syntax is: my $table = adTie( $format, $data, $open_mode, $flags ); The $data parameter allows you to read data from remote files accessible by http or ftp, see "Using Remote Files" below. It also allows you to treat strings and arrays as data sources without needing a file at all, see "Working with Strings and Arrays" below. The optional $mode parameter defaults to 'r' if none is supplied or must be one of 'r' read # read only access 'u' update # read/write access 'c' create # create a new file unless it already exists 'o' overwrite # create a new file, overwriting any that already exist The $flags parameter allows you to specify additional information such as column names. See the sections in "Further Details" below. With the exception of the XML, HTMLtable, and ARRAY formats, the adTie() command saves all modifications of the data directly to file as they are made. With XML and HTMLtable, you must make your modifications in memory and then explicitly save them to file with adExport(). =head2 adConvert() adConvert( $format1, $data1, $format2, $file2, $flags1, $flags2 ); or print adConvert( $format1, $data1, $format2, undef, $flags1, $flags2 ); or my $aryref = adConvert( $format1, $data1, 'ARRAY', undef, $flags1 ); This method converts data in any supported format into any other supported format. The resulting data may either be saved to a file (if $file2 is supplied as a parameter) or sent back as a string to e.g. print the data to the screen in the new format (if no $file2 is supplied), or sent back as an array reference if $format2 is 'ARRAY'. Some examples: # convert a CSV file into an XML file # adConvert('CSV','foo.csv','XML','foo.xml'); # convert a CSV file into an HTML table and print it to the screen # print adConvert('CSV','foo.csv','HTMLtable'); # convert an XML string into a CSV file # adConvert('XML', ["TIMTOWTDI"], 'CSV','foo.csv' ); # convert an array reference into an XML file # adConvert('ARRAY', [['id','motto'],['perl','TIMTOWTDI']], 'XML','foo.xml' ); # convert an XML file into an array reference # my $aryref = adConvert('XML','foo.xml','ARRAY'); See section below "Using strings and arrays" for details. =head2 adExport() adExport( $table, $format, $file, $flags ); or print adExport( $table, $format ); or my $aryref = adExport( $table, 'ARRAY' ); This method converts an existing tied hash into another format and/or saves the tied hash as a file in the specified format. Some examples: all assume a previous call to my $table= adTie(...); # export table to an XML file # adExport($table','XML','foo.xml'); # export table to an HTML string and print it to the screen # print adExport($table,'HTMLtable'); # export the table to an array reference # my $aryref = adExport($table,'ARRAY'); See section below "Using strings and arrays" for details. =head2 adNames() my $table = adTie(...); my @column_names = adNames($table); This method returns an array of the column names for the specified table. =head2 adRows() my $table = adTie(...); adRows( $table, %search_hash ); This method takes an AnyData tied hash created with adTie() and counts the rows in the table that match the search hash. For example, this snippet returns a count of the rows in the file that contain the specified page in the request column my $hits = adTie( 'Weblog', 'access.log'); print adRows( $hits , request => 'mypage.html' ); The search hash may contain multiple search criteria, see the section on multiple row operations below. If the search_hash is omitted, it returns a count of all rows. =head2 adColumn() my @col_vals = adColumn( $table, $column_name, $distinct_flag ); This method returns an array of values taken from the specified column. If there is a distinct_flag parameter, duplicates will be eliminated from the list. For example, this snippet returns a unique list of the values in the 'player' column of the table. my $game = adTie( 'Pipe','games.db' ); my @players = adColumn( $game, 'player', 1 ); =head2 adDump() my $table = adTie(...); print adDump($table); This method prints the raw data in the table. Column names are printed inside angle brackets and separated by colons on the first line, then each row is printed as a list of values inside square brackets. =head2 adFormats() print "$_\n for adFormats(); This method shows the available format parsers, e.g. 'CSV', 'XML', etc. It looks in your @INC for the .../AnyData/Format directory and prints the names of format parsing files there. If the parser requires further modules (e.g. XML requires XML::Parser) and you do not have the additional modules installed, the format will not work even if listed by this command. Otherwise, all formats should work as described in this documentation. =head1 FURTHER DETAILS =head2 Column Names Column names may be assigned in three ways: * pre -- The format parser preassigns column names (e.g. Passwd files automatically have columns named 'username', 'homedir', 'GID', etc.). * user -- The user specifies the column names as a comma separated string associated with the key 'cols': my $table = adTie( $format, $file, $mode, {cols=>'name,age,gender'} ); * auto -- If there is no preassigned list of column names and none defined by the user, the first line of the file is treated as a list of column names; the line is parsed according to the specific format (e.g. CSV column names are a comma-separated list, Tab column names are a tab separated list); When creating a new file in a format that does not preassign column names, the user *must* manually assign them as shown above. Some formats have special rules for assigning column names (XML,Fixed,HTMLtable), see the sections below on those formats. =head2 Key Columns The AnyData modules support tables that have a single key column that uniquely identifies each row as well as tables that do not have such keys. For tables where there is a unique key, that key may be assigned in three ways: * pre -- The format parser automatically preassigns the key column name e.g. Passwd files automatically have 'username' as the key column. * user -- The user specifies the key column name: my $table = adTie( $format, $file, $mode, {key=>'country'} ); * auto If there is no preassigned key column and the user does not define one, the first column becomes the default key column =head2 Format Specific Details For full details, see the documentation for AnyData::Format::Foo where Foo is any of the formats listed in the adFormats() command e.g. 'CSV', 'XML', etc. Included below are only some of the more important details of the specific parsers. =over =item Fixed Format When using the Fixed format for fixed length records you must always specify a pattern indicating the lengths of the fields. This should be a string as would be passed to the unpack() function to unpack the records in your Fixed length definition: my $t = adTie( 'Fixed', $file, 'r', {pattern=>'A3 A7 A9'} ); If you want the column names to appear on the first line of a Fixed file, they should be in comma-separated format, not in Fixed format. This is different from other formats which use their own format to display the column names on the first line. This is necessary because the name of the column might be longer than the length of the column. =item XML Format The XML format does not allow you to specify column names as a flag, rather you specify a "record_tag" and the column names are determined from the contents of the tag. If no record_tag is specified, the record tag will be assumed to be the first child of the root of the XML tree. That child and its structure will be determined from the DTD if there is one, or from the first occurring record if there is no DTD. For simple XML, no flags are necessary: JoeSeattleSuePortland
The record_tag will default to the first child, namely "row". The column names will be generated from the attributes of the record tag and all of the tags included under the record tag, so the column names in this example will be "row_id","name","location". If the record_tag is not the first child, you will need to specify it. For example: JoeSeattleSuePortland
BobBoiseBevBillings
In this case you will need to specify "row" as the record_tag since it is not the first child of the tree. The column names will be generated from the attributes of row's parent (if the parent is not the root), from row's attributes and sub tags, i.e. "table_id","row_id","name","location". When exporting XML, you can specify a DTD to control the output. For example, if you import a table from CSV or from an Array, you can output as XML and specify which of the columns become tags and which become attributes and also specify the nesting of the tags in your DTD. The XML format parser is built on top of Michel Rodriguez's excellent XML::Twig which is itself based on XML::Parser. Parameters to either of those modules may be passed in the flags for adTie() and the other commands including the "prettyPrint" flag to specify how the output XML is displayed and things like ProtocolEncoding. ProtocolEncoding defaults to 'ISO-8859-1', all other flags keep the defaults of XML::Twig and XML::Parser. See the documentation of those modules for details; CAUTION: Unlike other formats, the XML format does not save changes to the file as they are entered, but only saves the changes when you explicitly request them to be saved with the adExport() command. =item HTMLtable Format This format is based on Matt Sisk's excelletn HTML::TableExtract. It can be used to read an existing table from an html page, or to create a new HTML table from any data source. You may control which table in an HTML page is used with the column_names, depth and count flags. If a column_names flag is passed, the first table that contains those names as the cells in a row will be selected. If depth and or count parameters are passed, it will look for tables as specified in the HTML::TableExtract documentation. If none of column_names, depth, or count flags are passed, the first table encountered in the file will be the table selected and its first row will be used to determine the column names for the table. When exporting to an HTMLtable, you may pass flags to specify properties of the whole table (table_flags), the top row containing the column names (top_row_flags), and the data rows (data_row_flags). These flags follow the syntax of CGI.pm table constructors, e.g.: print adExport( $table, 'HTMLtable', { table_flags => {Border=>3,bgColor=>'blue'}; top_row_flags => {bgColor=>'red'}; data_row_flags => {valign='top'}; }); The table_flags will default to {Border=>1,bgColor=>'white'} if none are specified. The top_row_flags will default to {bgColor=>'#c0c0c0'} if none are specified; The data_row_flags will be empty if none are specified. In other words, if no flags are specified the table will print out with a border of 1, the column headings in gray, and the data rows in white. CAUTION: This module will *not* preserve anything in the html file except the selected table so if your file contains more than the selected table, you will want to use adTie() to read the table and then adExport() to write the table to a different file. When using the HTMLtable format, this is the only way to preserve changes to the data, the adTie() command will *not* write to a file. =back =head2 Multiple Row Operations The AnyData hash returned by adTie() may use either single values as keys, or a reference to a hash of comparisons as a key. If the key to the hash is a single value, the hash operates on a single row but if the key to the hash is itself a hash reference, the hash operates on a group of rows. my $num_deleted = delete $table->{Sue}; This example deletes a single row where the key column has the value 'Sue'. If multiple rows have the value 'Sue' in that column, only the first is deleted. It uses a simple string as a key, therefore it operates on only a single row. my $num_deleted = delete $table->{ {name=>'Sue'} }; This example deletes all rows where the column 'name' is equal to 'Sue'. It uses a hashref as a key and therefore operates on multiple rows. The hashref used in this example is a single column comparison but the hashref could also include multiple column comparisons. This deletes all rows where the the values listed for the country, gender, and age columns are equal to those specified: my $num_deleted = delete $table->{{ country => 'us', gender => 'm', age => '25' }} In addition to simple strings, the values may be specified as regular expressions or as numeric or alphabetic comparisons. This will delete all North American males under the age of 25: my $num_deleted = delete $table->{{ country => qr/mx|us|ca/, gender => 'm', age => '< 25' }} If numeric or alphabetic comparisons are used, they should be a string with the comparison operator separated from the value by a space, e.g. '> 4' or 'lt b'. This kind of search hashref can be used not only to delete multiple rows, but also to update rows. In fact you *must* use a hashref key in order to update your table. Updating is the only operation that can not be done with a single string key. The search hashref can be used with a select statement, in which case it returns a reference to an array of rows matching the criteria: my $male_players = $table->{{gender=>'m'}}; for my $player( @$male_players ) { print $player->{name},"\n" } This should be used with caution with a large table since it gathers all of the selected rows into an array in memory. Again, 'each' is a much better way for large tables. This accomplishes the same thing as the example above, but without ever pulling more than a row into memory at a time: while( my $row= each %$table ) { print $row->{name}, "\n" if $row->{gender}=>'m'; } Search criteria for multiple rows can also be used with the adRows() function: my $num_of_women = adRows( $table, gender => 'w' ); That does *not* pull the entire table into memory, it counts the rows a record at a time. =head2 Using Remote Files If the first file parameter of adTie() or adConvert() begins with "http://" or "ftp://", the file is treated as a remote URL and the LWP module is called behind the scenes to fetch the file. If the files are in an area that requires authentication, that may be supplied in the $flags parameter. For example: # read a remote file and access it via a tied hash # my $table = adTie( 'XML', 'http://www.foo.edu/bar.xml' ); # same with username/password # my $table = ( 'XML', 'ftp://www.foo.edu/pub/bar.xml', 'r' { user => 'me', pass => 'x7dy4' ); # read a remote file, convert it to an HTML table, and print it # print adConvert( 'XML', 'ftp://www.foo.edu/pub/bar.xml', 'HTMLtable' ); =head2 Using Strings and Arrays Strings and arrays may be used as either the source of data input or as the target of data output. Strings should be passed as the only element of an array reference (in other words, inside square brackets). Arrays should be a reference to an array whose first element is a reference to an array of column names and whose succeeding elements are references to arrays of row values. For example: my $table = adTie( 'XML', ["TIMTOWTDI"] ); This uses the XML format to parse the supplied string and returns a tied hash to the resulting table. my $table = adTie( 'ARRAY', [['id','motto'],['perl','TIMTOWTDI']] ); This uses the column names "id" and "motto" and the supplied row values and returns a tied hash to the resulting table. It is also possible to use an empty array to create a new empty tied hash in any format, for example: my $table = adTie('XML',[],'c'); creates a new empty tied hash; See adConvert() and adExport() for further examples of using strings and arrays. =head2 Ties, Flocks, I/O, and Atomicity AnyData provides flocking which works under the limitations of flock -- that it only works if other processes accessing the files are also using flock and only on platforms that support flock. See the flock() man page for details. Here is what the user supplied open modes actually do: r = read only (LOCK_SH) O_RDONLY u = update (LOCK_EX) O_RDWR c = create (LOCK_EX) O_CREAT | O_RDWR | O_EXCL o = overwrite (LOCK_EX) O_CREAT | O_RDWR | O_TRUNC When you use something like "my $table = adTie(...)", it opens the file with a lock and leaves the file and lock open until 1) the hash variable ($table) goes out of scope or 2) the hash is undefined (e.g. "undef $table") or 3) the hash is re-assigned to another tie. In all cases the file is closed and the lock released. If adTie is called without creating a tied hash variable, the file is closed and the lock released immediately after the call to adTie. For example: print adTie('XML','foo.xml')->{main_office}->{phone}. That obtains a shared lock, opens the file, retrieves the one value requested, closes the file and releases the lock. These two examples accomplish the same thing but the first example opens the file once, does all of the deletions, keeping the exclusive lock in place until they are all done, then closes the file. The second example opens and closes the file three times, once for each deletion and releases the exclusive lock between each deletion: 1. my $t = adTie('Pipe','games.db','u'); delete $t->{"user$_"} for (0..3); undef $t; # closes file and releases lock 2. delete adTie('Pipe','games.db','u')->{"user$_"} for (0..3); # no undef needed since no hash variable created =head2 Deletions and Packing In order to save time and to prevent having to do writes anywhere except at the end of the file, deletions and updates are *not* done at the time of issuing a delete command. Rather when the user does a delete, the position of the deleted record is stored in a hash and when the file is saved to disk, the deletions are only then physically removed by packing the entire database. Updates are done by inserting the new record at the end of the file and marking the old record for deletion. In the normal course of events, all of this should be transparent and you'll never need to worry about it. However, if your server goes down after you've made updates or deletions but before you've saved the file, then the deleted rows will remain in the database and for updates there will be duplicate rows -- the old non updated row and the new updated row. If you are worried about this kind of event, then use atomic deletes and updates as shown in the section above. There's still a very small possibility of a crash in between the deletion and the save, but in this case it should impact at most a single row. (BIG thanks to Matthew Wickline for suggestions on handling deletes) =head1 MORE HELP See the README file and the test.pl included with the module for further examples. See the AnyData/Format/*.pm PODs for further details of specific formats. For further support, please use comp.lang.perl.modules =head1 ACKNOWLEDGEMENTS Special thanks to Andy Duncan, Tom Lowery, Randal Schwartz, Michel Rodriguez, Jochen Wiedmann, Tim Bunce, Alligator Descartes, Mathew Persico, Chris Nandor, Malcom Cook and to many others on the DBI mailing lists and the clp* newsgroups. =head1 AUTHOR & COPYRIGHT Jeff Zucker This module is copyright (c), 2000 by Jeff Zucker. Some changes (c) 2012 Sven Dowideit L It may be freely distributed under the same terms as Perl itself. =cut ################################ # END OF AnyData ################################ 1; AnyData-0.12/lib/AnyData/Format/000755 000765 000024 00000000000 12462242164 016241 5ustar00snostaff000000 000000 AnyData-0.12/lib/AnyData/Storage/000755 000765 000024 00000000000 12462242164 016415 5ustar00snostaff000000 000000 AnyData-0.12/lib/AnyData/Storage/File.pm000644 000765 000024 00000020400 12423400733 017621 0ustar00snostaff000000 000000 package AnyData::Storage::File; use strict; use warnings; use IO::File; use Fcntl qw(:flock); use File::Basename; use constant HAS_FLOCK => eval { flock STDOUT, 0; 1 }; use constant HAS_FILE_SPEC => eval { require File::Spec }; use vars qw($DEBUG); $DEBUG = 0; sub new { my $class = shift; my $self = shift || {}; #$self->{f_dir} ||= './'; return bless $self, $class; } sub seek_first_record { my $self = shift; my $fh = $self->{fh}; my $start = $self->{first_row_pos}; $start ? $fh->seek($start,0) || die $! : $fh->seek(0,0) || die $!; } sub get_pos { return shift->{fh}->tell } sub go_pos { my($s,$pos)=@_; $s->{fh}->seek($pos,0); } my $open_table_re = HAS_FILE_SPEC ? sprintf('(?:%s|%s|%s)', quotemeta(File::Spec->curdir()), quotemeta(File::Spec->updir()), quotemeta(File::Spec->rootdir())) : '(?:\.?\.)?\/'; sub open_local_file { my( $self,$file, $open_mode ) = @_; my $dir = $self->{f_dir} || './'; my($fname,$path) = fileparse($file); my($foo2,$os_cur_dir) = fileparse(''); my $haspath = 1 if $path and $path ne $os_cur_dir; if (!$haspath && $file !~ /^$open_table_re/o) { $file = HAS_FILE_SPEC ? File::Spec->catfile($dir, $file) : $dir . "/$file"; } my $fh; $open_mode ||= 'r'; my %valid_mode = ( r => q/read read an existing file, fail if already exists/, u => q/update read & modify an existing file, fail if already exists/, c => q/create create a new file, fail if it already exists/, o => q/overwrite create a new file, overwrite if it already exists/, ); my %mode = ( r => O_RDONLY, u => O_RDWR, c => O_CREAT | O_RDWR | O_EXCL, o => O_CREAT | O_RDWR | O_TRUNC ); my $help = qq( r if file exists, get shared lock u if file exists, get exclusive lock c if file doesn't exist, get exclusive lock o truncate if file exists, else create; get exclusive lock ); if ( !$valid_mode{$open_mode} ) { print "\nBad open_mode '$open_mode'\nValid modes are :\n"; for ('r','u','c','o'){ print " $_ = $valid_mode{$_}\n"; } exit; } if ($open_mode eq 'c') { if (-f $file) { die "Cannot create '$file': Already exists"; } } if ($open_mode =~ /[co]/ ) { if (!($fh = IO::File->new( $file, $mode{$open_mode} ))) { die "Cannot open '$file': $!"; } if (!$fh->seek(0, 0)) { die " Error while seeking back: $!"; } } if ($open_mode =~ /[ru]/) { die "Cannot read file '$file': doesn't exist!" unless -f $file; if (!($fh = IO::File->new($file, $mode{$open_mode}))) { die " Cannot open '$file': $!"; } } binmode($fh); $fh->autoflush(1); if ( HAS_FLOCK ) { if ( $open_mode eq 'r') { if (!flock($fh, LOCK_SH)) { die "Cannot obtain shared lock on '$file': $!"; } } else { if (!flock($fh, LOCK_EX)) { die " Cannot obtain exclusive lock on '$file': $!"; } } } print "OPENING $file, mode = '$open_mode'\n" if $DEBUG; return( $file, $fh, $open_mode) if wantarray; return( $fh ); } sub print_col_names { my($self,$parser,$col_names) = @_; my $fields = $col_names || $self->{col_names} || $parser->{col_names}; return undef unless scalar @$fields; $self->{col_names} = $fields; return $fields if $parser->{keep_first_line}; my $first_line = $self->get_record(); my $fh = $self->{fh}; $self->seek_first_record; my $end = $parser->{record_sep} || "\n"; my $colStr = $parser->write_fields(@$fields); $colStr = join( ',',@$fields) . $end if ref($parser) =~ /Fixed/; $fh->write($colStr,length $colStr); $self->{first_row_pos} = $fh->tell(); } sub get_col_names { my($self,$parser) = @_; my @fields = (); if ($parser->{keep_first_line}) { my $cols = $parser->{col_names}; return undef unless $cols; return $cols if ref $cols eq 'ARRAY'; @fields = split ',',$cols; #die "@fields"; return scalar @fields ? \@fields : undef; } my $fh = $self->{fh}; $fh->seek(0,0) if $fh; my $first_line = $self->get_record($parser); #print $first_line; if ( $first_line ) { @fields = ref($parser) =~ /Fixed/ ? split /,/,$first_line : $parser->read_fields($first_line); } # my @fields = $first_line # ? $parser->read_fields($first_line) # : (); #print "<$_>" for @fields; print "\n"; return "CAN'T FIND COLUMN NAMES ON FIRST LINE OF '" . $self->{file} . "' : '@fields'" if "@fields" =~ /[^ a-zA-Z0-9_]/; $parser->{col_names} = \@fields; $self->{col_names} = \@fields; $self->{col_nums} = $self->set_col_nums; $self->{first_row_pos} = $fh->tell(); return( \@fields); } sub open_table { my( $self, $parser, $file, $open_mode ) = @_; my($newfile, $fh); $file ||= ''; if ( $file =~ m'http://|ftp://' ) { # die "wrong storage!"; $newfile = $file; } else { ($newfile,$fh) = $self->open_local_file($file,$open_mode) if $file && !(ref $file); } $newfile ||= $file; #die AnyData::dump($parser); my $col_names = $parser->{col_names} || ''; # my @array = split(/,/,$col_names); my @array; @array = ref $col_names eq 'ARRAY' ? @$col_names : split ',',$col_names; my $pos = $fh->tell() if $fh; my %table = ( file => $newfile, open_mode => $open_mode, fh => $fh, col_nums => {}, col_names => \@array, first_row_pos => $pos ); for my $key(keys %table) { $self->{$key}=$table{$key}; } my $skip = $parser->init_parser($self); if (!$skip && defined $newfile) { $open_mode =~ /[co]/ ? $self->print_col_names($parser) : $self->get_col_names($parser); } $self->{col_nums} = $self->set_col_nums(); # use Data::Dumper; die Dumper $self; } sub get_file_handle { return shift->{fh} } sub get_file_name { return shift->{file} } sub get_file_open_mode { return shift->{open_mode} } sub file2str { return shift->get_record(@_) } sub get_record { my($self,$parser)=@_; local $/ = $parser->{record_sep} || "\n"; my $fh = $self->{fh} ; my $record = $fh->getline || return undef; $record =~ s/\015$//g; $record =~ s/\012$//g; return $record; } sub set_col_nums { my $self = shift; my $col_names = $self->{col_names}; return {} unless $col_names; my $col_nums={}; my $i=0; for (@$col_names) { next unless $_; $col_nums->{$col_names->[$i]} = $i; $i++; } return $col_nums; } sub truncate { my $self = shift; if (!$self->{fh}->truncate($self->{fh}->tell())) { die "Error while truncating " . $self->{file} . ": $!"; } } sub drop ($) { my($self) = @_; # We have to close the file before unlinking it: Some OS'es will # refuse the unlink otherwise. $self->{'fh'}->close() || die $!; unlink($self->{'file'}) || die $!; return 1; } sub close{ shift->{'fh'}->close() || die $!; } sub push_row { my $self = shift; my $rec = shift; my $fh = $self->{fh}; #####!!!! DON'T USE THIS #### $fh->seek(0,2) or die $!; $fh->write($rec,length $rec) || die "Couldn't write to file: $!\n"; } sub delete_record { my $self = shift; my $parser = shift || {}; my $fh = $self->{fh}; my $travel = length($parser->{record_sep}) || 0; my $pos = $fh->tell - $travel; $self->{deleted}->{$pos}++; } sub is_deleted { my $self = shift; my $parser = shift || {}; my $fh = $self->{fh}; my $travel = length($parser->{record_sep}) || 0; my $pos = $fh->tell - $travel; return $self->{deleted}->{$pos}; } sub seek { my($self, $pos, $whence) = @_; if ($whence == 0 && $pos == 0) { $pos = $self->{first_row_pos}; } elsif ($whence != 2 || $pos != 0) { die "Illegal seek position: pos = $pos, whence = $whence"; } if (!$self->{fh}->seek($pos, $whence)) { die "Error while seeking in " . $self->{'file'} . ": $!"; } #print "<$pos-$whence>"; } sub DESTROY { my $self = shift; my $fh = $self->{fh}; print "CLOSING ", $self->get_file_name, "\n" if $fh && $DEBUG; $fh->close if $fh; } __END__ AnyData-0.12/lib/AnyData/Storage/File.pod000644 000765 000024 00000003616 12423400733 020001 0ustar00snostaff000000 000000 =pod =head1 NAME AnyData::Storage::File -- manipulate files with rich warnings =head1 DESCRIPTION Opens, reads from, writes to, creates and destroys files with numerous options for error handling, flocking, binmode, etc. The simplest form is the equivalent of my $f = AnyData::Storage::File->new(dirs=>\@dirs,flock=>1); my $str1 = $f->adSlurp($file); for( @dirs ) { open(IN,$file) or die $!; } sub slurp { local *IN; local $/ = undef; } But, depending on how you set the default behaviour =head1 SYNOPSIS use AnyData; my $f = AnyData::Storage::File->new; $f->set('binmode',1|0); $f->set('PrintError',1|0); $f->set('RaiseError',1|0); $f->set('Trace',1|0); $f->set('f_dir',$dir|$dir_array) < input, fail if doesn't exist > output, truncate if exists, create if doesn't >> append, create if doesn't exist +< read/write, fail if doesn't exist r = < r+ = =head1 new() my $f = AnyData::Storage::File->new; or my $f = AnyData::Storage::File->new( %flags ); %flags is a hash which can contain any or all of: f_dir => $directory, # defaults to './' ( binmode => $binmode, # defaults to 0 (doesn't binmode files) printError => $warnings, # defaults to 1 (print warning on errors) =head1 open_local_file( $fname, $mode ); Mode is one of a = append open for reading & writing, create if doesn't exist r = read open for reading, fail if doesn't exist u = open open for reading & writing, fail if doesn't exist c = create open for reading & writing, fail if it already exists o = overwrite open for reading & writing, overwrite if it already exists Additionally, all modes fail if the file can't be opened. On systems that support flock, 'r' fails if a shared lock can not be obtained; the other modes fail if an exclusive lock can't be obtained. =cut AnyData-0.12/lib/AnyData/Storage/FileSys.pm000644 000765 000024 00000005621 12423400733 020330 0ustar00snostaff000000 000000 package AnyData::Storage::FileSys; use strict; use warnings; use File::Find; use File::Basename; use vars qw( @ISA @files $wanted_part $wanted_re ); use AnyData::Storage::File; @ISA = qw( AnyData::Storage::File ); use Data::Dumper; sub open_table {} sub new { my $class = shift; my $self = shift || {}; $self->{col_names} = ['fullpath','path','name','ext','size','content' ]; bless $self, $class; my $exts = $self->{exts}; if ($exts) { $self->{wanted_part} = 'ext'; $self->{wanted_re} = qr/\.$exts$/; } $self->{records} = $self->get_data; $self->{index} = 0; return $self; } sub is_deleted {} sub get_data { my $self = shift; my $dirs = shift || $self->{dirs}; my @col_names = @{ $self->{col_names} }; my $table = []; my @files = $self->get_filename_parts; for my $file_info(@files) { my $file = $file_info->[0]; # my $cols = get_mp3_tag($file) || next; #my $filesize = -s $file; #my @row = (@$file_info,$filesize); my @row = ( $file_info->[0], $file_info->[2], $file_info->[1], $file_info->[3], ); push @$table, \@row; # 'fullpath,path,name,ext,size,content'; # 'fullpath,file_name,path,ext,size,' # 'name,artist,album,year,comment,genre'; } #use Data::Dumper; print "!",Dumper $table; exit; return $table; } sub seek_first_record { my $self = shift; $self->{index} = 0; } sub file2str { my $self = shift; my $curindex = $self->{index}; return undef if $curindex >= scalar @{$self->{records}}; $self->{index}++; my $rec = $self->{records}->[$curindex]; my $file = $rec->[0]; push @$rec, -s $file; local $/; undef $/; my $fh = $self->open_local_file( $file, $self->{open_mode}); my $str = <$fh>; undef $fh; push @$rec, $str; return $rec; } sub col_names { shift->{col_names} } sub get_filename_parts { my $self = shift; my %flags; %flags = @_ if scalar @_; #use Data::Dumper; print "!",Dumper \%flags; exit; $wanted_part = $flags{part} || $self->{wanted_part} || ''; $wanted_re = $flags{re} || $self->{wanted_re} || ''; my $dirs = $flags{dirs} || $self->{dirs} || []; my $wanted_sub = $flags{sub} || \&wanted; @files = (); find { no_chdir => 1, wanted => $wanted_sub, }, @$dirs; ; my @results = @files; @files = (); return @results; } sub wanted { my @info = fileparse($_,'\.[^\.]*$'); my($name,$path,$ext) = map{$_ || ''} @info; if (!$name && $ext) { $name = $ext; $ext = ''; } unshift @info,$File::Find::name; my $cols; @{$cols}{('fullpath','filename','path','ext')} = @info; if ($wanted_part && $wanted_re) { return unless $cols->{$wanted_part} =~ $wanted_re; } push @files, \@info; } 1; AnyData-0.12/lib/AnyData/Storage/PassThru.pm000644 000765 000024 00000003322 12462127544 020530 0ustar00snostaff000000 000000 ######################################################################### package AnyData::Storage::PassThru; ######################################################################### # This module is copyright (c), 2000 by Jeff Zucker # All rights reserved. ######################################################################### # Nothing of interest here, it just passes the storage duties to the # parser for formats like XML that do both format and storage ######################################################################### use strict; use warnings; use vars qw($VERSION @ISA); $VERSION = '0.12'; use AnyData::Storage::File; @ISA = qw( AnyData::Storage::File); sub file2str { 1 } sub push_row { my($s,$f)=@_;$s->{parser}->push_row(@$f) } sub seek_first_record { shift->{parser}->seek_first_record } sub get_col_names { shift->{col_names} } sub delete_record { shift->{parser}->delete_record } sub truncate { shift->{parser}->truncate(@_) } sub drop { shift->{parser}->drop(@_)} sub close_table { shift->{parser}->close_table(@_)} sub get_pos { shift->{parser}->get_pos(@_)} sub go_pos { shift->{parser}->go_pos(@_)} sub seek { shift->{parser}->seek(@_)} sub export { shift->{parser}->export(@_)} sub DESTROY { #print "PASSTHRU DESTROYED"; } ##################################### # push_names() ##################################### sub print_col_names { my($self, $parser, $names) = @_; $names = $parser->push_names($names); $self->{col_names} = $names; my($col_nums) = {}; for (my $i = 0; $i < @$names; $i++) { $col_nums->{$names->[$i]} = $i; } $self->{col_nums} = $col_nums; } 1; AnyData-0.12/lib/AnyData/Storage/RAM.pm000644 000765 000024 00000025623 12462127544 017406 0ustar00snostaff000000 000000 ######################################################################### package AnyData::Storage::RAM; ######################################################################### # # This module is copyright (c), 2000 by Jeff Zucker # All rights reserved. # ######################################################################### use strict; use warnings; use vars qw($VERSION $DEBUG); $VERSION = '0.12'; $DEBUG = 1; use Data::Dumper; use AnyData::Storage::File; sub new { my $class = shift; my $self = shift || {}; return bless $self, $class; } ######## # MOVE set_col_nums and open_table to Storage/Base.pm # # ALSO make DBD::AnyData::Statement and DBD::Table simple @ISA for AnyData sub set_col_nums { my $self = shift; my $col_names = $self->{col_names}; return {} unless $col_names ; return {} unless ref $col_names eq 'ARRAY'; return {} unless scalar @$col_names; my $col_nums={}; my $i=0; for (@$col_names) { next unless $_; $col_nums->{$_} = $i; $i++; } #use Data::Dumper; die Dumper $col_names; $self->{col_nums}=$col_nums; return $col_nums; } sub open_table { my( $self, $parser, $file, $read_mode, $data ) = @_; $data = $self->{recs} if $self->{recs}; #$data ||= $parser->{recs}; #$data = $file if ref $file eq 'ARRAY' and !$data; #use Data::Dumper; print Dumper $data; #print ref $parser; my $rec_sep = $parser->{record_sep};# || "\n"; my $table_ary = []; my $col_names = $parser->{col_names} || $self->{col_names}; my $cols_supplied = $col_names; my $url = $file if $file =~ m"^http://|^ftp://"; $self->{open_mode} = $read_mode || 'r'; my $data_type; $data_type='ARY-ARY' if ref $data eq 'ARRAY' and ref $data->[0] eq 'ARRAY'; $data_type='ARY-HSH' if ref $data eq 'ARRAY' and ref $data->[0] eq 'HASH'; $data_type='ARY-STR' if ref $data eq 'ARRAY' and !$data_type; $data_type ||= 'STR'; # print "[$data_type]" . ref $data if $data; # MP3 and ARRAY if ( $self->{records} && !$data ) { $table_ary = $self->{records}; $col_names ||= shift @$table_ary; } # REMOTE elsif ( $data ) { if ($parser->{slurp_mode}) { ($table_ary,$col_names) = $parser->import($data,$self); shift @$table_ary if (ref $parser) =~ /HTMLtable/ && $url && $cols_supplied; } else { if ($data_type eq 'ARY-STR') { $data = join '', @$data; } if ($data_type eq 'ARY-ARY') { $table_ary = $data; } elsif ($data_type eq 'ARY-HSH') { print "IMPORT OF HASHES NOT YET IMPLEMENTED!\n"; exit; } else { $data =~ s/\015$//gsm; # ^M = CR from DOS #use Data::Dumper; print Dumper $data; my @tmp = split /$rec_sep/, $data; #use Data::Dumper; print ref $parser, Dumper \@tmp; if ((ref $parser) =~ /Fixed/ && (!$col_names or !scalar @$col_names)) { my $colstr = shift @tmp; # $colstr =~ s/\015$//g; # ^M = CR from DOS @$col_names = split ',',$colstr; } if ((ref $parser) =~ /Paragraph/) { my $colstr = shift @tmp; @$col_names = $parser->read_fields($colstr); #print "@$col_names"; } for my $line( @tmp ) { # for (split /$rec_sep/, $data) { # s/\015$//g; # ^M = CR from DOS next if $parser->{skip_pattern} and $line =~ $parser->{skip_pattern}; my @row = $parser->read_fields($line); #print $_; #use Data::Dumper; print Dumper \@row; ###z MOD # next unless scalar @row; # push @$table_ary, \@row; push @$table_ary, \@row # unless $parser->{skip_mark} # and $row[0] eq $parser->{skip_mark}; # } } if ((ref $parser) !~ /Fixed|Paragraph/ && !$parser->{keep_first_line} && !$parser->{col_names} ) { $col_names = shift @$table_ary; } #use Data::Dumper; die Dumper $table_ary; } } # if ($file and !(ref $file eq 'ARRAY') and $file !~ m'^http://|ftp://' and !(scalar @$table_ary) ) { if ((ref $parser) !~ /XML/ ) { my $size = scalar @$table_ary if defined $table_ary; if ($file and !(ref $file eq 'ARRAY') and !$size ) { if ($file =~ m'^http://|ftp://') { # ($table_ary,$col_names) = # $self->get_remote_data($file,$parser); } else { ($table_ary,$col_names) = $self->get_local_data($file,$parser,$read_mode); } } } my @array = @$col_names if ref $col_names eq 'ARRAY'; #print "@array" if @array; if ($col_names && scalar @array == 0 ) { @array = (ref $parser =~ /Fixed/) ? split ',', $col_names : $parser->read_fields($col_names); } my $col_nums; $col_nums = $self->set_col_nums() if $col_names; my %table = ( index => 0, file => $file, records => $table_ary, col_nums => $col_nums, col_names => \@array, ); for my $key(keys %table) { $self->{$key}=$table{$key}; } #use Data::Dumper; print Dumper $self; exit; #use Data::Dumper; print Dumper $table_ary; #use Data::Dumper; print Dumper $self->{records} if (ref $parser) =~ /Weblog/; } sub close { my $s = shift; undef $s } sub get_remote_data { my $self = shift; my $file = shift; my $parser = shift; $ENV = {} unless defined $ENV; $^W = 0; undef $@; my $user = $self->{user} || $self->{username}; my $pass = $self->{pass} || $self->{password}; eval{ require 'LWP/UserAgent.pm'; }; # eval{ require 'File/DosGlob.pm'; }; die "LWP module not found! $@" if $@; my $ua = LWP::UserAgent->new; my $req = HTTP::Request->new(GET => $file); $req->authorization_basic($user, $pass) if $user and $pass; my $res = $ua->request($req); die "[$file] : " . $res->message if !$res->is_success; $^W = 1; return $res->content; # return $parser->get_data($res->content,$self->{col_names}); } sub export { my $self = shift; my $parser = shift; print "##"; return unless $parser->{export_on_close} && $self->{open_mode} ne 'r'; # return $parser->export( $self->{records}, $self->{col_names}, $self->{deleted} ); #$self->{file_manager}->str2file($str); } sub DESTROY { #shift->export; #print "DESTROY"; } sub get_local_data { my $self = shift; my $file = shift; my $parser = shift; my $open_mode = shift || 'r'; my $adf = AnyData::Storage::File->new; # $adf->open_table($parser,$file,'r'); my $fh = $adf->open_local_file($file,$open_mode); #print Dumper $file,$adf; exit; $self->{file_manager} = $adf; $self->{fh} = $fh; #use Data::Dumper; print Dumper $self; # my $fh = $adf->{fh}; return([],$self->{col_names}) if 'co' =~ /$open_mode/; # if ((ref $parser) =~ /HTML/) { # print "[[$file]]"; # for (<$fh>) { print; } # } local $/ = undef; my $str = <$fh>; # $fh->close; #print $str if (ref $parser) =~ /HTML/; return $self->{col_names} unless $str; return $parser->get_data($str,$self->{col_names}); } sub dump { my $self = shift; print "\nTotal Rows = ", scalar @{ $self->{records} }, "\nCurrent Row = ", $self->{index}, "\nData = ", Dumper $self->{records}, ; } sub col_names { shift->{col_names} } sub get_col_names { my $self=shift; my $parser=shift; my $c = $self->{col_names} || $parser->{col_names}; #print "###@$c"; #return $c; # if (!scalar @$c and $self->{data}) { # $c = shift @{$self->{data}}; # } # return $c; } sub get_file_handle {''} sub get_file_name {''} sub seek_first_record { shift->{index}=0 } sub get_pos { my $s=shift; $s->{CUR}= $s->{index}} sub go_pos {my $s=shift;$s->{index}=$s->{CUR}} sub is_deleted { my $s=shift; return $s->{deleted}->{$s->{index}-1} }; sub delete_record { my $self = shift; # $self->{records}->[ $self->{index}-1 ]->[-1] = $self->{del_marker}; $self->{deleted}->{ $self->{index}-1 }++; } ################################## # fetch_row() ################################## sub get_record { my($self,$parser) = @_; my $currentRow = $self->{index}; return undef unless $self->{records} ; return undef if $currentRow >= @{ $self->{records} }; $self->{index} = $currentRow+1; $self->get_pos($self->{index}); #print @{ $self->{records}->[ $currentRow ] }; return $self->{records}->[ $currentRow ]; } *file2str = \&get_record; *write_fields = \&push_row; #################################### # push_row() #################################### sub push_row { my($self, $fields, $parser) = @_; if (! ref $fields) { $fields =~ s/\012$//; #chomp $fields; my @rec = $parser->read_fields($fields); $fields = \@rec; } #use Data::Dumper; print Dumper $fields; my $currentRow = $self->{index}; $self->{index} = $currentRow+1; $self->{records}->[$currentRow] = $fields; return 1; } ################################## # truncate() ################################## sub truncate { my $self = shift; return splice @{$self->{records}}, $self->{index},1; } ##################################### # push_names() ##################################### sub print_col_names { my($self, $parser, $names) = @_; $self->{col_names} = $names; $self->{parser}->{col_names} = $names; my($col_nums) = {}; for (my $i = 0; $i < @$names; $i++) { $col_nums->{$names->[$i]} = $i; } $self->{col_nums} = $col_nums; } sub drop {1;} sub close_table {1;} sub seek { my($self, $pos, $whence) = @_; return unless defined $self->{records}; my($currentRow) = $self->{index}; if ($whence == 0) { $currentRow = $pos; } elsif ($whence == 1) { $currentRow += $pos; } elsif ($whence == 2) { $currentRow = @{$self->{records}} + $pos; } else { die $self . "->seek: Illegal whence argument ($whence)"; } if ($currentRow < 0) { die "Illegal row number: $currentRow"; } $self->{index} = $currentRow; } ############################################################################ 1; __END__ sub str2file { my($self,$rec)=@_; my @c = caller 3; if ($c[3] =~ /DELETE/ or $c[3] =~ /UPDATE/) { $self->delete_record($rec); return undef if $c[3] =~ /DELETE/; } push @{ $self->{table} }, $rec; # $self->{index}++; return $rec; } sub delete_record{my $self=shift;use Data::Dumper; print Dumper @_} sub close {1;} sub seek { my($self,$pos,$whence) = @_; if ($pos == 0 && $whence == 0) { $self->{index}=0; return $self->{index}; } if ($pos == 0 && $whence == 2) { return $self->{index}; } } sub truncate {}#use Data::Dumper; print Dumper \@_;} 1; __END__ AnyData-0.12/lib/AnyData/Storage/TiedHash.pm000644 000765 000024 00000010364 12423400733 020443 0ustar00snostaff000000 000000 ###################################### package AnyData::Storage::TiedHash; ###################################### use strict; use warnings; sub FETCH { my($self,$key) = @_; my(@rows,$row,$found); return $self->{ad}->col_names if($key eq '__colnames'); return $self->{ad}->key_col if $key eq '__key'; my $ismultiple = ref $key; $self->{ad}->seek_first_record; while ($row = $self->{ad}->fetchrow_hashref) { if ( $self->{ad}->match($row,$key) ) { $found++; last unless $ismultiple; push @rows, $row; } } return \@rows if $ismultiple; return $found ? $row : undef; } sub TIEHASH { my $class = shift; my $ad = shift; my $perms = shift || 'r'; my $records = shift || {}; my $self = { INDEX => 0, RECORDS => $records, ad => $ad, del_marker => "\0", needs_packing => 0, PERMS => $perms, }; return bless $self, $class; } sub verify_columns { my $col_names = shift; my $val = shift; my %is_col = map {$_ => 1} @$col_names; my $errstr = "ERROR: XXX is not a column in the table!\n"; $errstr .= scalar @$col_names ? " columns are: " . join "~",@$col_names,"\n" : " couldn't find any column names\n"; if (ref $val eq 'HASH') { for (keys %$val) { $errstr =~ s/XXX/$_/; die $errstr if !$is_col{$_}; } } else { $errstr =~ s/XXX/$val/; $is_col{$val} ? return 1 : die $errstr; } } sub STORE { my($self,$key,$value) = @_; #my @c = caller 1; $self->{errstr} = "Can't store: file is opened in 'r' read-only mode!" if $self->{PERMS} eq 'r'; return undef if $self->{errstr}; my @colnames = @{ $self->{ad}->col_names }; verify_columns(\@colnames,$value); return $self->{ad}->update_multiple_rows($key,$value) if ref $key eq 'HASH'; $self->{ad}->seek(0,2); my @newrow; for my $i(0..$#colnames) { $newrow[$i] = $value->{$colnames[$i]}; next if defined $newrow[$i]; $newrow[$i] = $key if $colnames[$i] eq $self->{ad}->key_col; $newrow[$i] = undef unless $newrow[$i]; } return $self->{ad}->push_row(@newrow); } sub DELETE { my($self,$key)=@_; die "Can't delete: file is opened in 'r' read-only mode!" if $self->{PERMS} eq 'r'; my $row; my $count; return $self->{ad}->delete_multiple_rows($key) if ref $key; if ($row = $self->FETCH($key) ) { $self->{ad}->delete_single_row; $self->{needs_packing}++; $count++; } #return $row; return $count; } sub EXISTS { my($self,$key)=@_; return $self->FETCH($key); } sub FIRSTKEY { my $self = shift; $self->{ad}->seek_first_record(); my $found =0; my $row; while (!$found) { $row = $self->{ad}->fetchrow_hashref() or last; $found++; last; } return $found ? $row : undef; } sub NEXTKEY { my $self = shift; my $row; my $lastcol=0; my $found=0; while (!$found) { $row = $self->{ad}->fetchrow_hashref() or last; $found++; last; } return $found ? $row : undef; } sub adRows { my $self = shift; my $key = shift; my $count=0; $self->{ad}->seek_first_record; if (!$key) { while (my $row = $self->{ad}->fetchrow_hashref) { $count++; } } else { while (my $row = $self->{ad}->fetchrow_hashref) { $count++ if $self->{ad}->match($row,$key); } } return $count; } sub adColumn { my($self,$column,$flags)=@_; $flags ||= ''; my @results=(); $self->{ad}->seek_first_record; while (my $row = $self->{ad}->fetchrow_hashref) { push @results, $row->{$column} } my %is_member; @results = grep(!$is_member{$_}++, @results) if $flags; $flags =~ /u/i; # @results = sort @results if $flags =~ /a/i; # @results = reverse sort @results if $flags =~ /d/i; return @results; } sub DESTROY { #my $self=shift; #undef $self->{ad}; #print "HASH DESTROYED"; } ############################## # END OF AnyData::Tiedhash ############################## 1; AnyData-0.12/lib/AnyData/Format/Base.pm000644 000765 000024 00000001541 12423400733 017445 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Base; ######################################################### # AnyData driver for plain text files # copyright (c) 2000, Jeff Zucker ######################################################### use strict; use warnings; use vars qw( @ISA $DEBUG ); $DEBUG = 0; sub new { my $class = shift; my $self = shift || {}; $self->{record_sep} ||= "\n"; ### $self->{slurp_mode} = 1 unless defined $self->{slurp_mode}; return bless $self, $class; } sub DESTROY { # print "PARSER DESTROYED" } sub get_data { undef } sub storage_type { undef } sub init_parser { undef } sub write_fields { my $self = shift; my @ary = @_; return \@ary; } sub read_fields { my $self = shift; my $aryref = shift; return @$aryref; } 1; AnyData-0.12/lib/AnyData/Format/CSV.pm000644 000765 000024 00000006674 12462127544 017253 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::CSV; ######################################################### # copyright (c) 2000, Jeff Zucker ######################################################### =head1 NAME AnyData::Format::CSV - tiedhash & DBI/SQL access to CSV data =head1 SYNOPSIS use AnyData; my $table = adTable( 'CSV', $filename,'r',$flags ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','CSV', $filename,$flags,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a plug-in format parser for the AnyData and DBD::AnyData modules. It will read column names from the first row of the file, or accept names passed by the user. In addition to column names, the user may set other options as follows: col_names : a comma separated list of column names eol : the end of record mark, \n by default quote_char : the character used to quote fields " by default escape_char : the character used to escape the quote char, " by default If you are using this with DBD::AnyData, put ad_ in front of the flags, e.g. ad_eol. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::Base; use vars qw( @ISA $VERSION); @AnyData::Format::CSV::ISA = qw( AnyData::Format::Base ); $VERSION = '0.12'; sub new { my $class = shift; my $self = shift || {}; my $s = ${self}->{field_rsep} || ${self}->{field_sep} || q(,); my $s1 = $s; #$s1 =~ s/\\/\\\\/ if $s1 =~ /\+$/; #$s1 =~ s/\+$//; #die $s1; ${self}->{field_sep} ||= q(,); my $q = ${self}->{quote} ||= q("); my $e = ${self}->{escape} ||= q("); ${self}->{record_sep} ||= qq(\n); $self->{regex} = [ qr/$q((?:(?:$e$q)|[^$q])*)$q$s?|([^$s1]+)$s?|$s/, "$e$q", $q ]; return bless $self, $class; } sub read_fields { my $self = shift; my $str = shift || return undef; my @fields = (); my $captured; my $field_wsep = $self->{field_wsep} || $self->{field_sep}; if ($self->{trim}) { $str =~ s/\s*($field_wsep)\s*/$1/g; } while ($str =~ m#$self->{regex}->[0]#g) { $captured = $+; $captured =~ s/$self->{regex}[1]/$self->{regex}[2]/g if $captured; last if $captured && $captured eq "\n"; push(@fields,$captured); }; push(@fields, undef) if substr($str,-1,1) eq $field_wsep; return @fields; } sub write_fields { my $self = shift; my @fields = @_; my $str = ''; my $field_rsep = $self->{field_rsep} || $self->{field_sep}; $field_rsep = quotemeta($field_rsep); my $field_wsep = $self->{field_sep}; $field_wsep =~ s/\\//g; # if ($self->{ChopBlanks}) { # $field_wsep =~ " $field_wsep "; # } for (@fields) { $_ = '' if !defined $_; if ($self->{field_sep} eq ',') { s/"/""/g; s/^(.*)$/"$1"/s if /,/ or /\n/s or /"/; } $str .= $_ . $field_wsep; } $str =~ s/$self->{field_sep}$/$self->{record_sep}/; return $str; } 1; AnyData-0.12/lib/AnyData/Format/FileSys.pm000644 000765 000024 00000001225 12423400733 020150 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::FileSys; ######################################################### # AnyData driver for plain text files # copyright (c) 2000, Jeff Zucker ######################################################### use strict; use warnings; use AnyData::Format::Base; use vars qw( @ISA $DEBUG ); @AnyData::Format::FileSys::ISA = qw( AnyData::Format::Base ); $DEBUG = 0; sub new { my $class = shift; my $self = shift || {}; $self->{rec_sep} ||= "\n"; $self->{keep_first_line} = 1; $self->{storage} = 'FileSys'; return bless $self, $class; } 1; AnyData-0.12/lib/AnyData/Format/Fixed.pm000644 000765 000024 00000005124 12462127544 017644 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Fixed; ######################################################### # copyright (c) 2000, Jeff Zucker ######################################################### =head1 NAME AnyData::Format::Fixed - tiedhash & DBI/SQL access to Fixed length data =head1 SYNOPSIS use AnyData; my $table = adHash( 'Fixed', $filename,'r',{pattern=>'A20 A2'} ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','Fixed', $filename, {pattern=>'A20 A2'},'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a parser for fixed length record files. You must specify an unpack pattern listing the widths of the fields e.g. {pattern=>'A3 A7 A20'}. You can either supply the column names or let the module get them for you from the first line of the file. In either case, they should be a comma separated string. Refer to L for the formatting of the pattern. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::Base; use vars qw( @ISA $VERSION); @AnyData::Format::Fixed::ISA = qw( AnyData::Format::Base ); $VERSION = '0.12'; sub read_fields { my $self = shift; my $str = shift; if (!$self->{pattern}) { print "NO UNPACK PATTERN SPECIFIED!"; exit; } my @fields = unpack $self->{pattern}, $str; if ($self->{trim}) { @fields = map {s/^\s+//; s/\s+$//; $_} @fields; } return @fields; } sub write_fields { my $self = shift; my @fields = @_; my $fieldNum =0; my $patternStr = $self->{pattern} || ''; $patternStr =~ s/[a-zA-Z]//gi; my @fieldLengths = split /\s+/, $patternStr; my $fieldStr = ''; for(@fields) { next unless defined $_; # PAD OR TRUNCATE DATA TO FIT WITHIN FIELD LENGTHS my $oldLen = length $_ || 0; my $newLen = $fieldLengths[$fieldNum] || 0; if ($oldLen < $newLen) { $_ = sprintf "%-${newLen}s",$_; } if ($oldLen > $newLen) { $_ = substr $_, 0, $newLen; } $fieldNum++; $fieldStr .= $_; } $fieldStr .= $self->{record_sep}; #print "<$fieldStr>"; return $fieldStr; } 1; AnyData-0.12/lib/AnyData/Format/HTMLtable.pm000644 000765 000024 00000013502 12462127544 020360 0ustar00snostaff000000 000000 ###################################################################### package AnyData::Format::HTMLtable; ###################################################################### # by Jeff Zucker # copyright 2000 all rights reserved ###################################################################### =head1 NAME HTMLtable - tied hash and DBI/SQL access to HTML tables =head1 SYNOPSIS use AnyData; my $table = adHash( 'HTMLtable', $filename ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','HTMLtable', $filename,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This module allows one to treat the data contained in an HTML table as a tied hash (using AnyData.pm) or as a DBI/SQL accessible database (using DBD::AnyData.pm). Both the tiedhash and DBI interfaces allow one to read, modify, and create HTML tables from perl data or from local or remote files. The module requires that CGI, HTML::Parser and HTML::TableExtract are installed. When reading the HTML table, this module is essentially just a pass through to Matt Sisk's excellent HTML::TableExtract module. If no flags are specified in the adTie() or ad_catalog() calls, then TableExtract is called with depth=0 and count=0, in other words it finds the first row of the first table and treats that as the column names for the entire table. If a flag for 'cols' (column names) is specified in the adTie() or ad_catalog() calls, that list of column names is passed to TableExtract as a headers parameter. If the user specifies flags for headers, depth, or count, those are passed directly to TableExtract. When exporting to an HTMLtable, you may pass flags to specify properties of the whole table (table_flags), the top row containing the column names (top_row_flags), and the data rows (data_row_flags). These flags follow the syntax of CGI.pm table constructors, e.g.: print adExport( $table, 'HTMLtable', { table_flags => {Border=>3,bgColor=>'blue'}; top_row_flags => {bgColor=>'red'}; data_row_flags => {valign='top'}; }); The table_flags will default to {Border=>1,bgColor=>'white'} if none are specified. The top_row_flags will default to {bgColor=>'#c0c0c0'} if none are specified; The data_row_flags will be empty if none are specified. In other words, if no flags are specified the table will print out with a border of 1, the column headings in gray, and the data rows in white. CAUTION: This module will *not* preserve anything in the html file except the selected table so if your file contains more than the selected table, you will want to use adTie() or $dbh->func(...,'ad_import') to read the table and then adExport() or $dbh->func(...,'ad_export') to write the table to a different file. When using the HTMLtable format, this is the only way to preserve changes to the data, the adTie() command will *not* write to a file. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::Base; use AnyData::Storage::File; use vars qw( @ISA $VERSION); @AnyData::Format::HTMLtable::ISA = qw( AnyData::Format::Base ); $VERSION = '0.12'; sub new { my $class = shift; my $self = shift || {}; $self->{export_on_close} = 1; $self->{slurp_mode} = 1; return bless $self, $class; } sub storage_type { 'RAM'; } sub import { my $self = shift; my $data = shift; my $storage = shift; return $self->get_data($data,$self->{col_names}); } sub get_data { my $self = shift; my $str = shift or return undef; my $col_names = shift; require HTML::TableExtract; my $count = $self->{count} || 0; my $depth = $self->{depth} || 0; my $headers = $self->{headers} || $self->{col_names} || undef; my %flags; if (defined $count or defined $depth or defined $headers) { $flags{count} = $count if defined $count; $flags{depth} = $depth if defined $depth; $flags{headers} = $headers if defined $headers; } else { %flags = $col_names ? ( headers => $col_names ) : (count=>$count,depth=>$depth); } my $te = new HTML::TableExtract( %flags ); $te->parse($str); my $table; @$table = $te->rows; $self->{col_names} = shift @$table if !$col_names; return $table, $self->{col_names}; } sub export { #print "EXPORTING!"; my $self = shift; my $storage = shift; my $col_names = $storage->{col_names}; my $table = $storage->{records}; #use Data::Dumper; print Dumper $table; print "###"; exit; my $fh = $storage->{fh}; use CGI; my $table_flags = shift || {Border=>1,bgColor=>'white'}; my $top_row_flags = shift || {bgColor=>'#c0c0c0'}; my $data_row_flags = shift || {}; @$table = map { my $row = $_; @$row = map { $_ || ' ' } @$row; $row; } @$table; my $str = CGI::table( $table_flags, CGI::Tr( $top_row_flags, CGI::th($col_names) ), map CGI::Tr( $data_row_flags, CGI::td($_) ), @$table ); $fh->write($str,length $str) if $fh; return $str; } sub exportOLD { my $self = shift; my $table = shift; my $col_names = shift; use CGI; my $table_flags = shift || {Border=>1,bgColor=>'white'}; my $top_row_flags = shift || {bgColor=>'#c0c0c0'}; my $data_row_flags = shift || {}; return CGI::table( $table_flags, CGI::Tr( $top_row_flags, CGI::th($col_names) ), map CGI::Tr( $data_row_flags, CGI::td($_) ), @$table ); } 1; AnyData-0.12/lib/AnyData/Format/Ini.pm000644 000765 000024 00000003567 12462127544 017335 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Ini; ######################################################### # copyright (c) 2000, Jeff Zucker # all rights reserved ######################################################### =head1 NAME AnyData::Format::Ini - tiedhash & DBI/SQL access to ini files =head1 SYNOPSIS use AnyData; my $table = adHash( 'Ini', $filename,'r',$flags ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','Init', $filename,$flags,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a parser for simple name=value style Ini files. Soon it will also handle files with sections. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use AnyData::Format::CSV; use strict; use warnings; use vars qw/@ISA $VERSION/; @ISA = qw(AnyData::Format::CSV); $VERSION = '0.12'; sub new { my $class = shift; my $flags = shift || {}; $flags->{field_sep} ||= '='; my $self = AnyData::Format::CSV::->new( $flags ); return bless $self, $class; } sub write_fields { my($self,$key,$value) = @_; return undef unless $key; $value ||= ''; return "$key = $value" . $self->{record_sep}; } sub read_fields { my $self = shift; my $str = shift || return undef; $str =~ s/^\s+//; $str =~ s/\s+$//; return undef unless $str; my @fields = $str =~ /^([^=]*?)\s*=\s*(.*)/; die "Couldn't parse line '$str'\n" unless defined $fields[0]; return( @fields ); } AnyData-0.12/lib/AnyData/Format/Mp3.pm000644 000765 000024 00000012752 12462127544 017251 0ustar00snostaff000000 000000 ###################################################################### package AnyData::Format::Mp3; ###################################################################### # # copyright 2000 by Jeff Zucker # all rights reserved # ###################################################################### =head1 NAME AnyData::Format::Mp3 - tied hash and DBI access to Mp3 files =head1 SYNOPSIS use AnyData; my $playlist = adTie( 'Passwd', ['c:/My Music/'] ); while (my $song = each %$playlist){ print $song->{artist} if $song->{genre} eq 'Reggae' } OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('playlist','Mp3,['c:/My Music'],'ad_catalog'); my $playlist = $dbh->selectall_arrayref( qq{ SELECT artist, title FROM playlist WHERE genre = 'Reggae' }); # ... other DBI/SQL operations =head1 DESCRIPTION This module provides a tied hash interface and a DBI/SQL interface to MP files. It creates an in-memory database or hash from the Mp3 files themselves without actually creating a separate database file. This means that the database is automatically updated just by moving files in or out of the directories. Many mp3 (mpeg three) music files contain a header describing the song name, artist, and other information about the music. Simply choose 'Mp3' as the format and give a reference to an array of directories containing mp3 files. Each file in those directories will become a record containing the fields: song artist album year genre filename filesize This module is a submodule of the AnyData.pm and DBD::AnyData.pm modules. Refer to their documentation for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::Base; use AnyData::Storage::FileSys; use AnyData::Storage::File; use vars qw( @ISA $VERSION); @AnyData::Format::Mp3::ISA = qw( AnyData::Format::Base ); $VERSION = '0.12'; sub new { my $class = shift; my $self = shift || {}; #use Data::Dumper; die Dumper $self; my $dirs = $self->{dirs} || $self->{file_name} || $self->{recs}; $self->{col_names} = 'song,artist,album,year,genre,filename,filesize'; $self->{recs} = $self->{records} = get_data( $dirs ); return bless $self, $class; } sub storage_type { 'RAM'; } sub read_fields { my $self = shift; my $thing = shift; return @$thing if ref $thing eq 'ARRAY'; return split ',', $thing; } sub write_fields { die "WRITING NOT IMPLEMENTED FOR FORMAT Mp3"; } sub get_data { my $dirs = shift; my $table = []; my @files = AnyData::Storage::FileSys::get_filename_parts( {}, part => 'ext', re => 'mp3', dirs => $dirs ); for my $file_info(@files) { my $file = $file_info->[0]; my $cols = get_mp3_tag($file) || next; my $filesize = -s $file; $filesize = sprintf "%1.fmb", $filesize/1000000; my @row = (@$cols,$file,$filesize); push @$table, \@row; # 'file_name,path,ext,fullpath,size,' # 'name,artist,album,year,comment,genre'; } return $table; } sub get_mp3_tag { my($file) = shift; my $adf = AnyData::Storage::File->new; my(undef,$fh,undef) = $adf->open_local_file($file,'r'); local $/ = ''; $fh->seek(-128,2); my $str = <$fh> || ''; $fh->close; return undef if !($str =~ /^TAG/); #$file = sprintf("%-255s",$file); #$str =~ s/^TAG(.*)/$file$1/; $str =~ s/^TAG(.*)/$1/; my $genre = $str; $genre =~ s/^.*(.)$/$1/g; $str =~ s/(.)$//g; $genre = unpack( 'C', $genre ); my @genres =("Blues", "Classic Rock", "Country", "Dance", "Disco", "Funk", "Grunge", "Hip-Hop", "Jazz", "Metal", "New Age", "Oldies", "Other", "Pop", "R&B", "Rap", "Reggae", "Rock", "Techno", "Industrial", "Alternative", "Ska", "Death Metal", "Pranks", "Soundtrack", "Eurotechno", "Ambient", "Trip-Hop", "Vocal", "Jazz+Funk", "Fusion", "Trance", "Classical", "Instrumental", "Acid", "House", "Game", "Sound Clip", "Gospel", "Noise", "Alternative Rock", "Bass", "Soul", "Punk", "Space", "Meditative", "Instrumental Pop", "Instrumental Rock", "Ethnic", "Gothic", "Darkwave", "Techno-Industrial", "Electronic", "Pop-Folk", "Eurodance", "Dream", "Southern Rock", "Comedy", "Cult", "Gangsta", "Top 40", "Christian Rap", "Pop/Funk", "Jungle", "Native American", "Cabaret", "New Wave", "Psychadelic", "Rave", "Show Tunes", "Trailer", "Lo-Fi", "Tribal", "Acid Punk", "Acid Jazz", "Polka", "Retro", "Musical", "Rock & Roll", "Hard Rock", "Folk", "Folk/Rock", "National Folk", "Swing", "Fast-Fusion", "Bebop", "Latin", "Revival", "Celtic", "Bluegrass", "Avantgarde", "Gothic Rock", "Progressive Rock", "Psychedelic Rock", "Symphonic Rock", "Slow Rock", "Big Band", "Chorus", "Easy Listening", "Acoustic", "Humour", "Speech", "Chanson", "Opera", "Chamber Music", "Sonata", "Symphony", "Booty Bass", "Primus", "Porn Groove", "Satire", "Slow Jam", "Club", "Tango", "Samba", "Folklore", "Ballad", "Power Ballad", "Rhytmic Soul", "Freestyle", "Duet", "Punk Rock", "Drum Solo", "Acapella", "Euro-House", "Dance Hall", "Goa", "Drum & Bass", "Club-House", "Hardcore", "Terror", "Indie", "BritPop", "Negerpunk", "Polsk Punk", "Beat", "Christian Gangsta Rap", "Heavy Metal", "Black Metal", "Crossover", "Contemporary Christian", "Christian Rock", "Unknown"); $genre = $genres[$genre] || ''; my @cols = unpack 'A30 A30 A30 A4 A30', $str; my $comment = pop @cols; #print $comment; @cols = map{$_ || ''} @cols; push @cols, $genre; return \@cols; } 1; AnyData-0.12/lib/AnyData/Format/Paragraph.pm000644 000765 000024 00000004245 12462127544 020515 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Paragraph; ######################################################### # copyright (c) 2000, Jeff Zucker ######################################################### =head1 NAME AnyData::Format::Paragraph - tiedhash & DBI/SQL access to vertical files =head1 SYNOPSIS use AnyData; my $table = adHash( 'Paragraph', $filename,'r',$flags ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','Paragraph', $filename,$flags,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a plug-in format parser for the AnyData and DBD::AnyData modules. It handles "vertical" files in which the record name occurs on a line by itself followed by records on lines by themselves, e.g. Photoslop /My Photos/ .jpg, .gif, .psd Nutscrape /htdocs/ .html, .htm Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData; use AnyData::Format::CSV; use vars qw/@ISA $VERSION/; @ISA = qw(AnyData::Format::CSV); $VERSION = '0.12'; sub new { my $class = shift; my $flags = shift || {}; my $f = $flags->{record_sep} || ''; #print "<$f>"; $flags->{field_sep} = "\n"; $flags->{record_sep} = "\n\n"; #print "[",$flags->{record_sep},"]"; my $self = AnyData::Format::CSV::->new( $flags ); return bless $self, $class; } sub write_fields { my($self,@fields) = @_; @fields = map {$_ || ''} @fields; return join("\n",@fields) . $self->{record_sep}; } sub read_fields { my $self = shift; my $str = shift || return undef; return undef unless $str; my @fields = split /\n/, $str; @fields = map{s/\s+$//; $_}@fields; die "Couldn't parse line '$str'\n" unless defined $fields[0]; return( @fields ); } AnyData-0.12/lib/AnyData/Format/Passwd.pm000644 000765 000024 00000003513 12462127544 020046 0ustar00snostaff000000 000000 ###################################################################### package AnyData::Format::Passwd; ###################################################################### # # copyright 2000 by Jeff Zucker # all rights reserved # ###################################################################### =head1 NAME Passwd - tied hash and DBI access to passwd files =head1 SYNOPSIS use AnyData; my $users = adTie( 'Passwd', '/etc/passwd' ); print $users->{jdoe}->{homedir}; # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('users','Passwd','/etc/passwd','ad_catalog'); my $g7 = $dbh->selectall_arrayref( qq{ SELECT username, homedir FROM users WHERE GID = '7' }); # ... other DBI/SQL operations =head1 DESCRIPTION This module provides a tied hash interface and a DBI/SQL interface to passwd files. Simply specify the format as 'Passwd' and give the name of the file and the modules will build a hash table with the column names username passwd UID GID fullname homedir shell The username field is treated as a key column. This module is a submodule of the AnyData.pm and DBD::AnyData.pm modules. Refer to their documentation for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::CSV; use vars qw( @ISA $VERSION); @AnyData::Format::Passwd::ISA = qw( AnyData::Format::CSV ); $VERSION = '0.12'; sub new { my $class = shift; my $flags = shift || {}; $flags->{field_sep} = q(:); $flags->{col_names} = 'username,passwd,UID,GID,fullname,homedir,shell'; $flags->{key} = 'username'; $flags->{keep_first_line} = 1; my $self = AnyData::Format::CSV::->new( $flags ); return bless $self, $class; } 1; AnyData-0.12/lib/AnyData/Format/Pipe.pm000644 000765 000024 00000003547 12462127544 017511 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Pipe; ######################################################### # copyright (c) 2000, Jeff Zucker # all rights reserved ######################################################### =head1 NAME AnyData::Format::Pipe - tiedhash & DBI/SQL access to Pipe delimited files =head1 SYNOPSIS use AnyData; my $table = adHash( 'Pipe', $filename,'r',$flags ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','Pipe', $filename,$flags,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a plug-in format parser for the AnyData and DBD::AnyData modules. It will read column names from the first row of the file, or accept names passed by the user. In addition to column names, the user may set other options as follows: col_names : a pipe separated list of column names If you are using this with DBD::AnyData, put ad_ in front of the flags, e.g. ad_eol. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::CSV; use vars qw( @ISA $VERSION); @AnyData::Format::Pipe::ISA = qw( AnyData::Format::CSV ); $VERSION = '0.12'; sub new { my $class = shift; my $flags = shift || {}; $flags->{field_sep} ||= q(\|); # $flags->{field_sep} ||= q(\s*\|\s*); # my $self = AnyData::Format::CSV::->new({ my $self = new AnyData::Format::CSV({ %$flags }); return bless $self, $class; } 1; __END__ AnyData-0.12/lib/AnyData/Format/Tab.pm000644 000765 000024 00000003677 12462127544 017326 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Tab; ######################################################### # copyright (c) 2000, Jeff Zucker # all rights reserved ######################################################### =head1 NAME AnyData::Format::Tab - tiedhash & DBI/SQL access to Tab delimited files =head1 SYNOPSIS use AnyData; my $table = adHash( 'Tab', $filename,'r',$flags ); while (my $row = each %$table) { print $row->{name},"\n" if $row->{country} =~ /us|mx|ca/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('table1','Tab', $filename,$flags,'ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT name FROM table1 WHERE country = 'us' }); # ... other DBI/SQL operations =head1 DESCRIPTION This is a plug-in format parser for the AnyData and DBD::AnyData modules. It will read column names from the first row of the file, or accept names passed by the user. In addition to column names, the user may set other options as follows: col_names : a tab separated list of column names eol : the end of record mark, \n by default quote_char : the character used to quote fields " by default escape_char : the character used to escape the quote char, " by default If you are using this with DBD::AnyData, put ad_ in front of the flags, e.g. ad_eol. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::CSV; use vars qw( @ISA $VERSION ); @AnyData::Format::Tab::ISA = qw( AnyData::Format::CSV ); $VERSION = '0.12'; sub new { my $class = shift; my $flags = shift || {}; $flags->{field_sep} ||= qq(\t); my $self = AnyData::Format::CSV::->new( $flags ); return bless $self, $class; } 1; AnyData-0.12/lib/AnyData/Format/Text.pm000644 000765 000024 00000003754 12423400733 017527 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Text; ######################################################### # AnyData driver for plain text files # copyright (c) 2000, Jeff Zucker ######################################################### use strict; use warnings; use AnyData::Format::Base; use AnyData::Storage::FileSys; use vars qw( @ISA $DEBUG ); @AnyData::Format::Text::ISA = qw( AnyData::Format::Base ); $DEBUG = 0; sub new { my $class = shift; my $self = shift || {}; #use Data::Dumper; die Dumper $self; $self->{rec_sep} ||= "\n"; if ($self->{dirs}) { $self->{storage} = 'FileSys'; $self->{col_names} = 'fullpath,path,name,ext,size,content'; $self->{records} = get_data( {},$self->{dirs} ); } else { $self->{col_names} = 'text'; $self->{key} = 'text'; } $self->{keep_first_line} = 1; return bless $self, $class; } sub write_fields { my $self = shift; return $self->{dirs} ? pop @_ : join '', @_; } sub read_fields { my $self = shift; my $str = shift || return undef; if (!$self->{dirs}) { my @row = ($str); return @row } } sub get_data { my $self = shift; my $dirs = shift; # my @col_names = @{ $self->{col_names} }; my $table = []; my @files = AnyData::Storage::FileSys::get_filename_parts( dirs => $dirs ); for my $file_info(@files) { my $file = $file_info->[0]; # my $cols = get_mp3_tag($file) || next; #my $filesize = -s $file; #my @row = (@$file_info,$filesize); my @row = ( $file_info->[0], $file_info->[2], $file_info->[1], $file_info->[3], ); push @$table, \@row; # 'fullpath,path,name,ext,size,content'; # 'fullpath,file_name,path,ext,size,' # 'name,artist,album,year,comment,genre'; } return $table; } 1; AnyData-0.12/lib/AnyData/Format/Weblog.pm000644 000765 000024 00000005336 12462127544 020031 0ustar00snostaff000000 000000 ######################################################### package AnyData::Format::Weblog; ######################################################### # AnyData driver for "Common Log Format" web log files # copyright (c) 2000, Jeff Zucker ######################################################### =head1 NAME AnyData::Format::Weblog - tiedhash & DBI/SQL access to HTTPD Logs =head1 SYNOPSIS use AnyData; my $weblog = adTie( 'Weblog', $filename ); while (my $hit = each %$weblog) { print $hit->{remotehost},"\n" if $hit->{request} =~ /mypage.html/; } # ... other tied hash operations OR use DBI my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func('hits','Weblog','access_log','ad_catalog'); my $hits = $dbh->selectall_arrayref( qq{ SELECT remotehost FROM hits WHERE request LIKE '%mypage.html%' }); # ... other DBI/SQL read operations =head1 DESCRIPTION This is a plug-in format parser for the AnyData and DBD::AnyData modules. You can gain read access to Common Log Format files web server log files (e.g. NCSA or Apache) either through tied hashes or arrays or through SQL database queries. Fieldnames are taken from the W3 definitions found at http://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format remotehost usernname authuser date request status bytes In addition, two extra fields that may be present in extended format logfiles are: referer client This module does not currently support writing to weblog files. Please refer to the documentation for AnyData.pm and DBD::AnyData.pm for further details. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut use strict; use warnings; use AnyData::Format::Base; use vars qw( @ISA $DEBUG $VERSION); @AnyData::Format::Weblog::ISA = qw( AnyData::Format::Base ); $DEBUG = 0; $VERSION = '0.12'; sub new { my $class = shift; my $self = shift || {}; $self->{col_names} = 'remotehost,username,authuser,date,request,status,bytes,referer,client'; $self->{record_sep} = "\n"; $self->{key} = 'datestamp'; $self->{keep_first_line} = 1; return bless $self, $class; } sub read_fields { print "PARSE RECORD\n" if $DEBUG; my $self = shift; my $str = shift || return undef; $str =~ s/^\s+//; $str =~ s/\s+$//; return undef unless $str; my (@row) = $str =~ /^(\S*) (\S*) (\S*) \[([^\]]*)\] "(.*?)" (\S*) (\S*)\s*(.*)$/; return undef unless defined $row[0]; my ( $referer, $client ) = $row[7] =~ /^(.*?)\s(.*)$/; $client ||= ''; $referer ||= ''; ( $row[7], $row[8] ) = ( $referer, $client ); # $row[3] =~ s/\s*-\s*(\S*)$//; # hide GMT offset on datestamp return @row; } 1; AnyData-0.12/lib/AnyData/Format/XML.pm000644 000765 000024 00000106476 12462127544 017261 0ustar00snostaff000000 000000 ################################################################## package AnyData::Format::XML; ################################################################## # an AnyData format parser for XML # by Jeff Zucker ################################################################## use strict; use warnings; use AnyData::Format::Base; use AnyData::Storage::RAM; use XML::Twig; use vars qw( @ISA $DEBUG $VERSION); @AnyData::Format::XML::ISA = qw( AnyData::Format::Base ); $VERSION = '0.12'; sub seek { 1 } sub get_pos { 1 } sub go_pos { 1 } sub new { my $class = shift; my $self = shift || {}; $self->{export_on_close} = 1; $self->{slurp_mode} = 1; if ($self->{col_names}) { ## something goes here :-) } return bless $self, $class; } sub storage_type { 'PassThru'; } sub truncate { my $self = shift; my $data = shift; # from SQL::Statement, ignored for my $e( $self->{twig}->root->descendants ) { next unless $e->gi eq 'delete__'; $e->delete; } undef $self->{last_before_delete}; } sub push_row { my $self = shift; my @fields = @_; my @ch = caller 3; # tied-hash my @cd = caller 4; # DBD my $hash_caller = $ch[3] || ''; my $dbd_caller = $cd[3] || ''; my @f = caller 4; if ($dbd_caller =~ /SQL/ ) { # DELETE | UPDATE | INSERT if ( !$self->{last_before_delete} && $dbd_caller !~ /INSERT/ ) { $self->{last_before_delete} = 1; my @children = $self->{twig}->root->descendants; for my $e(@children) { next unless $e->path eq $self->{record_tag}->path; next if $e->cmp($self->{record_tag}) == 0; $e->set_gi("delete__") } } #$self->{twig}->print; exit; return $self->insert_record(\@fields); } $self->insert_record(\@fields); } sub insert_record { my $self = shift; my $row = shift; #print "@$row\n"; my $rect = $self->{record_tag}; my $col_structure = $self->{col_structure}; my @tags = @{$col_structure->{col_names}}; my @cols = @{$col_structure->{pretty_cols}}; my $is_atr = $col_structure->{amap}; my $p = $rect->path; my $has_parent_atr; for my $atr(keys %$is_atr) { $has_parent_atr++ unless $atr =~ /^$p/; } my $col2tag = $col_structure->{col2tag} || {}; my $elt= new XML::Twig::Elt($rect->gi); # CREATE ELEMENT my $par; my $par_name = $rect->parent->gi; if ($has_parent_atr) { $par = new XML::Twig::Elt($rect->parent->gi); # CREATE PARENT } my $rowhash; @{$rowhash}{@cols} = @$row; for my $i(0..$#cols) { my $tag = $tags[$i]; my $col = $cols[$i]; my $value = $rowhash->{$col}; $tag ||= $col2tag->{$col}; my($path,$name) = $tag =~ m!^(.*)/([^/]*)$!; if ($is_atr->{$tag} && defined $value) { if ($tag =~ /^$p/) { $elt->set_att($name,$value); # ADD ELT ATTRIBUTE } else { $par->set_att($name,$value); # ADD PARENT ATTRIBUTE } } elsif (defined $name && defined $value) { my $kid= new XML::Twig::Elt($name); # CREATE CHILD $kid->set_text($value); # ADD TEXT TO CHILD $kid->paste('last_child',$elt); # PASTE CHILD INTO ELEMENT } } if ($has_parent_atr) { $elt->paste('last_child',$par); # PASTE ELT INTO PARENT my $last = $rect->parent->parent->last_child($par_name); $par->paste('after',$last); # PASTE PARENT INTO TREE } else { my $last = $rect->parent->last_child($rect->gi); $elt->paste('after',$last); # PASTE ELEMENT INTO TREE } #$self->{twig}->print; } sub delete_record { #my @calls = caller 3; #my $call = $calls[3] || ''; #return if $call =~ /UPDATE/i; #print "$call\n"; my $self = shift; my $elt = $self->{prev_element}; my $rec = $self->{record_tag}; my $p = $rec->path; my $new = $elt->prev_elt($rec->gi); $elt = $new if $elt->path !~ /^$p/; $self->{skip} = 1 if !$elt; return undef unless $elt; $elt->delete; } sub DESTROY { return; print "XML DESTROYED"; my $self = shift; if ( $self->{storage}->{fh} && $self->{storage}->{open_mode} ne 'r' ){ $self->export( $self->{storage} ); } #undef $self->{twig}; #undef $self->{storage}->{fh}; } sub read_fields { my $self = shift; my $c = $self->{current_element}; return undef unless defined $c; $c = $self->{current_element} = $c->next_elt($c->gi) if $c->att('record_tag__'); $self->{prev_element} = $self->{current_element}; $self->{current_element} = $c->next_elt($c->gi) if $c; return $self->process_element( $self->{prev_element} ); } sub process_element { my $self = shift; my $element = shift; my @col_names = @{ $self->{col_structure}->{col_names} }; my @row; my $parent = $element->parent; my $values = { $element->path => $element->text }; my $par_ats = {}; $par_ats = $parent->atts if $parent; my $elt_ats = $element->atts || {}; while( my($att_key,$att_val) = each %$par_ats) { $values->{$parent->path.'/'.$att_key} = $att_val; } while( my($att_key,$att_val) = each %$elt_ats) { $values->{$element->path.'/'.$att_key} = $att_val; } for my $kid($element->children) { if ( defined $values->{$kid->path} ) { if (!ref $values->{$kid->path}) { $values->{$kid->path} = [ $values->{$kid->path} ] ; } push ( @{ $values->{$kid->path} }, $kid->text ); } else { $values->{$kid->path} = $kid->text; } } for my $col(@col_names) { if (ref $values->{$col}) { @row = (@row,@{$values->{$col}}); } else { push @row, $values->{$col}; } } # use Data::Dumper; print Dumper $values, Dumper \@row; exit; return @row; } sub seek_first_record { my $self = shift; return unless $self->{twig} and $self->{twig}->root; $self->{current_element} = $self->{record_tag}; } sub push_names { my $self = shift; my $col_names = shift || $self->{col_names}; #my @c= caller 1; die $c[3]."!!!"; my $str = "\n \n"; #print "CREATING"; for (@$col_names) { $str .= " <$_>dummy__\n"; } $str .= " \n
\n"; $str = $self->{template} if $self->{template}; if ( $self->{dtd} ) { $str = $self->{dtd} if $self->{dtd}; my $root = $str; $root =~ s/.*"; #die $str; } $self->get_data( $str ); return $self->{col_names}; } sub import { my $self = shift; my $data = shift; my $storage = shift; $self->init_parser($storage,$data); return $self->get_data($data,$storage->{col_names}); } #### # GET DATA FROM STRING ### sub init_parser { my $self = shift; my $storage = shift; my $fh_or_str = shift; return if 'co' =~ /$storage->{open_mode}/; #print " INIT ..."; #print "HAS RECS\n" if $storage->{recs}; #print "HAS DATA\n" if ref $storage->{file} eq 'HASH'; $fh_or_str ||= $storage->{fh} if $storage->{fh}; $fh_or_str ||= $storage->{file}->{data} if ref $storage->{file} eq 'HASH'; $fh_or_str ||= $storage->{recs}; # $fh_or_str ||= join('',@$fh_or_str) if ref $fh_or_str eq 'ARRAY'; #print $fh_or_str; exit; ###z $self->create_new_twig( $self->{col_names} ); my $rv = $self->get_data( $fh_or_str,$self->{col_names} ); return undef unless $rv; $self->{current_element} = $self->{twig}->root; $storage->{col_names} = $self->{col_names}; return 1; } sub create_new_twig { my $self = shift; my $flags = $self; my $root_tag = $flags->{root_tag}; my $depth_limit = $flags->{depth_limit}; # $flags->{twig_flags}->{TwigRoots} = {$root_tag=>'1'} if $root_tag; # $flags->{twig_flags}->{KeepEncoding} ||= 1; # $flags->{twig_flags}->{ProtocolEncoding} ||= 'ISO-8859-1'; $flags = $self->check_twig_options($flags); $self->{twig}= new XML::Twig(%{$flags}); #$self->{twig}= new XML::Twig(%{$flags->{twig_flags}}); } sub read_dtd { my $self = shift; my $twig = shift; #print Dumper $self->{dtd}; exit; my $record_tag = $self->{record_tag}; my $col_names = $self->{col_names}; $col_names = $self->{dtd}->{elt_list}; my $newc; my $colh; #print Dumper $self->{dtd}; exit; my $col_text; for my $col(@$col_names) { while (my($k,$v) = each %{ $self->{dtd}->{model} } ) { if ($v =~ /(#P*CDATA)/ ) { $col_text->{"$k$1"}++; } if ($v =~ /[(\s,]+$col[)\s,]+/ ) { my @path = ($k,$col); push @$newc, \@path; $colh->{$col} = $k; } } } $col_names = []; my $done; my $nh; while (!$done) { $done = 1; my $i; for $i(0..scalar @$newc -1) { my $cur = $newc->[$i]->[0]; # $cur =~ s"^.*/([^/]+)$"$1"; #print "$cur : "; my $path = $colh->{ $cur }; my $p; if ($path) { $p = $newc->[$i]->[0] = $path . '/' . $newc->[$i]->[0]; # delete $colh->{$cur}; $nh->{$cur} = $p; $done=0; } while (my($k,$v)=each %$nh) { if ( $cur =~ m"^$k/(.*)") { $newc->[$i]->[0] = $v . '/' . $1; $done = 0; } } } } #@array = grep(!$is_member{$_}++, @array); my %is_member; for my $row (@$newc) { my $c = '/' . $row->[0] . '/' . $row->[1]; push @$col_names, $c if !$is_member{$c}; $is_member{$c}++; } # put in order by depth @$col_names = sort { my $x=$a; my $y=$b; $x =~ s"[^/]""g;; $x=length $x; $y =~ s"[^/]""g;; $y=length $y; $x <=> $y; } @$col_names; $record_tag ||= $col_names->[0]; $record_tag =~ s".*/([^/]+)$"$1"; # $record_tag = $twig->first_elt($record_tag) # || die "Can't find column '$record_tag'!". $@; #print $record_tag, Dumper $col_names; exit; my %done; for my $c(@$col_names) { my @tags = split '/', $c; shift @tags; # remove root for my $i(0..$#tags) { my $t = $tags[$i]; next if $done{$c.$t}; next unless $c =~ m"/$t$"; # print "$c:$t\n"; # next if $done{$t}; $done{$c.$t}++ ; my $nxt = $twig->root->next_elt($t); next if $nxt and $nxt->path =~ /^$c/; next if $t eq $twig->root->gi; my $p= $tags[$i-1]; my $pos = $twig->root->next_elt($p); $pos ||= $twig->root; my $e= new XML::Twig::Elt($t); #if ($col_text->{$e->gi.'#PCDATA'}) { # $e->append_pcdata("xxx"); #} $e->paste('last_child',$pos); # if ($col_text->{$e->gi.'#PCDATA'} ) { # print $e->gi.'#PCDATA'."\n"; # $twig->root->next_elt($e->gi)->append_pcdata('x'); # } } } my $atts; while (my($k,$v)=each%{$self->{dtd}->{att}}) { my $cur = $twig->root->next_elt($k); next unless $cur; while (my($k2,$v2)=each%{$v}) { ### $cur->set_att($k2,""); $atts->{$cur->path.'/'.$k2}=$k2; #print "[".$cur->path.'/'.$k2."]"; } } $record_tag ||= $twig->root->first_child->gi; $record_tag = $twig->root->next_elt($record_tag) || die "Couldn't find column '$record_tag'!"; $newc = []; my $found; for my $org(@$col_names) { my $x = $org; $x =~ s".*/([^/]+)$"$1"; my $p =$record_tag->parent->path; next unless $org =~ /^$p/; #next unless $p =~ /^$_/; while (my($k,$v)=each%$atts) { next if $found->{$k}; if ( $k =~ m"$p/([^/]+)$" or $k =~ m"$p/([^/]+/[^/]+)$" ) { #print "$k\n"; push @$newc, $k; $found->{$k}++; } } push @$newc, $org if $col_text->{$x.'#PCDATA'} or $col_text->{$x.'#CDATA'}; } #unshift @$newc, $record_tag->gi unless $found; #die Dumper $newc; #$twig->print; exit; my $elt = $twig->root; if (!$self->{recs}) { while ( $elt = $elt->next_elt ) { $elt->set_att('xstruct__','1'); } $record_tag->set_att('record_tag__','1'); } ######### # COMMENT THIS TO SEE STRUCTURE TAGS # $self->{destroy}++; #print Dumper $record_tag->gi,$newc, $atts; return( $record_tag,$newc, $atts); #$twig->print; exit; } sub get_structure { my $self = shift; my $twig = shift; my $record_tag = $self->{record_tag}; # $record_tag ||= $self->{table_name}; my $col_names = $self->{col_names}; if ($self->{dtd}) { return $self->read_dtd($twig) } $record_tag = $twig->first_elt($record_tag) if $record_tag; $record_tag ||= $twig->root->first_child; #print $record_tag->gi; # if (!$record_tag) { # $record_tag = $twig->root->first_child; # if ( $record_tag # and $record_tag->first_child # and !$record_tag->contains_text # and !$record_tag->first_child->contains_text # ) { # $record_tag = $record_tag->first_child; # } # } $self->{record_tag} = $record_tag; if ($self->{create}) { my $elt = $twig->root; while ( $elt = $elt->next_elt ) { $elt->set_att('xstruct__','1'); } ###z# print "1"; $record_tag->set_att('record_tag__','1'); } if ($col_names) { @$col_names = map { my $o = $_; if ($o !~ m"/") { $o = $twig->first_elt($o)->path; } $o; } @$col_names; } else { @$col_names = map {$_->path} $record_tag->descendants; #die join "\n",@$col_names; my $newcolz = []; my %hashz; for (@$col_names) { next unless m"/#PCDATA|/#CDATA"; next if $hashz{$_}; push @$newcolz, $_; $hashz{$_}++; } $col_names = $newcolz; } # my $oldcols = $col_names; # $col_names = []; #print join "\n", @$col_names; exit; my $atts; my @atts_to_check = ($record_tag,$record_tag->descendants); #print $record_tag->gi,"\n"; if ($record_tag->parent and $record_tag->parent->parent) { unshift @atts_to_check, $record_tag->parent; } my $has_record_tag = 1 if $record_tag->att('record_tag__'); #print $has_record_tag ? 'HAS' : 'NONE'; #$twig->print; #print $record_tag->path,"!\n"; # for (keys %{$record_tag->atts}) { print "$_#"; } my @att_col; for my $t(@atts_to_check) { my $ats = $t->atts; next unless $ats; delete $ats->{record_tag__}; # push @$col_names, $t->path if $t->is_text; # print $t->path . '/' . $_ for keys %$ats; # unshift @$col_names, $t->path . '/' . $_ for keys %$ats; push @att_col, $t->path . '/' . $_ for keys %$ats; #unshift @$atts, $ats->{$_} for keys %$ats; $atts->{$t->path . '/'. $_} = $_ for keys %$ats; } @$col_names = (@att_col,@$col_names); @$col_names = map {s"/#P*CDATA""; $_} @$col_names; #print join "\n",@$col_names; ###z# print 2; # $record_tag->set_att('record_tag__','true'); $record_tag->set_att('record_tag__','true') if $record_tag->text =~/dummy__/; # $record_tag->set_att('xstruct__','true') if $record_tag->text =~/dummy__/; # #if ($has_record_tag) { $twig->print; exit; } return($record_tag,$col_names,$atts) if $has_record_tag; my $cols; @$cols = map {$_} @$col_names; my $elt= new XML::Twig::Elt($record_tag->gi); for my $a(keys %$atts ) { $a =~ s".*/([^/]+)$"$1"; next if $a eq 'record_tag__'; next unless $record_tag->att($a); $elt->set_att($a,''); } for my $c(@$cols ) { next if $atts->{$c}; next if $c =~ m"/#PCDATA"; $c =~ s"/#PCDATA""; $c =~ s".*/([^/]+)$"$1"; my $e= new XML::Twig::Elt($c); $e->paste('last_child',$elt); } my $par; if ($record_tag->parent and $record_tag->parent->parent) { $par= new XML::Twig::Elt($record_tag->parent->gi); for my $a(keys %$atts ) { $a =~ s".*/([^/]+)$"$1"; next unless $record_tag->parent->att($a); $par->set_att($a,''); } ###z# print 3; $elt->set_att('record_tag__','true'); $elt->paste('first_child',$par); $par->paste('before',$record_tag->parent); # $record_tag = $self->{record_tag} = $record_tag->parent->prev_sibling->first_child; $record_tag ||= $self->{record_tag} = $record_tag->parent->prev_sibling->first_child; } else { for my $a(keys %$atts ) { $a =~ s".*/([^/]+)$"$1"; next unless $record_tag->att($a); $elt->set_att($a,''); } ###z# print 4; $record_tag ||= $self->{record_tag} = $record_tag->prev_sibling; # $record_tag = $self->{record_tag} = $record_tag->prev_sibling; } $record_tag ||= $twig->root->first_child; my $old = $record_tag->next_elt($record_tag->gi); # $old->delete if $self->{create}; # $old->set_att('frump','foo') if $old; $old->del_att('record_tag__') if $old; #$twig->print; #print "\n"; #$old->print if $old; ##print "\n"; # my $par = $self->create_record; # $self->{blank_element} = $par; #printf "\n%s\n %s\n", $elt->path, "@$col_names"; @$col_names = map {s"/#PCDATA""; $_} @$col_names; #$twig->print; print "\n\n"; #use Data::Dumper; print Dumper $record_tag->gi,$col_names,$atts; return $record_tag,$col_names,$atts; } sub check_twig_options { my $flags = shift; my $new_flags; my %twig_opt = %XML::Twig::valid_option; return $flags unless scalar (keys %twig_opt); while (my($k,$v) = each %$flags) { $new_flags->{$k} = $v if $twig_opt{$k}; } return $new_flags; } sub get_structure_from_map { my $self = shift; my $twig = shift; my $col_map = shift; my($amap,$map,$multi,$col_names,$pretty_cols,$col2tag); for my $col(@$col_map) { my($tag_name,$col_name) = ($col,$col); ($tag_name,$col_name) = each %$col if ref $col eq 'HASH'; my($tname,$tparent) = ($tag_name,$tag_name); if ($tname =~ m!(.*)/([^/]*)$! ) { $tparent = $1; $tname = $2; $tparent =~ s!.*/([^/]*)$!$1!; } my $tag = $twig->first_elt($tname); $tag_name=$tag->path if $tag; if (!$tag) { my $new_tag = $twig->first_elt($tparent); # die "No such element '$tname'!" unless $tag; if (!$new_tag) { $tag_name = $tname; } else { $tag_name=$new_tag->path . '/' . $tname; } $amap->{$tag_name}++; } if (ref $col_name eq 'ARRAY') { for my $col2(@$col_name) { $col2tag->{$col2} = $tag_name; $multi->{$tag_name}++; push @$pretty_cols, $col2; } } push @$col_names, $tag_name; push @$pretty_cols, $col_name unless ref $col_name eq 'ARRAY'; $map->{$tag_name} = $col_name; } my $record_tag; my $record_tag_path = ''; for my $col(@$col_names) { my($rt) = $col =~ m!(.*)/[^/]*$!; next unless $rt; $record_tag_path = $rt if length $rt > length $record_tag_path; } my @children = $twig->root->descendants; for my $e(@children) { next unless $e->path eq $record_tag_path; $record_tag = $e; last; } if (!$record_tag) { $record_tag = $twig->root->first_child; my $p = $record_tag->path; @$col_names = map {$p.'/'.$_}@$col_names; # use Data::Dumper; print Dumper $amap; my $newmap; $newmap->{ $p.'/'.$_ }++ for keys %{$amap}; $amap = $newmap; $newmap = {}; $newmap->{ $p.'/'.$_ } = $map->{$_} for keys %{$map}; $map = $newmap; } ## #=pod #paste into parent record_tag__ # my $rt_atts = $record_tag->atts; # if (!$rt_atts->{record_tag__}) { # my $new_rt = $record_tag->copy; # $new_rt->set_att('record_tag__','1'); # $new_rt->set_att('xstruct__','1'); # $new_rt->paste('first_child',$record_tag->parent); # $record_tag = $new_rt; # } #=cut my $col_structure = { amap => $amap, map => $map, multi => $multi, col_names => $col_names, pretty_cols => $pretty_cols, col2tag => $col2tag, }; # print $record_tag->path, "\n"; # use Data::Dumper; print Dumper $col_structure; # exit; return $record_tag, $col_structure; } sub get_data { my $self = shift; my $fh_or_str = shift; my $url = $self->{url}; if ( $url ) { $fh_or_str = AnyData::Storage::RAM::get_remote_data({},$url); } return if( ! defined( $fh_or_str ) ); my $col_names = shift || []; $col_names = []; #### IGNORE USER COLUMN NAMES FOR NOW my $flags; while (my($k,$v)=each %$self) { $flags->{$k}=$v; } my $root_tag = $flags->{root_tag}; my $depth_limit = $flags->{depth_limit}; my $supplied_col_names = $flags->{col_names}; my $have_col_names = 1 if $supplied_col_names; my $pretty_col_names = $supplied_col_names; my $col_structure = $self->{col_structure}; undef $col_structure unless $col_structure->{col_names} and scalar @{$col_structure->{col_names}}; my %multi; my %map; my %amap; $flags->{LoadDTD} = 1; $flags->{TwigRoots} = {$root_tag=>'1'} if $root_tag; # # DEFAULTS : KeepEncoding OFF to mirror XML::Twig # ProtocolEncoding 'ISO-8859-1' # # $flags->{KeepEncoding} ||= 1; # $flags->{ProtocolEncoding} ||= 'ISO-8859-1'; #use Data::Dumper; die Dumper $flags; $flags = check_twig_options($flags); my $twig= new XML::Twig(%{$flags}); my $success = $twig->safe_parse($fh_or_str); $self->{errstr} = $@ unless $success; die $self->{errstr} if $self->{errstr}; return undef unless $success; $self->{dtd} = $twig->dtd; my $root = $twig->first_elt($root_tag) || $twig->root; my $name = $root->path; my $element= $twig->root; my($record_tag,$colZ,$atts); my $col_map = $self->{col_map}; if ($col_map) { ($record_tag,$col_structure) = $self->get_structure_from_map($twig,$col_map); } else { ($record_tag,$colZ,$atts) = $self->get_structure($twig); if (!$col_structure) { $have_col_names++; $col_structure = build_column_names($colZ,$root,$root_tag,$colZ); $col_structure->{amap} = $atts; } } # CREATE A DUMMY RECORD TAG # my $rt_atts = $record_tag->atts; if (!$rt_atts->{record_tag__}) { my $new_rt = $record_tag->copy; $new_rt->set_att('record_tag__','1'); $new_rt->set_att('xstruct__','1'); $new_rt->paste('first_child',$record_tag->parent); $record_tag = $new_rt; } # $twig->print; # use Data::Dumper; print Dumper $col_structure; # print $self->{record_tag}->path; $self->{record_tag} = $record_tag; $self->{twig} = $twig; $self->{col_names} = $col_structure->{pretty_cols}; $self->{col_structure} = $col_structure; return 1; } ############################################################### # MAP A ROW HASH ONTO A COLUMN NAMES ARRAY ############################################################### sub rowhash_to_array { my $row = shift; my $col_structure = shift; #die Dumper $col_structure; my $col_names = $col_structure->{col_names}; my %map = %{ $col_structure->{map} } if $col_structure->{map}; my %multi = %{ $col_structure->{multi} } if $col_structure->{multi}; my $pretty_col_names = $col_structure->{pretty_cols} if $col_structure->{pretty_cols}; my @newvals; my %visited; for my $coln(@$col_names) { my $tag = $map{$coln}; #next unless $tag; if (!$multi{$tag}) { $row->{$tag} ? push @newvals, $row->{$tag} : push @newvals, undef; } else { if (!$visited{$tag}) { $visited{$tag}++; my @multi_col = ref $row->{$tag} eq 'ARRAY' ? @{$row->{$tag}} : ($row->{$tag}); push @multi_col,undef unless scalar @multi_col; my $dif = ($multi{$tag}) - (scalar @multi_col); push @multi_col,undef for 0 .. $dif; push @newvals,$_ for @multi_col; } } } return( \@newvals ); } ############################################################### # BUILD A COLUMN NAMES LIST IF NONE HAS BEEN BUILT YET ############################################################### sub build_column_names { my $tags = shift; my $root = shift; my $root_tag = shift; my $col_names = shift || []; my %multi; my %map; for my $col(@$col_names) { $multi{$col}++; } my %num; my $newcolz; for my $col(@$col_names) { if ($multi{$col} <2) { push @$newcolz, $col; $map{$col}=$col; next; } $num{$col}++; push @$newcolz, $col.$num{$col}; $map{$col.$num{$col}}=$col; } $col_names = $newcolz; # REMOVE AS MUCH OF THE PATH AS POSS., KEEPING NAMES UNIQUE # my $prefix = $root->gi; $prefix .= "/$root_tag" if $root_tag; my $pretty_col_names; die "No Column Names!" unless$col_names; @$pretty_col_names = @$col_names; @$pretty_col_names = map {$_ =~ s"^/$prefix/"";$_} @$pretty_col_names; my %is_member; my @newcols; for my $col(@$pretty_col_names) { my $newc = $col; $newc =~ s".*/([^/]*)$"$1"; if ($is_member{$newc}) { $newc = $col; $newc =~ s"[^/]*/(.*)"$1"; } push @newcols, $is_member{$newc} ? $col : $newc; $is_member{$newc}=1; } @$pretty_col_names = @newcols; @$pretty_col_names = map {s"/"_"g;$_} @$pretty_col_names; for (keys %multi) { $multi{$_} = $multi{$_} -1; delete $multi{$_} unless $multi{$_}; } my $col_structure = { col_names => $col_names, map => \%map, multi => \%multi, pretty_cols => $pretty_col_names, }; #print Dumper $col_structure; return( $col_structure ); } ############################################################### sub export { my $self = shift; my $storage = shift; #z my $format = shift; my $file = shift; my $flags = shift || {}; #$self->{twig}->print; if ( ( $storage and $file and !$storage->{fh} ) ) { $storage->{file_name} = $file; $storage->{fh} = $storage->open_local_file($file,'o'); } return unless $self->{twig}; #$self->{twig}->print; $self->{twig}->set_pretty_print($flags->{pretty_print}) if $flags->{pretty_print}; #$self->{twig}->print; print "\n\n"; #$self->{twig}->finish_print; my $fh = $storage->{fh} if $storage; my $r = $self->{twig}->root->gi; my $str = $self->{outside_of_tree} || ''; my $rectag = $self->{record_tag}; # $self->{twig}->first_elt($self->{record_tag}->gi)->delete; my $elt= $self->{twig}->root; # $self->{destroy}= 1; #print "FOO"; if ( $self->{destroy}) { for my $e($elt->descendants) { if ( $e->att('xstruct__') ) { $e->del_att('xstruct__'); # next unless defined $e->next_elt($e->gi); $e->delete; #next; } } } $elt= $self->{twig}->root; while( $elt= $elt->next_elt ){ #for (keys %{$elt->atts}) { print "$_#"; } my $del_parent; # $elt->delete if $elt->text =~ /x/ and $elt->is_text; if ($elt->children == 0 ) { # $elt->delete; next; } next if !$elt->att('record_tag__'); for ($elt->children) { $_->delete; } if ($elt->parent and $elt->parent->children < 2) { $del_parent++; } $elt->parent->delete if $del_parent; $elt->delete; } ###z # $str = defined $fh ? $self->{twig}->print($fh) # : $self->{twig}->sprint(); if($file and defined $fh ){ $str = $self->{twig}->print($fh) } else {$str = $self->{twig}->sprint();} undef $storage->{fh}; return $str; } 1; =head1 NAME AnyData::Format::XML - tiedhash and DBI access to XML =head1 SYNOPSIS # access XML data via a multidimensional tied hash # see AnyData.pod for full details # use AnyData; my $table = adTie( 'XML', $file, $mode, $flags ); OR # convert data to and from XML # see AnyData.pod for full details # use AnyData; adConvert( 'XML', $file1, $any_other_format, $file2, $flags ); adConvert( $any_other_format, $file1, 'XML', $file2, $flags ); OR # access the data via DBI and SQL # see DBD::AnyData.pod for full details # use DBI; my $dbh = DBI->connect( 'dbi:AnyData' ); $dbh->func('mytable','XML',$file,$flags,'ad_catalog'); See below for a description of the optional flags that apply to all of these examples. =head1 DESCRIPTION This module allows you to create, search, modify and/or convert XML data and files by treating them as databases without having to actually create separate database files. The data can be accessed via a multidimensional tiedhash using AnyData.pm or via DBI and SQL commands using DBD::AnyData.pm. See those modules for complete details of usage. The module is built on top of Michel Rodriguez's excellent XML::Twig which means that the AnyData interfaces can now include information from DTDs, be smarter about inferring data structure, reduce memory consumption on huge files, and provide access to many powerful features of XML::Twig and XML::Parser on which it is based. Importing options allow you to import/access/modify XML of almost any length or complexity. This includes the ability to access different subtrees as separate or joined databases. Exporting and converting options allow you to take data from almost any source (a perl array, any DBI database, etc.) and output it as an XML file. You can control the formatting of the resulting XML either by supplying a DTD listing things like nesting of tags and which columns should be output as attributes and/or you can use XML::Twig pretty_print settings to generate half a dozen different levels of compactness or whitespace in how the XML looks. The documentation below outlines the special flags that can be used in either of the interfaces to fine-tune how the XML is treated. The flags listed below define the relationship between tags and attributes in the XML document and columns in the resulting database. In many cases, you can simply accept the defaults and the database will be built automatically. However, you can also fine tune the generation of the database by specifying which tags and attributes you are interested in and their relationship with database columns. =head1 USAGE =head2 Prerequisites To use the tied hash interface, you will need AnyData XML::Twig XML::Parser To use the DBI/SQL interface, you will need those, and also DBI DBD::AnyData =head2 Required flags ( none ) If no flags are specified, then the module determines the database structure from examining the file or data itself, making use of the DTD if there is one, otherwise scanning the first child of the XML tree for structural information. =head2 Optional flags If the default behavior is not sufficient, you may either specify a "record_tag" which will be used to define column names, or you can define an entire tag-to-column mapping. For simple XML, no flags are necessary: JoeSeattleSuePortland
The record_tag will default to the first child, namely "row". The column names will be generated from the attributes of the record tag and all of the tags included under the record tag, so the column names in this example will be "row_id","name","location". If the record_tag is not the first child, you will need to specify it. For example: JoeSeattleSuePortland
BobBoiseBevBillings
In this case you will need to specify "row" as the record_tag since it is not the first child of the tree. The column names will be generated from the attributes of row's parent (if the parent is not the root), from row's attributes and sub tags, i.e. "table_id","row_id","name","location". In some cases you will need to specify an entire tag-to-column mapping. For example, if you want to use a different name for the database column than is used in the XML (especially if the XML tag is not a valid SQL column name). You'd also need to specify a mapping if there are two tags with the same name in different places in the XML tree. The column mapping is a reference to an array of column definitions. A column definition is either a simple name of a tag, or a hash reference with the key containing the full path of the XML tag and the value containing the desired column name alias. For example: col_map => [ 'part_id', 'part_name', 'availability' ]; That will find the first three tags with those names and create the database using the same names for the tags. Or: col_map => [ { '/parts/shop/id' => 'shop_id'}, { '/parts/shop/part/id' => 'part_id'}, { '/parts/shop/part/name' => 'part_name'}, ]; That would find the three tags referenced on the left and create a database with the three column names referenced on the right. When exporting XML, you can specify a DTD to control the output. For example, if you import a table from CSV or from an Array, you can output as XML and specify which of the columns become tags and which become attributes and also specify the nesting of the tags in your DTD. The XML format parser is built on top of Michel Rodriguez's excellent XML::Twig which is itself based on XML::Parser. Parameters to either of those modules may be passed in the flags for adTie() and the other commands including the "prettyPrint" flag to specify how the output XML is displayed and things like ProtocolEncoding. ProtocolEncoding defaults to 'ISO-8859-1', all other flags keep the defaults of XML::Twig and XML::Parser. See the documentation of those modules for details; CAUTION: Unlike other formats, the XML format does not save changes to the file as they are entered, but only saves the changes when you explicitly request them to be saved with the adExport() command. =head1 AUTHOR & COPYRIGHT copyright 2000, Jeff Zucker all rights reserved =cut