Statistics-Descriptive-3.0702000755000764000764 013364403553 16521 5ustar00shlomifshlomif000000000000README100644000764000764 273713364403553 17473 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702INSTALLATION To install this module, run the following commands: perl Build.PL ./Build ./Build test ./Build install SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive You can also look for information at: RT, CPAN's request tracker http://rt.cpan.org/NoAuth/Bugs.html?Dist=Statistics-Descriptive CPAN Ratings http://cpanratings.perl.org/d/Statistics-Descriptive Search CPAN http://metacpan.org/release/Statistics-Descriptive COPYRIGHT AND LICENCE This program is released under the following license: perl ------------ Old README: =========== This new version (2.5) of Statistics::Descriptive contains: - frequency_distribution was modified so that specific bins could be defined. Also, caching was fixed so that it works. - Lots of changes to the test suite to enhance coverage. - Caching was removed from least_squares_fit since it didn't make sense. - Added user survey to distribution. List of methods by class: Statistics::Descriptive::Sparse ---------------------------------- add_data count mean sum variance pseudo_variance min max mindex maxdex standard_deviation sample_range Statistics::Descriptive::Full ---------------------------------- All methods above and: get_data sort_data presorted percentile median trimmed_mean harmonic_mean mode geometric_mean frequency_distribution least_squares_fit ckuskie@cpan.org Changes100644000764000764 2366413364403553 20130 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702Revision history for Perl extension Statistics::Descriptive. 3.0702 2018-10-25 - Rephrase some lacking phrasing in the docs. - https://github.com/shlomif/perl-Statistics-Descriptive/issues/5 - Thanks to @melak . 3.0701 2018-07-14 - Typo corrections thanks to Debian 3.0700 2018-07-13 - Typo corrections thanks to Debian - Added the ->summary() method to Statistics::Descriptive::Full. - https://rt.cpan.org/Public/Bug/Display.html?id=82852 - Thanks to Su-Shee 3.0613 2018-05-02 - Convert to Dist-Zilla. 3.0612 2016-01-09 - Move the VCS repository to GitHub. - People are more used to it, and it can take part in the CPAN pull-req challenge. 3.0611 2016-01-07 - Apply patch from Debian to correct a spelling error. - https://rt.cpan.org/Ticket/Display.html?id=110987 - Thanks to Salvatore Bonaccorso for the report and patch. 3.0610 2016-01-06 - Clarified the docs for add_data_with_samples(). - Some people thought they were repeat counts of the data , but they are normally ignored. - Replace remaining tabs with spaces. 3.0609 2015-06-19 - Update some old URLs and addresses. 3.0608 2015-01-14 - Apply patch to speed up skewness calculations. - https://rt.cpan.org/Public/Bug/Display.html?id=101422 - Thanks to Shawn (SLAFFAN). 3.0607 2014-02-01 - Fix the List::MoreUtils prereqs. - http://www.cpantesters.org/cpan/report/365b752c-8adf-11e3-bd14-e3bee4621ba3 - Thanks to Chris Williams (BINGOS) for the CPAN Testers Report. 3.0606 2014-01-31 - Implement the median_absolute_deviation method. - https://bitbucket.org/shlomif/perl-statistics-descriptive/pull-request/5/median-absolute-deviation-method/diff - Thanks to Kang-min Liu. - Minimal version of perl set to 5.6.0 (CPANTS). - Add standalone LICENSE file (CPANTS). 3.0605 2013-05-21 - Add t/style-trailing-space.t . - Add t/cpan-changes.t . - Convert Changes to it. 3.0604 2012-07-14 - Correct a misspelling of "weight" in lib/Statistics/Descriptive/Smoother/Weightedexponential.pm - Thanks to Wilhelm for the report. - Update the scripts/tag-release.pl file for Mercurial. 3.0603 2012-05-15 - Use in_between to compare decimal numbers - Smoothing tests were failing because of rounding problems - Thanks to Andreas J. König for reporting it and to Fabio Ponciroli for fixing it. 3.0602 2012-05-12 - Correct a typo: - https://rt.cpan.org/Ticket/Display.html?id=77145 - Thanks to Salvatore Bonaccorso and the Debian Perl Group for the report. 3.0601 2012-05-11 - No longer using Test::Exception in the tests. - It was used by the tests and not specified in test_requires/build_requires. - Thanks to hsk@fli-leibniz.de for the report. 3.0600 2012-05-11 - Add the smoothing functionality. - Add the following public methods: add_data_with_samples(), set_smoother(), get_smoothed_data() to the main module. - Add the lib/Statistics/Descriptive/Smoother.pm and lib/Statistics/Descriptive/Smoother/Exponential.pm lib/Statistics/Descriptive/Smoother/Weightedexponential.pm modules. - Thanks to Fabio Ponciroli - Add the scripts/bump-version-number.pl to facilitate bumping the version number. 3.0500 2012-05-03 - Add the get_data_without_outliers() and the set_outlier_filter() methods. - See https://bitbucket.org/barbasa/perl-statistics-descriptive/overview - Thanks to Fabio Ponciroli 3.0400 2012-03-01 - Fix https://rt.cpan.org/Ticket/Display.html?id=74890 - some function should return undef() in list context so they can be easily assigned to values in hash initialisations. - thanks to SLAFFAN for a preliminary patch which was modified by SHLOMIF (the current Statistics-Descriptive maintainer). 3.0300 2012-02-11 - Now mean() and median() and other routines return undef() if there are no data. - Somewhat incompatible change: some methods that returned undef() under list context now return an empty list (so it will be false). - it is generally not recommended to call such methods in list context as they should always be called in scalar context. - Resolves https://rt.cpan.org/Ticket/Display.html?id=74693 - thanks to Shawn Laffan for the report and the patch. 3.0203 2011-11-17 - Fix https://rt.cpan.org/Ticket/Display.html?id=72495 . - percentile should not die and should return undef if there are no elements in the collection. 3.0202 2011-07-23 - Moved tag-release.pl to scripts/tag-release.pl (though we now use Mercurial instead of Subversion.) - Add t/mode.t to test the ->mode() method. - Documented ->mode() better. - Optimized ->mode(). 3.0201 2010-10-14 - Add some documentation clarifying the 0th percentile return, as it returns undef() for representing -inf: - Fix https://rt.cpan.org/Ticket/Display.html?id=62055 - Thanks to Dave Breimann for reporting it. - Add the tag-release.pl to tag a release using Subversion. 3.0200 2010-06-18 - Added skewness and kurtosis - https://rt.cpan.org/Ticket/Display.html?id=58187 - Thanks to Shawn Laffan. - Removed the Changes / Revision log from the .pm file. 3.0102 2010-06-15 - Add the $VERSION variable to Statistics::Descriptive::Sparse and Statistics::Descriptive::Full. This was done to silence the CPAN indexer. 3.0101 2010-06-15 - Moved the trimmed_mean caching test (that used the Benchmark.pm module) to rejects/descr.t , because it kept failing. 3.0100 2009-07-20 - Added the quantile method - thanks to Djibril Ousmanou (DJIBEL). - https://rt.cpan.org/Ticket/Display.html?id=47948 3.0000 2009-05-29 - Added tests (for ->count, ->sum, ->sumsq, ->min, ->max) - Localized the scope of $stat and other variables in t/descr.t - Got rid of AUTOLOAD in favour of individual accessors. - Converted many direct member accesses to the accessors. - Added ->frequency_distribution_ref() which deprecates frequency_distribution(). - Some refactoring of the lib/Statistics/Descriptive.pm module (without breaking the documented API). 2.9 2009-05-13 - Fixed bug https://rt.cpan.org/Public/Bug/Display.html?id=46026 : - standard_deviation failing due to a variance that got evaluated to 0 due to rounding errors. - Kwalitee : added a LICENSE section to the POD. - Kwalitee (CPANTS) : added an examples/ directory with a script. 2.8 2009-05-09 - Enabled "./Build runtest" and "./Build distruntest" (using Test::Run) in the distribution. - Fixed incomplete/broken tests in t/descr.t. 2.7 2009-05-03 - Converted the distribution to Build.PL and re-organized it to put everything under its proper place. Started maintaining it in: - [Old SVN Repository] - URL Elided. - Converted t/descr.t to use "use strict;" and "use warnings;". - Converted t/descr.t to use Test::More. - Cleaned up the "use" statement of lib/Statistics/Descriptive.pm. - Added more explicit dependencies (core, though) to Build.PL. - Fixed RT bug #34999: freq distribution generated too many bins. - https://rt.cpan.org/Ticket/Display.html?id=34999 - Added some keywords and resources to the META.yml, using Build.PL's meta_merge. - Fixed https://rt.cpan.org/Ticket/Display.html?id=32183 - more authoritative (and non-broken) link to the RFC. - Applied the patch in https://rt.cpan.org/Ticket/Display.html?id=9160 - {{#9160: Variance and Standard Deviation use costly pseudo-variance, instead of computing real variance}}. 2.6 2002-10-10 - Fixed caching in trimmed mean and modified code to allow trimming 0% from upper bound. Formerly if 0 was requested then it used the lower bound! - POD format patch from ddunlap 2.5 1999-05-12 - Forgot to document change in v2.4, which included fixing percentile so that it worked right and added to the test harness. - Modified frequency_distribution so that specific bins could be passed in. Fixed caching so that it actually works (it only used to get stuck returning the result of the first call). - Turned off caching for least_squares_fit because there's no way to generate a unique key for memorization. 2.3 1998-11-12 - Fix for frequency distribution. Changed Makefile.PL to ease ActiveState distribution of the module. Andrea's code for preventing division by zero and other improvements. He also wrote a great test bench. Added code from Warren Matthews to calculate percentile. 2.2 1998-02-23 - Multiple bug fixes: Fixed min/max bug with '0' vs defined. Provided fix for bug with AUTOLOAD/DESTROY/Carp problem. 2.1 1997-09-02 - Multiple bug fixes: Cleaned up syntax error with my scoping. Fixed errors in least_squares_fit and median methods 2.00 1997-08-20 - new version; created by h2xs 1.16 - Complete rewrite of OO interface by Colin Kuskie. - Now has 2 classes instead of 1.5, a base class without data storage and a class that inherits the base methods and extends them with data storage and more statistics. 1.1 1995-04-01 - Added LeastSquaresFit and FrequencyDistribution. 1.0 1995-03-01 - Released to comp.lang.perl and placed on archive sites. 0.20 1994-12-01 - Complete rewrite after extensive and invaluable e-mail correspondence with Anno Siegel. 0.10 1994-12-01 - Initital concept, released to perl5-porters list. - Jason Kastner LICENSE100644000764000764 5013113364403553 17627 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702Terms of Perl itself a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" ---------------------------------------------------------------------------- GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS ---------------------------------------------------------------------------- The Artistic License Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions: - "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. - "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder. - "Copyright Holder" is whoever is named in the copyright or copyrights for the package. - "You" is you, if you're thinking about copying or distributing this Package. - "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) - "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a) place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as ftp.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b) use the modified Package only within your corporation or organization. c) rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d) make other distribution arrangements with the Copyright Holder. 4. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a) distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b) accompany the distribution with the machine-readable source of the Package with your modifications. c) accompany any non-standard executables with their corresponding Standard Version executables, giving the non-standard executables non-standard names, and clearly documenting the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d) make other distribution arrangements with the Copyright Holder. 5. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. 6. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. 7. C or perl subroutines supplied by you and linked into this Package shall not be considered part of this Package. 8. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End dist.ini100644000764000764 113413364403553 20245 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702name = Statistics-Descriptive author = Shlomi Fish license = Perl_5 copyright_holder = Jason Kastner, Andrea Spinelli, Colin Kuskie, and others copyright_year = 1997 [@Filter] -bundle = @SHLOMIF -remove = License -remove = Readme -remove = Test::Compile dist = Statistics-Descriptive github_name = perl-Statistics-Descriptive [Keywords] keyword = average keyword = distribution keyword = mean keyword = median keyword = statistics keyword = stats keyword = stddev [PruneFiles] match = ^rejects/ [RewriteVersion] [Test::Compile] fake_home = 1 skip = bump-ver|tag-release|run_agg_tests t000755000764000764 013364403553 16705 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702mode.t100644000764000764 135613364403553 20163 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 4; use Statistics::Descriptive; { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 10, 100 ); my $mode = $stat->mode(); # TEST ok (!defined($mode), "No mode for a flat distribution." ); my $second_mode = $stat->mode(); # TEST ok (!defined($second_mode), "No mode after a second call." ); } { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 5,5,5,10,19,19,30 ); my $mode = $stat->mode(); # TEST is ($mode, 5, "Mode is 5." ); my $second_mode = $stat->mode(); # TEST is ($second_mode, 5, "Second call mode is 5." ); } META.yml100644000764000764 2664713364403553 20112 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702--- abstract: 'Module of basic descriptive statistical functions.' author: - 'Shlomi Fish ' build_requires: Benchmark: '0' Exporter: '0' File::Spec: '0' File::Temp: '0' IO::Handle: '0' IPC::Open3: '0' Module::Build: '0.28' Test::More: '0.88' lib: '0' configure_requires: ExtUtils::MakeMaker: '0' Module::Build: '0.28' dynamic_config: 0 generated_by: 'Dist::Zilla version 6.012, CPAN::Meta::Converter version 2.150010' keywords: - average - distribution - mean - median - statistics - stats - stddev license: perl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: Statistics-Descriptive provides: Statistics::Descriptive: file: lib/Statistics/Descriptive.pm version: '3.0702' Statistics::Descriptive::Full: file: lib/Statistics/Descriptive/Full.pm version: '3.0702' Statistics::Descriptive::Smoother: file: lib/Statistics/Descriptive/Smoother.pm version: '3.0702' Statistics::Descriptive::Smoother::Exponential: file: lib/Statistics/Descriptive/Smoother/Exponential.pm version: '3.0702' Statistics::Descriptive::Smoother::Weightedexponential: file: lib/Statistics/Descriptive/Smoother/Weightedexponential.pm version: '3.0702' Statistics::Descriptive::Sparse: file: lib/Statistics/Descriptive/Sparse.pm version: '3.0702' requires: Carp: '0' List::MoreUtils: '0' List::Util: '0' POSIX: '0' base: '0' parent: '0' perl: '5.006' strict: '0' vars: '0' warnings: '0' resources: bugtracker: https://github.com/shlomif/perl-Statistics-Descriptive/issues homepage: http://metacpan.org/release/Statistics-Descriptive repository: git://github.com/shlomif/perl-Statistics-Descriptive.git version: '3.0702' x_Dist_Zilla: perl: version: '5.028000' plugins: - class: Dist::Zilla::Plugin::Prereqs config: Dist::Zilla::Plugin::Prereqs: phase: test type: requires name: '@Filter/TestMoreDoneTesting' version: '6.012' - class: Dist::Zilla::Plugin::AutoPrereqs name: '@Filter/AutoPrereqs' version: '6.012' - class: Dist::Zilla::Plugin::ExecDir name: '@Filter/ExecDir' version: '6.012' - class: Dist::Zilla::Plugin::GatherDir config: Dist::Zilla::Plugin::GatherDir: exclude_filename: [] exclude_match: [] follow_symlinks: 0 include_dotfiles: 0 prefix: '' prune_directory: [] root: . name: '@Filter/GatherDir' version: '6.012' - class: Dist::Zilla::Plugin::ManifestSkip name: '@Filter/ManifestSkip' version: '6.012' - class: Dist::Zilla::Plugin::MetaYAML name: '@Filter/MetaYAML' version: '6.012' - class: Dist::Zilla::Plugin::PruneCruft name: '@Filter/PruneCruft' version: '6.012' - class: Dist::Zilla::Plugin::RunExtraTests config: Dist::Zilla::Role::TestRunner: default_jobs: 1 name: '@Filter/RunExtraTests' version: '0.029' - class: Dist::Zilla::Plugin::ShareDir name: '@Filter/ShareDir' version: '6.012' - class: Dist::Zilla::Plugin::MakeMaker config: Dist::Zilla::Role::TestRunner: default_jobs: 1 name: '@Filter/MakeMaker' version: '6.012' - class: Dist::Zilla::Plugin::CheckChangesHasContent name: '@Filter/CheckChangesHasContent' version: '0.011' - class: Dist::Zilla::Plugin::ConfirmRelease name: '@Filter/ConfirmRelease' version: '6.012' - class: Dist::Zilla::Plugin::Manifest name: '@Filter/Manifest' version: '6.012' - class: Dist::Zilla::Plugin::MetaConfig name: '@Filter/MetaConfig' version: '6.012' - class: Dist::Zilla::Plugin::MetaJSON name: '@Filter/MetaJSON' version: '6.012' - class: Dist::Zilla::Plugin::MetaProvides::Package config: Dist::Zilla::Plugin::MetaProvides::Package: finder_objects: - class: Dist::Zilla::Plugin::FinderCode name: '@Filter/MetaProvides::Package/AUTOVIV/:InstallModulesPM' version: '6.012' include_underscores: 0 Dist::Zilla::Role::MetaProvider::Provider: $Dist::Zilla::Role::MetaProvider::Provider::VERSION: '2.002004' inherit_missing: '1' inherit_version: '1' meta_noindex: '1' Dist::Zilla::Role::ModuleMetadata: Module::Metadata: '1.000033' version: '0.006' name: '@Filter/MetaProvides::Package' version: '2.004003' - class: Dist::Zilla::Plugin::MetaResources name: '@Filter/MetaResources' version: '6.012' - class: Dist::Zilla::Plugin::ModuleBuild config: Dist::Zilla::Role::TestRunner: default_jobs: 1 name: '@Filter/ModuleBuild' version: '6.012' - class: Dist::Zilla::Plugin::PkgVersion name: '@Filter/PkgVersion' version: '6.012' - class: Dist::Zilla::Plugin::PodCoverageTests name: '@Filter/PodCoverageTests' version: '6.012' - class: Dist::Zilla::Plugin::PodSyntaxTests name: '@Filter/PodSyntaxTests' version: '6.012' - class: Dist::Zilla::Plugin::PodVersion name: '@Filter/PodVersion' version: '6.012' - class: Dist::Zilla::Plugin::Test::CPAN::Changes config: Dist::Zilla::Plugin::Test::CPAN::Changes: changelog: Changes name: '@Filter/Test::CPAN::Changes' version: '0.012' - class: Dist::Zilla::Plugin::Test::EOL config: Dist::Zilla::Plugin::Test::EOL: filename: xt/author/eol.t finder: - ':ExecFiles' - ':InstallModules' - ':TestFiles' trailing_whitespace: 1 name: '@Filter/Test::EOL' version: '0.19' - class: Dist::Zilla::Plugin::Test::NoTabs config: Dist::Zilla::Plugin::Test::NoTabs: filename: xt/author/no-tabs.t finder: - ':InstallModules' - ':ExecFiles' - ':TestFiles' name: '@Filter/Test::NoTabs' version: '0.15' - class: Dist::Zilla::Plugin::Test::TrailingSpace name: '@Filter/Test::TrailingSpace' version: 0.2.0 - class: Dist::Zilla::Plugin::TestRelease name: '@Filter/TestRelease' version: '6.012' - class: Dist::Zilla::Plugin::PodWeaver config: Dist::Zilla::Plugin::PodWeaver: finder: - ':InstallModules' - ':ExecFiles' plugins: - class: Pod::Weaver::Plugin::EnsurePod5 name: '@CorePrep/EnsurePod5' version: '4.015' - class: Pod::Weaver::Plugin::H1Nester name: '@CorePrep/H1Nester' version: '4.015' - class: Pod::Weaver::Plugin::SingleEncoding name: -SingleEncoding version: '4.015' - class: Pod::Weaver::Section::Generic name: NAME version: '4.015' - class: Pod::Weaver::Section::Version name: Version version: '4.015' - class: Pod::Weaver::Section::Region name: prelude version: '4.015' - class: Pod::Weaver::Section::Generic name: SYNOPSIS version: '4.015' - class: Pod::Weaver::Section::Generic name: DESCRIPTION version: '4.015' - class: Pod::Weaver::Section::Generic name: OVERVIEW version: '4.015' - class: Pod::Weaver::Section::Collect name: ATTRIBUTES version: '4.015' - class: Pod::Weaver::Section::Collect name: METHODS version: '4.015' - class: Pod::Weaver::Section::Leftovers name: Leftovers version: '4.015' - class: Pod::Weaver::Section::Region name: postlude version: '4.015' - class: Pod::Weaver::Section::Authors name: Authors version: '4.015' - class: Pod::Weaver::Section::Legal name: Legal version: '4.015' - class: Pod::Weaver::Section::Bugs name: Bugs version: '4.015' - class: Pod::Weaver::Section::Support name: Support version: '1.010' name: '@Filter/PodWeaver' version: '4.008' - class: Dist::Zilla::Plugin::UploadToCPAN name: '@Filter/UploadToCPAN' version: '6.012' - class: Dist::Zilla::Plugin::Keywords config: Dist::Zilla::Plugin::Keywords: keywords: - average - distribution - mean - median - statistics - stats - stddev name: Keywords version: '0.007' - class: Dist::Zilla::Plugin::PruneFiles name: PruneFiles version: '6.012' - class: Dist::Zilla::Plugin::RewriteVersion config: Dist::Zilla::Plugin::RewriteVersion: add_tarball_name: 0 finders: - ':ExecFiles' - ':InstallModules' global: 0 skip_version_provider: 0 name: RewriteVersion version: '0.018' - class: Dist::Zilla::Plugin::Test::Compile config: Dist::Zilla::Plugin::Test::Compile: bail_out_on_fail: '0' fail_on_warning: author fake_home: 1 filename: t/00-compile.t module_finder: - ':InstallModules' needs_display: 0 phase: test script_finder: - ':PerlExecFiles' skips: - bump-ver|tag-release|run_agg_tests switch: [] name: Test::Compile version: '2.058' - class: Dist::Zilla::Plugin::FinderCode name: ':InstallModules' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':IncModules' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':TestFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':ExtraTestFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':ExecFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':PerlExecFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':ShareFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':MainModule' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':AllFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: ':NoFiles' version: '6.012' - class: Dist::Zilla::Plugin::FinderCode name: '@Filter/MetaProvides::Package/AUTOVIV/:InstallModulesPM' version: '6.012' zilla: class: Dist::Zilla::Dist::Builder config: is_trial: '0' version: '6.012' x_generated_by_perl: v5.28.0 x_serialization_backend: 'YAML::Tiny version 1.73' MANIFEST100644000764000764 172013364403553 17733 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702# This file was automatically generated by Dist::Zilla::Plugin::Manifest v6.012. Build.PL Changes LICENSE MANIFEST MANIFEST.SKIP META.json META.yml Makefile.PL README UserSurvey.txt dist.ini examples/statistical-analysis.pl inc/Test/Run/Builder.pm lib/Statistics/Descriptive.pm lib/Statistics/Descriptive/Full.pm lib/Statistics/Descriptive/Smoother.pm lib/Statistics/Descriptive/Smoother/Exponential.pm lib/Statistics/Descriptive/Smoother/Weightedexponential.pm lib/Statistics/Descriptive/Sparse.pm scripts/bump-version-number.pl scripts/tag-release.pl t/00-compile.t t/00-load.t t/boilerplate.t t/descr.t t/descr_smooth_methods.t t/freq_distribution-1-rt-34999.t t/lib/Utils.pm t/median_absolute_deviation.t t/mode.t t/outliers.t t/quantile.t t/smoother.t t/smoother_exponential.t t/smoother_weightedexponential.t t/summary.t weaver.ini xt/author/eol.t xt/author/no-tabs.t xt/author/pod-coverage.t xt/author/pod-syntax.t xt/release/cpan-changes.t xt/release/trailing-space.t Build.PL100644000764000764 300513364403553 20074 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702 # This file was automatically generated by Dist::Zilla::Plugin::ModuleBuild v6.012. use strict; use warnings; use Module::Build 0.28; my %module_build_args = ( "build_requires" => { "Module::Build" => "0.28" }, "configure_requires" => { "ExtUtils::MakeMaker" => 0, "Module::Build" => "0.28" }, "dist_abstract" => "Module of basic descriptive statistical functions.", "dist_author" => [ "Shlomi Fish " ], "dist_name" => "Statistics-Descriptive", "dist_version" => "3.0702", "license" => "perl", "module_name" => "Statistics::Descriptive", "recursive_test_files" => 1, "requires" => { "Carp" => 0, "List::MoreUtils" => 0, "List::Util" => 0, "POSIX" => 0, "base" => 0, "parent" => 0, "perl" => "5.006", "strict" => 0, "vars" => 0, "warnings" => 0 }, "test_requires" => { "Benchmark" => 0, "Exporter" => 0, "File::Spec" => 0, "File::Temp" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Test::More" => "0.88", "lib" => 0 } ); my %fallback_build_requires = ( "Benchmark" => 0, "Exporter" => 0, "File::Spec" => 0, "File::Temp" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Module::Build" => "0.28", "Test::More" => "0.88", "lib" => 0 ); unless ( eval { Module::Build->VERSION(0.4004) } ) { delete $module_build_args{test_requires}; $module_build_args{build_requires} = \%fallback_build_requires; } my $build = Module::Build->new(%module_build_args); $build->create_build_script; descr.t100644000764000764 3006313364403553 20354 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 61; use lib 't/lib'; use Utils qw/is_between compare_hash_by_ranges/; use Benchmark; use Statistics::Descriptive; { # test #1 my $stat = Statistics::Descriptive::Full->new(); my @results = $stat->least_squares_fit(); # TEST ok (!scalar(@results), "Least-squares results on a non-filled object are empty."); # test #2 # data are y = 2*x - 1 $stat->add_data( 1, 3, 5, 7 ); @results = $stat->least_squares_fit(); # TEST is_deeply ( [@results[0..1]], [-1, 2], "least_squares_fit returns the correct result." ); } { # test #3 # test error condition on harmonic mean : one element zero my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1.1, 2.9, 4.9, 0.0 ); my $single_result = $stat->harmonic_mean(); # TEST ok (!defined($single_result), "harmonic_mean is undefined if there's a 0 datum." ); } { # test #4 # test error condition on harmonic mean : sum of elements zero my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1.0, -1.0 ); my $single_result = $stat->harmonic_mean(); # TEST ok (!defined($single_result), "harmonic_mean is undefined if the sum of the reciprocals is zero." ); } { # test #5 # test error condition on harmonic mean : sum of elements near zero my $stat = Statistics::Descriptive::Full->new(); local $Statistics::Descriptive::Tolerance = 0.1; $stat->add_data( 1.01, -1.0 ); my $single_result = $stat->harmonic_mean(); # TEST ok (! defined( $single_result ), "test error condition on harmonic mean : sum of elements near zero" ); } { # test #6 # test normal function of harmonic mean my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1,2,3 ); my $single_result = $stat->harmonic_mean(); # TEST ok (scalar(abs( $single_result - 1.6363 ) < 0.001), "test normal function of harmonic mean", ); } { # test #7 # test stringification of hash keys in frequency distribution my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(0.1, 0.15, 0.16, 1/3); my %f = $stat->frequency_distribution(2); # TEST compare_hash_by_ranges( \%f, [[0.216666,0.216667,3],[0.3333,0.3334,1]], "Test stringification of hash keys in frequency distribution", ); # test #8 ##Test memorization of last frequency distribution my %g = $stat->frequency_distribution(); # TEST is_deeply( \%f, \%g, "memorization of last frequency distribution" ); } { # test #9 # test the frequency distribution with specified bins my $stat = Statistics::Descriptive::Full->new(); my @freq_bins=(20,40,60,80,100); $stat->add_data(23.92, 32.30, 15.27, 39.89, 8.96, 40.71, 16.20, 34.61, 27.98, 74.40); my %f = $stat->frequency_distribution(\@freq_bins); # TEST is_deeply( \%f, { 20 => 3, 40 => 5, 60 => 1, 80 => 1, 100 => 0, }, "Test the frequency distribution with specified bins" ); } { # test #10 and #11 # Test the percentile function and caching my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(-5,-2,4,7,7,18); ##Check algorithm # TEST is ($stat->percentile(50), 4, "percentile function and caching - 1", ); # TEST is ($stat->percentile(25), -2, "percentile function and caching - 2", ); } { # tests #12 and #13 # Check correct parsing of method parameters my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1,2,3,4,5,6,7,8,9,10); # TEST is( $stat->trimmed_mean(0.1,0.1), $stat->trimmed_mean(0.1), "correct parsing of method parameters", ); # TEST is ($stat->trimmed_mean(0.1,0), 6, "correct parsing of method parameters - 2", ); } { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data((0.001) x 6); # TEST is_between ($stat->variance(), 0, 0.00001, "Workaround to avoid rounding errors that yield negative variance." ); # TEST is_between ($stat->standard_deviation(), 0, 0.00001, "Workaround to avoid rounding errors that yield negative std-dev." ); } { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1, 2, 3, 5); # TEST is ($stat->count(), 4, "There are 4 elements." ); # TEST is ($stat->sum(), 11, "The sum is 11", ); # TEST is ($stat->sumsq(), 39, "The sum of squares is 39" ); # TEST is ($stat->min(), 1, "The minimum is 1." ); # TEST is ($stat->max(), 5, "The maximum is 5." ); } { # test #9 # test the frequency distribution with specified bins my $stat = Statistics::Descriptive::Full->new(); my @freq_bins=(20,40,60,80,100); $stat->add_data(23.92, 32.30, 15.27, 39.89, 8.96, 40.71, 16.20, 34.61, 27.98, 74.40); my $f_d = $stat->frequency_distribution_ref(\@freq_bins); # TEST is_deeply( $f_d, { 20 => 3, 40 => 5, 60 => 1, 80 => 1, 100 => 0, }, "Test the frequency distribution returned as a scalar reference" ); } { # test #9 # test the frequency distribution with specified bins my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(2, 4, 8); # TEST is_between( $stat->geometric_mean(), (4-1e-4), (4+1e-4), "Geometric Mean Test #1", ) } { my $stat = Statistics::Descriptive::Full->new(); my $expected; $stat->add_data(1 .. 9, 100); # TEST $expected = 3.11889574523909; is_between ($stat->skewness(), $expected - 1E-13, $expected + 1E-13, "Skewness of $expected +/- 1E-13" ); # TEST $expected = 9.79924471616366; is_between ($stat->kurtosis(), $expected - 1E-13, $expected + 1E-13, "Kurtosis of $expected +/- 1E-13" ); $stat->add_data(100 .. 110); # now check that cached skew and kurt values are recalculated # TEST $expected = -0.306705104889384; is_between ($stat->skewness(), $expected - 1E-13, $expected + 1E-13, "Skewness of $expected +/- 1E-13" ); # TEST $expected = -2.09839497356215; is_between ($stat->kurtosis(), $expected - 1E-13, $expected + 1E-13, "Kurtosis of $expected +/- 1E-13" ); } { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1,2); my $def; # TEST $def = defined $stat->skewness() ? 1 : 0; is ($def, 0, 'Skewness is undef for 2 samples' ); $stat->add_data (1); # TEST $def = defined $stat->kurtosis() ? 1 : 0; is ($def, 0, 'Kurtosis is undef for 3 samples' ); } { # This is a fix for: # https://rt.cpan.org/Ticket/Display.html?id=72495 # Thanks to Robert Messer my $stat = Statistics::Descriptive::Full->new(); my $ret = $stat->percentile(100); # TEST ok (!defined($ret), 'Returns undef and does not die.'); } # test stats when no data have been added { my $stat = Statistics::Descriptive::Full->new(); my ($result, $str); # An accessor method for _permitted would be handy, # or one to get all the stats methods my @methods = qw { mean sum variance standard_deviation min mindex max maxdex sample_range skewness kurtosis median harmonic_mean geometric_mean mode least_squares_fit percentile frequency_distribution }; # least_squares_fit is handled in an earlier test, so is actually a duplicate here #diag 'Results are undef when no data added'; # need to update next line when new methods are tested here # TEST:$method_count=18 foreach my $method (sort @methods) { $result = $stat->$method; # TEST*$method_count ok (!defined ($result), "$method is undef when object has no data."); } # quantile and trimmed_mean require valid args, so don't test in the method loop my $method = 'quantile'; $result = $stat->$method(1); # TEST ok (!defined ($result), "$method is undef when object has no data."); $method = 'trimmed_mean'; $result = $stat->$method(0.1); # TEST ok (!defined ($result), "$method is undef when object has no data."); } # test SD when only one value added { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1 ); my $result = $stat->standard_deviation(); # TEST ok ($result == 0, "SD is zero when object has one record."); } # Test function returns undef in list context when no data added. # The test itself is almost redundant. # Fixes https://rt.cpan.org/Ticket/Display.html?id=74890 { my $stat = Statistics::Descriptive::Full->new(); # TEST is_deeply( [ $stat->median(), ], [ undef() ], "->median() Returns undef in list-context.", ); # TEST is_deeply( [ $stat->standard_deviation(), ], [ undef() ], "->standard_deviation() Returns undef in list-context.", ); } { my $stats = Statistics::Descriptive::Full->new(); $stats->add_data_with_samples([{1 => 10}, {2 => 20}, {3 => 30}, {4 => 40}, {5 => 50}]); # TEST is_deeply( $stats->_data(), [ 1, 2, 3, 4, 5 ], 'add_data_with_samples: data set is correct', ); # TEST is_deeply( $stats->_samples(), [ 10, 20, 30, 40, 50 ], 'add_data_with_samples: samples are correct', ); } # Tests for mindex and maxdex on unsorted data, # including when new data are added which should not change the values { my $stats_class = 'Statistics::Descriptive::Full'; my $stat1 = $stats_class->new(); my @data1 = (20, 1 .. 3, 100, 1..5); my @data2 = (25, 30); my $e_maxdex = 4; my $e_mindex = 1; $stat1->add_data(@data1); # initialise # TEST*2 is ($stat1->mindex, $e_mindex, "initial mindex is correct"); is ($stat1->maxdex, $e_maxdex, "initial maxdex is correct"); # TEST*2 $stat1->add_data(@data2); # add new data is ($stat1->mindex, $e_mindex, "mindex is correct after new data added"); is ($stat1->maxdex, $e_maxdex, "maxdex is correct after new data added"); # TEST*2 $stat1->median; # trigger a sort $e_maxdex = scalar @data1 + scalar @data2 - 1; is ($stat1->mindex, 0, "mindex is correct after sorting"); is ($stat1->maxdex, $e_maxdex, "maxdex is correct after sorting"); } # what happens when we add new data? # Recycle the same data so mean, sd etc remain the same { my $stats_class = 'Statistics::Descriptive::Full'; my $stat1 = $stats_class->new(); my $stat2 = $stats_class->new(); my @data1 = (1 .. 9, 100); my @data2 = (100 .. 110); # sample of methods my @methods = qw /mean standard_deviation count skewness kurtosis median/; $stat1->add_data(@data1); # initialise foreach my $meth (@methods) { # run some methods $stat1->$meth; } $stat1->add_data(@data2); # add new data foreach my $meth (@methods) { # re-run some methods $stat1->$meth; } $stat2->add_data(@data1, @data2); # initialise with all data foreach my $meth (@methods) { # run some methods $stat2->$meth; } # TEST is_deeply ( $stat1, $stat2, 'stats consistent after adding new data', ); } META.json100644000764000764 4413613364403553 20253 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702{ "abstract" : "Module of basic descriptive statistical functions.", "author" : [ "Shlomi Fish " ], "dynamic_config" : 0, "generated_by" : "Dist::Zilla version 6.012, CPAN::Meta::Converter version 2.150010", "keywords" : [ "average", "distribution", "mean", "median", "statistics", "stats", "stddev" ], "license" : [ "perl_5" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : 2 }, "name" : "Statistics-Descriptive", "prereqs" : { "build" : { "requires" : { "Module::Build" : "0.28" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0", "Module::Build" : "0.28" } }, "develop" : { "requires" : { "Pod::Coverage::TrustPod" : "0", "Test::CPAN::Changes" : "0.19", "Test::EOL" : "0", "Test::More" : "0.96", "Test::NoTabs" : "0", "Test::Pod" : "1.41", "Test::Pod::Coverage" : "1.08", "Test::TrailingSpace" : "0.0203" } }, "runtime" : { "requires" : { "Carp" : "0", "List::MoreUtils" : "0", "List::Util" : "0", "POSIX" : "0", "base" : "0", "parent" : "0", "perl" : "5.006", "strict" : "0", "vars" : "0", "warnings" : "0" } }, "test" : { "requires" : { "Benchmark" : "0", "Exporter" : "0", "File::Spec" : "0", "File::Temp" : "0", "IO::Handle" : "0", "IPC::Open3" : "0", "Test::More" : "0.88", "lib" : "0" } } }, "provides" : { "Statistics::Descriptive" : { "file" : "lib/Statistics/Descriptive.pm", "version" : "3.0702" }, "Statistics::Descriptive::Full" : { "file" : "lib/Statistics/Descriptive/Full.pm", "version" : "3.0702" }, "Statistics::Descriptive::Smoother" : { "file" : "lib/Statistics/Descriptive/Smoother.pm", "version" : "3.0702" }, "Statistics::Descriptive::Smoother::Exponential" : { "file" : "lib/Statistics/Descriptive/Smoother/Exponential.pm", "version" : "3.0702" }, "Statistics::Descriptive::Smoother::Weightedexponential" : { "file" : "lib/Statistics/Descriptive/Smoother/Weightedexponential.pm", "version" : "3.0702" }, "Statistics::Descriptive::Sparse" : { "file" : "lib/Statistics/Descriptive/Sparse.pm", "version" : "3.0702" } }, "release_status" : "stable", "resources" : { "bugtracker" : { "web" : "https://github.com/shlomif/perl-Statistics-Descriptive/issues" }, "homepage" : "http://metacpan.org/release/Statistics-Descriptive", "repository" : { "type" : "git", "url" : "git://github.com/shlomif/perl-Statistics-Descriptive.git", "web" : "https://github.com/shlomif/perl-Statistics-Descriptive" } }, "version" : "3.0702", "x_Dist_Zilla" : { "perl" : { "version" : "5.028000" }, "plugins" : [ { "class" : "Dist::Zilla::Plugin::Prereqs", "config" : { "Dist::Zilla::Plugin::Prereqs" : { "phase" : "test", "type" : "requires" } }, "name" : "@Filter/TestMoreDoneTesting", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::AutoPrereqs", "name" : "@Filter/AutoPrereqs", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::ExecDir", "name" : "@Filter/ExecDir", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::GatherDir", "config" : { "Dist::Zilla::Plugin::GatherDir" : { "exclude_filename" : [], "exclude_match" : [], "follow_symlinks" : 0, "include_dotfiles" : 0, "prefix" : "", "prune_directory" : [], "root" : "." } }, "name" : "@Filter/GatherDir", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::ManifestSkip", "name" : "@Filter/ManifestSkip", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::MetaYAML", "name" : "@Filter/MetaYAML", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PruneCruft", "name" : "@Filter/PruneCruft", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::RunExtraTests", "config" : { "Dist::Zilla::Role::TestRunner" : { "default_jobs" : 1 } }, "name" : "@Filter/RunExtraTests", "version" : "0.029" }, { "class" : "Dist::Zilla::Plugin::ShareDir", "name" : "@Filter/ShareDir", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::MakeMaker", "config" : { "Dist::Zilla::Role::TestRunner" : { "default_jobs" : 1 } }, "name" : "@Filter/MakeMaker", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::CheckChangesHasContent", "name" : "@Filter/CheckChangesHasContent", "version" : "0.011" }, { "class" : "Dist::Zilla::Plugin::ConfirmRelease", "name" : "@Filter/ConfirmRelease", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::Manifest", "name" : "@Filter/Manifest", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::MetaConfig", "name" : "@Filter/MetaConfig", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::MetaJSON", "name" : "@Filter/MetaJSON", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::MetaProvides::Package", "config" : { "Dist::Zilla::Plugin::MetaProvides::Package" : { "finder_objects" : [ { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : "@Filter/MetaProvides::Package/AUTOVIV/:InstallModulesPM", "version" : "6.012" } ], "include_underscores" : 0 }, "Dist::Zilla::Role::MetaProvider::Provider" : { "$Dist::Zilla::Role::MetaProvider::Provider::VERSION" : "2.002004", "inherit_missing" : 1, "inherit_version" : 1, "meta_noindex" : 1 }, "Dist::Zilla::Role::ModuleMetadata" : { "Module::Metadata" : "1.000033", "version" : "0.006" } }, "name" : "@Filter/MetaProvides::Package", "version" : "2.004003" }, { "class" : "Dist::Zilla::Plugin::MetaResources", "name" : "@Filter/MetaResources", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::ModuleBuild", "config" : { "Dist::Zilla::Role::TestRunner" : { "default_jobs" : 1 } }, "name" : "@Filter/ModuleBuild", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PkgVersion", "name" : "@Filter/PkgVersion", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PodCoverageTests", "name" : "@Filter/PodCoverageTests", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PodSyntaxTests", "name" : "@Filter/PodSyntaxTests", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PodVersion", "name" : "@Filter/PodVersion", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::Test::CPAN::Changes", "config" : { "Dist::Zilla::Plugin::Test::CPAN::Changes" : { "changelog" : "Changes" } }, "name" : "@Filter/Test::CPAN::Changes", "version" : "0.012" }, { "class" : "Dist::Zilla::Plugin::Test::EOL", "config" : { "Dist::Zilla::Plugin::Test::EOL" : { "filename" : "xt/author/eol.t", "finder" : [ ":ExecFiles", ":InstallModules", ":TestFiles" ], "trailing_whitespace" : 1 } }, "name" : "@Filter/Test::EOL", "version" : "0.19" }, { "class" : "Dist::Zilla::Plugin::Test::NoTabs", "config" : { "Dist::Zilla::Plugin::Test::NoTabs" : { "filename" : "xt/author/no-tabs.t", "finder" : [ ":InstallModules", ":ExecFiles", ":TestFiles" ] } }, "name" : "@Filter/Test::NoTabs", "version" : "0.15" }, { "class" : "Dist::Zilla::Plugin::Test::TrailingSpace", "name" : "@Filter/Test::TrailingSpace", "version" : "0.2.0" }, { "class" : "Dist::Zilla::Plugin::TestRelease", "name" : "@Filter/TestRelease", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::PodWeaver", "config" : { "Dist::Zilla::Plugin::PodWeaver" : { "finder" : [ ":InstallModules", ":ExecFiles" ], "plugins" : [ { "class" : "Pod::Weaver::Plugin::EnsurePod5", "name" : "@CorePrep/EnsurePod5", "version" : "4.015" }, { "class" : "Pod::Weaver::Plugin::H1Nester", "name" : "@CorePrep/H1Nester", "version" : "4.015" }, { "class" : "Pod::Weaver::Plugin::SingleEncoding", "name" : "-SingleEncoding", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "NAME", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Version", "name" : "Version", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Region", "name" : "prelude", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "SYNOPSIS", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "DESCRIPTION", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "OVERVIEW", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "ATTRIBUTES", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "METHODS", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Leftovers", "name" : "Leftovers", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Region", "name" : "postlude", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Authors", "name" : "Authors", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Legal", "name" : "Legal", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Bugs", "name" : "Bugs", "version" : "4.015" }, { "class" : "Pod::Weaver::Section::Support", "name" : "Support", "version" : "1.010" } ] } }, "name" : "@Filter/PodWeaver", "version" : "4.008" }, { "class" : "Dist::Zilla::Plugin::UploadToCPAN", "name" : "@Filter/UploadToCPAN", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::Keywords", "config" : { "Dist::Zilla::Plugin::Keywords" : { "keywords" : [ "average", "distribution", "mean", "median", "statistics", "stats", "stddev" ] } }, "name" : "Keywords", "version" : "0.007" }, { "class" : "Dist::Zilla::Plugin::PruneFiles", "name" : "PruneFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::RewriteVersion", "config" : { "Dist::Zilla::Plugin::RewriteVersion" : { "add_tarball_name" : 0, "finders" : [ ":ExecFiles", ":InstallModules" ], "global" : 0, "skip_version_provider" : 0 } }, "name" : "RewriteVersion", "version" : "0.018" }, { "class" : "Dist::Zilla::Plugin::Test::Compile", "config" : { "Dist::Zilla::Plugin::Test::Compile" : { "bail_out_on_fail" : 0, "fail_on_warning" : "author", "fake_home" : 1, "filename" : "t/00-compile.t", "module_finder" : [ ":InstallModules" ], "needs_display" : 0, "phase" : "test", "script_finder" : [ ":PerlExecFiles" ], "skips" : [ "bump-ver|tag-release|run_agg_tests" ], "switch" : [] } }, "name" : "Test::Compile", "version" : "2.058" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":InstallModules", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":IncModules", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":TestFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ExtraTestFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ExecFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":PerlExecFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ShareFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":MainModule", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":AllFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":NoFiles", "version" : "6.012" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : "@Filter/MetaProvides::Package/AUTOVIV/:InstallModulesPM", "version" : "6.012" } ], "zilla" : { "class" : "Dist::Zilla::Dist::Builder", "config" : { "is_trial" : 0 }, "version" : "6.012" } }, "x_generated_by_perl" : "v5.28.0", "x_serialization_backend" : "Cpanel::JSON::XS version 4.06" } weaver.ini100644000764000764 66513364403553 20563 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702[@CorePrep] [-SingleEncoding] [Generic / NAME] [Version] [Region / prelude] [Generic / SYNOPSIS] [Generic / DESCRIPTION] [Generic / OVERVIEW] [Collect / ATTRIBUTES] command = attr [Collect / METHODS] command = method [Leftovers] [Region / postlude] [Authors] [Legal] ; [Generic / DESCRIPTION] ; required = 1 ; [Generic / BUGS] ; [Generic / Section::Bugs] ; [Generic / Section::License] ; [Bugs] [Support] all_modules = 1 00-load.t100644000764000764 26613364403553 20352 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!perl -T use Test::More tests => 1; BEGIN { use_ok( 'Statistics::Descriptive' ); } diag( "Testing Statistics::Descriptive $Statistics::Descriptive::VERSION, Perl $], $^X" ); summary.t100644000764000764 343613364403553 20735 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 6; use Statistics::Descriptive; { my @data1 = ( 1 .. 20 ); my $obj = Statistics::Descriptive::Full->new; $obj->add_data(@data1); my $summary = $obj->summary; # TEST like( $summary, qr#\bMin: 1#, 'min' ); # TEST like( $summary, qr#\bMax: 2#, 'max' ); # TEST like( $summary, qr#\bMean: 1#, 'mean' ); # TEST like( $summary, qr#\bMedian: 1#, 'mean' ); # TEST like( $summary, qr#\b1st quantile: 5#, 'quantile' ); # TEST like( $summary, qr#\b3rd quantile: 1#, 'quantile' ); } __END__ =head1 COPYRIGHT & LICENSE Copyright 2018 by Shlomi Fish This program is distributed under the MIT / Expat License: L Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut Makefile.PL100644000764000764 337013364403553 20557 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702# This file was automatically generated by Dist::Zilla::Plugin::MakeMaker v6.012. use strict; use warnings; use 5.006; use ExtUtils::MakeMaker; my %WriteMakefileArgs = ( "ABSTRACT" => "Module of basic descriptive statistical functions.", "AUTHOR" => "Shlomi Fish ", "BUILD_REQUIRES" => { "Module::Build" => "0.28" }, "CONFIGURE_REQUIRES" => { "ExtUtils::MakeMaker" => 0, "Module::Build" => "0.28" }, "DISTNAME" => "Statistics-Descriptive", "LICENSE" => "perl", "MIN_PERL_VERSION" => "5.006", "NAME" => "Statistics::Descriptive", "PREREQ_PM" => { "Carp" => 0, "List::MoreUtils" => 0, "List::Util" => 0, "POSIX" => 0, "base" => 0, "parent" => 0, "strict" => 0, "vars" => 0, "warnings" => 0 }, "TEST_REQUIRES" => { "Benchmark" => 0, "Exporter" => 0, "File::Spec" => 0, "File::Temp" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Test::More" => "0.88", "lib" => 0 }, "VERSION" => "3.0702", "test" => { "TESTS" => "t/*.t" } ); my %FallbackPrereqs = ( "Benchmark" => 0, "Carp" => 0, "Exporter" => 0, "File::Spec" => 0, "File::Temp" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "List::MoreUtils" => 0, "List::Util" => 0, "Module::Build" => "0.28", "POSIX" => 0, "Test::More" => "0.88", "base" => 0, "lib" => 0, "parent" => 0, "strict" => 0, "vars" => 0, "warnings" => 0 ); unless ( eval { ExtUtils::MakeMaker->VERSION(6.63_03) } ) { delete $WriteMakefileArgs{TEST_REQUIRES}; delete $WriteMakefileArgs{BUILD_REQUIRES}; $WriteMakefileArgs{PREREQ_PM} = \%FallbackPrereqs; } delete $WriteMakefileArgs{CONFIGURE_REQUIRES} unless eval { ExtUtils::MakeMaker->VERSION(6.52) }; WriteMakefile(%WriteMakefileArgs); outliers.t100644000764000764 1032213364403553 21116 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 11; use Statistics::Descriptive; sub foo {return;}; local $SIG{__WARN__} = sub { }; { # testing set_outlier_filter my $stat = Statistics::Descriptive::Full->new(); # TEST ok ( !defined($stat->set_outlier_filter()), 'set_outlier_filter: undef code reference value'); # TEST ok ( !defined($stat->set_outlier_filter(1)), 'set_outlier_filter: invalid code ref value'); # TEST is ( $stat->set_outlier_filter(\&foo), 1, 'set_outlier_filter: valid code reference - return value'); # TEST is ( $stat->{_outlier_filter}, \&foo, 'set_outlier_filter: valid code reference - internal'); } { # testing get_data_without_outliers without removing outliers my $stat = Statistics::Descriptive::Full->new(); # TEST ok ( !defined($stat->get_data_without_outliers()), 'get_data_without_outliers: insufficient samples'); $stat->add_data( 1, 2, 3, 4, 5 ); # TEST ok ( !defined($stat->get_data_without_outliers()), 'get_data_without_outliers: undefined filter'); # We force the filter function to never detect outliers... $stat->set_outlier_filter( sub {0} ); no warnings 'redefine'; local *Statistics::Descriptive::Full::_outlier_candidate_index = sub { 0 }; my @results = $stat->get_data_without_outliers(); #...we expect the data set to be unmodified # TEST is_deeply ( [@results], [1, 2, 3, 4, 5], 'get_data_without_outliers: no outliers', ); } { # testing get_data_without_outliers removing outliers my $stat = Statistics::Descriptive::Full->new(); # 100 is definitively the candidate to be an outlier in this series $stat->add_data( 1, 2, 3, 4, 100, 6, 7, 8 ); # We force the filter function to always detect outliers for this data set $stat->set_outlier_filter( sub {$_[1] > 0} ); my @results = $stat->get_data_without_outliers(); # Note that 100 has been filtered out from the data set # TEST is_deeply ( [@results], [1, 2, 3, 4, 6, 7, 8, ], 'get_data_without_outliers: remove outliers', ); } my ($first_val, $second_val); sub check_params { ($first_val, $second_val) = @_; } { # testing params passed to outlier filter my $stat = Statistics::Descriptive::Full->new(); # 100 is definitively the candidate to be an outlier in this series $stat->add_data( 1, 2, 3, 4, 100, 6, 7, 8 ); $stat->set_outlier_filter( \&check_params ); my @results = $stat->get_data_without_outliers(); # TEST isa_ok ($first_val, 'Statistics::Descriptive::Full', 'first param of outlier filter ok'); # TEST is ($second_val, 100, 'second param of outlier filter ok'); } { # testing _outlier_candidate_index my $stat = Statistics::Descriptive::Full->new(); # 100 is definitively the candidate to be an outlier in this series $stat->add_data( 1, 2, 3, 4, 100, 6, 7, 8 ); # TEST is ($stat->_outlier_candidate_index, 4, '_outlier_candidate_index' ); } =pod =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut smoother.t100644000764000764 1030513364403553 21111 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 11; use Statistics::Descriptive::Smoother; local $SIG{__WARN__} = sub { }; { #Test factory pattern my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0, data => [1,2,3], samples => [100, 100, 100], }); # TEST isa_ok ($smoother, 'Statistics::Descriptive::Smoother::Exponential', 'Exponential class correctly created'); } { my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedExponential', coeff => 0, data => [1,2,3], samples => [100, 100, 100], }); # TEST isa_ok ($smoother, 'Statistics::Descriptive::Smoother::Weightedexponential', 'Weightedexponential class correctly created'); } { # Test invalid smoothing method eval { Statistics::Descriptive::Smoother->instantiate({ method => 'invalid_method', coeff => 0, data => [1,2,3], samples => [100, 100, 100], }); }; # TEST ok ($@, 'Invalid method'); } { #TODO get output from Carp #Test invalid coefficient my $smoother_neg = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => -123, data => [1,2,3], samples => [100, 100, 100], }); # TEST is ($smoother_neg, undef, 'Invalid coefficient: < 0'); my $smoother_pos = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 123, data => [1,2,3], samples => [100, 100, 100], }); # TEST is ($smoother_pos, undef, 'Invalid coefficient: > 1'); } { #Test unsufficient number of samples my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0, data => [1], samples => [100], }); # TEST is ($smoother, undef, 'Insufficient number of samples'); } { #Test smoothing coefficient accessors my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0.5, data => [1,2,3], samples => [100, 100, 100], }); # TEST is ($smoother->get_smoothing_coeff(), 0.5, 'get_smoothing_coeff'); my $ok = $smoother->set_smoothing_coeff(0.7); # TEST ok ($ok, 'set_smoothing_coeff: set went fine'); # TEST is ($smoother->get_smoothing_coeff(), 0.7, 'set_smoothing_coeff: correct value set'); my $ok2 = $smoother->set_smoothing_coeff(123); # TEST is ($ok2, undef, 'set_smoothing_coeff: set failed'); # TEST is ($smoother->get_smoothing_coeff(), 0.7, 'set_smoothing_coeff: value not modified after failure'); } 1; =pod =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut quantile.t100644000764000764 350013364403553 21052 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl #================================================================== # Author : Djibril Ousmanou # Copyright : 2009 # Update : 20/07/2009 # AIM : Test quantile type 7 calcul #================================================================== use strict; use warnings; use Carp; use Test::More tests => 15; use Statistics::Descriptive; my @data1 = ( 1 .. 10 ); my @data2 = ( 601, 449, 424, 568, 569, 447, 425, 621, 616, 573, 584, 635, 480, 437, 724, 711, 717, 576, 724, 585, 458, 752, 753, 709, 584, 748, 628, 483, 739, 747, 694, 601, 758, 653, 487, 720, 750, 660, 588, 719, 631, 492, 584, 647, 548, 585, 649, 532, 492, 598, 653, 524, 567, 570, 506, 475, 640, 725, 688, 567, 634, 520, 488, 718, 769, 739, 576, 718, 527, 497, 698, 736, 785, 581, 733, 540, 537, 683, 691, 785, 588, 733, 531, 564, 581, 554, 765, 580, 626, 510, 533, 495, 470, 713, 571, 573, 476, 526, 441, 431, 686, 563, 496, 447, 518 ); my @data3 = qw/-9 2 3 44 -10 6 7/; my %DataTest = ( 'First sample test' => { 'Data' => \@data1, 'Test' => { '0' => '1', '1' => '3.25', '2' => '5.5', '3' => '7.75', '4' => '10', }, }, 'Second sample test' => { 'Data' => \@data2, 'Test' => { '0' => '424', '1' => '526', '2' => '584', '3' => '698', '4' => '785', }, }, 'Third sample test' => { 'Data' => \@data3, 'Test' => { '0' => '-10', '1' => '-3.5', '2' => '3', '3' => '6.5', '4' => '44', }, } ); # Test Quantile, foreach my $MessageTest ( sort keys %DataTest ) { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( @{ $DataTest{$MessageTest}->{Data} } ); for ( 0 .. 4 ) { is( $stat->quantile($_), $DataTest{$MessageTest}->{Test}{$_}, $MessageTest . ", Q$_" ); } } MANIFEST.SKIP100644000764000764 313364403553 20411 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702~$ UserSurvey.txt100644000764000764 211713364403553 21500 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702Over the three years that I've owned Statistics::Descriptive I've often wondered how users of the module feel about it. I would appreciate it if you would answer the following questions: 1) What do you use Statistics::Descriptive for? (This is something that always amazes me. I think the coolest app so far is summarizing web server performance for the 1998 Olympics in Atlanta) 2) On what platform do you use Perl and Statistics::Descriptive? 3) Is the manpage helpful? (If not, please explain why). 4) Do you use the caching for Full method outputs such as trimmed_mean()? 5) Is there anything really annoying about Statistics::Descriptive? (aside from flakiness from the maintainer like surveys and bad releases) 6) Is there anything missing from Statistics::Descriptive? (I inherited it from Jason Kastner because I wanted to add a few features and he didn't have time to maintain the module anymore. After v2.2, any new features (2-3) that were added have come from requests from users.) 7) Any other comments you'd like to make that I haven't asked about? lib000755000764000764 013364403553 17453 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/tUtils.pm100644000764000764 756713364403553 21270 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t/libpackage Utils; use strict; use warnings; require Exporter; our @ISA = qw/Exporter/; our @EXPORT_OK = qw/is_between compare_hash_by_ranges is_array_between/; use Test::More; sub is_between { local $Test::Builder::Level = $Test::Builder::Level + 1; my ($have, $want_bottom, $want_top, $blurb) = @_; ok ( _is_between($have, $want_bottom, $want_top), $blurb ); } sub is_array_between { local $Test::Builder::Level = $Test::Builder::Level + 1; my ($got_array_ref, $expected_array_ref, $low_tolerance, $high_tolerance, $blurb) = @_; my $success = 1; if (scalar @$expected_array_ref != scalar @$got_array_ref) { $success = 0; diag('Arrays have different lengths'); } else { for my $idx (0 .. $#$got_array_ref) { my $expected_bottom = $expected_array_ref->[$idx] - $low_tolerance; my $expected_top = $expected_array_ref->[$idx] + $high_tolerance; unless (_is_between($got_array_ref->[$idx], $expected_bottom, $expected_top)) { $success = 0; diag(<<"EOF"); Value $idx is out of range: Got: [$got_array_ref->[$idx]] Expected: [$expected_bottom, $expected_top, $expected_array_ref->[$idx]] EOF last; } } } ok($success, $blurb); } sub compare_hash_by_ranges { local $Test::Builder::Level = $Test::Builder::Level + 1; my $got_hash_ref = shift; my $expected = shift; my $blurb = shift; my $got = [ map { [$_, $got_hash_ref->{$_} ] } sort { $a <=> $b } keys(%$got_hash_ref) ] ; my $success = 1; if (scalar(@$expected) != scalar(@$got)) { $success = 0; diag("Number of keys differ in hashes."); } else { COMPARE_KEYS: for my $idx (0 .. $#$got) { my ($got_key, $got_val) = @{$got->[$idx]}; my ($expected_bottom, $expected_top, $expected_val) = @{$expected->[$idx]}; if (! ( ($got_key >= $expected_bottom) && ($got_key <= $expected_top) && ($got_val == $expected_val) ) ) { $success = 0; diag(<<"EOF"); Key/Val pair No. $idx is out of range or wrong: Got: [$got_key, $got_val] Expected: [$expected_bottom, $expected_top, $expected_val] EOF last COMPARE_KEYS; } } } ok($success, $blurb); } sub _is_between { my ($have, $want_bottom, $want_top,) = @_; return (($have >= $want_bottom) && ($want_top >= $have)); } 1; =pod =head1 AUTHOR Shlomi Fish, L , C =head1 COPYRIGHT Copyright(c) 2012 by Shlomi Fish. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut 00-compile.t100644000764000764 334013364403553 21077 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/tuse 5.006; use strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::Compile 2.058 use Test::More; plan tests => 6 + ($ENV{AUTHOR_TESTING} ? 1 : 0); my @module_files = ( 'Statistics/Descriptive.pm', 'Statistics/Descriptive/Full.pm', 'Statistics/Descriptive/Smoother.pm', 'Statistics/Descriptive/Smoother/Exponential.pm', 'Statistics/Descriptive/Smoother/Weightedexponential.pm', 'Statistics/Descriptive/Sparse.pm' ); # fake home for cpan-testers use File::Temp; local $ENV{HOME} = File::Temp::tempdir( CLEANUP => 1 ); my @switches = ( -d 'blib' ? '-Mblib' : '-Ilib', ); use File::Spec; use IPC::Open3; use IO::Handle; open my $stdin, '<', File::Spec->devnull or die "can't open devnull: $!"; my @warnings; for my $lib (@module_files) { # see L my $stderr = IO::Handle->new; diag('Running: ', join(', ', map { my $str = $_; $str =~ s/'/\\'/g; q{'} . $str . q{'} } $^X, @switches, '-e', "require q[$lib]")) if $ENV{PERL_COMPILE_TEST_DEBUG}; my $pid = open3($stdin, '>&STDERR', $stderr, $^X, @switches, '-e', "require q[$lib]"); binmode $stderr, ':crlf' if $^O eq 'MSWin32'; my @_warnings = <$stderr>; waitpid($pid, 0); is($?, 0, "$lib loaded ok"); shift @_warnings if @_warnings and $_warnings[0] =~ /^Using .*\bblib/ and not eval { +require blib; blib->VERSION('1.01') }; if (@_warnings) { warn @_warnings; push @warnings, @_warnings; } } is(scalar(@warnings), 0, 'no warnings found') or diag 'got warnings: ', ( Test::More->can('explain') ? Test::More::explain(\@warnings) : join("\n", '', @warnings) ) if $ENV{AUTHOR_TESTING}; boilerplate.t100644000764000764 236713364403553 21544 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!perl -T use strict; use warnings; use Test::More tests => 3; sub not_in_file_ok { my ($filename, %regex) = @_; open( my $fh, '<', $filename ) or die "couldn't open $filename for reading: $!"; my %violated; while (my $line = <$fh>) { while (my ($desc, $regex) = each %regex) { if ($line =~ $regex) { push @{$violated{$desc}||=[]}, $.; } } } if (%violated) { fail("$filename contains boilerplate text"); diag "$_ appears on lines @{$violated{$_}}" for keys %violated; } else { pass("$filename contains no boilerplate text"); } } sub module_boilerplate_ok { my ($module) = @_; not_in_file_ok($module => 'the great new $MODULENAME' => qr/ - The great new /, 'boilerplate description' => qr/Quick summary of what the module/, 'stub function definition' => qr/function[12]/, ); } TODO: { not_in_file_ok(README => "The README is used..." => qr/The README is used/, "'version information here'" => qr/to provide version information/, ); not_in_file_ok(Changes => "placeholder date/time" => qr(Date/time) ); module_boilerplate_ok('lib/Statistics/Descriptive.pm'); } author000755000764000764 013364403553 20377 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xteol.t100644000764000764 160113364403553 21501 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/authoruse strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::EOL 0.19 use Test::More 0.88; use Test::EOL; my @files = ( 'lib/Statistics/Descriptive.pm', 'lib/Statistics/Descriptive/Full.pm', 'lib/Statistics/Descriptive/Smoother.pm', 'lib/Statistics/Descriptive/Smoother/Exponential.pm', 'lib/Statistics/Descriptive/Smoother/Weightedexponential.pm', 'lib/Statistics/Descriptive/Sparse.pm', 't/00-compile.t', 't/00-load.t', 't/boilerplate.t', 't/descr.t', 't/descr_smooth_methods.t', 't/freq_distribution-1-rt-34999.t', 't/lib/Utils.pm', 't/median_absolute_deviation.t', 't/mode.t', 't/outliers.t', 't/quantile.t', 't/smoother.t', 't/smoother_exponential.t', 't/smoother_weightedexponential.t', 't/summary.t' ); eol_unix_ok($_, { trailing_whitespace => 1 }) foreach @files; done_testing; no-tabs.t100644000764000764 154713364403553 22276 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/authoruse strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::NoTabs 0.15 use Test::More 0.88; use Test::NoTabs; my @files = ( 'lib/Statistics/Descriptive.pm', 'lib/Statistics/Descriptive/Full.pm', 'lib/Statistics/Descriptive/Smoother.pm', 'lib/Statistics/Descriptive/Smoother/Exponential.pm', 'lib/Statistics/Descriptive/Smoother/Weightedexponential.pm', 'lib/Statistics/Descriptive/Sparse.pm', 't/00-compile.t', 't/00-load.t', 't/boilerplate.t', 't/descr.t', 't/descr_smooth_methods.t', 't/freq_distribution-1-rt-34999.t', 't/lib/Utils.pm', 't/median_absolute_deviation.t', 't/mode.t', 't/outliers.t', 't/quantile.t', 't/smoother.t', 't/smoother_exponential.t', 't/smoother_weightedexponential.t', 't/summary.t' ); notabs_ok($_) foreach @files; done_testing; scripts000755000764000764 013364403553 20131 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702tag-release.pl100644000764000764 65513364403553 23005 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/scripts#!/usr/bin/perl use strict; use warnings; use IO::All; my ($version) = (map { m{\$VERSION *= *'([^']+)'} ? ($1) : () } io->file('lib/Statistics/Descriptive.pm')->getlines() ) ; if (!defined ($version)) { die "Version is undefined!"; } my @cmd = ( "git", "tag", "-m", "Tagging the Statistics-Descriptive release as $version", "releases/$version", ); print join(" ", @cmd), "\n"; exec(@cmd); pod-syntax.t100644000764000764 25213364403553 23011 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/author#!perl # This file was automatically generated by Dist::Zilla::Plugin::PodSyntaxTests. use strict; use warnings; use Test::More; use Test::Pod 1.41; all_pod_files_ok(); Run000755000764000764 013364403553 20676 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/inc/TestBuilder.pm100644000764000764 321113364403553 22757 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/inc/Test/Runpackage Test::Run::Builder; use strict; use warnings; use Module::Build; use vars qw(@ISA); @ISA = (qw(Module::Build)); sub ACTION_runtest { my ($self) = @_; my $p = $self->{properties}; $self->depends_on('code'); local @INC = @INC; # Make sure we test the module in blib/ unshift @INC, (File::Spec->catdir($p->{base_dir}, $self->blib, 'lib'), File::Spec->catdir($p->{base_dir}, $self->blib, 'arch')); $self->do_test_run_tests; } sub ACTION_distruntest { my ($self) = @_; $self->depends_on('distdir'); my $start_dir = $self->cwd; my $dist_dir = $self->dist_dir; chdir $dist_dir or die "Cannot chdir to $dist_dir: $!"; # XXX could be different names for scripts $self->run_perl_script('Build.PL') # XXX Should this be run w/ --nouse-rcfile or die "Error executing 'Build.PL' in dist directory: $!"; $self->run_perl_script('Build') or die "Error executing 'Build' in dist directory: $!"; $self->run_perl_script('Build', [], ['runtest']) or die "Error executing 'Build test' in dist directory"; chdir $start_dir; } sub do_test_run_tests { my $self = shift; require Test::Run::CmdLine::Iface; my $test_run = Test::Run::CmdLine::Iface->new( { 'test_files' => [glob("t/*.t")], } # 'backend_params' => $self->_get_backend_params(), ); return $test_run->run(); } sub ACTION_tags { return system(qw( ctags -f tags --recurse --totals --exclude=blib/ --exclude=t/lib --exclude=.svn --exclude='*~' --languages=Perl --langmap=Perl:+.t )); } 1; descr_smooth_methods.t100644000764000764 465613364403553 23461 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 3; use Statistics::Descriptive; local $SIG{__WARN__} = sub { }; my @original_data = (1 .. 10); { # testing set_smoother my $stats = Statistics::Descriptive::Full->new(); $stats->add_data(\@original_data ); $stats->set_smoother({ method => 'exponential', coeff => 0, }); # TEST isa_ok ( $stats->{_smoother}, 'Statistics::Descriptive::Smoother::Exponential', 'set_smoother: smoother set correctly'); } { # testing get_smoothed_data my $stats = Statistics::Descriptive::Full->new(); # TEST is ( $stats->get_smoothed_data(), undef, 'get_smoothed_data: smoother needs to be defined'); $stats->add_data(\@original_data ); $stats->set_smoother({ method => 'exponential', coeff => 0.5, }); my @expected_values = ( 1, 1.5, 2.25, 3.125, 4.0625, 5.03125, 6.015625, 7.0078125, 8.00390625, 9.001953125, ); my @smoothed_data = $stats->get_smoothed_data(); # TEST is_deeply( \@smoothed_data, \@expected_values, 'Smoothing with C=0.5'); } =pod =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut smoother_exponential.t100644000764000764 616513364403553 23510 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use lib 't/lib'; use Utils qw/is_array_between/; use Test::More tests => 3; use Statistics::Descriptive::Smoother; my @original_data = (1 .. 10); my @original_samples = (3, 3, 3, 3, 3, 3, 3, 3, 3, 3,); { #Test no smoothing my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); # When the smoothing coefficient is 0 the series is not smoothed # TEST is_deeply( \@smoothed_data, \@original_data, 'No smoothing C=0'); } { #Test max smoothing my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 1, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); # When the smoothing coefficient is 1 the series is universally equal to the initial unsmoothed value my @expected_values = map { $original_data[0] } 1 .. $smoother->{count}; # TEST is_deeply( \@smoothed_data, \@expected_values, 'Max smoothing C=1'); } { #Test smoothing coeff 0.5 my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0.5, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); my @expected_values = ( 1, 1.5, 2.25, 3.125, 4.0625, 5.03125, 6.015625, 7.0078125, 8.00390625, 9.001953125, ); # TEST is_array_between( \@smoothed_data, \@expected_values, 1E-13, 1E-13, 'Smoothing with C=0.5'); } 1; =pod =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut pod-coverage.t100644000764000764 33413364403553 23257 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/author#!perl # This file was automatically generated by Dist::Zilla::Plugin::PodCoverageTests. use Test::Pod::Coverage 1.08; use Pod::Coverage::TrustPod; all_pod_coverage_ok({ coverage_class => 'Pod::Coverage::TrustPod' }); release000755000764000764 013364403553 20515 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xtcpan-changes.t100644000764000764 34413364403553 23352 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/releaseuse strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::CPAN::Changes 0.012 use Test::More 0.96 tests => 1; use Test::CPAN::Changes; subtest 'changes_ok' => sub { changes_file_ok('Changes'); }; trailing-space.t100644000764000764 103413364403553 23742 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/xt/release#!perl use strict; use warnings; use Test::More; eval "use Test::TrailingSpace"; if ($@) { plan skip_all => "Test::TrailingSpace required for trailing space test."; } else { plan tests => 1; } # TODO: add .pod, .PL, the README/Changes/TODO/etc. documents and possibly # some other stuff. my $finder = Test::TrailingSpace->new( { root => '.', filename_regex => qr#(?:\.(?:t|pm|pl|xs|c|h|txt|pod|PL)|README|Changes|TODO|LICENSE)\z#, }, ); # TEST $finder->no_trailing_space( "No trailing space was found." ); Statistics000755000764000764 013364403553 21342 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/libDescriptive.pm100644000764000764 4740313364403553 24351 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statisticspackage Statistics::Descriptive; use strict; use warnings; ##This module draws heavily from perltoot v0.4 from Tom Christiansen. use 5.006; use vars (qw($Tolerance $Min_samples_number)); our $VERSION = '3.0702'; $Tolerance = 0.0; $Min_samples_number = 4; use Statistics::Descriptive::Sparse; use Statistics::Descriptive::Full; package Statistics::Descriptive; ##All modules return true. 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive - Module of basic descriptive statistical functions. =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive; my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1,2,3,4); my $mean = $stat->mean(); my $var = $stat->variance(); my $tm = $stat->trimmed_mean(.25); $Statistics::Descriptive::Tolerance = 1e-10; =head1 DESCRIPTION This module provides basic functions used in descriptive statistics. It has an object oriented design and supports two different types of data storage and calculation objects: sparse and full. With the sparse method, none of the data is stored and only a few statistical measures are available. Using the full method, the entire data set is retained and additional functions are available. Whenever a division by zero may occur, the denominator is checked to be greater than the value C<$Statistics::Descriptive::Tolerance>, which defaults to 0.0. You may want to change this value to some small positive value such as 1e-24 in order to obtain error messages in case of very small denominators. Many of the methods (both Sparse and Full) cache values so that subsequent calls with the same arguments are faster. =head1 VERSION version 3.0702 =head1 METHODS =head2 Sparse Methods =over 5 =item $stat = Statistics::Descriptive::Sparse->new(); Create a new sparse statistics object. =item $stat->clear(); Effectively the same as my $class = ref($stat); undef $stat; $stat = new $class; except more efficient. =item $stat->add_data(1,2,3); Adds data to the statistics variable. The cached statistical values are updated automatically. =item $stat->count(); Returns the number of data items. =item $stat->mean(); Returns the mean of the data. =item $stat->sum(); Returns the sum of the data. =item $stat->variance(); Returns the variance of the data. Division by n-1 is used. =item $stat->standard_deviation(); Returns the standard deviation of the data. Division by n-1 is used. =item $stat->min(); Returns the minimum value of the data set. =item $stat->mindex(); Returns the index of the minimum value of the data set. =item $stat->max(); Returns the maximum value of the data set. =item $stat->maxdex(); Returns the index of the maximum value of the data set. =item $stat->sample_range(); Returns the sample range (max - min) of the data set. =back =head2 Full Methods Similar to the Sparse Methods above, any Full Method that is called caches the current result so that it doesn't have to be recalculated. In some cases, several values can be cached at the same time. =over 5 =item $stat = Statistics::Descriptive::Full->new(); Create a new statistics object that inherits from Statistics::Descriptive::Sparse so that it contains all the methods described above. =item $stat->add_data(1,2,4,5); Adds data to the statistics variable. All of the sparse statistical values are updated and cached. Cached values from Full methods are deleted since they are no longer valid. I =item $stat->add_data_with_samples([{1 => 10}, {2 => 20}, {3 => 30},]); Add data to the statistics variable and set the number of samples each value has been built with. The data is the key of each element of the input array ref, while the value is the number of samples: [{data1 => smaples1}, {data2 => samples2}, ...]. B The number of samples is only used by the smoothing function and is ignored otherwise. It is not equivalent to repeat count. In order to repeat a certain datum more than one time call add_data() like this: my $value = 5; my $repeat_count = 10; $stat->add_data( [ ($value) x $repeat_count ] ); =item $stat->get_data(); Returns a copy of the data array. =item $stat->get_data_without_outliers(); Returns a copy of the data array without outliers. The number minimum of samples to apply the outlier filtering is C<$Statistics::Descriptive::Min_samples_number>, 4 by default. A function to detect outliers need to be defined (see C), otherwise the function will return an undef value. The filtering will act only on the most extreme value of the data set (i.e.: value with the highest absolute standard deviation from the mean). If there is the need to remove more than one outlier, the filtering need to be re-run for the next most extreme value with the initial outlier removed. This is not always needed since the test (for example Grubb's test) usually can only detect the most exreme value. If there is more than one extreme case in a set, then the standard deviation will be high enough to make neither case an outlier. =item $stat->set_outlier_filter($code_ref); Set the function to filter out the outlier. C<$code_ref> is the reference to the subroutine implementing the filtering function. Returns C for invalid values of C<$code_ref> (i.e.: not defined or not a code reference), C<1> otherwise. =over 4 =item Example #1: Undefined code reference my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1, 2, 3, 4, 5); print $stat->set_outlier_filter(); # => undef =item Example #2: Valid code reference sub outlier_filter { return $_[1] > 1; } my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 1, 1, 100, 1, ); print $stat->set_outlier_filter( \&outlier_filter ); # => 1 my @filtered_data = $stat->get_data_without_outliers(); # @filtered_data is (1, 1, 1, 1) In this example the series is really simple and the outlier filter function as well. For more complex series the outlier filter function might be more complex (see Grubbs' test for outliers). The outlier filter function will receive as first parameter the Statistics::Descriptive::Full object, as second the value of the candidate outlier. Having the object in the function might be useful for complex filters where statistics property are needed (again see Grubbs' test for outlier). =back =item $stat->set_smoother({ method => 'exponential', coeff => 0, }); Set the method used to smooth the data and the smoothing coefficient. See C for more details. =item $stat->get_smoothed_data(); Returns a copy of the smoothed data array. The smoothing method and coefficient need to be defined (see C), otherwise the function will return an undef value. =item $stat->sort_data(); Sort the stored data and update the mindex and maxdex methods. This method uses perl's internal sort. =item $stat->presorted(1); =item $stat->presorted(); If called with a non-zero argument, this method sets a flag that says the data is already sorted and need not be sorted again. Since some of the methods in this class require sorted data, this saves some time. If you supply sorted data to the object, call this method to prevent the data from being sorted again. The flag is cleared whenever add_data is called. Calling the method without an argument returns the value of the flag. =item $stat->skewness(); Returns the skewness of the data. A value of zero is no skew, negative is a left skewed tail, positive is a right skewed tail. This is consistent with Excel. =item $stat->kurtosis(); Returns the kurtosis of the data. Positive is peaked, negative is flattened. =item $x = $stat->percentile(25); =item ($x, $index) = $stat->percentile(25); Sorts the data and returns the value that corresponds to the percentile as defined in RFC2330: =over 4 =item For example, given the 6 measurements: -2, 7, 7, 4, 18, -5 Then F(-8) = 0, F(-5) = 1/6, F(-5.0001) = 0, F(-4.999) = 1/6, F(7) = 5/6, F(18) = 1, F(239) = 1. Note that we can recover the different measured values and how many times each occurred from F(x) -- no information regarding the range in values is lost. Summarizing measurements using histograms, on the other hand, in general loses information about the different values observed, so the EDF is preferred. Using either the EDF or a histogram, however, we do lose information regarding the order in which the values were observed. Whether this loss is potentially significant will depend on the metric being measured. We will use the term "percentile" to refer to the smallest value of x for which F(x) >= a given percentage. So the 50th percentile of the example above is 4, since F(4) = 3/6 = 50%; the 25th percentile is -2, since F(-5) = 1/6 < 25%, and F(-2) = 2/6 >= 25%; the 100th percentile is 18; and the 0th percentile is -infinity, as is the 15th percentile, which for ease of handling and backward compatibility is returned as undef() by the function. Care must be taken when using percentiles to summarize a sample, because they can lend an unwarranted appearance of more precision than is really available. Any such summary must include the sample size N, because any percentile difference finer than 1/N is below the resolution of the sample. =back (Taken from: I, Section 11.3. Defining Statistical Distributions. RFC2330 is available from: L .) If the percentile method is called in a list context then it will also return the index of the percentile. =item $x = $stat->quantile($Type); Sorts the data and returns estimates of underlying distribution quantiles based on one or two order statistics from the supplied elements. This method use the same algorithm as Excel and R language (quantile B). The generic function quantile produces sample quantiles corresponding to the given probabilities. B<$Type> is an integer value between 0 to 4 : 0 => zero quartile (Q0) : minimal value 1 => first quartile (Q1) : lower quartile = lowest cut off (25%) of data = 25th percentile 2 => second quartile (Q2) : median = it cuts data set in half = 50th percentile 3 => third quartile (Q3) : upper quartile = highest cut off (25%) of data, or lowest 75% = 75th percentile 4 => fourth quartile (Q4) : maximal value Example : my @data = (1..10); my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(@data); print $stat->quantile(0); # => 1 print $stat->quantile(1); # => 3.25 print $stat->quantile(2); # => 5.5 print $stat->quantile(3); # => 7.75 print $stat->quantile(4); # => 10 =item $stat->median(); Sorts the data and returns the median value of the data. =item $stat->harmonic_mean(); Returns the harmonic mean of the data. Since the mean is undefined if any of the data are zero or if the sum of the reciprocals is zero, it will return undef for both of those cases. =item $stat->geometric_mean(); Returns the geometric mean of the data. =item my $mode = $stat->mode(); Returns the mode of the data. The mode is the most commonly occurring datum. See L . If all values occur only once, then mode() will return undef. =item $stat->trimmed_mean(ltrim[,utrim]); C returns the mean with a fraction C of entries at each end dropped. C returns the mean after a fraction C has been removed from the lower end of the data and a fraction C has been removed from the upper end of the data. This method sorts the data before beginning to analyze it. All calls to trimmed_mean() are cached so that they don't have to be calculated a second time. =item $stat->frequency_distribution_ref($num_partitions); =item $stat->frequency_distribution_ref(\@bins); =item $stat->frequency_distribution_ref(); C slices the data into C<$num_partitions> sets (where $num_partitions is greater than 1) and counts the number of items that fall into each partition. It returns a reference to a hash where the keys are the numerical values of the partitions used. The minimum value of the data set is not a key and the maximum value of the data set is always a key. The number of entries for a particular partition key are the number of items which are greater than the previous partition key and less then or equal to the current partition key. As an example, $stat->add_data(1,1.5,2,2.5,3,3.5,4); $f = $stat->frequency_distribution_ref(2); for (sort {$a <=> $b} keys %$f) { print "key = $_, count = $f->{$_}\n"; } prints key = 2.5, count = 4 key = 4, count = 3 since there are four items less than or equal to 2.5, and 3 items greater than 2.5 and less than 4. C provides the bins that are to be used for the distribution. This allows for non-uniform distributions as well as trimmed or sample distributions to be found. C<@bins> must be monotonic and must contain at least one element. Note that unless the set of bins contains the full range of the data, the total counts returned will be less than the sample size. Calling C with no arguments returns the last distribution calculated, if such exists. =item my %hash = $stat->frequency_distribution($partitions); =item my %hash = $stat->frequency_distribution(\@bins); =item my %hash = $stat->frequency_distribution(); Same as C except that it returns the hash clobbered into the return list. Kept for compatibility reasons with previous versions of Statistics::Descriptive and using it is discouraged. =item $stat->least_squares_fit(); =item $stat->least_squares_fit(@x); C performs a least squares fit on the data, assuming a domain of C<@x> or a default of 1..$stat->count(). It returns an array of four elements C<($q, $m, $r, $rms)> where =over 4 =item C<$q and $m> satisfy the equation C($y = $m*$x + $q). =item C<$r> is the Pearson linear correlation cofficient. =item C<$rms> is the root-mean-square error. =back If case of error or division by zero, the empty list is returned. The array that is returned can be "coerced" into a hash structure by doing the following: my %hash = (); @hash{'q', 'm', 'r', 'err'} = $stat->least_squares_fit(); Because calling C with no arguments defaults to using the current range, there is no caching of the results. =back =head1 REPORTING ERRORS I read my email frequently, but since adopting this module I've added 2 children and 1 dog to my family, so please be patient about my response times. When reporting errors, please include the following to help me out: =over 4 =item * Your version of perl. This can be obtained by typing perl C<-v> at the command line. =item * Which version of Statistics::Descriptive you're using. As you can see below, I do make mistakes. Unfortunately for me, right now there are thousands of CD's with the version of this module with the bugs in it. Fortunately for you, I'm a very patient module maintainer. =item * Details about what the error is. Try to narrow down the scope of the problem and send me code that I can run to verify and track it down. =back =head1 AUTHOR Current maintainer: Shlomi Fish, L , C Previously: Colin Kuskie My email address can be found at http://www.perl.com under Who's Who or at: https://metacpan.org/author/COLINK . =head1 CONTRIBUTORS Fabio Ponciroli & Adzuna Ltd. team (outliers handling) =head1 REFERENCES RFC2330, Framework for IP Performance Metrics The Art of Computer Programming, Volume 2, Donald Knuth. Handbook of Mathematica Functions, Milton Abramowitz and Irene Stegun. Probability and Statistics for Engineering and the Sciences, Jay Devore. =head1 COPYRIGHT Copyright (c) 1997,1998 Colin Kuskie. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1998 Andrea Spinelli. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1994,1995 Jason Kastner. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 LICENSE This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut median_absolute_deviation.t100644000764000764 66613364403553 24417 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 2; use Statistics::Descriptive; { my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 1, 1, 2, 2, 2, 2, 4, 7 ); my $mad = $stat->median_absolute_deviation(); # TEST ok (defined($mad), "median_absolute_deviation is not undefined" ); # TEST is($mad, 1, "median_absolute_deviation is correct" ); } bump-version-number.pl100644000764000764 121013364403553 24534 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/scripts#!/usr/bin/perl use strict; use warnings; use File::Find::Object; use IO::All; my $tree = File::Find::Object->new({}, 'lib/'); my $version_n = shift(@ARGV); if (!defined($version_n)) { die "Specify version number as an argument! bump-version-number.pl '0.0.1'"; } while (my $r = $tree->next()) { if ($r =~ m{/\.svn\z}) { $tree->prune(); } elsif ($r =~ m{\.pm\z}) { my @lines = io->file($r)->getlines(); foreach (@lines) { s#(\$VERSION = '|^Version )\d+\.\d+(?:\.\d+)?('|)#$1 . $version_n . $2#e; } io->file($r)->print( @lines ); } } examples000755000764000764 013364403553 20260 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702statistical-analysis.pl100644000764000764 402513364403553 25123 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/examples#!/usr/bin/perl # This script analyses two distributions from tab-separated input # and outputs some basic statistcs. # # It was used to analyse the distribution of the number of moves in # the solutions generated by http://fc-solve.shlomifish.org/ . use strict; use warnings; use Statistics::Descriptive; my $num_fields = 2; my @stats = (map { Statistics::Descriptive::Full->new(); } (1 .. $num_fields) ); while(my $line = ) { chomp($line); my @vals = split(/\t/, $line); foreach my $f (0 .. $num_fields-1) { $stats[$f]->add_data($vals[$f]); } } foreach my $f (0 .. $num_fields-1) { my $s = $stats[$f]; print "Field No. $f\n"; print "---------------------------\n"; print "Min: " , $s->min(), "\n"; print "Max: " , $s->max(), "\n"; print "Average: " , $s->mean(), "\n"; print "StdDev: " , $s->standard_deviation(), "\n"; print "Median: " , $s->median(), "\n"; print "\n"; } =head1 COPYRIGHT AND LICENSE Copyright (c) 2009 Shlomi Fish Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut smoother_weightedexponential.t100644000764000764 732613364403553 25231 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use lib 't/lib'; use Utils qw/is_array_between/; use Test::More tests => 4; use Statistics::Descriptive::Smoother; local $SIG{__WARN__} = sub { }; my @original_data = (1 .. 10); my @original_samples = (100, 50, 100, 50, 100, 50, 100, 50, 100, 50,); { #Test no smoothing my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedExponential', coeff => 0, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); # When the smoothing coefficient is 0 the series is not smoothed # TEST is_deeply( \@smoothed_data, \@original_data, 'No smoothing C=0'); } { #Test max smoothing my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedExponential', coeff => 1, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); # When the smoothing coefficient is 1 the series is universally equal to the initial unsmoothed value my @expected_values = map { $original_data[0] } 1 .. $smoother->{count}; # TEST is_deeply( \@smoothed_data, \@expected_values, 'Max smoothing C=1'); } { #Test smoothing coeff 0.5 my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedExponential', coeff => 0.5, data => \@original_data, samples => \@original_samples, }); my @smoothed_data = $smoother->get_smoothed_data(); my @expected_values = ( 1, 1.33333333333333, 2.24242424242424, 2.85944551901999, 4.0651836704636, 4.75526654493058, 6.03174342835728, 6.7367839208657, 8.02706266125788, 8.73457937329917, ); # TEST is_array_between( \@smoothed_data, \@expected_values, 1E-13, 1E-13, 'Smoothing with C=0.5'); } { #Test different number of samples and data are not allowed my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedExponential', coeff => 0, data => [1,2,3,4], samples => [1,2,3], }); # TEST is ( $smoother, undef, 'Different number of samples and data'); } 1; =pod =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut freq_distribution-1-rt-34999.t100644000764000764 167513364403553 24257 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/t#!/usr/bin/perl use strict; use warnings; use Test::More tests => 2; use Statistics::Descriptive; my @data=( 601,449,424,568,569,447,425,621,616,573,584,635,480,437,724,711, 717,576,724,585,458,752,753,709,584,748,628,483,739,747,694,601, 758,653,487,720,750,660,588,719,631,492,584,647,548,585,649,532, 492,598,653,524,567,570,506,475,640,725,688,567,634,520,488,718, 769,739,576,718,527,497,698,736,785,581,733,540,537,683,691,785, 588,733,531,564,581,554,765,580,626,510,533,495,470,713,571,573, 476,526,441,431,686,563,496,447,518 ); my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(@data); # I should get 20 partitions, shouldn't I? my %freqs=$stat->frequency_distribution (20); # TEST is (scalar(keys(%freqs)), 20, "We got 20 partitions" ); my $sum = 0; foreach my $v (values(%freqs)) { $sum += $v; } # TEST is ($sum, scalar(@data), "The total number of elements in the bins" ); Descriptive000755000764000764 013364403553 23623 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/StatisticsFull.pm100644000764000764 11236713364403553 25275 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/Descriptivepackage Statistics::Descriptive::Full; $Statistics::Descriptive::Full::VERSION = '3.0702'; use strict; use warnings; use Carp; use POSIX (); use Statistics::Descriptive::Smoother; use vars qw($a $b %fields); use parent qw(Statistics::Descriptive::Sparse); use List::MoreUtils (); use List::Util (); ##Create a list of fields not to remove when data is updated %fields = ( _permitted => undef, ##Place holder for the inherited key hash data => undef, ##Our data samples => undef, ##Number of samples for each value of the data set presorted => undef, ##Flag to indicate the data is already sorted _reserved => undef, ##Place holder for this lookup hash ); __PACKAGE__->_make_private_accessors( [qw(data samples frequency geometric_mean harmonic_mean least_squares_fit median mode skewness kurtosis median_absolute_deviation ) ] ); __PACKAGE__->_make_accessors([qw(presorted _reserved _trimmed_mean_cache)]); sub _clear_fields { my $self = shift; # Empty array ref for holding data later! $self->_data([]); $self->_samples([]); $self->_reserved(\%fields); $self->presorted(0); $self->_trimmed_mean_cache(+{}); return; } ##Have to override the base method to add the data to the object ##The proxy method from above is still valid sub new { my $proto = shift; my $class = ref($proto) || $proto; # Create my self re SUPER my $self = $class->SUPER::new(); bless ($self, $class); #Re-anneal the object $self->_clear_fields(); return $self; } sub _is_reserved { my $self = shift; my $field = shift; return exists($self->_reserved->{$field}); } sub _delete_all_cached_keys { my $self = shift; my %keys = %{ $self }; # Remove reserved keys for this class from the deletion list delete @keys{keys %{$self->_reserved}}; delete @keys{keys %{$self->_permitted}}; delete $keys{_trimmed_mean_cache}; KEYS_LOOP: foreach my $key (keys %keys) { # Check each key in the object delete $self->{$key}; # Delete any out of date cached key } $self->{_trimmed_mean_cache} = {}; # just reset this one return; } ##Clear a stat. More efficient than destroying an object and calling ##new. sub clear { my $self = shift; ##Myself my $key; if (!$self->count()) { return; } $self->_delete_all_cached_keys(); $self->SUPER::clear(); $self->_clear_fields(); } sub add_data { my $self = shift; ##Myself my $aref; if (ref $_[0] eq 'ARRAY') { $aref = $_[0]; } else { $aref = \@_; } ##If we were given no data, we do nothing. return 1 if (!@{ $aref }); my $oldmean; my ($min, $max, $sum, $sumsq); my $count = $self->count; # $count is modified lower down, but we need this flag after that my $has_existing_data = $count; # Take care of appending to an existing data set if ($has_existing_data) { $min = $self->min(); $max = $self->max(); $sum = $self->sum(); $sumsq = $self->sumsq(); } else { $min = $aref->[0]; $max = $aref->[0]; $sum = 0; $sumsq = 0; } # need to allow for already having data $sum += List::Util::sum (@$aref); $sumsq += List::Util::sum (map {$_ ** 2} @$aref); $max = List::Util::max ($max, @$aref); $min = List::Util::min ($min, @$aref); $count += scalar @$aref; my $mean = $sum / $count; $self->min($min); $self->max($max); $self->sample_range($max - $min); $self->sum($sum); $self->sumsq($sumsq); $self->mean($mean); $self->count($count); ##Variance isn't commonly enough ##used to recompute every single data add, so just clear its cache. $self->_variance(undef); push @{ $self->_data() }, @{ $aref }; # no need to clear keys if we are a newly populated object, # and profiling shows it takes a long time when creating # and populating many stats objects if ($has_existing_data) { ##Clear the presorted flag $self->presorted(0); $self->_delete_all_cached_keys(); } return 1; } sub add_data_with_samples { my ($self,$aref_values) = @_; return 1 if (!@{ $aref_values }); my $aref_data = [map { keys %$_ } @{ $aref_values }]; my $aref_samples = [map { values %$_ } @{ $aref_values }]; $self->add_data($aref_data); push @{ $self->_samples() }, @{ $aref_samples }; return 1; } sub get_data { my $self = shift; return @{ $self->_data() }; } sub get_data_without_outliers { my $self = shift; if ($self->count() < $Statistics::Descriptive::Min_samples_number) { carp("Need at least $Statistics::Descriptive::Min_samples_number samples\n"); return; } if (!defined $self->{_outlier_filter}) { carp("Outliers filter not defined\n"); return; } my $outlier_candidate_index = $self->_outlier_candidate_index; my $possible_outlier = ($self->_data())->[$outlier_candidate_index]; my $is_outlier = $self->{_outlier_filter}->($self, $possible_outlier); return $self->get_data unless $is_outlier; # Removing the outlier from the dataset my @good_indexes = grep { $_ != $outlier_candidate_index } (0 .. $self->count() - 1); my @data = $self->get_data; my @filtered_data = @data[@good_indexes]; return @filtered_data; } sub set_outlier_filter { my ($self, $code_ref) = @_; if (!$code_ref || ref($code_ref) ne "CODE") { carp("Need to pass a code reference"); return; } $self->{_outlier_filter} = $code_ref; return 1; } sub _outlier_candidate_index { my $self = shift; my $mean = $self->mean(); my $outlier_candidate_index = 0; my $max_std_deviation = abs(($self->_data())->[0] - $mean); foreach my $idx (1 .. ($self->count() - 1) ) { my $curr_value = ($self->_data())->[$idx]; if ($max_std_deviation < abs($curr_value - $mean) ) { $outlier_candidate_index = $idx; $max_std_deviation = abs($curr_value - $mean); } } return $outlier_candidate_index; } sub set_smoother { my ($self, $args) = @_; $args->{data} = $self->_data(); $args->{samples} = $self->_samples(); $self->{_smoother} = Statistics::Descriptive::Smoother->instantiate($args); } sub get_smoothed_data { my ($self, $args) = @_; if (!defined $self->{_smoother}) { carp("Smoother object not defined\n"); return; } $self->{_smoother}->get_smoothed_data(); } sub maxdex { my $self = shift; return undef if !$self->count; my $maxdex; if ($self->presorted) { $maxdex = $self->count - 1; } else { my $max = $self->max; $maxdex = List::MoreUtils::first_index {$_ == $max} $self->get_data; } $self->{maxdex} = $maxdex; return $maxdex; } sub mindex { my $self = shift; return undef if !$self->count; #my $maxdex = $self->{maxdex}; #return $maxdex if defined $maxdex; my $mindex; if ($self->presorted) { $mindex = 0; } else { my $min = $self->min; $mindex = List::MoreUtils::first_index {$_ == $min} $self->get_data; } $self->{mindex} = $mindex; return $mindex; } sub sort_data { my $self = shift; if (! $self->presorted()) { ##Sort the data in descending order $self->_data([ sort {$a <=> $b} @{$self->_data()} ]); $self->presorted(1); ##Fix the maxima and minima indices - no, this is unnecessary now we have methods #$self->mindex(0); #$self->maxdex($#{$self->_data()}); } return 1; } sub percentile { my $self = shift; my $percentile = shift || 0; ##Since we're returning a single value there's no real need ##to cache this. ##If the requested percentile is less than the "percentile bin ##size" then return undef. Check description of RFC 2330 in the ##POD below. my $count = $self->count(); if ((! $count) || ($percentile < 100 / $count)) { return; # allow for both scalar and list context } $self->sort_data(); my $num = $count*$percentile/100; my $index = &POSIX::ceil($num) - 1; my $val = $self->_data->[$index]; return wantarray ? ($val, $index) : $val ; } sub _calc_new_median { my $self = shift; my $count = $self->count(); ##Even or odd if ($count % 2) { return $self->_data->[($count-1)/2]; } else { return ( ($self->_data->[($count)/2] + $self->_data->[($count-2)/2] ) / 2 ); } } sub median { my $self = shift; return undef if !$self->count; ##Cached? if (! defined($self->_median())) { $self->sort_data(); $self->_median($self->_calc_new_median()); } return $self->_median(); } sub quantile { my ( $self, $QuantileNumber ) = @_; unless ( defined $QuantileNumber and $QuantileNumber =~ m/^0|1|2|3|4$/ ) { carp("Bad quartile type, must be 0, 1, 2, 3 or 4\n"); return; } # check data count after the args are checked - should help debugging return undef if !$self->count; $self->sort_data(); return $self->_data->[0] if ( $QuantileNumber == 0 ); my $count = $self->count(); return $self->_data->[ $count - 1 ] if ( $QuantileNumber == 4 ); my $K_quantile = ( ( $QuantileNumber / 4 ) * ( $count - 1 ) + 1 ); my $F_quantile = $K_quantile - POSIX::floor($K_quantile); $K_quantile = POSIX::floor($K_quantile); # interpolation my $aK_quantile = $self->_data->[ $K_quantile - 1 ]; return $aK_quantile if ( $F_quantile == 0 ); my $aKPlus_quantile = $self->_data->[$K_quantile]; # Calcul quantile my $quantile = $aK_quantile + ( $F_quantile * ( $aKPlus_quantile - $aK_quantile ) ); return $quantile; } sub _real_calc_trimmed_mean { my $self = shift; my $lower = shift; my $upper = shift; my $lower_trim = int ($self->count()*$lower); my $upper_trim = int ($self->count()*$upper); my ($val,$oldmean) = (0,0); my ($tm_count,$tm_mean,$index) = (0,0,$lower_trim); $self->sort_data(); while ($index <= $self->count() - $upper_trim -1) { $val = $self->_data()->[$index]; $oldmean = $tm_mean; $index++; $tm_count++; $tm_mean += ($val - $oldmean) / $tm_count; } return $tm_mean; } sub trimmed_mean { my $self = shift; my ($lower,$upper); #upper bound is in arg list or is same as lower if (@_ == 1) { ($lower,$upper) = ($_[0],$_[0]); } else { ($lower,$upper) = ($_[0],$_[1]); } # check data count after the args return undef if !$self->count; ##Cache my $thistm = join ':',$lower,$upper; my $cache = $self->_trimmed_mean_cache(); if (!exists($cache->{$thistm})) { $cache->{$thistm} = $self->_real_calc_trimmed_mean($lower, $upper); } return $cache->{$thistm}; } sub _test_for_too_small_val { my $self = shift; my $val = shift; return (abs($val) <= $Statistics::Descriptive::Tolerance); } sub _calc_harmonic_mean { my $self = shift; my $hs = 0; foreach my $item ( @{$self->_data()} ) { ##Guarantee that there are no divide by zeros if ($self->_test_for_too_small_val($item)) { return; } $hs += 1/$item; } if ($self->_test_for_too_small_val($hs)) { return; } return $self->count()/$hs; } sub harmonic_mean { my $self = shift; if (!defined($self->_harmonic_mean())) { $self->_harmonic_mean(scalar($self->_calc_harmonic_mean())); } return $self->_harmonic_mean(); } sub mode { my $self = shift; if (!defined ($self->_mode())) { my $mode = 0; my $occurances = 0; my %count; foreach my $item (@{ $self->_data() }) { my $count = ++$count{$item}; if ($count > $occurances) { $mode = $item; $occurances = $count; } } $self->_mode( ($occurances > 1) ? {exists => 1, mode => $mode} : {exists => 0,} ); } my $m = $self->_mode; return $m->{'exists'} ? $m->{mode} : undef; } sub geometric_mean { my $self = shift; return undef if !$self->count; if (!defined($self->_geometric_mean())) { my $gm = 1; my $exponent = 1/$self->count(); for my $val (@{ $self->_data() }) { if ($val < 0) { return undef; } $gm *= $val**$exponent; } $self->_geometric_mean($gm); } return $self->_geometric_mean(); } sub skewness { my $self = shift; if (!defined($self->_skewness())) { my $n = $self->count(); my $sd = $self->standard_deviation(); my $skew; # skip if insufficient records if ( $sd && $n > 2) { my $mean = $self->mean(); my $sum_pow3; foreach my $rec ( $self->get_data ) { $sum_pow3 += (($rec - $mean) / $sd) ** 3; } my $correction = $n / ( ($n-1) * ($n-2) ); $skew = $correction * $sum_pow3; } $self->_skewness($skew); } return $self->_skewness(); } sub kurtosis { my $self = shift; if (!defined($self->_kurtosis())) { my $kurt; my $n = $self->count(); my $sd = $self->standard_deviation(); if ( $sd && $n > 3) { my $mean = $self->mean(); my $sum_pow4; foreach my $rec ( $self->get_data ) { $sum_pow4 += ( ($rec - $mean ) / $sd ) ** 4; } my $correction1 = ( $n * ($n+1) ) / ( ($n-1) * ($n-2) * ($n-3) ); my $correction2 = ( 3 * ($n-1) ** 2) / ( ($n-2) * ($n-3) ); $kurt = ( $correction1 * $sum_pow4 ) - $correction2; } $self->_kurtosis($kurt); } return $self->_kurtosis(); } sub frequency_distribution_ref { my $self = shift; my @k = (); # Must have at least two elements if ($self->count() < 2) { return undef; } if ((!@_) && (defined $self->_frequency())) { return $self->_frequency() } my %bins; my $partitions = shift; if (ref($partitions) eq 'ARRAY') { @k = @{ $partitions }; return undef unless @k; ##Empty array if (@k > 1) { ##Check for monotonicity my $element = $k[0]; for my $next_elem (@k[1..$#k]) { if ($element > $next_elem) { carp "Non monotonic array cannot be used as frequency bins!\n"; return undef; } $element = $next_elem; } } %bins = map { $_ => 0 } @k; } else { return undef unless $partitions >= 1; my $interval = $self->sample_range() / $partitions; foreach my $idx (1 .. ($partitions-1)) { push @k, ($self->min() + $idx * $interval); } $bins{$self->max()} = 0; push @k, $self->max(); } ELEMENT: foreach my $element (@{$self->_data()}) { foreach my $limit (@k) { if ($element <= $limit) { $bins{$limit}++; next ELEMENT; } } } return $self->_frequency(\%bins); } sub frequency_distribution { my $self = shift; my $ret = $self->frequency_distribution_ref(@_); if (!defined($ret)) { return undef; } else { return %$ret; } } sub least_squares_fit { my $self = shift; return () if $self->count() < 2; ##Sigma sums my ($sigmaxy, $sigmax, $sigmaxx, $sigmayy, $sigmay) = (0,0,0,0,$self->sum); my ($xvar, $yvar, $err); ##Work variables my ($iter,$y,$x,$denom) = (0,0,0,0); my $count = $self->count(); my @x; ##Outputs my ($m, $q, $r, $rms); if (!defined $_[1]) { @x = 1..$self->count(); } else { @x = @_; if ( $self->count() != scalar @x) { carp "Range and domain are of unequal length."; return (); } } foreach $x (@x) { $y = $self->_data->[$iter]; $sigmayy += $y * $y; $sigmaxx += $x * $x; $sigmaxy += $x * $y; $sigmax += $x; $iter++; } $denom = $count * $sigmaxx - $sigmax*$sigmax; return () unless abs( $denom ) > $Statistics::Descriptive::Tolerance; $m = ($count*$sigmaxy - $sigmax*$sigmay) / $denom; $q = ($sigmaxx*$sigmay - $sigmax*$sigmaxy ) / $denom; $xvar = $sigmaxx - $sigmax*$sigmax / $count; $yvar = $sigmayy - $sigmay*$sigmay / $count; $denom = sqrt( $xvar * $yvar ); return () unless (abs( $denom ) > $Statistics::Descriptive::Tolerance); $r = ($sigmaxy - $sigmax*$sigmay / $count )/ $denom; $iter = 0; $rms = 0.0; foreach (@x) { ##Error = Real y - calculated y $err = $self->_data->[$iter] - ( $m * $_ + $q ); $rms += $err*$err; $iter++; } $rms = sqrt($rms / $count); $self->_least_squares_fit([$q, $m, $r, $rms]); return @{ $self->_least_squares_fit() }; } sub median_absolute_deviation { my ($self) = @_; if (!defined($self->_median_absolute_deviation())) { my $stat = $self->new; $stat->add_data(map { abs($_ - $self->median) } $self->get_data); $self->_median_absolute_deviation($stat->median); } return $self->_median_absolute_deviation(); } sub summary { my ($self) = @_; my $FMT = '%.5e'; return sprintf("Min: $FMT\nMax: $FMT\nMean: $FMT\nMedian: $FMT\n" . "1st quantile: $FMT\n3rd quantile: $FMT\n", $self->min, $self->max, $self->mean, $self->median, $self->quantile(1), $self->quantile(3), ); } 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive - Module of basic descriptive statistical functions. =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive; my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1,2,3,4); my $mean = $stat->mean(); my $var = $stat->variance(); my $tm = $stat->trimmed_mean(.25); $Statistics::Descriptive::Tolerance = 1e-10; =head1 DESCRIPTION This module provides basic functions used in descriptive statistics. It has an object oriented design and supports two different types of data storage and calculation objects: sparse and full. With the sparse method, none of the data is stored and only a few statistical measures are available. Using the full method, the entire data set is retained and additional functions are available. Whenever a division by zero may occur, the denominator is checked to be greater than the value C<$Statistics::Descriptive::Tolerance>, which defaults to 0.0. You may want to change this value to some small positive value such as 1e-24 in order to obtain error messages in case of very small denominators. Many of the methods (both Sparse and Full) cache values so that subsequent calls with the same arguments are faster. =head1 VERSION version 3.0702 =head1 METHODS =head2 Sparse Methods =over 5 =item $stat = Statistics::Descriptive::Sparse->new(); Create a new sparse statistics object. =item $stat->clear(); Effectively the same as my $class = ref($stat); undef $stat; $stat = new $class; except more efficient. =item $stat->add_data(1,2,3); Adds data to the statistics variable. The cached statistical values are updated automatically. =item $stat->count(); Returns the number of data items. =item $stat->mean(); Returns the mean of the data. =item $stat->sum(); Returns the sum of the data. =item $stat->variance(); Returns the variance of the data. Division by n-1 is used. =item $stat->standard_deviation(); Returns the standard deviation of the data. Division by n-1 is used. =item $stat->min(); Returns the minimum value of the data set. =item $stat->mindex(); Returns the index of the minimum value of the data set. =item $stat->max(); Returns the maximum value of the data set. =item $stat->maxdex(); Returns the index of the maximum value of the data set. =item $stat->sample_range(); Returns the sample range (max - min) of the data set. =back =head2 Full Methods Similar to the Sparse Methods above, any Full Method that is called caches the current result so that it doesn't have to be recalculated. In some cases, several values can be cached at the same time. =over 5 =item $stat = Statistics::Descriptive::Full->new(); Create a new statistics object that inherits from Statistics::Descriptive::Sparse so that it contains all the methods described above. =item $stat->add_data(1,2,4,5); Adds data to the statistics variable. All of the sparse statistical values are updated and cached. Cached values from Full methods are deleted since they are no longer valid. I =item $stat->add_data_with_samples([{1 => 10}, {2 => 20}, {3 => 30},]); Add data to the statistics variable and set the number of samples each value has been built with. The data is the key of each element of the input array ref, while the value is the number of samples: [{data1 => smaples1}, {data2 => samples2}, ...]. B The number of samples is only used by the smoothing function and is ignored otherwise. It is not equivalent to repeat count. In order to repeat a certain datum more than one time call add_data() like this: my $value = 5; my $repeat_count = 10; $stat->add_data( [ ($value) x $repeat_count ] ); =item $stat->get_data(); Returns a copy of the data array. =item $stat->get_data_without_outliers(); Returns a copy of the data array without outliers. The number minimum of samples to apply the outlier filtering is C<$Statistics::Descriptive::Min_samples_number>, 4 by default. A function to detect outliers need to be defined (see C), otherwise the function will return an undef value. The filtering will act only on the most extreme value of the data set (i.e.: value with the highest absolute standard deviation from the mean). If there is the need to remove more than one outlier, the filtering need to be re-run for the next most extreme value with the initial outlier removed. This is not always needed since the test (for example Grubb's test) usually can only detect the most exreme value. If there is more than one extreme case in a set, then the standard deviation will be high enough to make neither case an outlier. =item $stat->set_outlier_filter($code_ref); Set the function to filter out the outlier. C<$code_ref> is the reference to the subroutine implementing the filtering function. Returns C for invalid values of C<$code_ref> (i.e.: not defined or not a code reference), C<1> otherwise. =over 4 =item Example #1: Undefined code reference my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1, 2, 3, 4, 5); print $stat->set_outlier_filter(); # => undef =item Example #2: Valid code reference sub outlier_filter { return $_[1] > 1; } my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 1, 1, 100, 1, ); print $stat->set_outlier_filter( \&outlier_filter ); # => 1 my @filtered_data = $stat->get_data_without_outliers(); # @filtered_data is (1, 1, 1, 1) In this example the series is really simple and the outlier filter function as well. For more complex series the outlier filter function might be more complex (see Grubbs' test for outliers). The outlier filter function will receive as first parameter the Statistics::Descriptive::Full object, as second the value of the candidate outlier. Having the object in the function might be useful for complex filters where statistics property are needed (again see Grubbs' test for outlier). =back =item $stat->set_smoother({ method => 'exponential', coeff => 0, }); Set the method used to smooth the data and the smoothing coefficient. See C for more details. =item $stat->get_smoothed_data(); Returns a copy of the smoothed data array. The smoothing method and coefficient need to be defined (see C), otherwise the function will return an undef value. =item $stat->sort_data(); Sort the stored data and update the mindex and maxdex methods. This method uses perl's internal sort. =item $stat->presorted(1); =item $stat->presorted(); If called with a non-zero argument, this method sets a flag that says the data is already sorted and need not be sorted again. Since some of the methods in this class require sorted data, this saves some time. If you supply sorted data to the object, call this method to prevent the data from being sorted again. The flag is cleared whenever add_data is called. Calling the method without an argument returns the value of the flag. =item $stat->skewness(); Returns the skewness of the data. A value of zero is no skew, negative is a left skewed tail, positive is a right skewed tail. This is consistent with Excel. =item $stat->kurtosis(); Returns the kurtosis of the data. Positive is peaked, negative is flattened. =item $x = $stat->percentile(25); =item ($x, $index) = $stat->percentile(25); Sorts the data and returns the value that corresponds to the percentile as defined in RFC2330: =over 4 =item For example, given the 6 measurements: -2, 7, 7, 4, 18, -5 Then F(-8) = 0, F(-5) = 1/6, F(-5.0001) = 0, F(-4.999) = 1/6, F(7) = 5/6, F(18) = 1, F(239) = 1. Note that we can recover the different measured values and how many times each occurred from F(x) -- no information regarding the range in values is lost. Summarizing measurements using histograms, on the other hand, in general loses information about the different values observed, so the EDF is preferred. Using either the EDF or a histogram, however, we do lose information regarding the order in which the values were observed. Whether this loss is potentially significant will depend on the metric being measured. We will use the term "percentile" to refer to the smallest value of x for which F(x) >= a given percentage. So the 50th percentile of the example above is 4, since F(4) = 3/6 = 50%; the 25th percentile is -2, since F(-5) = 1/6 < 25%, and F(-2) = 2/6 >= 25%; the 100th percentile is 18; and the 0th percentile is -infinity, as is the 15th percentile, which for ease of handling and backward compatibility is returned as undef() by the function. Care must be taken when using percentiles to summarize a sample, because they can lend an unwarranted appearance of more precision than is really available. Any such summary must include the sample size N, because any percentile difference finer than 1/N is below the resolution of the sample. =back (Taken from: I, Section 11.3. Defining Statistical Distributions. RFC2330 is available from: L .) If the percentile method is called in a list context then it will also return the index of the percentile. =item $x = $stat->quantile($Type); Sorts the data and returns estimates of underlying distribution quantiles based on one or two order statistics from the supplied elements. This method use the same algorithm as Excel and R language (quantile B). The generic function quantile produces sample quantiles corresponding to the given probabilities. B<$Type> is an integer value between 0 to 4 : 0 => zero quartile (Q0) : minimal value 1 => first quartile (Q1) : lower quartile = lowest cut off (25%) of data = 25th percentile 2 => second quartile (Q2) : median = it cuts data set in half = 50th percentile 3 => third quartile (Q3) : upper quartile = highest cut off (25%) of data, or lowest 75% = 75th percentile 4 => fourth quartile (Q4) : maximal value Example : my @data = (1..10); my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(@data); print $stat->quantile(0); # => 1 print $stat->quantile(1); # => 3.25 print $stat->quantile(2); # => 5.5 print $stat->quantile(3); # => 7.75 print $stat->quantile(4); # => 10 =item $stat->median(); Sorts the data and returns the median value of the data. =item $stat->harmonic_mean(); Returns the harmonic mean of the data. Since the mean is undefined if any of the data are zero or if the sum of the reciprocals is zero, it will return undef for both of those cases. =item $stat->geometric_mean(); Returns the geometric mean of the data. =item my $mode = $stat->mode(); Returns the mode of the data. The mode is the most commonly occurring datum. See L . If all values occur only once, then mode() will return undef. =item $stat->trimmed_mean(ltrim[,utrim]); C returns the mean with a fraction C of entries at each end dropped. C returns the mean after a fraction C has been removed from the lower end of the data and a fraction C has been removed from the upper end of the data. This method sorts the data before beginning to analyze it. All calls to trimmed_mean() are cached so that they don't have to be calculated a second time. =item $stat->frequency_distribution_ref($partitions); =item $stat->frequency_distribution_ref(\@bins); =item $stat->frequency_distribution_ref(); C slices the data into C<$partition> sets (where $partition is greater than 1) and counts the number of items that fall into each partition. It returns a reference to a hash where the keys are the numerical values of the partitions used. The minimum value of the data set is not a key and the maximum value of the data set is always a key. The number of entries for a particular partition key are the number of items which are greater than the previous partition key and less then or equal to the current partition key. As an example, $stat->add_data(1,1.5,2,2.5,3,3.5,4); $f = $stat->frequency_distribution_ref(2); for (sort {$a <=> $b} keys %$f) { print "key = $_, count = $f->{$_}\n"; } prints key = 2.5, count = 4 key = 4, count = 3 since there are four items less than or equal to 2.5, and 3 items greater than 2.5 and less than 4. C provides the bins that are to be used for the distribution. This allows for non-uniform distributions as well as trimmed or sample distributions to be found. C<@bins> must be monotonic and contain at least one element. Note that unless the set of bins contains the range that the total counts returned will be less than the sample size. Calling C with no arguments returns the last distribution calculated, if such exists. =item my %hash = $stat->frequency_distribution($partitions); =item my %hash = $stat->frequency_distribution(\@bins); =item my %hash = $stat->frequency_distribution(); Same as C except that returns the hash clobbered into the return list. Kept for compatibility reasons with previous versions of Statistics::Descriptive and using it is discouraged. =item $stat->median_absolute_deviation() The median absolute deviation. =item $stat->summary() Returns a textual summary of the distribution - min, max, median, mean and quantiles. (New in version 3.0700 .) =item $stat->least_squares_fit(); =item $stat->least_squares_fit(@x); C performs a least squares fit on the data, assuming a domain of C<@x> or a default of 1..$stat->count(). It returns an array of four elements C<($q, $m, $r, $rms)> where =over 4 =item C<$q and $m> satisfy the equation C($y = $m*$x + $q). =item C<$r> is the Pearson linear correlation cofficient. =item C<$rms> is the root-mean-square error. =back If case of error or division by zero, the empty list is returned. The array that is returned can be "coerced" into a hash structure by doing the following: my %hash = (); @hash{'q', 'm', 'r', 'err'} = $stat->least_squares_fit(); Because calling C with no arguments defaults to using the current range, there is no caching of the results. =back =head1 REPORTING ERRORS I read my email frequently, but since adopting this module I've added 2 children and 1 dog to my family, so please be patient about my response times. When reporting errors, please include the following to help me out: =over 4 =item * Your version of perl. This can be obtained by typing perl C<-v> at the command line. =item * Which version of Statistics::Descriptive you're using. As you can see below, I do make mistakes. Unfortunately for me, right now there are thousands of CD's with the version of this module with the bugs in it. Fortunately for you, I'm a very patient module maintainer. =item * Details about what the error is. Try to narrow down the scope of the problem and send me code that I can run to verify and track it down. =back =head1 AUTHOR Current maintainer: Shlomi Fish, L , C Previously: Colin Kuskie My email address can be found at http://www.perl.com under Who's Who or at: https://metacpan.org/author/COLINK . =head1 CONTRIBUTORS Fabio Ponciroli & Adzuna Ltd. team (outliers handling) =head1 REFERENCES RFC2330, Framework for IP Performance Metrics The Art of Computer Programming, Volume 2, Donald Knuth. Handbook of Mathematica Functions, Milton Abramowitz and Irene Stegun. Probability and Statistics for Engineering and the Sciences, Jay Devore. =head1 COPYRIGHT Copyright (c) 1997,1998 Colin Kuskie. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1998 Andrea Spinelli. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1994,1995 Jason Kastner. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 LICENSE This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive::Full =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut Sparse.pm100644000764000764 6003613364403553 25603 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/Descriptivepackage Statistics::Descriptive::Sparse; $Statistics::Descriptive::Sparse::VERSION = '3.0702'; use strict; use warnings; use vars qw(%fields); use Carp; require Statistics::Descriptive; use Statistics::Descriptive::Smoother; sub _make_accessors { my ($pkg, $methods) = @_; no strict 'refs'; foreach my $method (@$methods) { *{$pkg."::".$method} = do { my $m = $method; sub { my $self = shift; if (@_) { $self->{$m} = shift; } return $self->{$m}; }; }; } return; } sub _make_private_accessors { my ($pkg, $methods) = @_; no strict 'refs'; foreach my $method (@$methods) { *{$pkg."::_".$method} = do { my $m = $method; sub { my $self = shift; if (@_) { $self->{$m} = shift; } return $self->{$m}; }; }; } return; } ##Define the fields to be used as methods %fields = ( count => 0, mean => undef, sum => undef, sumsq => undef, min => undef, max => undef, mindex => undef, maxdex => undef, sample_range => undef, variance => undef, ); __PACKAGE__->_make_accessors( [ grep { $_ ne "variance" } keys(%fields) ] ); __PACKAGE__->_make_accessors( ["_permitted"] ); __PACKAGE__->_make_private_accessors(["variance"]); sub new { my $proto = shift; my $class = ref($proto) || $proto; my $self = { %fields, }; bless ($self, $class); $self->_permitted(\%fields); return $self; } sub _is_permitted { my $self = shift; my $key = shift; return exists($self->_permitted()->{$key}); } sub add_data { my $self = shift; ##Myself my $oldmean; my ($min,$mindex,$max,$maxdex,$sum,$sumsq,$count); my $aref; if (ref $_[0] eq 'ARRAY') { $aref = $_[0]; } else { $aref = \@_; } ##If we were given no data, we do nothing. return 1 if (!@{ $aref }); ##Take care of appending to an existing data set if (!defined($min = $self->min())) { $min = $aref->[$mindex = 0]; } else { $mindex = $self->mindex(); } if (!defined($max = $self->max())) { $max = $aref->[$maxdex = 0]; } else { $maxdex = $self->maxdex(); } $sum = $self->sum(); $sumsq = $self->sumsq(); $count = $self->count(); ##Calculate new mean, sumsq, min and max; foreach ( @{ $aref } ) { $sum += $_; $sumsq += $_**2; $count++; if ($_ >= $max) { $max = $_; $maxdex = $count-1; } if ($_ <= $min) { $min = $_; $mindex = $count-1; } } $self->min($min); $self->mindex($mindex); $self->max($max); $self->maxdex($maxdex); $self->sample_range($max - $min); $self->sum($sum); $self->sumsq($sumsq); $self->mean($sum / $count); $self->count($count); ##indicator the value is not cached. Variance isn't commonly enough ##used to recompute every single data add. $self->_variance(undef); return 1; } sub standard_deviation { my $self = shift; ##Myself return undef if (!$self->count()); return sqrt($self->variance()); } ##Return variance; if needed, compute and cache it. sub variance { my $self = shift; ##Myself my $count = $self->count(); return undef if !$count; return 0 if $count == 1; if (!defined($self->_variance())) { my $variance = ($self->sumsq()- $count * $self->mean()**2); # Sometimes due to rounding errors we get a number below 0. # This makes sure this is handled as gracefully as possible. # # See: # # https://rt.cpan.org/Public/Bug/Display.html?id=46026 $variance = $variance < 0 ? 0 : $variance / ($count - 1); $self->_variance($variance); # Return now to avoid re-entering this sub # (and therefore save time when many objects are used). return $variance; } return $self->_variance(); } ##Clear a stat. More efficient than destroying an object and calling ##new. sub clear { my $self = shift; ##Myself my $key; return if (!$self->count()); while (my($field, $value) = each %fields) { # could use a slice assignment here $self->{$field} = $value; } } 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive - Module of basic descriptive statistical functions. =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive; my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1,2,3,4); my $mean = $stat->mean(); my $var = $stat->variance(); my $tm = $stat->trimmed_mean(.25); $Statistics::Descriptive::Tolerance = 1e-10; =head1 DESCRIPTION This module provides basic functions used in descriptive statistics. It has an object oriented design and supports two different types of data storage and calculation objects: sparse and full. With the sparse method, none of the data is stored and only a few statistical measures are available. Using the full method, the entire data set is retained and additional functions are available. Whenever a division by zero may occur, the denominator is checked to be greater than the value C<$Statistics::Descriptive::Tolerance>, which defaults to 0.0. You may want to change this value to some small positive value such as 1e-24 in order to obtain error messages in case of very small denominators. Many of the methods (both Sparse and Full) cache values so that subsequent calls with the same arguments are faster. =head1 VERSION version 3.0702 =head1 METHODS =head2 Sparse Methods =over 5 =item $stat = Statistics::Descriptive::Sparse->new(); Create a new sparse statistics object. =item $stat->clear(); Effectively the same as my $class = ref($stat); undef $stat; $stat = new $class; except more efficient. =item $stat->add_data(1,2,3); Adds data to the statistics variable. The cached statistical values are updated automatically. =item $stat->count(); Returns the number of data items. =item $stat->mean(); Returns the mean of the data. =item $stat->sum(); Returns the sum of the data. =item $stat->variance(); Returns the variance of the data. Division by n-1 is used. =item $stat->standard_deviation(); Returns the standard deviation of the data. Division by n-1 is used. =item $stat->min(); Returns the minimum value of the data set. =item $stat->mindex(); Returns the index of the minimum value of the data set. =item $stat->max(); Returns the maximum value of the data set. =item $stat->maxdex(); Returns the index of the maximum value of the data set. =item $stat->sample_range(); Returns the sample range (max - min) of the data set. =back =head2 Full Methods Similar to the Sparse Methods above, any Full Method that is called caches the current result so that it doesn't have to be recalculated. In some cases, several values can be cached at the same time. =over 5 =item $stat = Statistics::Descriptive::Full->new(); Create a new statistics object that inherits from Statistics::Descriptive::Sparse so that it contains all the methods described above. =item $stat->add_data(1,2,4,5); Adds data to the statistics variable. All of the sparse statistical values are updated and cached. Cached values from Full methods are deleted since they are no longer valid. I =item $stat->add_data_with_samples([{1 => 10}, {2 => 20}, {3 => 30},]); Add data to the statistics variable and set the number of samples each value has been built with. The data is the key of each element of the input array ref, while the value is the number of samples: [{data1 => smaples1}, {data2 => samples2}, ...]. B The number of samples is only used by the smoothing function and is ignored otherwise. It is not equivalent to repeat count. In order to repeat a certain datum more than one time call add_data() like this: my $value = 5; my $repeat_count = 10; $stat->add_data( [ ($value) x $repeat_count ] ); =item $stat->get_data(); Returns a copy of the data array. =item $stat->get_data_without_outliers(); Returns a copy of the data array without outliers. The number minimum of samples to apply the outlier filtering is C<$Statistics::Descriptive::Min_samples_number>, 4 by default. A function to detect outliers need to be defined (see C), otherwise the function will return an undef value. The filtering will act only on the most extreme value of the data set (i.e.: value with the highest absolute standard deviation from the mean). If there is the need to remove more than one outlier, the filtering need to be re-run for the next most extreme value with the initial outlier removed. This is not always needed since the test (for example Grubb's test) usually can only detect the most exreme value. If there is more than one extreme case in a set, then the standard deviation will be high enough to make neither case an outlier. =item $stat->set_outlier_filter($code_ref); Set the function to filter out the outlier. C<$code_ref> is the reference to the subroutine implementing the filtering function. Returns C for invalid values of C<$code_ref> (i.e.: not defined or not a code reference), C<1> otherwise. =over 4 =item Example #1: Undefined code reference my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(1, 2, 3, 4, 5); print $stat->set_outlier_filter(); # => undef =item Example #2: Valid code reference sub outlier_filter { return $_[1] > 1; } my $stat = Statistics::Descriptive::Full->new(); $stat->add_data( 1, 1, 1, 100, 1, ); print $stat->set_outlier_filter( \&outlier_filter ); # => 1 my @filtered_data = $stat->get_data_without_outliers(); # @filtered_data is (1, 1, 1, 1) In this example the series is really simple and the outlier filter function as well. For more complex series the outlier filter function might be more complex (see Grubbs' test for outliers). The outlier filter function will receive as first parameter the Statistics::Descriptive::Full object, as second the value of the candidate outlier. Having the object in the function might be useful for complex filters where statistics property are needed (again see Grubbs' test for outlier). =back =item $stat->set_smoother({ method => 'exponential', coeff => 0, }); Set the method used to smooth the data and the smoothing coefficient. See C for more details. =item $stat->get_smoothed_data(); Returns a copy of the smoothed data array. The smoothing method and coefficient need to be defined (see C), otherwise the function will return an undef value. =item $stat->sort_data(); Sort the stored data and update the mindex and maxdex methods. This method uses perl's internal sort. =item $stat->presorted(1); =item $stat->presorted(); If called with a non-zero argument, this method sets a flag that says the data is already sorted and need not be sorted again. Since some of the methods in this class require sorted data, this saves some time. If you supply sorted data to the object, call this method to prevent the data from being sorted again. The flag is cleared whenever add_data is called. Calling the method without an argument returns the value of the flag. =item $stat->skewness(); Returns the skewness of the data. A value of zero is no skew, negative is a left skewed tail, positive is a right skewed tail. This is consistent with Excel. =item $stat->kurtosis(); Returns the kurtosis of the data. Positive is peaked, negative is flattened. =item $x = $stat->percentile(25); =item ($x, $index) = $stat->percentile(25); Sorts the data and returns the value that corresponds to the percentile as defined in RFC2330: =over 4 =item For example, given the 6 measurements: -2, 7, 7, 4, 18, -5 Then F(-8) = 0, F(-5) = 1/6, F(-5.0001) = 0, F(-4.999) = 1/6, F(7) = 5/6, F(18) = 1, F(239) = 1. Note that we can recover the different measured values and how many times each occurred from F(x) -- no information regarding the range in values is lost. Summarizing measurements using histograms, on the other hand, in general loses information about the different values observed, so the EDF is preferred. Using either the EDF or a histogram, however, we do lose information regarding the order in which the values were observed. Whether this loss is potentially significant will depend on the metric being measured. We will use the term "percentile" to refer to the smallest value of x for which F(x) >= a given percentage. So the 50th percentile of the example above is 4, since F(4) = 3/6 = 50%; the 25th percentile is -2, since F(-5) = 1/6 < 25%, and F(-2) = 2/6 >= 25%; the 100th percentile is 18; and the 0th percentile is -infinity, as is the 15th percentile, which for ease of handling and backward compatibility is returned as undef() by the function. Care must be taken when using percentiles to summarize a sample, because they can lend an unwarranted appearance of more precision than is really available. Any such summary must include the sample size N, because any percentile difference finer than 1/N is below the resolution of the sample. =back (Taken from: I, Section 11.3. Defining Statistical Distributions. RFC2330 is available from: L .) If the percentile method is called in a list context then it will also return the index of the percentile. =item $x = $stat->quantile($Type); Sorts the data and returns estimates of underlying distribution quantiles based on one or two order statistics from the supplied elements. This method use the same algorithm as Excel and R language (quantile B). The generic function quantile produces sample quantiles corresponding to the given probabilities. B<$Type> is an integer value between 0 to 4 : 0 => zero quartile (Q0) : minimal value 1 => first quartile (Q1) : lower quartile = lowest cut off (25%) of data = 25th percentile 2 => second quartile (Q2) : median = it cuts data set in half = 50th percentile 3 => third quartile (Q3) : upper quartile = highest cut off (25%) of data, or lowest 75% = 75th percentile 4 => fourth quartile (Q4) : maximal value Example : my @data = (1..10); my $stat = Statistics::Descriptive::Full->new(); $stat->add_data(@data); print $stat->quantile(0); # => 1 print $stat->quantile(1); # => 3.25 print $stat->quantile(2); # => 5.5 print $stat->quantile(3); # => 7.75 print $stat->quantile(4); # => 10 =item $stat->median(); Sorts the data and returns the median value of the data. =item $stat->harmonic_mean(); Returns the harmonic mean of the data. Since the mean is undefined if any of the data are zero or if the sum of the reciprocals is zero, it will return undef for both of those cases. =item $stat->geometric_mean(); Returns the geometric mean of the data. =item my $mode = $stat->mode(); Returns the mode of the data. The mode is the most commonly occurring datum. See L . If all values occur only once, then mode() will return undef. =item $stat->sumsq() The sum of squares. =item $stat->trimmed_mean(ltrim[,utrim]); C returns the mean with a fraction C of entries at each end dropped. C returns the mean after a fraction C has been removed from the lower end of the data and a fraction C has been removed from the upper end of the data. This method sorts the data before beginning to analyze it. All calls to trimmed_mean() are cached so that they don't have to be calculated a second time. =item $stat->frequency_distribution_ref($partitions); =item $stat->frequency_distribution_ref(\@bins); =item $stat->frequency_distribution_ref(); C slices the data into C<$partition> sets (where $partition is greater than 1) and counts the number of items that fall into each partition. It returns a reference to a hash where the keys are the numerical values of the partitions used. The minimum value of the data set is not a key and the maximum value of the data set is always a key. The number of entries for a particular partition key are the number of items which are greater than the previous partition key and less then or equal to the current partition key. As an example, $stat->add_data(1,1.5,2,2.5,3,3.5,4); $f = $stat->frequency_distribution_ref(2); for (sort {$a <=> $b} keys %$f) { print "key = $_, count = $f->{$_}\n"; } prints key = 2.5, count = 4 key = 4, count = 3 since there are four items less than or equal to 2.5, and 3 items greater than 2.5 and less than 4. C provides the bins that are to be used for the distribution. This allows for non-uniform distributions as well as trimmed or sample distributions to be found. C<@bins> must be monotonic and contain at least one element. Note that unless the set of bins contains the range that the total counts returned will be less than the sample size. Calling C with no arguments returns the last distribution calculated, if such exists. =item my %hash = $stat->frequency_distribution($partitions); =item my %hash = $stat->frequency_distribution(\@bins); =item my %hash = $stat->frequency_distribution(); Same as C except that returns the hash clobbered into the return list. Kept for compatibility reasons with previous versions of Statistics::Descriptive and using it is discouraged. =item $stat->least_squares_fit(); =item $stat->least_squares_fit(@x); C performs a least squares fit on the data, assuming a domain of C<@x> or a default of 1..$stat->count(). It returns an array of four elements C<($q, $m, $r, $rms)> where =over 4 =item C<$q and $m> satisfy the equation C($y = $m*$x + $q). =item C<$r> is the Pearson linear correlation cofficient. =item C<$rms> is the root-mean-square error. =back If case of error or division by zero, the empty list is returned. The array that is returned can be "coerced" into a hash structure by doing the following: my %hash = (); @hash{'q', 'm', 'r', 'err'} = $stat->least_squares_fit(); Because calling C with no arguments defaults to using the current range, there is no caching of the results. =back =head1 REPORTING ERRORS I read my email frequently, but since adopting this module I've added 2 children and 1 dog to my family, so please be patient about my response times. When reporting errors, please include the following to help me out: =over 4 =item * Your version of perl. This can be obtained by typing perl C<-v> at the command line. =item * Which version of Statistics::Descriptive you're using. As you can see below, I do make mistakes. Unfortunately for me, right now there are thousands of CD's with the version of this module with the bugs in it. Fortunately for you, I'm a very patient module maintainer. =item * Details about what the error is. Try to narrow down the scope of the problem and send me code that I can run to verify and track it down. =back =head1 AUTHOR Current maintainer: Shlomi Fish, L , C Previously: Colin Kuskie My email address can be found at http://www.perl.com under Who's Who or at: https://metacpan.org/author/COLINK . =head1 CONTRIBUTORS Fabio Ponciroli & Adzuna Ltd. team (outliers handling) =head1 REFERENCES RFC2330, Framework for IP Performance Metrics The Art of Computer Programming, Volume 2, Donald Knuth. Handbook of Mathematica Functions, Milton Abramowitz and Irene Stegun. Probability and Statistics for Engineering and the Sciences, Jay Devore. =head1 COPYRIGHT Copyright (c) 1997,1998 Colin Kuskie. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1998 Andrea Spinelli. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Copyright (c) 1994,1995 Jason Kastner. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 LICENSE This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive::Sparse =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut Smoother.pm100644000764000764 1726413364403553 26153 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/Descriptivepackage Statistics::Descriptive::Smoother; use strict; use warnings; use Carp; our $VERSION = '3.0702'; sub instantiate { my ($class, $args) = @_; my $method = delete $args->{method}; my $coeff = delete $args->{coeff} || 0; my $ra_samples = delete $args->{samples}; my $ra_data = delete $args->{data}; if ($coeff < 0 || $coeff > 1) { carp("Invalid smoothing coefficient C $coeff\n"); return; } if (@$ra_data < 2) { carp("Need at least 2 samples to smooth the data\n"); return; } $method = ucfirst(lc($method)); my $sub_class = __PACKAGE__."::$method"; eval "require $sub_class"; die "No such class $sub_class: $@" if $@; return $sub_class->_new({ data => $ra_data, samples => $ra_samples, count => scalar @$ra_data, coeff => $coeff, }); } sub get_smoothing_coeff { $_[0]->{coeff} } sub set_smoothing_coeff { my ($self, $coeff) = @_; if ($coeff < 0 || $coeff > 1) { carp("Invalid smoothing coefficient C $coeff\n"); return; } $self->{coeff} = $coeff; return 1; } 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive::Smoother - Base module for smoothing statistical data =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive::Smoother; my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0.5, data => [1, 2, 3, 4, 5], samples => [110, 120, 130, 140, 150], }); my @smoothed_data = $smoother->get_smoothed_data(); =head1 DESCRIPTION This module provide methods to smooth the trend of a series of statistical data. The methods provided are the C and the C (see respectively C and C for more details). This class is just a factory that will instantiate the object to perform the chosen smoothing algorithm. =head1 VERSION version 3.0702 =head1 METHODS =over 5 =item Statistics::Descriptive::Smoother->instantiate({}); Create a new Smoother object. This method require several parameters: =over 5 =item method Method used for the smoothing. Allowed values are: C and C =item coeff Smoothing coefficient. It needs to be in the [0;1] range, otherwise undef will be reutrned. C<0> means that the series is not smoothed at all, while C<1> the series is universally equal to the initial unsmoothed value. =item data Array ref with the data of the series. At least 2 values are needed to smooth the series, undef is returned otherwise. =item samples Array ref with the samples each data value has been built with. This is an optional parameter since it is not used by all the smoothing algorithm. =back =item $smoother->get_smoothing_coeff(); Returns the smoothing coefficient. =item $smoother->set_smoothing_coeff(0.5); Set the smoothing coefficient value. It needs to be in the [0;1] range, otherwise undef will be reutrned. =back =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive::Smoother =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut Smoother000755000764000764 013364403553 25423 5ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/DescriptiveExponential.pm100644000764000764 1522613364403553 30435 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/Descriptive/Smootherpackage Statistics::Descriptive::Smoother::Exponential; use strict; use warnings; use base 'Statistics::Descriptive::Smoother'; our $VERSION = '3.0702'; sub _new { my ($class, $args) = @_; return bless $args || {}, $class; } # The name of the variables used in the code refers to the explanation in the pod sub get_smoothed_data { my ($self) = @_; my @smoothed_values; push @smoothed_values, @{$self->{data}}[0]; my $C = $self->get_smoothing_coeff(); foreach my $sample_idx (1 .. $self->{count} -1) { my $smoothed_value = $C * ($smoothed_values[-1]) + (1 - $C) * $self->{data}->[$sample_idx]; push @smoothed_values, $smoothed_value; } return @smoothed_values; } 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive::Smoother::Exponential - Implement exponential smoothing =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive::Smoother; my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'exponential', coeff => 0.5, data => [1, 2, 3, 4, 5], samples => [110, 120, 130, 140, 150], }); my @smoothed_data = $smoother->get_smoothed_data(); =head1 DESCRIPTION This module implement the exponential smoothing algorithm to smooth the trend of a series of statistical data. This algorithm works well for unsmoothed data build with big number of samples. If this is not the case you might consider using the C one. The algorithm implements the following formula: S(0) = X(0) S(t) = C*S(t-1) + (1-C)*X(t) where: =over 3 =item * t = index in the series =item * S(t) = smoothed series value at position t =item * C = smoothing coefficient. Value in the [0;1] range. C<0> means that the series is not smoothed at all, while C<1> the series is universally equal to the initial unsmoothed value. =item * X(t) = unsmoothed series value at position t =back =head1 VERSION version 3.0702 =head1 METHODS =over 5 =item $stats->get_smoothed_data(); Returns a copy of the smoothed data array. =back =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive::Smoother::Exponential =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut Weightedexponential.pm100644000764000764 1766413364403553 32166 0ustar00shlomifshlomif000000000000Statistics-Descriptive-3.0702/lib/Statistics/Descriptive/Smootherpackage Statistics::Descriptive::Smoother::Weightedexponential; use strict; use warnings; use Carp; use base 'Statistics::Descriptive::Smoother'; our $VERSION = '3.0702'; sub _new { my ($class, $args) = @_; if (scalar @{$args->{data}} != scalar @{$args->{samples}}) { carp("Number of data values and samples need to be the same\n"); return; } return bless $args || {}, $class; } # The name of the variables used in the code refers to the explanation in the pod sub get_smoothed_data { my ($self) = @_; my (@smoothed_values, @Wts); # W(0) = N(0) push @Wts, @{$self->{samples}}[0]; # S(0) = X(0) push @smoothed_values, @{$self->{data}}[0]; my $C = $self->get_smoothing_coeff(); foreach my $idx (1 .. ($self->{count} -1)) { my $Xt = $self->{data}->[$idx]; my $Nt = $self->{samples}->[$idx]; my $St_1 = $smoothed_values[-1]; my $Wt_1 = $Wts[-1]; push @Wts, $self->_get_Wt($Wt_1, $Nt); my $coeff_a = $self->_get_coeff_A($Wt_1, $Nt); my $coeff_b = $self->_get_coeff_B($Wt_1, $Nt); my $smoothed_value = ( $St_1 * $coeff_a + $Xt * $coeff_b ) / ( $coeff_a + $coeff_b ); push @smoothed_values, $smoothed_value; } return @smoothed_values; } sub _get_Wt { my ($self, $Wt_1, $Nt) = @_; my $C = $self->get_smoothing_coeff(); my $coeff_a = $self->_get_coeff_A($Wt_1, $Nt); my $coeff_b = $self->_get_coeff_B($Wt_1, $Nt);; return (($Wt_1 * $coeff_a + $Nt * $coeff_b)/($coeff_a + $coeff_b)); } sub _get_coeff_A { my ($self, $Wt_1, $Nt) = @_; my $C = $self->get_smoothing_coeff(); return $C * ( $Wt_1 / ($Wt_1 + $Nt) ); } sub _get_coeff_B { my ($self, $Wt_1, $Nt) = @_; my $C = $self->get_smoothing_coeff(); return (1 - $C) * ( $Nt / ($Nt + $Wt_1) ); } 1; __END__ =pod =encoding UTF-8 =head1 NAME Statistics::Descriptive::Smoother::Weigthedexponential - Implement weighted exponential smoothing =head1 VERSION version 3.0702 =head1 SYNOPSIS use Statistics::Descriptive::Smoother; my $smoother = Statistics::Descriptive::Smoother->instantiate({ method => 'weightedexponential', coeff => 0.5, data => [1, 2, 3, 4, 5], samples => [110, 120, 130, 140, 150], }); my @smoothed_data = $smoother->get_smoothed_data(); =head1 DESCRIPTION This module implement the weighted exponential smoothing algorithm to smooth the trend of a series of statistical data. This algorithm can help to control large swings in the unsmoothed data that arise from small samples for those data points. The algorithm implements the following formula: W(0) = N(0) W(t) = ( W(t-1) * CoeffA + N(t) * CoeffB ) / (CoeffA + CoeffB) CoeffA = C * ( W(t-1) / (W(t-1) + N(t) ) ) CoeffB = (1 - C) * ( N(t) * (W(t-1) + N(t)) ) S(t) = (S(t-1)*CoeffA + X(t)*CoeffB) / (CoeffA + CoeffB) where: =over 3 =item * t = index in the series =item * S(t) = smoothed series value at position t =item * C = smoothing coefficient. Value in the [0;1] range. C<0> means that the series is not smoothed at all, while C<1> the series is universally equal to the initial unsmoothed value. =item * X(t) = unsmoothed series value at position t =back =head1 VERSION version 3.0702 =head1 METHODS =over 5 =item $stats->get_smoothed_data(); Returns a copy of the smoothed data array. =back =head1 AUTHOR Fabio Ponciroli =head1 COPYRIGHT Copyright(c) 2012 by Fabio Ponciroli. =head1 LICENSE This file is licensed under the MIT/X11 License: http://www.opensource.org/licenses/mit-license.php. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =head1 AUTHOR Shlomi Fish =head1 COPYRIGHT AND LICENSE This software is copyright (c) 1997 by Jason Kastner, Andrea Spinelli, Colin Kuskie, and others. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. =head1 BUGS Please report any bugs or feature requests on the bugtracker website L When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature. =for :stopwords cpan testmatrix url annocpan anno bugtracker rt cpants kwalitee diff irc mailto metadata placeholders metacpan =head1 SUPPORT =head2 Perldoc You can find documentation for this module with the perldoc command. perldoc Statistics::Descriptive::Smoother::Weightedexponential =head2 Websites The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources. =over 4 =item * MetaCPAN A modern, open-source CPAN search engine, useful to view POD in HTML format. L =item * Search CPAN The default CPAN search engine, useful to view POD in HTML format. L =item * RT: CPAN's Bug Tracker The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN. L =item * AnnoCPAN The AnnoCPAN is a website that allows community annotations of Perl module documentation. L =item * CPAN Ratings The CPAN Ratings is a website that allows community ratings and reviews of Perl modules. L =item * CPANTS The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution. L =item * CPAN Testers The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions. L =item * CPAN Testers Matrix The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms. L =item * CPAN Testers Dependencies The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution. L =back =head2 Bugs / Feature Requests Please report any bugs or feature requests by email to C, or through the web interface at L. You will be automatically notified of any progress on the request by the system. =head2 Source Code The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :) L git clone git://github.com/shlomif/perl-Statistics-Descriptive.git =cut