Bio-Coordinate-1.007001000755000765000024 013024421741 14605 5ustar00cjfieldsstaff000000000000Changes100644000765000024 72013024421741 16140 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001Summary of important user-visible changes for Bio-Coordinate ------------------------------------------------------------ 1.007001 2016-12-14 23:02:19-06:00 America/Chicago * Second point release after initial indexing fail: added a stub module for the distribution 1.007000 2016-11-14 19:31:51-06:00 America/Chicago * First release after split from bioperl-live. * Bio::Coordinate::Collection - allow passing an array reference to mappers(). LICENSE100644000765000024 4365513024421741 15730 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001This software is copyright (c) 2016 by BioPerl Team. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. Terms of the Perl programming language system itself a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" --- The GNU General Public License, Version 1, February 1989 --- This software is Copyright (c) 2016 by BioPerl Team. This is free software, licensed under: The GNU General Public License, Version 1, February 1989 GNU GENERAL PUBLIC LICENSE Version 1, February 1989 Copyright (C) 1989 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The license agreements of most software companies try to keep users at the mercy of those companies. By contrast, our General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. The General Public License applies to the Free Software Foundation's software and to any other program whose authors commit to using it. You can use it for your programs, too. When we speak of free software, we are referring to freedom, not price. Specifically, the General Public License is designed to make sure that you have the freedom to give away or sell copies of free software, that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of a such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any work containing the Program or a portion of it, either verbatim or with modifications. Each licensee is addressed as "you". 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this General Public License and to the absence of any warranty; and give any other recipients of the Program a copy of this General Public License along with the Program. You may charge a fee for the physical act of transferring a copy. 2. You may modify your copy or copies of the Program or any portion of it, and copy and distribute such modifications under the terms of Paragraph 1 above, provided that you also do the following: a) cause the modified files to carry prominent notices stating that you changed the files and the date of any change; and b) cause the whole of any work that you distribute or publish, that in whole or in part contains the Program or any part thereof, either with or without modifications, to be licensed at no charge to all third parties under the terms of this General Public License (except that you may choose to grant warranty protection to some or all third parties, at your option). c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the simplest and most usual way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this General Public License. d) You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. Mere aggregation of another independent work with the Program (or its derivative) on a volume of a storage or distribution medium does not bring the other work under the scope of these terms. 3. You may copy and distribute the Program (or a portion or derivative of it, under Paragraph 2) in object code or executable form under the terms of Paragraphs 1 and 2 above provided that you also do one of the following: a) accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Paragraphs 1 and 2 above; or, b) accompany it with a written offer, valid for at least three years, to give any third party free (except for a nominal charge for the cost of distribution) a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Paragraphs 1 and 2 above; or, c) accompany it with the information you received as to where the corresponding source code may be obtained. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form alone.) Source code for a work means the preferred form of the work for making modifications to it. For an executable file, complete source code means all the source code for all modules it contains; but, as a special exception, it need not include source code for modules which are standard libraries that accompany the operating system on which the executable file runs, or for standard header files or definitions files that accompany that operating system. 4. You may not copy, modify, sublicense, distribute or transfer the Program except as expressly provided under this General Public License. Any attempt otherwise to copy, modify, sublicense, distribute or transfer the Program is void, and will automatically terminate your rights to use the Program under this License. However, parties who have received copies, or rights to use copies, from you under this General Public License will not have their licenses terminated so long as such parties remain in full compliance. 5. By copying, distributing or modifying the Program (or any work based on the Program) you indicate your acceptance of this license to do so, and all its terms and conditions. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. 7. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of the license which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation. 8. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Appendix: How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to humanity, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) 19yy This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) 19xx name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (a program to direct compilers to make passes at assemblers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice That's all there is to it! --- The Artistic License 1.0 --- This software is Copyright (c) 2016 by BioPerl Team. This is free software, licensed under: The Artistic License 1.0 The Artistic License Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions: - "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. - "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder. - "Copyright Holder" is whoever is named in the copyright or copyrights for the package. - "You" is you, if you're thinking about copying or distributing this Package. - "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) - "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a) place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as ftp.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b) use the modified Package only within your corporation or organization. c) rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d) make other distribution arrangements with the Copyright Holder. 4. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a) distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b) accompany the distribution with the machine-readable source of the Package with your modifications. c) accompany any non-standard executables with their corresponding Standard Version executables, giving the non-standard executables non-standard names, and clearly documenting the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d) make other distribution arrangements with the Copyright Holder. 5. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. 6. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. 7. C or perl subroutines supplied by you and linked into this Package shall not be considered part of this Package. 8. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End dist.ini100644000765000024 67413024421741 16321 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001name = Bio-Coordinate abstract = Methods for dealing with genomic coordinates. main_module = lib/Bio/Coordinate/Chain.pm version = 1.007001 author = BioPerl Team license = Perl_5 copyright_holder = BioPerl Team [@Filter] -bundle = @BioPerl -remove = Test::NoTabs ; because CoordinateMapper.t and GeneCoordinateMapper.t require tabs -remove = PodCoverageTests META.yml100644000765000024 2530513024421741 16164 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001--- abstract: 'Methods for dealing with genomic coordinates.' author: - 'BioPerl Team ' build_requires: Bio::LocatableSeq: '0' Bio::Root::Test: '0' Bio::SimpleAlign: '0' File::Spec: '0' IO::Handle: '0' IPC::Open3: '0' Test::More: '0' blib: '1.01' perl: '5.006' configure_requires: ExtUtils::MakeMaker: '0' dynamic_config: 0 generated_by: 'Dist::Zilla version 6.007, CPAN::Meta::Converter version 2.150001' license: perl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: Bio-Coordinate requires: Bio::Location::Simple: '0' Bio::Location::Split: '0' Bio::LocationI: '0' Bio::Root::Root: '0' Bio::Root::RootI: '0' parent: '0' strict: '0' utf8: '0' warnings: '0' resources: bugtracker: https://github.com/bioperl/%%7Bdist%7D homepage: https://metacpan.org/release/Bio-Coordinate repository: git://github.com/bioperl/bio-coordinate.git version: '1.007001' x_Dist_Zilla: perl: version: '5.022001' plugins: - class: Dist::Zilla::Plugin::GatherDir config: Dist::Zilla::Plugin::GatherDir: exclude_filename: [] exclude_match: [] follow_symlinks: 0 include_dotfiles: 0 prefix: '' prune_directory: [] root: . name: '@Filter/@Filter/GatherDir' version: '6.007' - class: Dist::Zilla::Plugin::PruneCruft name: '@Filter/@Filter/PruneCruft' version: '6.007' - class: Dist::Zilla::Plugin::ManifestSkip name: '@Filter/@Filter/ManifestSkip' version: '6.007' - class: Dist::Zilla::Plugin::MetaYAML name: '@Filter/@Filter/MetaYAML' version: '6.007' - class: Dist::Zilla::Plugin::License name: '@Filter/@Filter/License' version: '6.007' - class: Dist::Zilla::Plugin::ExtraTests name: '@Filter/@Filter/ExtraTests' version: '6.007' - class: Dist::Zilla::Plugin::ExecDir name: '@Filter/@Filter/ExecDir' version: '6.007' - class: Dist::Zilla::Plugin::ShareDir name: '@Filter/@Filter/ShareDir' version: '6.007' - class: Dist::Zilla::Plugin::MakeMaker config: Dist::Zilla::Role::TestRunner: default_jobs: 1 name: '@Filter/@Filter/MakeMaker' version: '6.007' - class: Dist::Zilla::Plugin::Manifest name: '@Filter/@Filter/Manifest' version: '6.007' - class: Dist::Zilla::Plugin::TestRelease name: '@Filter/@Filter/TestRelease' version: '6.007' - class: Dist::Zilla::Plugin::ConfirmRelease name: '@Filter/@Filter/ConfirmRelease' version: '6.007' - class: Dist::Zilla::Plugin::UploadToCPAN name: '@Filter/@Filter/UploadToCPAN' version: '6.007' - class: Dist::Zilla::Plugin::MetaConfig name: '@Filter/MetaConfig' version: '6.007' - class: Dist::Zilla::Plugin::MetaJSON name: '@Filter/MetaJSON' version: '6.007' - class: Dist::Zilla::Plugin::PkgVersion name: '@Filter/PkgVersion' version: '6.007' - class: Dist::Zilla::Plugin::PodSyntaxTests name: '@Filter/PodSyntaxTests' version: '6.007' - class: Dist::Zilla::Plugin::NextRelease name: '@Filter/NextRelease' version: '6.007' - class: Dist::Zilla::Plugin::Test::Compile config: Dist::Zilla::Plugin::Test::Compile: bail_out_on_fail: '0' fail_on_warning: author fake_home: 0 filename: t/00-compile.t module_finder: - ':InstallModules' needs_display: 0 phase: test script_finder: - ':PerlExecFiles' skips: [] name: '@Filter/Test::Compile' version: '2.054' - class: Dist::Zilla::Plugin::MojibakeTests name: '@Filter/MojibakeTests' version: '0.8' - class: Dist::Zilla::Plugin::AutoPrereqs name: '@Filter/AutoPrereqs' version: '6.007' - class: Dist::Zilla::Plugin::RunExtraTests config: Dist::Zilla::Role::TestRunner: default_jobs: 1 name: '@Filter/RunExtraTests' version: '0.029' - class: Dist::Zilla::Plugin::AutoMetaResources name: '@Filter/AutoMetaResources' version: '1.21' - class: Dist::Zilla::Plugin::MetaResources name: '@Filter/MetaResources' version: '6.007' - class: Dist::Zilla::Plugin::Authority name: '@Filter/Authority' version: '1.009' - class: Dist::Zilla::Plugin::EOLTests config: Dist::Zilla::Plugin::Test::EOL: filename: xt/release/eol.t finder: - ':ExecFiles' - ':InstallModules' - ':TestFiles' trailing_whitespace: 1 version: '0.19' name: '@Filter/EOLTests' version: '0.19' - class: Dist::Zilla::Plugin::PodWeaver config: Dist::Zilla::Plugin::PodWeaver: config_plugins: - '@BioPerl' finder: - ':InstallModules' - ':ExecFiles' plugins: - class: Pod::Weaver::Plugin::EnsurePod5 name: '@CorePrep/EnsurePod5' version: '4.013' - class: Pod::Weaver::Plugin::H1Nester name: '@CorePrep/H1Nester' version: '4.013' - class: Pod::Weaver::Section::Name name: '@BioPerl/Name' version: '4.013' - class: Pod::Weaver::Section::Version name: '@BioPerl/Version' version: '4.013' - class: Pod::Weaver::Section::Region name: '@BioPerl/prelude' version: '4.013' - class: Pod::Weaver::Section::Generic name: SYNOPSIS version: '4.013' - class: Pod::Weaver::Section::Generic name: DESCRIPTION version: '4.013' - class: Pod::Weaver::Section::Generic name: OVERVIEW version: '4.013' - class: Pod::Weaver::Section::Collect name: ATTRIBUTES version: '4.013' - class: Pod::Weaver::Section::Collect name: METHODS version: '4.013' - class: Pod::Weaver::Section::Collect name: FUNCTIONS version: '4.013' - class: Pod::Weaver::Section::Collect name: 'INTERNAL METHODS' version: '4.013' - class: Pod::Weaver::Section::Leftovers name: '@BioPerl/Leftovers' version: '4.013' - class: Pod::Weaver::Section::Region name: '@BioPerl/postlude' version: '4.013' - class: Pod::Weaver::Section::GenerateSection name: FEEDBACK version: '1.02' - class: Pod::Weaver::Section::GenerateSection name: 'Mailing lists' version: '1.02' - class: Pod::Weaver::Section::GenerateSection name: Support version: '1.02' - class: Pod::Weaver::Section::GenerateSection name: 'Reporting bugs' version: '1.02' - class: Pod::Weaver::Section::Legal::Complicated name: '@BioPerl/Legal' version: '1.21' - class: Pod::Weaver::Section::Contributors name: '@BioPerl/Contributors' version: '0.009' - class: Pod::Weaver::Plugin::Encoding name: Encoding version: '0.03' - class: Pod::Weaver::Plugin::Transformer name: '@BioPerl/List' version: '4.013' - class: Pod::Weaver::Plugin::EnsureUniqueSections name: EnsureUniqueSections version: '0.121550' name: '@Filter/PodWeaver' version: '4.008' - class: Dist::Zilla::Plugin::Git::Check config: Dist::Zilla::Plugin::Git::Check: untracked_files: die Dist::Zilla::Role::Git::DirtyFiles: allow_dirty: - Changes - dist.ini allow_dirty_match: [] changelog: Changes Dist::Zilla::Role::Git::Repo: repo_root: . name: '@Filter/Git::Check' version: '2.039' - class: Dist::Zilla::Plugin::Git::Commit config: Dist::Zilla::Plugin::Git::Commit: add_files_in: [] commit_msg: v%v%n%n%c Dist::Zilla::Role::Git::DirtyFiles: allow_dirty: - Changes - dist.ini allow_dirty_match: [] changelog: Changes Dist::Zilla::Role::Git::Repo: repo_root: . Dist::Zilla::Role::Git::StringFormatter: time_zone: local name: '@Filter/Git::Commit' version: '2.039' - class: Dist::Zilla::Plugin::Git::Tag config: Dist::Zilla::Plugin::Git::Tag: branch: ~ changelog: Changes signed: 0 tag: Bio-Coordinate-v1.007001 tag_format: '%N-v%v' tag_message: '%N-v%v' Dist::Zilla::Role::Git::Repo: repo_root: . Dist::Zilla::Role::Git::StringFormatter: time_zone: local name: '@Filter/Git::Tag' version: '2.039' - class: Dist::Zilla::Plugin::FinderCode name: ':InstallModules' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':IncModules' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':TestFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':ExtraTestFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':ExecFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':PerlExecFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':ShareFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':MainModule' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':AllFiles' version: '6.007' - class: Dist::Zilla::Plugin::FinderCode name: ':NoFiles' version: '6.007' zilla: class: Dist::Zilla::Dist::Builder config: is_trial: '0' version: '6.007' x_authority: cpan:BIOPERLML x_serialization_backend: 'YAML::Tiny version 1.69' MANIFEST100644000765000024 131613024421741 16020 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001# This file was automatically generated by Dist::Zilla::Plugin::Manifest v6.007. Changes LICENSE MANIFEST META.json META.yml Makefile.PL README.md dist.ini lib/Bio/Coordinate.pm lib/Bio/Coordinate/Chain.pm lib/Bio/Coordinate/Collection.pm lib/Bio/Coordinate/ExtrapolatingPair.pm lib/Bio/Coordinate/GeneMapper.pm lib/Bio/Coordinate/Graph.pm lib/Bio/Coordinate/MapperI.pm lib/Bio/Coordinate/Pair.pm lib/Bio/Coordinate/Result.pm lib/Bio/Coordinate/Result/Gap.pm lib/Bio/Coordinate/Result/Match.pm lib/Bio/Coordinate/ResultI.pm lib/Bio/Coordinate/Utils.pm t/00-compile.t t/CoordinateBoundaryTest.t t/CoordinateGraph.t t/CoordinateMapper.t t/GeneCoordinateMapper.t t/author-mojibake.t t/author-pod-syntax.t t/release-eol.t README.md100644000765000024 21713024421741 16125 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001Bio-Coordinate =============== The Bio-Coordinate distribution. This distribution is part of the [BioPerl](http://www.bioperl.org/) project. META.json100644000765000024 4163313024421741 16336 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001{ "abstract" : "Methods for dealing with genomic coordinates.", "author" : [ "BioPerl Team " ], "dynamic_config" : 0, "generated_by" : "Dist::Zilla version 6.007, CPAN::Meta::Converter version 2.150001", "license" : [ "perl_5" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : 2 }, "name" : "Bio-Coordinate", "prereqs" : { "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "develop" : { "requires" : { "Test::EOL" : "0", "Test::Mojibake" : "0", "Test::More" : "0.88", "Test::Pod" : "1.41" } }, "runtime" : { "requires" : { "Bio::Location::Simple" : "0", "Bio::Location::Split" : "0", "Bio::LocationI" : "0", "Bio::Root::Root" : "0", "Bio::Root::RootI" : "0", "parent" : "0", "strict" : "0", "utf8" : "0", "warnings" : "0" } }, "test" : { "requires" : { "Bio::LocatableSeq" : "0", "Bio::Root::Test" : "0", "Bio::SimpleAlign" : "0", "File::Spec" : "0", "IO::Handle" : "0", "IPC::Open3" : "0", "Test::More" : "0", "blib" : "1.01", "perl" : "5.006" } } }, "release_status" : "stable", "resources" : { "bugtracker" : { "mailto" : "bioperl-l@bioperl.org", "web" : "https://github.com/bioperl/%%7Bdist%7D" }, "homepage" : "https://metacpan.org/release/Bio-Coordinate", "repository" : { "type" : "git", "url" : "git://github.com/bioperl/bio-coordinate.git", "web" : "https://github.com/bioperl/bio-coordinate" } }, "version" : "1.007001", "x_Dist_Zilla" : { "perl" : { "version" : "5.022001" }, "plugins" : [ { "class" : "Dist::Zilla::Plugin::GatherDir", "config" : { "Dist::Zilla::Plugin::GatherDir" : { "exclude_filename" : [], "exclude_match" : [], "follow_symlinks" : 0, "include_dotfiles" : 0, "prefix" : "", "prune_directory" : [], "root" : "." } }, "name" : "@Filter/@Filter/GatherDir", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::PruneCruft", "name" : "@Filter/@Filter/PruneCruft", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::ManifestSkip", "name" : "@Filter/@Filter/ManifestSkip", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::MetaYAML", "name" : "@Filter/@Filter/MetaYAML", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::License", "name" : "@Filter/@Filter/License", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::ExtraTests", "name" : "@Filter/@Filter/ExtraTests", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::ExecDir", "name" : "@Filter/@Filter/ExecDir", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::ShareDir", "name" : "@Filter/@Filter/ShareDir", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::MakeMaker", "config" : { "Dist::Zilla::Role::TestRunner" : { "default_jobs" : 1 } }, "name" : "@Filter/@Filter/MakeMaker", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::Manifest", "name" : "@Filter/@Filter/Manifest", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::TestRelease", "name" : "@Filter/@Filter/TestRelease", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::ConfirmRelease", "name" : "@Filter/@Filter/ConfirmRelease", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::UploadToCPAN", "name" : "@Filter/@Filter/UploadToCPAN", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::MetaConfig", "name" : "@Filter/MetaConfig", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::MetaJSON", "name" : "@Filter/MetaJSON", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::PkgVersion", "name" : "@Filter/PkgVersion", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::PodSyntaxTests", "name" : "@Filter/PodSyntaxTests", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::NextRelease", "name" : "@Filter/NextRelease", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::Test::Compile", "config" : { "Dist::Zilla::Plugin::Test::Compile" : { "bail_out_on_fail" : "0", "fail_on_warning" : "author", "fake_home" : 0, "filename" : "t/00-compile.t", "module_finder" : [ ":InstallModules" ], "needs_display" : 0, "phase" : "test", "script_finder" : [ ":PerlExecFiles" ], "skips" : [] } }, "name" : "@Filter/Test::Compile", "version" : "2.054" }, { "class" : "Dist::Zilla::Plugin::MojibakeTests", "name" : "@Filter/MojibakeTests", "version" : "0.8" }, { "class" : "Dist::Zilla::Plugin::AutoPrereqs", "name" : "@Filter/AutoPrereqs", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::RunExtraTests", "config" : { "Dist::Zilla::Role::TestRunner" : { "default_jobs" : 1 } }, "name" : "@Filter/RunExtraTests", "version" : "0.029" }, { "class" : "Dist::Zilla::Plugin::AutoMetaResources", "name" : "@Filter/AutoMetaResources", "version" : "1.21" }, { "class" : "Dist::Zilla::Plugin::MetaResources", "name" : "@Filter/MetaResources", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::Authority", "name" : "@Filter/Authority", "version" : "1.009" }, { "class" : "Dist::Zilla::Plugin::EOLTests", "config" : { "Dist::Zilla::Plugin::Test::EOL" : { "filename" : "xt/release/eol.t", "finder" : [ ":ExecFiles", ":InstallModules", ":TestFiles" ], "trailing_whitespace" : 1, "version" : "0.19" } }, "name" : "@Filter/EOLTests", "version" : "0.19" }, { "class" : "Dist::Zilla::Plugin::PodWeaver", "config" : { "Dist::Zilla::Plugin::PodWeaver" : { "config_plugins" : [ "@BioPerl" ], "finder" : [ ":InstallModules", ":ExecFiles" ], "plugins" : [ { "class" : "Pod::Weaver::Plugin::EnsurePod5", "name" : "@CorePrep/EnsurePod5", "version" : "4.013" }, { "class" : "Pod::Weaver::Plugin::H1Nester", "name" : "@CorePrep/H1Nester", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Name", "name" : "@BioPerl/Name", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Version", "name" : "@BioPerl/Version", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Region", "name" : "@BioPerl/prelude", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "SYNOPSIS", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "DESCRIPTION", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Generic", "name" : "OVERVIEW", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "ATTRIBUTES", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "METHODS", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "FUNCTIONS", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Collect", "name" : "INTERNAL METHODS", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Leftovers", "name" : "@BioPerl/Leftovers", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::Region", "name" : "@BioPerl/postlude", "version" : "4.013" }, { "class" : "Pod::Weaver::Section::GenerateSection", "name" : "FEEDBACK", "version" : "1.02" }, { "class" : "Pod::Weaver::Section::GenerateSection", "name" : "Mailing lists", "version" : "1.02" }, { "class" : "Pod::Weaver::Section::GenerateSection", "name" : "Support", "version" : "1.02" }, { "class" : "Pod::Weaver::Section::GenerateSection", "name" : "Reporting bugs", "version" : "1.02" }, { "class" : "Pod::Weaver::Section::Legal::Complicated", "name" : "@BioPerl/Legal", "version" : "1.21" }, { "class" : "Pod::Weaver::Section::Contributors", "name" : "@BioPerl/Contributors", "version" : "0.009" }, { "class" : "Pod::Weaver::Plugin::Encoding", "name" : "Encoding", "version" : "0.03" }, { "class" : "Pod::Weaver::Plugin::Transformer", "name" : "@BioPerl/List", "version" : "4.013" }, { "class" : "Pod::Weaver::Plugin::EnsureUniqueSections", "name" : "EnsureUniqueSections", "version" : "0.121550" } ] } }, "name" : "@Filter/PodWeaver", "version" : "4.008" }, { "class" : "Dist::Zilla::Plugin::Git::Check", "config" : { "Dist::Zilla::Plugin::Git::Check" : { "untracked_files" : "die" }, "Dist::Zilla::Role::Git::DirtyFiles" : { "allow_dirty" : [ "Changes", "dist.ini" ], "allow_dirty_match" : [], "changelog" : "Changes" }, "Dist::Zilla::Role::Git::Repo" : { "repo_root" : "." } }, "name" : "@Filter/Git::Check", "version" : "2.039" }, { "class" : "Dist::Zilla::Plugin::Git::Commit", "config" : { "Dist::Zilla::Plugin::Git::Commit" : { "add_files_in" : [], "commit_msg" : "v%v%n%n%c" }, "Dist::Zilla::Role::Git::DirtyFiles" : { "allow_dirty" : [ "Changes", "dist.ini" ], "allow_dirty_match" : [], "changelog" : "Changes" }, "Dist::Zilla::Role::Git::Repo" : { "repo_root" : "." }, "Dist::Zilla::Role::Git::StringFormatter" : { "time_zone" : "local" } }, "name" : "@Filter/Git::Commit", "version" : "2.039" }, { "class" : "Dist::Zilla::Plugin::Git::Tag", "config" : { "Dist::Zilla::Plugin::Git::Tag" : { "branch" : null, "changelog" : "Changes", "signed" : 0, "tag" : "Bio-Coordinate-v1.007001", "tag_format" : "%N-v%v", "tag_message" : "%N-v%v" }, "Dist::Zilla::Role::Git::Repo" : { "repo_root" : "." }, "Dist::Zilla::Role::Git::StringFormatter" : { "time_zone" : "local" } }, "name" : "@Filter/Git::Tag", "version" : "2.039" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":InstallModules", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":IncModules", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":TestFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ExtraTestFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ExecFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":PerlExecFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":ShareFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":MainModule", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":AllFiles", "version" : "6.007" }, { "class" : "Dist::Zilla::Plugin::FinderCode", "name" : ":NoFiles", "version" : "6.007" } ], "zilla" : { "class" : "Dist::Zilla::Dist::Builder", "config" : { "is_trial" : "0" }, "version" : "6.007" } }, "x_authority" : "cpan:BIOPERLML", "x_serialization_backend" : "Cpanel::JSON::XS version 3.0217" } Makefile.PL100644000765000024 340013024421741 16635 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001# This file was automatically generated by Dist::Zilla::Plugin::MakeMaker v6.007. use strict; use warnings; use 5.006; use ExtUtils::MakeMaker; my %WriteMakefileArgs = ( "ABSTRACT" => "Methods for dealing with genomic coordinates.", "AUTHOR" => "BioPerl Team ", "CONFIGURE_REQUIRES" => { "ExtUtils::MakeMaker" => 0 }, "DISTNAME" => "Bio-Coordinate", "LICENSE" => "perl", "MIN_PERL_VERSION" => "5.006", "NAME" => "Bio::Coordinate", "PREREQ_PM" => { "Bio::Location::Simple" => 0, "Bio::Location::Split" => 0, "Bio::LocationI" => 0, "Bio::Root::Root" => 0, "Bio::Root::RootI" => 0, "parent" => 0, "strict" => 0, "utf8" => 0, "warnings" => 0 }, "TEST_REQUIRES" => { "Bio::LocatableSeq" => 0, "Bio::Root::Test" => 0, "Bio::SimpleAlign" => 0, "File::Spec" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Test::More" => 0, "blib" => "1.01" }, "VERSION" => "1.007001", "test" => { "TESTS" => "t/*.t" } ); my %FallbackPrereqs = ( "Bio::LocatableSeq" => 0, "Bio::Location::Simple" => 0, "Bio::Location::Split" => 0, "Bio::LocationI" => 0, "Bio::Root::Root" => 0, "Bio::Root::RootI" => 0, "Bio::Root::Test" => 0, "Bio::SimpleAlign" => 0, "File::Spec" => 0, "IO::Handle" => 0, "IPC::Open3" => 0, "Test::More" => 0, "blib" => "1.01", "parent" => 0, "strict" => 0, "utf8" => 0, "warnings" => 0 ); unless ( eval { ExtUtils::MakeMaker->VERSION(6.63_03) } ) { delete $WriteMakefileArgs{TEST_REQUIRES}; delete $WriteMakefileArgs{BUILD_REQUIRES}; $WriteMakefileArgs{PREREQ_PM} = \%FallbackPrereqs; } delete $WriteMakefileArgs{CONFIGURE_REQUIRES} unless eval { ExtUtils::MakeMaker->VERSION(6.52) }; WriteMakefile(%WriteMakefileArgs); t000755000765000024 013024421741 14771 5ustar00cjfieldsstaff000000000000Bio-Coordinate-1.00700100-compile.t100644000765000024 316613024421741 17171 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/tuse 5.006; use strict; use warnings; # this test was generated with Dist::Zilla::Plugin::Test::Compile 2.054 use Test::More; plan tests => 13 + ($ENV{AUTHOR_TESTING} ? 1 : 0); my @module_files = ( 'Bio/Coordinate.pm', 'Bio/Coordinate/Chain.pm', 'Bio/Coordinate/Collection.pm', 'Bio/Coordinate/ExtrapolatingPair.pm', 'Bio/Coordinate/GeneMapper.pm', 'Bio/Coordinate/Graph.pm', 'Bio/Coordinate/MapperI.pm', 'Bio/Coordinate/Pair.pm', 'Bio/Coordinate/Result.pm', 'Bio/Coordinate/Result/Gap.pm', 'Bio/Coordinate/Result/Match.pm', 'Bio/Coordinate/ResultI.pm', 'Bio/Coordinate/Utils.pm' ); # no fake home requested my $inc_switch = -d 'blib' ? '-Mblib' : '-Ilib'; use File::Spec; use IPC::Open3; use IO::Handle; open my $stdin, '<', File::Spec->devnull or die "can't open devnull: $!"; my @warnings; for my $lib (@module_files) { # see L my $stderr = IO::Handle->new; my $pid = open3($stdin, '>&STDERR', $stderr, $^X, $inc_switch, '-e', "require q[$lib]"); binmode $stderr, ':crlf' if $^O eq 'MSWin32'; my @_warnings = <$stderr>; waitpid($pid, 0); is($?, 0, "$lib loaded ok"); shift @_warnings if @_warnings and $_warnings[0] =~ /^Using .*\bblib/ and not eval { require blib; blib->VERSION('1.01') }; if (@_warnings) { warn @_warnings; push @warnings, @_warnings; } } is(scalar(@warnings), 0, 'no warnings found') or diag 'got warnings: ', ( Test::More->can('explain') ? Test::More::explain(\@warnings) : join("\n", '', @warnings) ) if $ENV{AUTHOR_TESTING}; release-eol.t100644000765000024 205213024421741 17512 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/t BEGIN { unless ($ENV{RELEASE_TESTING}) { print "1..0 # SKIP these tests are for release candidate testing\n"; exit } } use strict; use warnings; # this test was generated with Dist::Zilla::Plugin::EOLTests 0.19 use Test::More 0.88; use Test::EOL; my @files = ( 'lib/Bio/Coordinate.pm', 'lib/Bio/Coordinate/Chain.pm', 'lib/Bio/Coordinate/Collection.pm', 'lib/Bio/Coordinate/ExtrapolatingPair.pm', 'lib/Bio/Coordinate/GeneMapper.pm', 'lib/Bio/Coordinate/Graph.pm', 'lib/Bio/Coordinate/MapperI.pm', 'lib/Bio/Coordinate/Pair.pm', 'lib/Bio/Coordinate/Result.pm', 'lib/Bio/Coordinate/Result/Gap.pm', 'lib/Bio/Coordinate/Result/Match.pm', 'lib/Bio/Coordinate/ResultI.pm', 'lib/Bio/Coordinate/Utils.pm', 't/00-compile.t', 't/CoordinateBoundaryTest.t', 't/CoordinateGraph.t', 't/CoordinateMapper.t', 't/GeneCoordinateMapper.t', 't/author-mojibake.t', 't/author-pod-syntax.t', 't/release-eol.t' ); eol_unix_ok($_, { trailing_whitespace => 1 }) foreach @files; done_testing; CoordinateGraph.t100644000765000024 127013024421741 20367 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/tuse strict; use warnings; BEGIN { use Bio::Root::Test; test_begin(-tests => 7); use_ok('Bio::Coordinate::Graph'); } ok my $graph = Bio::Coordinate::Graph->new(); # graph structure my $dag = { 9 => [], 8 => [9], 7 => [], 6 => [7, 8], 5 => [], 4 => [5], 3 => [6], 2 => [3, 4, 6], 1 => [2] }; ok $graph->hash_of_arrays($dag); my $a = 1; my $b = 6; is my @a = $graph->shortest_path($a, $b), 3; $a = 7; $b = 8; is @a = $graph->shortest_path($a, $b), 1; $a = 8; $b = 9; is @a = $graph->shortest_path($a, $b), 2; $b = 2; is @a = $graph->shortest_path($a, $b), 3; author-mojibake.t100644000765000024 35113024421741 20356 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/t#!perl BEGIN { unless ($ENV{AUTHOR_TESTING}) { print "1..0 # SKIP these tests are for testing by the author\n"; exit } } use strict; use warnings qw(all); use Test::More; use Test::Mojibake; all_files_encoding_ok(); CoordinateMapper.t100644000765000024 4456413024421741 20607 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/tuse strict; use warnings; BEGIN { use Bio::Root::Test; test_begin(-tests => 175); use_ok('Bio::Location::Simple'); use_ok('Bio::Coordinate::Pair'); use_ok('Bio::Coordinate::Result::Match'); use_ok('Bio::Coordinate::Result::Gap'); use_ok('Bio::Coordinate::Chain'); use_ok('Bio::Coordinate::Collection'); } my ($c, $value); ok $c = Bio::Coordinate::Result::Match-> new; ok $c = Bio::Coordinate::Result::Gap-> new; # propepide my $match1 = Bio::Location::Simple->new (-seq_id => 'propeptide', -start => 21, -end => 40, -strand=>1 ); # peptide my $match2 = Bio::Location::Simple->new (-seq_id => 'peptide', -start => 1, -end => 20, -strand=>1 ); ok my $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2, -negative => 0, # false, default ); ok $pair->test; is $pair->strand(), 1; # = in->strand * out->strand is $pair->in->seq_id(), 'propeptide'; my ($count, $pos, $pos2, $res, $match, $res2); # # match within # $pos = Bio::Location::Simple->new (-start => 25, -end => 25, -strand=> -1 ); # results are in Bio::Coordinate::Result # they can be Matches and Gaps; are Bio::LocationIs ok $res = $pair->map($pos); isa_ok $res, 'Bio::Coordinate::Result'; isa_ok $res, 'Bio::Location::SplitLocationI'; is $res->each_match, 1; is $res->each_gap, 0; is $res->each_Location, 1; isa_ok $res->match, 'Bio::LocationI'; isa_ok $res->match, 'Bio::Coordinate::Result::Match'; is $res->match->start, 5; is $res->match->end, 5; is $res->match->strand, -1; is $res->match->seq_id, 'peptide'; is $res->start, 5; is $res->end, 5; is $res->strand, -1; #is $res->seq_id, 'peptide'; # lets do the reverse $match = $res->match; ok $pair->swap; $res2 = $pair->map($match); is $res2->match->start, $pos->start; is $res2->match->end, $pos->end; is $res2->match->strand, $pos->strand; is $res2->match->seq_id, $pair->out->seq_id; ok $pair->swap; # # match outside = Gap # $pos = Bio::Location::Simple->new (-start => 5, -end => 5 ); ok $res = $pair->map($pos); #$res->verbose(2); is $res->each_Location, 1; is $res->each_gap, 1; isa_ok $res->gap, 'Bio::Coordinate::Result::Gap'; isa_ok $res->gap, 'Bio::LocationI'; is $res->gap->strand, 1; is $res->gap->start, 5; is $res->gap->length, $pos->length; is $res->gap->seq_id, 'propeptide'; # # partial match = gap & match # $pos2 = Bio::Location::Simple->new (-start => 20, -end => 22, -strand=> -1 ); ok $res = $pair->map($pos2); is $res->each_match, 1; is $res->each_gap, 1; is $res->each_Location, 2; is $res->match->length + $res->gap->length, $pos2->length; is $res->match->start, 1; is $res->match->end, 2; is $res->match->seq_id, 'peptide'; is $res->match->strand, -1; is $res->gap->start, 20; is $res->gap->end, 20; is $res->gap->seq_id, 'propeptide'; is $res->gap->strand, -1; # # partial match = match & gap # $pos2 = Bio::Location::Simple->new (-start => 40, -end => 41, -strand=> 1 ); ok $res = $pair->map($pos2); is $res->match->length + $res->gap->length, $pos2->length; # #enveloping # $pos2 = Bio::Location::Simple->new (-start => 19, -end => 41, -strand=> 1 ); ok $res = $pair->map($pos2); $count = 0; map {$count += $_->length} $res->each_Location; is $count, $pos2->length; # # Testing insertions # #out $pos = Bio::Location::Simple->new (-start => 5, -end => 6, -location_type=>'^'); $res = $pair->map($pos); is $res->each_gap, 1; is $res->each_Location, 1; #in $pos = Bio::Location::Simple->new (-start => 21, -end => 22, -location_type=>'^'); $res = $pair->map($pos); is $res->each_match, 1; is $res->each_Location, 1; #just before $pos = Bio::Location::Simple->new (-start => 20, -end => 21, -location_type=>'^'); $res = $pair->map($pos); is $res->each_gap, 1; is $res->each_Location, 1; #just after $pos = Bio::Location::Simple->new (-start => 40, -end => 41, -location_type=>'^'); $res = $pair->map($pos); is $res->each_gap, 1; is $res->each_Location, 1; # # strandness # # 11 6 4 2 # -|--------|- # -|--------|- # 2 7 9 11 # # from $match1 = Bio::Location::Simple->new (-seq_id => 'from', -start => 2, -end => 11, -strand=>1 ); # to $match2 = Bio::Location::Simple->new (-seq_id => 'to', -start => 2, -end => 11, -strand=>-1 ); $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2 ); # # match within # ok $pair->test; is $pair->strand(), -1; $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 7, -end => 9, -strand=>1 ); $res = $pair->map($pos); is $res->match->start, 4; is $res->match->end, 6; is $res->match->strand, -1; $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 3, -end => 10, -strand=>-1 ); $res = $pair->map($pos); is $res->match->start, 3; is $res->match->end, 10; is $res->match->strand, 1; # # match outside = Gap # $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 1, -end => 1, -strand=>1 ); $res = $pair->map($pos); is $res->gap->start, 1; is $res->gap->end, 1; is $res->gap->strand, 1; $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 12, -end => 12, -strand=>-1 ); $res = $pair->map($pos); is $res->gap->start, 12; is $res->gap->end, 12; is $res->gap->strand, -1; # # partial match1 = gap & match # $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 1, -end => 7, -strand=>-1 ); $res = $pair->map($pos); is $res->gap->start, 1; is $res->gap->end, 1; is $res->gap->strand, -1; is $res->match->start, 6; is $res->match->end, 11; is $res->match->strand, 1; # # partial match2 = match & gap # $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 9, -end => 12, -strand=>-1 ); $res = $pair->map($pos); is $res->match->start, 2; is $res->match->end, 4; is $res->match->strand, 1; is $res->gap->start, 12; is $res->gap->end, 12; is $res->gap->strand, -1; # #enveloping # $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 1, -end => 12, -strand=>-1 ); $res = $pair->map($pos); is $res->match->start, 2; is $res->match->end, 11; is $res->match->strand, 1; my ($gap1, $gap2) = $res->each_gap; is $gap1->start, 1; is $gap1->end, 1; is $gap1->strand, -1; is $gap2->start, 12; is $gap2->end, 12; is $gap2->strand, -1; # # Chain # # chain (two) mappers together # # propepide $match1 = Bio::Location::Simple->new (-seq_id => 'propeptide', -start => 5, -end => 40, -strand=>1 ); # peptide $match2 = Bio::Location::Simple->new (-seq_id => 'peptide', -start => 1, -end => 36, -strand=>1 ); ok $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2 ); ok my $chain = Bio::Coordinate::Chain->new; ok $chain->add_mapper($pair); $chain->add_mapper($pair); $pos = Bio::Location::Simple->new (-seq_id => 'from', -start => 6, -end => 21, -strand=> 1 ); # 6 -> 2 -> 1 # 21 -> 17 -> 13 $match = $chain->map($pos); isa_ok $match, 'Bio::Coordinate::Result::Match'; is $match->start, 1; is $match->end, 13; is $match->strand, 1; # # Collection # # 1 5 6 10 # |---| |---| #-----|----------------------- # 1 5 9 15 19 # pair1 pair2 # gene $match1 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 5, -end => 9, -strand=>1 ); # exon2 $match2 = Bio::Location::Simple->new (-seq_id => 'exon1', -start => 1, -end => 5, -strand=>1 ); ok my $pair1 = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2, ); # gene my $match3 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 15, -end => 19, -strand=>1 ); # exon my $match4 = Bio::Location::Simple->new (-seq_id => 'exon2', -start => 6, -end => 10, -strand=>1 ); ok my $pair2 = Bio::Coordinate::Pair->new(-in => $match3, -out => $match4, ); ok my $transcribe = Bio::Coordinate::Collection->new; ok $transcribe->add_mapper($pair1); ok $transcribe->add_mapper($pair2); # simple match $pos = Bio::Location::Simple->new (-start => 5, -end => 9 ); ok $res = $transcribe->map($pos); is $res->match->start, 1; is $res->match->end, 5; is $res->match->seq_id, 'exon1'; # flank pre $pos = Bio::Location::Simple->new (-start => 2, -end => 9 ); ok $res = $transcribe->map($pos); is $res->each_gap, 1; is $res->each_match, 1; is $res->match->start, 1; is $res->match->end, 5; # flank post $pos = Bio::Location::Simple->new (-start => 5, -end => 12 ); ok $res = $transcribe->map($pos); is $res->each_gap, 1; is $res->each_match, 1; is $res->match->start, 1; is $res->match->end, 5; # match more than two $pos = Bio::Location::Simple->new (-start => 5, -end => 19 ); ok $res = $transcribe->map($pos); is $res->each_gap, 2; is $res->each_match, 2; # testing sorting # # 1 5 6 10 11 15 # |---| |---| |---| #-----|-----------------------|---|-- # 1 5 9 15 19 25 29 # pair1 pair2 pair3 # # # create the third pair # gene my $match5 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 25, -end => 29, -strand=>1 ); # exon my $match6 = Bio::Location::Simple->new (-seq_id => 'exon3', -start => 11, -end => 15, -strand=>1 ); my $pair3 = Bio::Coordinate::Pair->new(-in => $match5, -out => $match6 ); # create a new collection in wrong order $transcribe = Bio::Coordinate::Collection->new; $transcribe->add_mapper($pair3); $transcribe->add_mapper($pair1); $transcribe->add_mapper($pair2); ok $transcribe->sort; my @res; map {push @res, $_->in->start } $transcribe->each_mapper; ok compare_arrays ([5, 15, 25], \@res); # # Test using genomic data # my $mapper = Bio::Coordinate::Collection->new; load_data($mapper, undef ); # transform a segment entirely within the first rawcontig #test_transform ($mapper, # [627012, 2, 5, -1, "rawcontig"], # ["chr1", 2, 5, -1]); $pos = Bio::Location::Simple->new (-start => 2, -end => 5, -strand => -1); $res = $mapper->map($pos); is $res->match->start, 2; is $res->match->end, 5; is $res->match->strand, -1; is $res->match->seq_id, '627012'; ## now a split coord my @testres = ( [314696, 31917, 31937, -1], [341, 126, 59773, -1], [315843, 5332, 5963, +1] ); $pos = Bio::Location::Simple->new (-start => 383700, -end => 444000, -strand => 1); $res = $mapper->map($pos); @res = $res->each_match; compare (shift @res, shift @testres); compare (shift @res, shift @testres); compare (shift @res, shift @testres); ## now a simple gap @testres = ( [627011, 7447, 7507, +1], ["chr1", 273762, 273781, 1] ); $pos = Bio::Location::Simple->new (-start => 273701, -end => 273781, -strand => 1); $res = $mapper->map($pos); is $res->each_match, 1; is $res->each_gap, 1; @res = $res->each_Location; compare (shift @res, shift @testres); compare (shift @res, shift @testres); ok $mapper->swap; $pos = Bio::Location::Simple->new (-start => 2, -end => 5, -strand => -1, -seq_id => '627012'); $res = $mapper->map($pos); is $res->match->start, 2; is $res->match->end, 5; is $res->match->strand, -1; is $res->match->seq_id, 'chr1'; # # tests for split locations # # testing a simple pair $match1 = Bio::Location::Simple->new (-seq_id => 'a', -start => 5, -end => 17, -strand=>1 ); $match2 = Bio::Location::Simple->new (-seq_id => 'b', -start => 1, -end => 13, -strand=>-1 ); $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2, ); # split location ok my $split = Bio::Location::Split->new(); ok $split->add_sub_Location(Bio::Location::Simple->new(-start=>6, -end=>8, -strand=>1)); $split->add_sub_Location(Bio::Location::Simple->new(-start=>15, -end=>16, -strand=>1)); $res=$pair->map($split); ok my @sublocs = $res->each_Location(1); is @sublocs, 2; #print Dumper \@sublocs; is $sublocs[0]->start, 2; is $sublocs[0]->end, 3; is $sublocs[1]->start, 10; is $sublocs[1]->end, 12; # # from Align # use Bio::Coordinate::Utils; use Bio::LocatableSeq; use Bio::SimpleAlign; my $string; #y $out = IO::String->new($string); #AAA/3-10 --wtatgtng #BBB/1-7 -aaaat-tt- my $s1 = Bio::LocatableSeq->new(-id => 'AAA', -seq => '--wtatgtng', -start => 3, -end => 10, -alphabet => 'dna' ); my $s2 = Bio::LocatableSeq->new(-id => 'BBB', -seq => '-aaaat-tt-', -start => 1, -end => 7, -alphabet => 'dna' ); $a = Bio::SimpleAlign->new(); $a->add_seq($s1); $a->add_seq($s2); ok my $uti = Bio::Coordinate::Utils->new; $mapper = $uti->from_align($a); #print Dumper $mapper; is $mapper->return_match, 1; is $mapper->return_match(1), 1; $pos = Bio::Location::Simple->new (-start => 4, -end => 8, -strand => 1); $res = $mapper->map($pos); #print Dumper $res; exit; # end of tests # # subroutines only after this # sub compare_arrays { my ($first, $second) = @_; return 0 unless @$first == @$second; for (my $i = 0; $i < @$first; $i++) { return 0 if $first->[$i] ne $second->[$i]; } return 1; } sub compare { my ($match, $test) = @_; is $match->seq_id eq $test->[0], 1, "Match: |". $match->seq_id. "| Test: ". $test->[0]. "|"; is $match->start, $test->[1]; is $match->end, $test->[2]; is $match->strand, $test->[3]; } sub load_data { my ($map, $reverse) = @_; #chr_name raw_id chr_start chr_end raw_start raw_end raw_ori my @sgp_dump = split ( /\n/, qq { chr1 627012 1 31276 1 31276 1 chr1 627010 31377 42949 72250 83822 -1 chr1 2768 42950 180950 251 138251 1 chr1 10423 180951 266154 1 85204 -1 chr1 627011 266255 273761 1 7507 1 chr1 314698 273862 283122 1 9261 -1 chr1 627009 283223 331394 251 48422 -1 chr1 314695 331395 352162 1 20768 -1 chr1 314697 352263 359444 1 7182 -1 chr1 314696 359545 383720 31917 56092 -1 chr1 341 383721 443368 126 59773 -1 chr1 315843 443369 444727 5332 6690 1 chr1 315844 444828 453463 1 8636 -1 chr1 315834 453564 456692 1 3129 1 chr1 315831 456793 458919 1 2127 1 chr1 315827 459020 468965 251 10196 -1 chr1 544782 468966 469955 1 990 -1 chr1 315837 470056 473446 186 3576 -1 chr1 544807 473447 474456 1 1010 -1 chr1 315832 474557 477289 1 2733 1 chr1 544806 477390 477601 1086 1297 -1 chr1 315840 477602 482655 21 5074 1 chr1 544802 482656 483460 1 805 -1 chr1 544811 483561 484162 6599 7200 -1 chr1 315829 484163 498439 15 14291 -1 chr1 544813 498440 500980 1 2541 -1 chr1 544773 501081 502190 1217 2326 -1 chr1 315828 502191 513296 72 11177 1 chr1 544815 513297 517276 2179 6158 1 chr1 315836 517277 517662 2958 3343 1 chr1 544805 517663 520643 299 3279 1 chr1 315835 520744 521682 2462 3400 -1 chr1 544784 521683 526369 54 4740 1 chr1 544796 526470 527698 1 1229 1 chr1 315833 527799 528303 2530 3034 -1 chr1 544803 528304 531476 1 3173 -1 chr1 544821 531577 532691 1 1115 1 chr1 544810 532792 533843 1 1052 1 chr1 544800 533944 535249 1 1306 1 chr1 544786 535350 536652 1 1303 1 chr1 544814 536753 538358 1 1606 1 chr1 544812 538459 540004 1 1546 1 chr1 544818 540105 541505 1 1401 1 chr1 544816 541606 542693 1 1088 1 chr1 544778 542794 544023 1 1230 1 chr1 544779 544124 545709 1 1586 1 chr1 544804 545810 547660 1 1851 1 chr1 544774 547761 550105 1 2345 1 chr1 544817 550206 552105 1 1900 1 chr1 544781 552206 553640 1 1435 1 chr1 315830 553741 555769 1 2029 -1 chr1 544819 555870 558904 1 3035 -1 chr1 544777 559005 560670 1 1666 1 chr1 544795 560771 563092 1 2322 1 chr1 544809 563193 565523 1 2331 1 chr1 544808 565624 568113 1 2490 1 chr1 544798 568214 570324 1 2111 1 chr1 544783 570425 574640 1 4216 1 chr1 544824 574741 578101 1 3361 1 chr1 544775 578202 580180 1 1979 -1 chr1 544825 580281 581858 1 1578 -1 chr1 544772 581959 585312 1 3354 1 chr1 544793 585413 588740 1 3328 1 chr1 544785 588841 591656 1 2816 -1 chr1 544791 591757 594687 1 2931 1 chr1 544820 594788 597671 1 2884 1 chr1 544790 597772 601587 1 3816 1 chr1 544794 601688 603324 1 1637 -1 chr1 544823 603425 607433 1 4009 1 chr1 544789 607534 610856 1 3323 1 chr1 544799 610957 614618 1 3662 1 chr1 544776 614719 618674 1 3956 -1 chr1 544797 618775 624522 1 5748 -1 chr1 544787 624623 629720 1 5098 -1 chr1 544792 629821 637065 1 7245 1 chr1 622020 837066 851064 1 13999 -1 chr1 622021 851165 854101 1 2937 -1 chr1 622016 854202 856489 1 2288 -1 chr1 625275 856590 888524 420 32354 -1 chr1 622015 888525 891483 1 2959 -1 chr1 622024 891584 896208 8871 13495 -1 chr1 625537 896209 952170 1 55962 -1 chr1 625538 952271 1051812 251 99792 -1 chr1 625277 1051813 1055193 1 3381 -1 chr1 625266 1055294 1062471 1 7178 -1 chr1 598266 1062572 1086504 11 23943 -1 chr1 625271 1086505 1096571 3943 14009 1 chr1 625265 1096572 1100161 2436 6025 -1 chr1 173125 1100162 1106067 3329 9234 -1 chr1 598265 1106068 1112101 286 6319 1 chr1 625360 1112102 1172572 251 60721 1 chr1 173111 1172573 1172716 1 144 -1 chr1 173103 1172817 1173945 1 1129 1 chr1 170531 1174046 1174188 8791 8933 -1 chr1 625363 1174189 1183590 67 9468 1 chr1 173120 1183591 1183929 153 491 -1 chr1 170509 1183930 1184112 864 1046 1 chr1 173119 1184213 1189703 1 5491 -1 chr1 625357 1189804 1213915 1 24112 1 chr1 625359 1214016 1216330 1 2315 1 } ); # test the auto-sorting feature # @sgp_dump = reverse (@sgp_dump) if defined $reverse; my $first = 1; for my $line ( @sgp_dump ) { if( $first ) { $first = 0; next; } my ( $chr_name, $contig_id, $chr_start, $chr_end, $contig_start, $contig_end, $contig_strand ) = split ( /\t/, $line ); my $match1 = Bio::Location::Simple->new (-seq_id => $chr_name, -start => $chr_start, -end => $chr_end, -strand=>1 ); my $match2 = Bio::Location::Simple->new (-seq_id => $contig_id, -start => $contig_start, -end => $contig_end, -strand=>$contig_strand ); my $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2, ); $map->add_mapper($pair); } return $map; } Bio000755000765000024 013024421741 16005 5ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/libCoordinate.pm100644000765000024 467613024421741 20607 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Biouse strict; use warnings; package Bio::Coordinate; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::VERSION = '1.007001'; # ABSTRACT: Modules for working with biological coordinates # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 # CONTRIBUTOR: Ewan Birney 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate - Modules for working with biological coordinates =head1 VERSION version 1.007001 =head1 SYNOPSIS # create Bio::Coordinate::Pairs or other Bio::Coordinate::MapperIs somehow $pair1; $pair2; # add them into a Collection $collection = Bio::Coordinate::Collection->new; $collection->add_mapper($pair1); $collection->add_mapper($pair2); # create a position and map it $pos = Bio::Location::Simple->new (-start => 5, -end => 9 ); $res = $collection->map($pos); $res->match->start == 1; $res->match->end == 5; # if mapping is many to one (*>1) or many-to-many (*>*) # you have to give seq_id not get unrelevant entries $pos = Bio::Location::Simple->new (-start => 5, -end => 9 -seq_id=>'clone1'); =head1 DESCRIPTION Bio::Coordinate classes are used for working with various biological coordinate systems. See L and L for examples. =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut author-pod-syntax.t100644000765000024 45213024421741 20705 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/t#!perl BEGIN { unless ($ENV{AUTHOR_TESTING}) { print "1..0 # SKIP these tests are for testing by the author\n"; exit } } # This file was automatically generated by Dist::Zilla::Plugin::PodSyntaxTests. use strict; use warnings; use Test::More; use Test::Pod 1.41; all_pod_files_ok(); GeneCoordinateMapper.t100644000765000024 3402113024421741 21371 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/tuse strict; use warnings; BEGIN { use Bio::Root::Test; test_begin(-tests => 116); use_ok('Bio::Location::Simple'); use_ok('Bio::Coordinate::Pair'); use_ok('Bio::Coordinate::ExtrapolatingPair'); use_ok('Bio::Coordinate::GeneMapper'); } # # Extrapolating pairs # # No gaps returned, matches extrapolated # returns always a match or undef # -strict # # the reverse strand pair my $inr = Bio::Location::Simple->new(-start=>2, -end=>5, -strand=>1); my $outr = Bio::Location::Simple->new(-start=>10, -end=>13, -strand=>-1); ok my $pairr = Bio::Coordinate::ExtrapolatingPair-> new(-in => $inr, -out => $outr ); my $posr = Bio::Location::Simple->new (-start => 3, -end => 4, -strand=> 1 ); my $resr = $pairr->map($posr); is $resr->start, 11; is $resr->end, 12; is $resr->strand, -1; # propepide my $match1 = Bio::Location::Simple->new (-seq_id => 'propeptide', -start => 21, -end => 40, -strand=>1 ); # peptide my $match2 = Bio::Location::Simple->new (-seq_id => 'peptide', -start => 1, -end => 20, -strand=>1 ); ok my $pair = Bio::Coordinate::ExtrapolatingPair-> new(-in => $match1, -out => $match2, -strict => 1 ); ok $pair->test; is $pair->strand(), 1; # = in->strand * out->strand is $pair->in->seq_id(), 'propeptide'; is $pair->strict(), 1; my ($count, $pos, $pos2, $res, $match, $res2); # match within $pos = Bio::Location::Simple->new (-start => 25, -end => 25, -strand=> -1 ); $res = $pair->map($pos); isa_ok $res, 'Bio::Location::Simple'; is $res->start, 5; is $res->end, 5; is $res->strand, -1; is $res->seq_id, 'peptide'; # match outside = undef $pos = Bio::Location::Simple->new (-start => 5, -end => 5 ); $res = $pair->map($pos); is $res, undef; # # partial match = match # $pos2 = Bio::Location::Simple->new (-start => 20, -end => 22, -strand=> -1 ); ok $res = $pair->map($pos2); is $res->start, 0; is $res->end, 2; is $res->seq_id, 'peptide'; is $res->strand, -1; # # partial match2 = match & gap # $pos2 = Bio::Location::Simple->new (-start => 40, -end => 41, -strand=> 1 ); ok $res = $pair->map($pos2); is $res->start, 20; is $res->end, 20; # #enveloping # $pos2 = Bio::Location::Simple->new (-start => 19, -end => 41, -strand=> 1 ); ok $res = $pair->map($pos2); is $res->start, 1; is $res->end, 20; # # testing the changing the strand # # chr $match1 = Bio::Location::Simple->new (-seq_id => 'chr', -start => 21, -end => 40, -strand=>1 ); # gene $match2 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 1, -end => 20, -strand=>-1 ); $pair = Bio::Coordinate::ExtrapolatingPair-> #my $pair = Bio::Coordinate::Pair-> new(-in => $match1, -out => $match2, -strict => 0 ); $pos = Bio::Location::Simple->new (-start => 38, -end => 40, -strand=> 1 ); $res = $pair->map($pos); is $res->start, 1; is $res->end, 3; is $res->strand, -1; $pos = Bio::Location::Simple->new (-start => 1, -end => 3, -strand=> 1 ); $res = $pair->map($pos); is $res->start, 38; is $res->end, 40; is $res->strand, -1; # # # Gene Mapper # # ok my $m = Bio::Coordinate::GeneMapper->new(-in => 'propeptide', -out => 'peptide'); #$m->verbose(2); is $m->peptide_offset(5), 5; # match within $pos = Bio::Location::Simple->new (-start => 25, -end => 25, -strand=> 1 ); $res = $m->map($pos); is $res->start, 20; is $res->end, 20; is $res->strand, 1; is $res->seq_id, 'peptide'; # # nozero # # match within $pos = Bio::Location::Simple->new (-start => 4, -end => 5, -strand=> 1 ); $res = $m->map($pos); is $res->start, -1; is $res->end, 0; is $m->nozero('in&out'), 'in&out'; $res = $m->map($pos); is $res->start, -2; is $res->end, -1; is $m->nozero(0), 0; ok $m->swap; $pos = Bio::Location::Simple->new (-start => 5, -end => 5, -strand=> 1 ); $res = $m->map($pos); is $res->start, 10; # cds -> propeptide is $m->in('cds'), 'cds'; is $m->out('propeptide'), 'propeptide'; $res = $m->map($pos); is $res->start, 2; ok $res = $m->_translate($pos); is $res->start, 2; ok $res = $m->_reverse_translate($pos); is $res->start, 13; is $res->end, 15; $pos = Bio::Location::Simple->new (-start => 26, -end => 26, -strand=> 1 ); $m->out('peptide'); $res = $m->map($pos); is $res->start, 4; # # frame # $pos = Bio::Location::Simple->new (-start => 1, -end => 3, -strand=> 1 ); $res = $m->_frame($pos); is $res->start, 1; is $res->end, 3; # Collection representing exons # # cds 1 5 6 10 11 15 # exon 1 5 1 5 1 5 # gene 1 5 11 15 21 25 # |---| |---| |---| #-----|-----------------------|---|-- # chr 1 5 9 15 19 25 29 # pair1 pair2 pair3 # gene my $e1 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 5, -end => 9, -strand=>1 ); my $e2 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 15, -end => 19, -strand=>1 ); my $e3 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 25, -end => 29, -strand=>1 ); my @cexons = ($e1, $e2, $e3); $m= Bio::Coordinate::GeneMapper->new(); $m->in('chr'); $m->out('gene'); my $off = $m->cds(5); is $off->start, 5; # start of the coding region is $m->exons(@cexons), 3; $m->out('exon'); $pos = Bio::Location::Simple->new (-start => 6, -end => 7, -strand=> 1 ); $res = $m->map($pos); is $res->start, 2; is $res->end, 3; $m->out('negative_intron'); $pos = Bio::Location::Simple->new (-start => 12, -end => 14, -strand=> 1 ); $res = $m->map($pos); is $res->start, -3; is $res->end, -1; is $res->seq_id, 'intron1'; # cds $m->out('cds'); $pos = Bio::Location::Simple->new (-start => 5, -end => 9, -strand=> 1 ); $res = $m->map($pos); is $res->start, 1; is $res->end, 5; $pos = Bio::Location::Simple->new (-start => 15, -end => 25, -strand=> 1 ); $res = $m->map($pos); is $res->start, 6; is $res->end, 11; $pos = Bio::Location::Simple->new (-start => 5, -end => 19, -strand=> 1 ); $res = $m->map($pos); is $res->start, 1; is $res->end, 10; # # chr to cds ; ranges into one # my $exons = Bio::Location::Split->new(-seq_id => 'gene'); $exons->add_sub_Location($e1); $exons->add_sub_Location($e2); $exons->add_sub_Location($e3); $res = $m->map($exons); isa_ok $res,'Bio::Location::Simple'; is $res->start, 1; is $res->end, 15; # # cds to chr; single range into two # $m->in('cds'); $m->out('gene'); $pos = Bio::Location::Simple->new (-start => 4, -end => 7, -strand=> 1 ); $res = $m->map($pos); is $res->start, 4; is $res->end, 12; # Collection representing exons # # cds -11 -7 -6 -2 -1 3 :27 # cds -6 -2 -1 1 3 4 8 :17 # exon 1 5 1 5 1 5 # gene -21 -17 -11 -7 -1 1 3 :27 # gene -11 -7 -1 1 3 9 13 :17 # |---| |---| |---| #-----|-----------------------|---|-- # chr 1 5 9 15 19 25 29 # pair1 pair2 pair3 $m= Bio::Coordinate::GeneMapper->new(); $m->in('chr'); $m->out('gene'); $off = $m->cds(17); is $off->start, 17; # start of the coding region is $m->exons(@cexons), 3; # testing parameter handling in the constructor ok $m = Bio::Coordinate::GeneMapper->new(-in => 'gene', -out => 'peptide', -cds => 3, -exons => @cexons, -utr => 7, -peptide_offset => 5 ); # # Real life data # Mapping SNPs into human serum protein MSE55 and # human galecting LGALS2 from Ensembl: # #Ensembl Gene ID Exon Start (Chr bp) Exon End (Chr bp) Exon Coding Start (Chr bp) # Exon Coding End (Chr bp) Strand my @gene1_dump = split ( /\n/, qq { ENSG00000128283 34571058 34571126 1 ENSG00000128283 34576610 34577350 34576888 34577350 1 ENSG00000128283 34578646 34579858 34578646 34579355 1 }); my @gene2_dump = split ( /\n/, qq { ENSG00000100079 34590438 34590464 -1 ENSG00000100079 34582387 34582469 34582387 34582469 -1 ENSG00000100079 34581114 34581273 34581114 34581273 -1 ENSG00000100079 34580784 34580950 34580804 34580950 -1 }); # exon start should be less than end or is this intentional? #Chromosome Name Location (bp) Strand Reference ID my @snp_dump = split ( /\n/, qq { 22 34572694 1 2235335 22 34572799 1 2235336 22 34572843 1 2235337 22 34574896 1 2076087 22 34575256 1 2076088 22 34578830 1 2281098 22 34579111 1 2281099 22 34580411 1 2235338 22 34580591 1 2281097 22 34580845 1 2235339 22 34581963 1 2281100 22 34583722 1 140057 22 34585003 1 140058 22 34587726 1 968725 22 34588207 1 2284055 22 34591507 1 1969639 22 34591949 1 140059 }); shift @snp_dump; my ($cdsr, @exons) = read_gene_data(@gene1_dump); ok my $g1 = Bio::Coordinate::GeneMapper->new(-in=>'chr', -out=>'gene'); $g1->cds($cdsr); #$pos = Bio::Location::Simple->new # (-start => 34576888, -end => 34579355, -strand=> 1 ); $res = $g1->map($cdsr); is $res->start, 1; is $res->end, 2468; $g1->exons(@exons); $g1->in('gene'); $g1->out('cds'); $res = $g1->map($res); is $res->start, 1; is $res->end, 1173; #map_snps($g1, @snp_dump); #gene 2 in reverse strand ($cdsr, @exons) = read_gene_data(@gene2_dump); ok my $g2 = Bio::Coordinate::GeneMapper->new(-in=>'chr', -out=>'gene'); $g2->cds($cdsr); $pos = Bio::Location::Simple->new (-start => $cdsr->end-2, -end => $cdsr->end, -strand=> 1 ); $res = $g2->map($pos); is $res->start, 1; is $res->end, 3; is $res->strand, -1; $g2->exons(@exons); #map_snps($g2, @snp_dump); $match1 = Bio::Location::Simple->new (-seq_id => 'a', -start => 5, -end => 17, -strand=>1 ); $match2 = Bio::Location::Simple->new (-seq_id => 'b', -start => 1, -end => 13, -strand=>-1 ); ok $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2, ); # # split location # ok my $split = Bio::Location::Split->new(); ok $split->add_sub_Location(Bio::Location::Simple->new(-start=>6, -end=>8, -strand=>1)); $split->add_sub_Location(Bio::Location::Simple->new(-start=>15, -end=>16, -strand=>1)); $res=$pair->map($split); ok my @sublocs = $res->each_Location(1); is @sublocs, 2; #print Dumper \@sublocs; is $sublocs[0]->start, 2; is $sublocs[0]->end, 3; is $sublocs[1]->start, 10; is $sublocs[1]->end, 12; # testing cds -> gene/chr which generates a split location from a simple one # exons in reverse strand! # # pept 33222 111 # cds 8 4 3 1-1 # exon 5 1 5 1 # gene 13 9 3 1-2 # |---| |---| #-----|------------------- # chr 1 5 9 15 19 # e1 e2 # gene $e1 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 5, -end => 9, -strand=>-1 ); $e2 = Bio::Location::Simple->new (-seq_id => 'gene', -start => 15, -end => 19, -strand=>-1 ); @cexons = ($e1, $e2); my $cds= Bio::Location::Simple->new (-seq_id => 'gene', -start => 5, -end => 17, -strand=>-1 ); $m = Bio::Coordinate::GeneMapper->new(-in=>'cds', -out=>'chr'); $m->cds($cds); # this has to be set first!? is $m->exons(@cexons), 2; my $cds_f= Bio::Location::Simple->new (-start => 2, -end => 7, ); $res = $m->map($cds_f); ok @sublocs = $res->each_Location(1); is @sublocs, 2; is $sublocs[0]->start, 6; is $sublocs[0]->end, 9; is $sublocs[1]->start, 15; is $sublocs[1]->end, 16; # test inex, exon & negative_intron $m->in('gene'); $m->out('inex'); $pos = Bio::Location::Simple->new (-seq_id => 'gene', -start => 2, -end => 10, -strand=> 1 ); $res = $m->map($pos); is $res->each_Location, 3; $m->out('intron'); $res = $m->map($pos); is $res->start, 1; is $res->end, 5; is $res->strand, 1; $m->out('negative_intron'); $res = $m->map($pos); is $res->start, -5; is $res->end, -1; is $res->strand, 1; is $m->_mapper_code2string('1-2'), 'chr-gene'; is $m->_mapper_string2code('chr-gene'), '1-2'; #todo: # strict mapping mode # extrapolating pair code into Bio::Coordinate::Pair ? sub read_gene_data { my ($self,@gene_dump) = @_; my ($cds_start, $cds_end, $strand, @exons); #one line per exon my ($first, $first_line); for my $line ( @gene_dump ) { my ($geneid, $exon_start, $exon_end, $exon_cstart, $exon_cend, $exon_strand) = split /\t/, $line; $strand = $exon_strand if $exon_strand; #print join (' ', $geneid, $exon_start, $exon_strand), "\n"; # CDS location in chromosome coordinates $cds_start = $exon_cstart if !$cds_start and $exon_cstart; $cds_end = $exon_cend if $exon_cend; if ($exon_start > $exon_end) { ($exon_start, $exon_end) = ($exon_end, $exon_start); } my $exon = Bio::Location::Simple->new (-seq_id => 'gene', -start => $exon_start, -end => $exon_end, -strand=>$strand, -verbose=>2); push @exons, $exon; } if ($cds_start > $cds_end) { ($cds_start, $cds_end) = ($cds_end, $cds_start); } my $cdsr = Bio::Location::Simple->new (-start => $cds_start, -end => $cds_end, -strand=> $strand); return ($cdsr, @exons); } sub map_snps { my ($mapper, @snps) =@_; $mapper->in('chr'); $mapper->out('cds'); foreach my $line (@snps) { $mapper->out('cds'); my ($chr, $start, $strand, $id) = split /\t/, $line; my $loc = Bio::Location::Simple->new ( -start => $start, -end => $start, -strand=>$strand ); my $res = $mapper->map($loc); my $cds_start = 0; $cds_start = $res->start if defined $res;#defined $res->start; print $id, "\t", $cds_start, "\n"; # coding if ($cds_start) { $mapper->out('propeptide'); my $frame_obj = $mapper->_frame($res); my $res = $mapper->map($loc); my $cds_start = 0; $cds_start = $res->start if defined $res;#defined $res->start; print "\t\t", $cds_start, " (", $frame_obj->start, ")\n"; } } } Coordinate000755000765000024 013024421741 20074 5ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/BioPair.pm100644000765000024 2741313024421741 21514 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Pair; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Pair::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Coordinate::Result; use Bio::Coordinate::Result::Match; use Bio::Coordinate::Result::Gap; use parent qw(Bio::Root::Root Bio::Coordinate::MapperI); # ABSTRACT: Continuous match between two coordinate sets. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 sub new { my($class,@args) = @_; my $self = $class->SUPER::new(@args); my($in, $out) = $self->_rearrange([qw(IN OUT )], @args); $in && $self->in($in); $out && $self->out($out); return $self; # success - we hope! } sub in { my ($self,$value) = @_; if( defined $value) { $self->throw("Not a valid input Bio::Location [$value] ") unless $value->isa('Bio::LocationI'); $self->{'_in'} = $value; } return $self->{'_in'}; } sub out { my ($self,$value) = @_; if( defined $value) { $self->throw("Not a valid output coordinate Bio::Location [$value] ") unless $value->isa('Bio::LocationI'); $self->{'_out'} = $value; } return $self->{'_out'}; } sub swap { my ($self) = @_; ($self->{'_in'}, $self->{'_out'}) = ($self->{'_out'}, $self->{'_in'}); return 1; } sub strand { my ($self) = @_; $self->warn("Outgoing coordinates are not defined") unless $self->out; $self->warn("Incoming coordinates are not defined") unless $self->in; return ($self->in->strand || 0) * ($self->out->strand || 0); } sub test { my ($self) = @_; $self->warn("Outgoing coordinates are not defined") unless $self->out; $self->warn("Incoming coordinates are not defined") unless $self->in; return ($self->in->end - $self->in->start) == ($self->out->end - $self->out->start); } sub map { my ($self,$value) = @_; $self->throw("Need to pass me a value.") unless defined $value; $self->throw("I need a Bio::Location, not [$value]") unless $value->isa('Bio::LocationI'); $self->throw("Input coordinate system not set") unless $self->in; $self->throw("Output coordinate system not set") unless $self->out; if ($value->isa("Bio::Location::SplitLocationI")) { my $result = Bio::Coordinate::Result->new(); foreach my $loc ( $value->sub_Location(1) ) { my $res = $self->_map($loc); map { $result->add_sub_Location($_) } $res->each_Location; } return $result; } else { return $self->_map($value); } } sub _map { my ($self,$value) = @_; my $result = Bio::Coordinate::Result->new(); my $offset = $self->in->start - $self->out->start; my $start = $value->start - $offset; my $end = $value->end - $offset; my $match = Bio::Location::Simple->new; $match->location_type($value->location_type); $match->strand($self->strand); #within # |-------------------------| # |-| if ($start >= $self->out->start and $end <= $self->out->end) { $match->seq_id($self->out->seq_id); $result->seq_id($self->out->seq_id); if ($self->strand >= 0) { $match->start($start); $match->end($end); } else { $match->start($self->out->end - $end + $self->out->start); $match->end($self->out->end - $start + $self->out->start); } if ($value->strand) { $match->strand($match->strand * $value->strand); $result->strand($match->strand); } bless $match, 'Bio::Coordinate::Result::Match'; $result->add_sub_Location($match); } #out # |-------------------------| # |-| or |-| elsif ( ($end < $self->out->start or $start > $self->out->end ) or #insertions just outside the range need special settings ($value->location_type eq 'IN-BETWEEN' and ($end = $self->out->start or $start = $self->out->end))) { $match->seq_id($self->in->seq_id); $result->seq_id($self->in->seq_id); $match->start($value->start); $match->end($value->end); $match->strand($value->strand); bless $match, 'Bio::Coordinate::Result::Gap'; $result->add_sub_Location($match); } #partial I # |-------------------------| # |-----| elsif ($start < $self->out->start and $end <= $self->out->end ) { $result->seq_id($self->out->seq_id); if ($value->strand) { $match->strand($match->strand * $value->strand); $result->strand($match->strand); } my $gap = Bio::Location::Simple->new; $gap->start($value->start); $gap->end($self->in->start - 1); $gap->strand($value->strand); $gap->seq_id($self->in->seq_id); bless $gap, 'Bio::Coordinate::Result::Gap'; $result->add_sub_Location($gap); # match $match->seq_id($self->out->seq_id); if ($self->strand >= 0) { $match->start($self->out->start); $match->end($end); } else { $match->start($self->out->end - $end + $self->out->start); $match->end($self->out->end); } bless $match, 'Bio::Coordinate::Result::Match'; $result->add_sub_Location($match); } #partial II # |-------------------------| # |------| elsif ($start >= $self->out->start and $end > $self->out->end ) { $match->seq_id($self->out->seq_id); $result->seq_id($self->out->seq_id); if ($value->strand) { $match->strand($match->strand * $value->strand); $result->strand($match->strand); } if ($self->strand >= 0) { $match->start($start); $match->end($self->out->end); } else { $match->start($self->out->start); $match->end($self->out->end - $start + $self->out->start); } bless $match, 'Bio::Coordinate::Result::Match'; $result->add_sub_Location($match); my $gap = Bio::Location::Simple->new; $gap->start($self->in->end + 1); $gap->end($value->end); $gap->strand($value->strand); $gap->seq_id($self->in->seq_id); bless $gap, 'Bio::Coordinate::Result::Gap'; $result->add_sub_Location($gap); } #enveloping # |-------------------------| # |---------------------------------| elsif ($start < $self->out->start and $end > $self->out->end ) { $result->seq_id($self->out->seq_id); if ($value->strand) { $match->strand($match->strand * $value->strand); $result->strand($match->strand); } # gap1 my $gap1 = Bio::Location::Simple->new; $gap1->start($value->start); $gap1->end($self->in->start - 1); $gap1->strand($value->strand); $gap1->seq_id($self->in->seq_id); bless $gap1, 'Bio::Coordinate::Result::Gap'; $result->add_sub_Location($gap1); # match $match->seq_id($self->out->seq_id); $match->start($self->out->start); $match->end($self->out->end); bless $match, 'Bio::Coordinate::Result::Match'; $result->add_sub_Location($match); # gap2 my $gap2 = Bio::Location::Simple->new; $gap2->start($self->in->end + 1); $gap2->end($value->end); $gap2->strand($value->strand); $gap2->seq_id($self->in->seq_id); bless $gap2, 'Bio::Coordinate::Result::Gap'; $result->add_sub_Location($gap2); } else { $self->throw("Should not be here!"); } return $result; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Pair - Continuous match between two coordinate sets. =head1 VERSION version 1.007001 =head1 SYNOPSIS use Bio::Location::Simple; use Bio::Coordinate::Pair; my $match1 = Bio::Location::Simple->new (-seq_id => 'propeptide', -start => 21, -end => 40, -strand=>1 ); my $match2 = Bio::Location::Simple->new (-seq_id => 'peptide', -start => 1, -end => 20, -strand=>1 ); my $pair = Bio::Coordinate::Pair->new(-in => $match1, -out => $match2 ); # location to match $pos = Bio::Location::Simple->new (-start => 25, -end => 25, -strand=> -1 ); # results are in a Bio::Coordinate::Result # they can be Matches and Gaps; are Bio::LocationIs $res = $pair->map($pos); $res->isa('Bio::Coordinate::Result'); $res->each_match == 1; $res->each_gap == 0; $res->each_Location == 1; $res->match->start == 5; $res->match->end == 5; $res->match->strand == -1; $res->match->seq_id eq 'peptide'; =head1 DESCRIPTION This class represents a one continuous match between two coordinate systems represented by Bio::Location::Simple objects. The relationship is directed and reversible. It implements methods to ensure internal consistency, and map continuous and split locations from one coordinate system to another. The map() method returns Bio::Coordinate::Results with Bio::Coordinate::Result::Gaps. The calling code have to deal (process or ignore) them. =head1 METHODS =head2 new =head2 in Title : in Usage : $obj->in('peptide'); Function: Set and read the input coordinate system. Example : Returns : value of input system Args : new value (optional), Bio::LocationI =head2 out Title : out Usage : $obj->out('peptide'); Function: Set and read the output coordinate system. Example : Returns : value of output system Args : new value (optional), Bio::LocationI =head2 swap Title : swap Usage : $obj->swap; Function: Swap the direction of mapping; input <-> output Example : Returns : 1 Args : =head2 strand Title : strand Usage : $obj->strand; Function: Get strand value for the pair Example : Returns : ( 1 | 0 | -1 ) Args : =head2 test Title : test Usage : $obj->test; Function: test that both components are of the same length Example : Returns : ( 1 | undef ) Args : =head2 map Title : map Usage : $newpos = $obj->map($pos); Function: Map the location from the input coordinate system to a new value in the output coordinate system. Example : Returns : new Bio::LocationI in the output coordinate system or undef Args : Bio::LocationI object =head1 INTERNAL METHODS =head2 _map Title : _map Usage : $newpos = $obj->_map($simpleloc); Function: Internal method that does the actual mapping. Called multiple times by map() if the location to be mapped is a split location Example : Returns : new location in the output coordinate system or undef Args : Bio::Location::Simple =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut CoordinateBoundaryTest.t100644000765000024 3044013024421741 21772 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/t## Test for a suspected bug and tests for debugging. use strict; use warnings; BEGIN { use Bio::Root::Test; test_begin( -tests => 174 ); use_ok('Bio::Location::Simple'); use_ok('Bio::Coordinate::Pair'); } ## Set up two coordinate systems for the same sequence ## The contig ok my $ctg = Bio::Location::Simple-> new( -seq_id => 'ctg', -start => 1, -end => 1001, -strand => +1, ); isa_ok $ctg, 'Bio::Location::Simple'; ## The contig in the chromosome ok my $ctg_on_chr_f = Bio::Location::Simple-> new( -seq_id => 'ctg on chr f', -start => 5001, -end => 6001, -strand => +1, ); isa_ok $ctg_on_chr_f, 'Bio::Location::Simple'; ## The contig in the chromosome (again) ok my $ctg_on_chr_r = Bio::Location::Simple-> new( -seq_id => 'ctg on chr r', -start => 5001, -end => 6001, -strand => -1, ); isa_ok $ctg_on_chr_r, 'Bio::Location::Simple'; ## Set up the mapping between them ok my $agp_f = Bio::Coordinate::Pair-> new( -in => $ctg, -out => $ctg_on_chr_f ); isa_ok $agp_f, 'Bio::Coordinate::Pair'; ok my $agp_r = Bio::Coordinate::Pair-> new( -in => $ctg, -out => $ctg_on_chr_r ); isa_ok $agp_r, 'Bio::Coordinate::Pair'; ## Perform some very basic sanity testing on the resulting map objects ## f ok $agp_f->test; is $agp_f->in->seq_id, 'ctg'; is $agp_f->in->start, 1; is $agp_f->in->end, 1001; is $agp_f->in->strand, +1; is $agp_f->out->seq_id, 'ctg on chr f'; is $agp_f->out->start, 5001; is $agp_f->out->end, 6001; is $agp_f->out->strand, +1; ## r ok $agp_r->test; is $agp_r->in->seq_id, 'ctg'; is $agp_r->in->start, 1; is $agp_r->in->end, 1001; is $agp_r->in->strand, +1; is $agp_r->out->seq_id, 'ctg on chr r'; is $agp_r->out->start, 5001; is $agp_r->out->end, 6001; is $agp_r->out->strand, -1; ## ## Map a particular match through both map objects ## ## Define the match 1 ok my $match_on_ctg_1 = Bio::Location::Simple-> new( -seq_id => 'hit 1', -start => 25, -end => 225, -strand => +1, ); isa_ok $match_on_ctg_1, 'Bio::LocationI'; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_1_f = $agp_f->map( $match_on_ctg_1 ); isa_ok $match_on_chr_1_f, 'Bio::Coordinate::Result'; ok my $match_on_chr_1_r = $agp_r->map( $match_on_ctg_1 ); isa_ok $match_on_chr_1_r, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_ctg_1->seq_id, 'hit 1'; is $match_on_ctg_1->start, 25; is $match_on_ctg_1->end, 225; is $match_on_ctg_1->strand, +1; is $match_on_chr_1_f->seq_id, 'ctg on chr f'; is $match_on_chr_1_f->start, 5025; is $match_on_chr_1_f->end, 5225; is $match_on_chr_1_f->strand, +1; is $match_on_chr_1_r->seq_id, 'ctg on chr r'; is $match_on_chr_1_r->start, 5777; is $match_on_chr_1_r->end, 5977; is $match_on_chr_1_r->strand, -1; ## Define the match 2 ok my $match_on_ctg_2 = Bio::Location::Simple-> new( -seq_id => 'hit 2', -start => 25, -end => 225, -strand => -1, ); isa_ok $match_on_ctg_2, 'Bio::LocationI'; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_2_f = $agp_f->map( $match_on_ctg_2 ); isa_ok $match_on_chr_2_f, 'Bio::Coordinate::Result'; ok my $match_on_chr_2_r = $agp_r->map( $match_on_ctg_2 ); isa_ok $match_on_chr_2_r, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_ctg_2->seq_id, 'hit 2'; is $match_on_ctg_2->start, 25; is $match_on_ctg_2->end, 225; is $match_on_ctg_2->strand, -1; is $match_on_chr_2_f->seq_id, 'ctg on chr f'; is $match_on_chr_2_f->start, 5025; is $match_on_chr_2_f->end, 5225; is $match_on_chr_2_f->strand, -1; is $match_on_chr_2_r->seq_id, 'ctg on chr r'; is $match_on_chr_2_r->start, 5777; is $match_on_chr_2_r->end, 5977; is $match_on_chr_2_r->strand, +1; ## OK, now we can get down to some debugging... ## TEST ONE ## Create a match that goes off the end of the contig ## Define the match 3 ok my $match_on_ctg_3 = Bio::Location::Simple-> new( -seq_id => 'hit 3', -start => 925, -end => 1125, -strand => +1, ); isa_ok $match_on_ctg_3, 'Bio::LocationI'; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_3_f = $agp_f->map( $match_on_ctg_3 ); isa_ok $match_on_chr_3_f, 'Bio::Coordinate::Result'; ok my $match_on_chr_3_r = $agp_r->map( $match_on_ctg_3 ); isa_ok $match_on_chr_3_r, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_ctg_3->seq_id, 'hit 3'; is $match_on_ctg_3->start, 925; is $match_on_ctg_3->end, 1125; is $match_on_ctg_3->strand, +1; is $match_on_chr_3_f->seq_id, 'ctg on chr f'; is $match_on_chr_3_f->start, 5925; isnt $match_on_chr_3_f->end, 6125; # Gets truncated to maximum! is $match_on_chr_3_f->end, 6001; # Gets truncated to maximum! is $match_on_chr_3_f->strand, +1; #print Dumper $match_on_ctg_3; #print Dumper $match_on_chr_3_f; is $match_on_chr_3_r->seq_id, 'ctg on chr r'; isnt $match_on_chr_3_r->start, 4877; # Gets truncated to minimum! is $match_on_chr_3_r->start, 5001; # Gets truncated to minimum! is $match_on_chr_3_r->end, 5077; #is $match_on_chr_3_r->strand, -1; # FAIL is $match_on_chr_3_r->strand, undef; # See Bio::Location::Split #print Dumper $match_on_ctg_3; #print Dumper $match_on_chr_3_r; ## Define the match 4 ok my $match_on_ctg_4 = Bio::Location::Simple-> new( -seq_id => 'hit 4', -start => 925, -end => 1125, -strand => -1, ); isa_ok $match_on_ctg_4, 'Bio::LocationI'; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_4_f = $agp_f->map( $match_on_ctg_4 ); isa_ok $match_on_chr_4_f, 'Bio::Coordinate::Result'; ok my $match_on_chr_4_r = $agp_r->map( $match_on_ctg_4 ); isa_ok $match_on_chr_4_r, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_ctg_4->seq_id, 'hit 4'; is $match_on_ctg_4->start, 925; is $match_on_ctg_4->end, 1125; is $match_on_ctg_4->strand, -1; is $match_on_chr_4_f->seq_id, 'ctg on chr f'; is $match_on_chr_4_f->start, 5925; isnt $match_on_chr_4_f->end, 6125; # Gets truncated to maximum! is $match_on_chr_4_f->end, 6001; # Gets truncated to maximum! is $match_on_chr_4_f->strand, -1; #print Dumper $match_on_ctg_4; #print Dumper $match_on_chr_4_f; is $match_on_chr_4_r->seq_id, 'ctg on chr r'; isnt $match_on_chr_4_r->start, 4877; # Gets truncated to minimum! is $match_on_chr_4_r->start, 5001; # Gets truncated to minimum! is $match_on_chr_4_r->end, 5077; #is $match_on_chr_4_r->strand, +1; # FAIL is $match_on_chr_4_r->strand, undef; # See Bio::Location::Split #print Dumper $match_on_ctg_4; #print Dumper $match_on_chr_4_r; ### ### NOW! NONE OF THE ABOVE SHOULD BE AFFECTED BY LEAVING OFF seq_id ### NOW SHOULD IT?! ### ## Try commenting out the three -seq_id lines below to observe strange ## interactions! ## The contig ok my $ctg_x = Bio::Location::Simple-> new( -seq_id => 'ctg', -start => 1, -end => 1001, -strand => +1, ); isa_ok $ctg_x, 'Bio::Location::Simple'; ## The contig in the chromosome ok my $ctg_on_chr_f_x = Bio::Location::Simple-> new( -seq_id => 'ctg on chr f', -start => 5001, -end => 6001, -strand => +1, ); isa_ok $ctg_on_chr_f_x, 'Bio::Location::Simple'; ## The contig in the chromosome (again) ok my $ctg_on_chr_r_x = Bio::Location::Simple-> new( -seq_id => 'ctg on chr r', -start => 5001, -end => 6001, -strand => -1, ); isa_ok $ctg_on_chr_r_x, 'Bio::Location::Simple'; ## Set up the mapping between them ok my $agp_xf = Bio::Coordinate::Pair-> new( -in => $ctg_x, -out => $ctg_on_chr_f_x ); isa_ok $agp_xf, 'Bio::Coordinate::Pair'; ok my $agp_xr = Bio::Coordinate::Pair-> new( -in => $ctg_x, -out => $ctg_on_chr_r_x ); isa_ok $agp_xr, 'Bio::Coordinate::Pair'; ## Perform some very basic sanity testing on the resulting map objects ## f ok $agp_xf->test; is $agp_xf->in->start, 1; is $agp_xf->in->end, 1001; is $agp_xf->in->strand, +1; is $agp_xf->out->start, 5001; is $agp_xf->out->end, 6001; is $agp_xf->out->strand, +1; ## r ok $agp_r->test; is $agp_xr->in->start, 1; is $agp_xr->in->end, 1001; is $agp_xr->in->strand, +1; is $agp_xr->out->start, 5001; is $agp_xr->out->end, 6001; is $agp_xr->out->strand, -1; ## ## Map a particular match through both map objects ## # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_1_xf = $agp_xf->map( $match_on_ctg_1 ); isa_ok $match_on_chr_1_xf, 'Bio::Coordinate::Result'; ok my $match_on_chr_1_xr = $agp_xr->map( $match_on_ctg_1 ); isa_ok $match_on_chr_1_xr, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_chr_1_xf->start, 5025; is $match_on_chr_1_xf->end, 5225; is $match_on_chr_1_xf->strand, +1; is $match_on_chr_1_xr->start, 5777; is $match_on_chr_1_xr->end, 5977; is $match_on_chr_1_xr->strand, -1; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_2_xf = $agp_xf->map( $match_on_ctg_2 ); isa_ok $match_on_chr_2_xf, 'Bio::Coordinate::Result'; ok my $match_on_chr_2_xr = $agp_xr->map( $match_on_ctg_2 ); isa_ok $match_on_chr_2_xr, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_chr_2_xf->start, 5025; is $match_on_chr_2_xf->end, 5225; is $match_on_chr_2_xf->strand, -1; is $match_on_chr_2_xr->start, 5777; is $match_on_chr_2_xr->end, 5977; is $match_on_chr_2_xr->strand, +1; # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_3_xf = $agp_xf->map( $match_on_ctg_3 ); isa_ok $match_on_chr_3_xf, 'Bio::Coordinate::Result'; ok my $match_on_chr_3_xr = $agp_xr->map( $match_on_ctg_3 ); isa_ok $match_on_chr_3_xr, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_chr_3_xf->start, 5925; isnt $match_on_chr_3_xf->end, 6125; # Gets truncated to maximum! is $match_on_chr_3_xf->end, 6001; # Gets truncated to maximum! is $match_on_chr_3_xf->strand, +1; isnt $match_on_chr_3_xr->start, 4877; # Gets truncated to minimum! is $match_on_chr_3_xr->start, 5001; # Gets truncated to minimum! is $match_on_chr_3_xr->end, 5077; #is $match_on_chr_3_xr->strand, -1; # FAIL is $match_on_chr_3_xr->strand, undef; # See Bio::Location::Split # Convert the match from contig into chromosomal coordinates ok my $match_on_chr_4_xf = $agp_xf->map( $match_on_ctg_4 ); isa_ok $match_on_chr_4_xf, 'Bio::Coordinate::Result'; ok my $match_on_chr_4_xr = $agp_xr->map( $match_on_ctg_4 ); isa_ok $match_on_chr_4_xr, 'Bio::Coordinate::Result'; ## Perform some very basic sanity testing on the match objects is $match_on_chr_4_xf->start, 5925; isnt $match_on_chr_4_xf->end, 6125; # Gets truncated to maximum! is $match_on_chr_4_xf->end, 6001; # Gets truncated to maximum! is $match_on_chr_4_xf->strand, -1; isnt $match_on_chr_4_xr->start, 4877; # Gets truncated to minimum! is $match_on_chr_4_xr->start, 5001; # Gets truncated to minimum! is $match_on_chr_4_xr->end, 5077; #is $match_on_chr_4_xr->strand, +1; # FAIL is $match_on_chr_4_xr->strand, undef; # See Bio::Location::Split Chain.pm100644000765000024 1207113024421741 21635 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Chain; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Chain::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Root::Root; use Bio::Coordinate::Result; use parent qw(Bio::Coordinate::Collection Bio::Coordinate::MapperI); # ABSTRACT: Mapping locations through a chain of coordinate mappers. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 # CONTRIBUTOR: Ewan Birney sub map { my ($self,$value) = @_; $self->throw("Need to pass me a value.") unless defined $value; $self->throw("I need a Bio::Location, not [$value]") unless $value->isa('Bio::LocationI'); $self->throw("No coordinate mappers!") unless $self->each_mapper; my $res = Bio::Coordinate::Result->new(); foreach my $mapper ($self->each_mapper) { my $res = $mapper->map($value); return unless $res->each_match; $value = $res->match; } return $value; } sub sort{ my ($self) = @_; $self->warn("You do not really want to sort your chain, do you!\nDoing nothing."); } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Chain - Mapping locations through a chain of coordinate mappers. =head1 VERSION version 1.007001 =head1 SYNOPSIS # create Bio::Coordinate::Pairs, or any MapperIs, somehow $pair1; $pair2; # add them into a Chain $collection = Bio::Coordinate::Chain->new; $collection->add_mapper($pair1); $collection->add_mapper($pair2); # create a position and map it $pos = Bio::Location::Simple->new (-start => 5, -end => 9 ); $match = $collection->map($pos); if ($match) { sprintf "Matches at %d-%d\n", $match->start, $match->end, } else { print "No match\n"; } =head1 DESCRIPTION This class assumes that you have built several mappers and want to link them together so that output from the previous mapper is the next mappers input. This way you can build arbitrarily complex mappers from simpler components. Note that Chain does not do any sanity checking on its mappers. You are solely responsible that input and output coordinate systems, direction of mapping and parameters internal to mappers make sense when chained together. To put it bluntly, the present class is just a glorified foreach loop over an array of mappers calling the map method. It would be neat to an internal function that would generate a new single step mapper from those included in the chain. It should speed things up considerably. Any volunteers? =head1 METHODS =head2 map Title : map Usage : $newpos = $obj->map($pos); Function: Map the location through all the mappers in the chain. Example : Returns : new Location in the output coordiante system Args : a Bio::Location::Simple object =head2 sort You do not really want to sort your chain, do you! This function does nothing other than a warning. =head2 Inherited methods =head2 add_mapper Title : add_mapper Usage : $obj->add_mapper($mapper) Function: Pushes one Bio::Coodinate::MapperI into the list of mappers. Sets _is_sorted() to false. Example : Returns : 1 when succeeds, 0 for failure. Args : mapper object =head2 mappers Title : mappers Usage : $obj->mappers(); Function: Returns or sets a list of mappers. Example : Returns : array of mappers Args : array of mappers =head2 each_mapper Title : each_mapper Usage : $obj->each_mapper(); Function: Returns a list of mappers. Example : Returns : array of mappers Args : none =head2 swap Title : swap Usage : $obj->swap; Function: Swap the direction of mapping;input <-> output Example : Returns : 1 Args : =head2 test Title : test Usage : $obj->test; Function: test that both components of all pairs are of the same length. Ran automatically. Example : Returns : boolean Args : =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =head1 CONTRIBUTOR =for stopwords Ewan Birney Ewan Birney =cut Graph.pm100644000765000024 2447113024421741 21663 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Graph; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Graph::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::Root::Root); # ABSTRACT: Finds shortest path between nodes in a graph. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 sub new { my($class,@args) = @_; my $self = $class->SUPER::new(@args); my($graph, $hasharray) = $self->_rearrange([qw( GRAPH HASHARRAY )], @args); $graph && $self->graph($graph); $hasharray && $self->hasharray($hasharray); $self->{'_root'} = undef; return $self; # success - we hope! } sub graph { my ($self,$value) = @_; if ($value) { $self->throw("Need a hash of hashes") unless ref($value) eq 'HASH' ; $self->{'_dag'} = $value; # empty the cache $self->{'_root'} = undef; } return $self->{'_dag'}; } sub hash_of_arrays { my ($self,$value) = @_; # empty the cache $self->{'_root'} = undef; if ($value) { $self->throw("Need a hash of hashes") unless ref($value) eq 'HASH' ; #copy the hash of arrays into a hash of hashes; my %hash; foreach my $start ( keys %{$value}){ $hash{$start} = undef; map { $hash{$start}{$_} = 1 } @{$value->{$start}}; } $self->{'_dag'} = \%hash; } return $self->{'_dag'}; } sub shortest_path { my ($self, $root, $end) = @_; $self->throw("Two arguments needed") unless @_ == 3; $self->throw("No node name [$root]") unless exists $self->{'_dag'}->{$root}; $self->throw("No node name [$end]") unless exists $self->{'_dag'}->{$end}; my @res; # results my $reverse; if ($root > $end) { ($root, $end) = ($end, $root ); $reverse++; } # try to use cached paths $self->dijkstra($root) unless defined $self->{'_root'} and $self->{'_root'} eq $root; return @res unless $self->{'_paths'} ; # create the list my $node = $end; my $prev = $self->{'_paths'}->{$end}{'prev'}; while ($prev) { unshift @res, $node; $node = $self->{'_paths'}->{$node}{'prev'}; $prev = $self->{'_paths'}->{$node}{'prev'}; } unshift @res, $node; $reverse ? return reverse @res : return @res; } sub dijkstra { my ($self,$root) = @_; $self->throw("I need the name of the root node input") unless $root; $self->throw("No node name [$root]") unless exists $self->{'_dag'}->{$root}; my %est = (); # estimate hash my %res = (); # result hash my $nodes = keys %{$self->{'_dag'}}; my $maxdist = 1000000; # cache the root value $self->{'_root'} = $root; foreach my $node ( keys %{$self->{'_dag'}} ){ if ($node eq $root) { $est{$node}{'prev'} = undef; $est{$node}{'dist'} = 0; } else { $est{$node}{'prev'} = undef; $est{$node}{'dist'} = $maxdist; } } # remove nodes from %est until it is empty while (keys %est) { #select the node closest to current one, or root node my $min_node; my $min = $maxdist; foreach my $node (reverse sort keys %est) { if ( $est{$node}{'dist'} < $min ) { $min = $est{$node}{'dist'}; $min_node = $node; } } # no more links between nodes last unless ($min_node); # move the node from %est into %res; $res{$min_node} = delete $est{$min_node}; # recompute distances to the neighbours my $dist = $res{$min_node}{'dist'}; foreach my $neighbour ( keys %{$self->{'_dag'}->{$min_node}} ){ next unless $est{$neighbour}; # might not be there any more $est{$neighbour}{'prev'} = $min_node; $est{$neighbour}{'dist'} = $dist + $self->{'_dag'}{$min_node}{$neighbour} if $est{$neighbour}{'dist'} > $dist + 1 ; } } return $self->{'_paths'} = \%res; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Graph - Finds shortest path between nodes in a graph. =head1 VERSION version 1.007001 =head1 SYNOPSIS # get a hash of hashes representing the graph. E.g.: my $hash= { '1' => { '2' => 1 }, '2' => { '4' => 1, '3' => 1 }, '3' => undef, '4' => { '5' => 1 }, '5' => undef }; # create the object; my $graph = Bio::Coordinate::Graph->new(-graph => $hash); # find the shortest path between two nodes my $a = 1; my $b = 6; my @path = $graph->shortest_paths($a); print join (", ", @path), "\n"; =head1 DESCRIPTION This class calculates the shortest path between input and output coordinate systems in a graph that defines the relationships between them. This class is primarely designed to analyze gene-related coordinate systems. See L. Note that this module can not be used to manage graphs. Technically the graph implemented here is known as Directed Acyclic Graph (DAG). DAG is composed of vertices (nodes) and edges (with optional weights) linking them. Nodes of the graph are the coordinate systems in gene mapper. The shortest path is found using the Dijkstra's algorithm. This algorithm is fast and greedy and requires all weights to be positive. All weights in the gene coordinate system graph are currently equal (1) making the graph unweighted. That makes the use of Dijkstra's algorithm an overkill. A simpler and faster breadth-first would be enough. Luckily the difference for small graphs is not significant and the implementation is capable of taking weights into account if needed at some later time. =head2 Input format The graph needs to be primed using a hash of hashes where there is a key for each node. The second keys are the names of the downstream neighboring nodes and values are the weights for reaching them. Here is part of the gene coordiante system graph: $hash = { '6' => undef, '3' => { '6' => 1 }, '2' => { '6' => 1, '4' => 1, '3' => 1 }, '1' => { '2' => 1 }, '4' => { '5' => 1 }, '5' => undef }; Note that the names need to be positive integers. Root should be '1' and directness of the graph is taken advantage of to speed calculations by assuming that downsream nodes always have larger number as name. An alternative (shorter) way of describing input is to use hash of arrays. See L. =head1 METHODS =head2 new =head2 graph Title : graph Usage : $obj->graph($my_graph) Function: Read/write method for the graph structure Example : Returns : hash of hashes grah structure Args : reference to a hash of hashes =head2 hash_of_arrays Title : hash_of_arrays Usage : $obj->hash_of_array(%hasharray) Function: An alternative method to read in the graph structure. Hash arrays are easier to type. This method converts arrays into hashes and assigns equal values "1" to weights. Example : Here is an example of simple structure containing a graph. my $DAG = { 6 => [], 5 => [], 4 => [5], 3 => [6], 2 => [3, 4, 6], 1 => [2] }; Returns : hash of hashes graph structure Args : reference to a hash of arrays =head2 shortest_path Title : shortest_path Usage : $obj->shortest_path($a, $b); Function: Method for retrieving the shortest path between nodes. If the start node remains the same, the method is sometimes able to use cached results, otherwise it will recalculate the paths. Example : Returns : array of node names, only the start node name if no path Args : name of the start node : name of the end node =head2 dijkstra Title : dijkstra Usage : $graph->dijkstra(1); Function: Implements Dijkstra's algorithm. Returns or sets a list of mappers. The returned path description is always directed down from the root. Called from shortest_path(). Example : Returns : Reference to a hash of hashes representing a linked list which contains shortest path down to all nodes from the start node. E.g.: $res = { '2' => { 'prev' => '1', 'dist' => 1 }, '1' => { 'prev' => undef, 'dist' => 0 }, }; Args : name of the start node =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut Utils.pm100644000765000024 1703213024421741 21715 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Utils; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Utils::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Location::Simple; use Bio::Coordinate::Pair; use Bio::Coordinate::Collection; use parent qw(Bio::Root::Root); # ABSTRACT: Additional methods to create Bio::Coordinate objects. # AUTHOR: Heikki Lehvaslaiho # AUTHOR: Jason Stajich # OWNER: Heikki Lehvaslaiho # OWNER: Jason Stajich # LICENSE: Perl_5 sub from_align { my ($self, $aln, $ref ) = @_; $aln->isa('Bio::Align::AlignI') || $self->throw('Not a Bio::Align::AlignI object but ['. ref($aln). ']'); # default reference sequence to the first sequence $ref ||= 1; my $collection = Bio::Coordinate::Collection->new(-return_match=>1); # this works only for pairs, so split the MSA # take the ref #foreach remaining seq in aln, do: $aln->map_chars('\.','-'); my $cs = $aln->gap_line; my $seq1 = $aln->get_seq_by_pos(1); my $seq2 = $aln->get_seq_by_pos(2); while ( $cs =~ /([^\-]+)/g) { # alignment coordinates my $lenmatch = length($1); my $start = pos($cs) - $lenmatch +1; my $end = $start + $lenmatch -1; my $match1 = Bio::Location::Simple->new (-seq_id => $seq1->id, -start => $seq1->location_from_column($start)->start, -end => $seq1->location_from_column($end)->start, -strand => $seq1->strand ); my $match2 = Bio::Location::Simple->new (-seq_id => $seq2->id, -start => $seq2->location_from_column($start)->start, -end => $seq2->location_from_column($end)->start, -strand => $seq2->strand ); my $pair = Bio::Coordinate::Pair->new (-in => $match1, -out => $match2 ); unless( $pair->test ) { $self->warn(join("", "pair align did not pass test ($start..$end):\n", "\tm1=",$match1->to_FTstring(), " len=", $match1->length, " m2=", $match2->to_FTstring()," len=", $match2->length,"\n")); } $collection->add_mapper($pair); } return ($collection->each_mapper)[0] if $collection->mapper_count == 1; return $collection; } sub from_seq_to_alignmentpos { my ($self, $aln ) = @_; $aln->isa('Bio::Align::AlignI') || $self->throw('Not a Bio::Align::AlignI object but ['. ref($aln). ']'); # default reference sequence to the first sequence my @mappers; $aln->map_chars('\.','-'); for my $seq ( $aln->each_seq ) { my $collection = Bio::Coordinate::Collection->new(-return_match=>1); my $cs = $seq->seq(); # do we change this over to use index and substr for speed? while ( $cs =~ /([^\-]+)/g) { # alignment coordinates my $lenmatch = length($1); my $start = pos($cs) - $lenmatch +1; my $end = $start + $lenmatch -1; my $match1 = Bio::Location::Simple->new (-seq_id => $seq->id, -start => $seq->location_from_column($start)->start, -end => $seq->location_from_column($end)->start, -strand => $seq->strand ); my $match2 = Bio::Location::Simple->new (-seq_id => 'alignment', -start => $start, -end => $end, -strand => 0 ); my $pair = Bio::Coordinate::Pair->new (-in => $match1, -out => $match2 ); unless ( $pair->test ) { $self->warn(join("", "pair align did not pass test ($start..$end):\n", "\tm1=",$match1->to_FTstring(), " len=", $match1->length, " m2=", $match2->to_FTstring()," len=", $match2->length,"\n")); } $collection->add_mapper($pair); } if( $collection->mapper_count == 1) { push @mappers, ($collection->each_mapper)[0]; } else { push @mappers, $collection; } } return @mappers; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Utils - Additional methods to create Bio::Coordinate objects. =head1 VERSION version 1.007001 =head1 SYNOPSIS use Bio::Coordinate::Utils; # get a Bio::Align::AlignI compliant object, $aln, somehow # it could be a Bio::SimpleAlign $mapper = Bio::Coordinate::Utils->from_align($aln, 1); # Build a set of mappers which will map, for each sequence, # that sequence position in the alignment (exon position to alignment # position) my @mappers = Bio::Coordinate::Utils->from_seq_to_alignmentpos($aln); =head1 DESCRIPTION This class is a holder of methods that work on or create Bio::Coordinate::MapperI- compliant objects. . These methods are not part of the Bio::Coordinate::MapperI interface and should in general not be essential to the primary function of sequence objects. If you are thinking of adding essential functions, it might be better to create your own sequence class. See L, L, and L for more. =head1 METHODS =head2 new new() inherited from Root =head2 from_align Title : from_align Usage : $mapper = Bio::Coordinate::Utils->from_align($aln, 1); Function: Create a mapper out of an alignment. The mapper will return a value only when both ends of the input range find a match. Note: This implementation works only on pairwise alignments and is not yet well tested! Returns : A Bio::Coordinate::MapperI Args : Bio::Align::AlignI object Id for the reference sequence, optional =head2 from_seq_to_alignmentpos Title : from_seq_to_alignmentpos Usage : $mapper = Bio::Coordinate::Utils->from_seq_to_alignmentpos($aln, 1); Function: Create a mapper out of an alignment. The mapper will map the position of a sequence into that position in the alignment. Will work on alignments of >= 2 sequences Returns : An array of Bio::Coordinate::MapperI Args : Bio::Align::AlignI object =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHORS Heikki Lehvaslaiho Jason Stajich =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho, and by Jason Stajich. This software is available under the same terms as the perl 5 programming language system itself. =cut Result.pm100644000765000024 1422513024421741 22074 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Result; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Result::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::Location::Split Bio::Coordinate::ResultI); # ABSTRACT: Results from coordinate transformation. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 sub add_sub_Location { my ($self,$value) = @_; if( ! $value ) { $self->warn("provding an empty value for location\n"); return; } $self->throw("Is not a Bio::LocationI but [$value]") unless $value->isa('Bio::LocationI'); $self->{'_match'} = $value if $value->isa('Bio::Coordinate::Result::Match'); $self->{'_gap'} = $value if $value->isa('Bio::Coordinate::Result::Gap'); $self->SUPER::add_sub_Location($value); } sub add_result { my ($self,$value) = @_; $self->throw("Is not a Bio::Coordinate::Result but [$value]") unless $value->isa('Bio::Coordinate::Result'); map { $self->add_sub_Location($_) } $value->each_Location; } sub seq_id { my ($self, $seqid) = @_; my @ls = $self->each_Location; if (@ls) { return $ls[0]->seq_id; } else { return; } } sub each_gap { my ($self) = @_; my @gaps; foreach my $gap ($self->each_Location) { push @gaps, $gap if $gap->isa('Bio::Coordinate::Result::Gap'); } return @gaps; } sub each_match { my ($self) = @_; my @matches; foreach my $match ($self->each_Location) { push @matches, $match if $match->isa('Bio::Coordinate::Result::Match'); } return @matches; } sub match { my ($self) = @_; $self->warn("More than one match in results") if $self->each_match > 1 and $self->verbose > 0; unless (defined $self->{'_match'} ) { my @m = $self->each_match; $self->{'_match'} = $m[-1]; } return $self->{'_match'}; } sub gap { my ($self) = @_; $self->warn("More than one gap in results") if $self->each_gap > 1 and $self->verbose > 0; unless (defined $self->{'_gap'} ) { my @m = $self->each_gap; $self->{'_gap'} = $m[-1]; } return $self->{'_gap'}; } sub purge_gaps { my ($self) = @_; my @matches; my $count = 0; foreach my $loc ($self->each_Location) { if ($loc->isa('Bio::Coordinate::Result::Match')) { push @matches, $loc; } else { $count++ } } @{$self->{'_sublocations'}} = (); delete $self->{'_gap'} ; push @{$self->{'_sublocations'}}, @matches; return $count; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Result - Results from coordinate transformation. =head1 VERSION version 1.007001 =head1 SYNOPSIS use Bio::Coordinate::Result; #get results from a Bio::Coordinate::MapperI $matched = $result->each_match; =head1 DESCRIPTION The results from Bio::Coordinate::MapperI are kept in an object which itself is a split location, See L. The results are either Matches or Gaps. See L and L. If only one Match is returned, there is a convenience method of retrieving it or accessing its methods. Same holds true for a Gap. =head1 ATTRIBUTES =head2 seq_id Title : seq_id Usage : my $seqid = $location->seq_id(); Function: Get/Set seq_id that location refers to We override this here in order to propagate to all sublocations which are not remote (provided this root is not remote either) Returns : seq_id Args : [optional] seq_id value to set =head2 each_gap Title : each_gap Usage : $obj->each_gap(); Function: Returns a list of Bio::Coordianate::Result::Gap objects. Returns : list of gaps Args : none =head2 each_match Title : each_match Usage : $obj->each_match(); Function: Returns a list of Bio::Coordinate::Result::Match objects. Returns : list of Matchs Args : none =head1 METHODS =head2 add_location Title : add_sub_Location Usage : $obj->add_sub_Location($variant) Function: Pushes one Bio::LocationI into the list of variants. Example : Returns : 1 when succeeds Args : Location object =head2 add_result Title : add_result Usage : $obj->add_result($result) Function: Adds the contents of one Bio::Coordinate::Result Example : Returns : 1 when succeeds Args : Result object =head2 match Title : match Usage : $match_object = $obj->match(); #or $gstart = $obj->gap->start; Function: Read only method for retrieving or accessing the match object. Returns : one Bio::Coordinate::Result::Match Args : =head2 gap Title : gap Usage : $gap_object = $obj->gap(); #or $gstart = $obj->gap->start; Function: Read only method for retrieving or accessing the gap object. Returns : one Bio::Coordinate::Result::Gap Args : =head2 purge_gaps Title : purge_gaps Usage : $gap_count = $obj->purge_gaps; Function: remove all gaps from the Result Returns : count of removed gaps Args : =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut MapperI.pm100644000765000024 714613024421741 22137 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::MapperI; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::MapperI::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::Root::RootI); # ABSTRACT: Interface describing coordinate mappers. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 sub in { my ($self,$value) = @_; $self->throw_not_implemented(); } sub out { my ($self,$value) = @_; $self->throw_not_implemented(); } sub swap { my ($self) = @_; $self->throw_not_implemented(); } sub test { my ($self) = @_; $self->throw_not_implemented(); } sub map { my ($self,$value) = @_; $self->throw_not_implemented(); } sub return_match { my ($self,$value) = @_; if( defined $value) { $value ? ( $self->{'_return_match'} = 1 ) : ( $self->{'_return_match'} = 0 ); } return $self->{'_return_match'} || 0 ; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::MapperI - Interface describing coordinate mappers. =head1 VERSION version 1.007001 =head1 SYNOPSIS # not to be used directly =head1 DESCRIPTION MapperI defines methods for classes capable for mapping locations between coordinate systems. =head1 METHODS =head2 in Title : in Usage : $obj->in('peptide'); Function: Set and read the input coordinate system. Example : Returns : value of input system Args : new value (optional), Bio::LocationI =head2 out Title : out Usage : $obj->out('peptide'); Function: Set and read the output coordinate system. Example : Returns : value of output system Args : new value (optional), Bio::LocationI =head2 swap Title : swap Usage : $obj->swap; Function: Swap the direction of mapping: input <-> output) Example : Returns : 1 Args : =head2 test Title : test Usage : $obj->test; Function: test that both components are of same length Example : Returns : ( 1 | undef ) Args : =head2 map Title : map Usage : $newpos = $obj->map($loc); Function: Map the location from the input coordinate system to a new value in the output coordinate system. Example : Returns : new value in the output coordiante system Args : Bio::LocationI =head2 return_match Title : return_match Usage : $obj->return_match(1); Function: A flag to turn on the simplified mode of returning only one joined Match object or undef Example : Returns : boolean Args : boolean (optional) =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut ResultI.pm100644000765000024 351113024421741 22161 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::ResultI; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::ResultI::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::LocationI); # ABSTRACT: Interface to identify coordinate mapper results. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::ResultI - Interface to identify coordinate mapper results. =head1 VERSION version 1.007001 =head1 SYNOPSIS # not to be used directly =head1 DESCRIPTION ResultI identifies Bio::LocationIs returned by Bio::Coordinate::MapperI implementing classes from other locations. =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut Collection.pm100644000765000024 2437313024421741 22716 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::Collection; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Collection::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Coordinate::Result; use Bio::Coordinate::Result::Gap; use parent qw(Bio::Root::Root Bio::Coordinate::MapperI); # ABSTRACT: Noncontinuous match between two coordinate sets. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 # CONTRIBUTOR: Ewan Birney sub new { my($class,@args) = @_; my $self = $class->SUPER::new(@args); $self->{'_mappers'} = []; my($in, $out, $strict, $mappers, $return_match) = $self->_rearrange([qw(IN OUT STRICT MAPPERS RETURN_MATCH )], @args); $in && $self->in($in); $out && $self->out($out); $mappers && $self->mappers($mappers); $return_match && $self->return_match('return_match'); return $self; # success - we hope! } sub add_mapper { my ($self,$value) = @_; $self->throw("Is not a Bio::Coordinate::MapperI but a [$self]") unless defined $value && $value->isa('Bio::Coordinate::MapperI'); # test pair range lengths $self->warn("Coordinates in pair [". $value . ":" . $value->in->seq_id . "/". $value->out->seq_id . "] are not right.") unless $value->test; $self->_is_sorted(0); push(@{$self->{'_mappers'}},$value); } sub mappers{ my ($self,@args) = @_; if (@args) { if (@args == 1 && ref $args[0] eq 'ARRAY') { @args = @{$args[0]}; } $self->throw("Is not a Bio::Coordinate::MapperI but a [$self]") unless defined $args[0] && $args[0]->isa('Bio::Coordinate::MapperI'); push(@{$self->{'_mappers'}}, @args); } return @{$self->{'_mappers'}}; } sub each_mapper{ my ($self) = @_; return @{$self->{'_mappers'}}; } sub mapper_count{ my $self = shift; return scalar @{$self->{'_mappers'} || []}; } sub swap { my ($self) = @_; $self->sort unless $self->_is_sorted; map {$_->swap;} @{$self->{'_mappers'}}; ($self->{'_in_ids'}, $self->{'_out_ids'}) = ($self->{'_out_ids'}, $self->{'_in_ids'}); 1; } sub test { my ($self) = @_; my $res = 1; foreach my $mapper ($self->each_mapper) { unless( $mapper->test ) { $self->warn("Coordinates in pair [". $mapper . ":" . $mapper->in->seq_id . "/". $mapper->out->seq_id . "] are not right."); $res = 0; } } $res; } sub map { my ($self,$value) = @_; $self->throw("Need to pass me a value.") unless defined $value; $self->throw("I need a Bio::Location, not [$value]") unless $value->isa('Bio::LocationI'); $self->throw("No coordinate mappers!") unless $self->each_mapper; $self->sort unless $self->_is_sorted; if ($value->isa("Bio::Location::SplitLocationI")) { my $result = Bio::Coordinate::Result->new(); foreach my $loc ( $value->sub_Location(1) ) { my $res = $self->_map($loc); map { $result->add_sub_Location($_) } $res->each_Location; } return $result; } else { return $self->_map($value); } } sub _map { my ($self,$value) = @_; my $result = Bio::Coordinate::Result->new(-is_remote=>1); IDMATCH: { # bail out now we if are forcing the use of an ID # and it is not in this collection last IDMATCH if defined $value->seq_id && ! $self->{'_in_ids'}->{$value->seq_id}; foreach my $pair ($self->each_mapper) { # if we are limiting input to a certain ID next if defined $value->seq_id && $value->seq_id ne $pair->in->seq_id; # if we haven't even reached the start, move on next if $pair->in->end < $value->start; # if we have over run, break last if $pair->in->start > $value->end; my $subres = $pair->map($value); $result->add_result($subres); } } $result->seq_id($result->match->seq_id) if $result->match; unless ($result->each_Location) { #build one gap; my $gap = Bio::Location::Simple->new(-start => $value->start, -end => $value->end, -strand => $value->strand, -location_type => $value->location_type ); $gap->seq_id($value->seq_id) if defined $value->seq_id; bless $gap, 'Bio::Coordinate::Result::Gap'; $result->seq_id($value->seq_id) if defined $value->seq_id; $result->add_sub_Location($gap); } return $result; } sub sort{ my ($self) = @_; @{$self->{'_mappers'}} = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, $_->in->start] } @{$self->{'_mappers'}}; #create hashes for sequence ids $self->{'_in_ids'} = (); $self->{'_out_ids'} = (); foreach ($self->each_mapper) { $self->{'_in_ids'}->{$_->in->seq_id} = 1; $self->{'_out_ids'}->{$_->out->seq_id} = 1; } $self->_is_sorted(1); } sub _is_sorted{ my ($self,$value) = @_; $self->{'_is_sorted'} = 1 if defined $value && $value; return $self->{'_is_sorted'}; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Collection - Noncontinuous match between two coordinate sets. =head1 VERSION version 1.007001 =head1 SYNOPSIS # create Bio::Coordinate::Pairs or other Bio::Coordinate::MapperIs somehow $pair1; $pair2; # add them into a Collection $collection = Bio::Coordinate::Collection->new; $collection->add_mapper($pair1); $collection->add_mapper($pair2); # create a position and map it $pos = Bio::Location::Simple->new (-start => 5, -end => 9 ); $res = $collection->map($pos); $res->match->start == 1; $res->match->end == 5; # if mapping is many to one (*>1) or many-to-many (*>*) # you have to give seq_id not get unrelevant entries $pos = Bio::Location::Simple->new (-start => 5, -end => 9 -seq_id=>'clone1'); =head1 DESCRIPTION Generic, context neutral mapper to provide coordinate transforms between two B coordinate systems. It brings into Bioperl the functionality from Ewan Birney's Bio::EnsEMBL::Mapper ported into current bioperl. This class is aimed for representing mapping between whole chromosomes and contigs, or between contigs and clones, or between sequencing reads and assembly. The submaps are automatically sorted, so they can be added in any order. To map coordinates to the other direction, you have to swap() the collection. Keeping track of the direction and ID restrictions are left to the calling code. =head1 ATTRIBUTES =head2 mappers Title : mappers Usage : $obj->mappers(); Function: Returns or sets a list of mappers. Example : Returns : array of mappers Args : array of mappers =head2 each_mapper Title : each_mapper Usage : $obj->each_mapper(); Function: Returns a list of mappers. Example : Returns : list of mappers Args : none =head2 mapper_count Title : mapper_count Usage : my $count = $collection->mapper_count; Function: Get the count of the number of mappers stored in this collection Example : Returns : integer Args : none =head1 METHODS =head2 new =head2 add_mapper Title : add_mapper Usage : $obj->add_mapper($mapper) Function: Pushes one Bio::Coordinate::MapperI into the list of mappers. Sets _is_sorted() to false. Example : Returns : 1 when succeeds, 0 for failure. Args : mapper object =head2 swap Title : swap Usage : $obj->swap; Function: Swap the direction of mapping;input <-> output Example : Returns : 1 Args : =head2 test Title : test Usage : $obj->test; Function: test that both components of all pairs are of the same length. Ran automatically. Example : Returns : boolean Args : =head2 map Title : map Usage : $newpos = $obj->map($pos); Function: Map the location from the input coordinate system to a new value in the output coordinate system. Example : Returns : new value in the output coordinate system Args : integer =head2 sort Title : sort Usage : $obj->sort; Function: Sort function so that all mappings are sorted by input coordinate start Example : Returns : 1 Args : =head1 INTERNAL METHODS =head2 _map Title : _map Usage : $newpos = $obj->_map($simpleloc); Function: Internal method that does the actual mapping. Called multiple times by map() if the location to be mapped is a split location Example : Returns : new location in the output coordinate system or undef Args : Bio::Location::Simple =head2 _is_sorted Title : _is_sorted Usage : $newpos = $obj->_is_sorted; Function: toggle for whether the (internal) coodinate mapper data are sorted Example : Returns : boolean Args : boolean =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut GeneMapper.pm100644000765000024 12215013024421741 22656 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::GeneMapper; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::GeneMapper::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Coordinate::Result; use Bio::Location::Simple; use Bio::Coordinate::Graph; use Bio::Coordinate::Collection; use Bio::Coordinate::Pair; use Bio::Coordinate::ExtrapolatingPair; use parent qw(Bio::Root::Root Bio::Coordinate::MapperI); # ABSTRACT: Transformations between gene related coordinate systems. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 # first set internal values for all translation tables our %COORDINATE_SYSTEMS = ( peptide => 10, propeptide => 9, frame => 8, cds => 7, negative_intron => 6, intron => 5, exon => 4, inex => 3, gene => 2, chr => 1, ); our %COORDINATE_INTS = ( 10 => 'peptide', 9 => 'propeptide', 8 => 'frame', 7 => 'cds', 6 => 'negative_intron', 5 => 'intron', 4 => 'exon', 3 => 'inex', 2 => 'gene', 1 => 'chr' ); our $TRANSLATION = $COORDINATE_SYSTEMS{'cds'}. "-". $COORDINATE_SYSTEMS{'propeptide'}; our $DAG = { 10 => [], 9 => [10], 8 => [], 7 => [8, 9], 6 => [], 5 => [6], 4 => [7], 3 => [4, 5], 2 => [3, 4, 5, 7], 1 => [2], }; our $NOZERO_VALUES = { 0 => 0, 'in' => 1, 'out' => 2, 'in&out' => 3, }; our $NOZERO_KEYS = { 0 => 0, 1 => 'in', 2 => 'out', 3 => 'in&out', }; sub new { my($class,@args) = @_; my $self = $class->SUPER::new(@args); # prime the graph my $graph = Bio::Coordinate::Graph->new(); $graph->hash_of_arrays($DAG); $self->graph($graph); my($in, $out, $peptide_offset, $exons, $cds, $nozero, $strict) = $self->_rearrange([qw(IN OUT PEPTIDE_OFFSET EXONS CDS NOZERO STRICT )], @args); # direction of mapping when going chr to protein $self->{_direction} = 1; $in && $self->in($in); $out && $self->out($out); $cds && $self->cds($cds); $exons && ref($exons) =~ /ARRAY/i && $self->exons(@$exons); $peptide_offset && $self->peptide_offset($peptide_offset); $nozero && $self->nozero($nozero); $strict && $self->strict($strict); return $self; # success - we hope! } sub in { my ($self,$value) = @_; if( defined $value) { $self->throw("Not a valid input coordinate system name [$value]\n". "Valid values are ". join(", ", keys %COORDINATE_SYSTEMS )) unless defined $COORDINATE_SYSTEMS{$value}; $self->{'_in'} = $COORDINATE_SYSTEMS{$value}; } return $COORDINATE_INTS{ $self->{'_in'} }; } sub out { my ($self,$value) = @_; if( defined $value) { $self->throw("Not a valid input coordinate system name [$value]\n". "Valid values are ". join(", ", keys %COORDINATE_SYSTEMS )) unless defined $COORDINATE_SYSTEMS{$value}; $self->{'_out'} = $COORDINATE_SYSTEMS{$value}; } return $COORDINATE_INTS{ $self->{'_out'} }; } sub strict { my ($self,$value) = @_; if( defined $value) { $value ? ( $self->{'_strict'} = 1 ) : ( $self->{'_strict'} = 0 ); ## update in each mapper !! } return $self->{'_strict'} || 0 ; } sub nozero { my ($self,$value) = @_; if (defined $value) { $self->throw("Not a valid value for nozero [$value]\n". "Valid values are ". join(", ", keys %{$NOZERO_VALUES} )) unless defined $NOZERO_VALUES->{$value}; $self->{'_nozero'} = $NOZERO_VALUES->{$value}; } my $res = $self->{'_nozero'} || 0; return $NOZERO_KEYS->{$res}; } sub graph { my ($self,$value) = @_; if( defined $value) { $self->throw("Not a valid graph [$value]\n") unless $value->isa('Bio::Coordinate::Graph'); $self->{'_graph'} = $value; } return $self->{'_graph'}; } sub peptide { my ($self, $value) = @_; if( defined $value) { $self->throw("I need a Bio::LocationI, not [". $value. "]") unless $value->isa('Bio::LocationI'); $self->throw("Peptide start not defined") unless defined $value->start; $self->{'_peptide_offset'} = $value->start - 1; $self->throw("Peptide end not defined") unless defined $value->end; $self->{'_peptide_length'} = $value->end - $self->{'_peptide_offset'}; my $a = $self->_create_pair ('propeptide', 'peptide', $self->strict, $self->{'_peptide_offset'}, $self->{'_peptide_length'} ); my $mapper = $COORDINATE_SYSTEMS{'propeptide'}. "-". $COORDINATE_SYSTEMS{'peptide'}; $self->{'_mappers'}->{$mapper} = $a; } return Bio::Location::Simple->new (-seq_id => 'propeptide', -start => $self->{'_peptide_offset'} + 1 , -end => $self->{'_peptide_length'} + $self->{'_peptide_offset'}, -strand => 1, -verbose => $self->verbose, ); } sub peptide_offset { my ($self,$offset, $len) = @_; if( defined $offset) { $self->throw("I need an integer, not [$offset]") unless $offset =~ /^[+-]?\d+$/; $self->{'_peptide_offset'} = $offset; if (defined $len) { $self->throw("I need an integer, not [$len]") unless $len =~ /^[+-]?\d+$/; $self->{'_peptide_length'} = $len; } my $a = $self->_create_pair ('propeptide', 'peptide', $self->strict, $offset, $self->{'_peptide_length'} ); my $mapper = $COORDINATE_SYSTEMS{'propeptide'}. "-". $COORDINATE_SYSTEMS{'peptide'}; $self->{'_mappers'}->{$mapper} = $a; } return $self->{'_peptide_offset'} || 0; } sub peptide_length { my ($self, $len) = @_; if( defined $len) { $self->throw("I need an integer, not [$len]") if defined $len && $len !~ /^[+-]?\d+$/; $self->{'_peptide_length'} = $len; } return $self->{'_peptide_length'}; } sub exons { my ($self,@value) = @_; my $cds_mapper = $COORDINATE_SYSTEMS{'gene'}. "-". $COORDINATE_SYSTEMS{'cds'}; my $inex_mapper = $COORDINATE_SYSTEMS{'gene'}. "-". $COORDINATE_SYSTEMS{'inex'}; my $exon_mapper = $COORDINATE_SYSTEMS{'gene'}. "-". $COORDINATE_SYSTEMS{'exon'}; my $intron_mapper = $COORDINATE_SYSTEMS{'gene'}. "-". $COORDINATE_SYSTEMS{'intron'}; my $negative_intron_mapper = $COORDINATE_SYSTEMS{'intron'}. "-". $COORDINATE_SYSTEMS{'negative_intron'}; my $exon_cds_mapper = $COORDINATE_SYSTEMS{'exon'}. "-". $COORDINATE_SYSTEMS{'cds'}; if(@value) { if (ref($value[0]) && $value[0]->isa('Bio::SeqFeatureI') and $value[0]->location->isa('Bio::Location::SplitLocationI')) { @value = $value[0]->location->each_Location; } else { $self->throw("I need an array , not [@value]") unless ref \@value eq 'ARRAY'; $self->throw("I need a reference to an array of Bio::LocationIs, not to [". $value[0]. "]") unless ref $value[0] and $value[0]->isa('Bio::LocationI'); } # # sort the input array # # and if the used has not defined CDS assume it is the complete exonic range if (defined $value[0]->strand && $value[0]->strand == - 1) { #reverse strand @value = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $_, $_->start] } @value; unless ($self->cds) { $self->cds(Bio::Location::Simple->new (-start => $value[-1]->start, -end => $value[0]->end, -strand => $value[0]->strand, -seq_id => $value[0]->seq_id, -verbose => $self->verbose, ) ); } } else { # undef or forward strand @value = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, $_->start] } @value; unless ($self->cds) { $self->cds(Bio::Location::Simple->new (-start => $value[0]->start, -end => $value[-1]->end, -strand => $value[0]->strand, -seq_id => $value[0]->seq_id, -verbose => $self->verbose, ) ); } } $self->{'_chr_exons'} = \@value; # transform exons from chromosome to gene coordinates # but only if gene coordinate system has been set my @exons ; #my $gene_mapper = $self->$COORDINATE_SYSTEMS{'chr'}. "-". $COORDINATE_SYSTEMS{'gene'}; my $gene_mapper = "1-2"; if (defined $self->{'_mappers'}->{$gene_mapper} ) { my $tmp_in = $self->{'_in'}; my $tmp_out = $self->{'_out'}; my $tmp_verb = $self->verbose; $self->verbose(0); $self->in('chr'); $self->out('gene'); @exons = map {$self->map($_) } @value; $self->{'_in'} = ($tmp_in); $self->{'_out'} = ($tmp_out); $self->verbose($tmp_verb); } else { @exons = @value; } my $cds_map = Bio::Coordinate::Collection->new; my $inex_map = Bio::Coordinate::Collection->new; my $exon_map = Bio::Coordinate::Collection->new; my $exon_cds_map = Bio::Coordinate::Collection->new; my $intron_map = Bio::Coordinate::Collection->new; my $negative_intron_map = Bio::Coordinate::Collection->new; my $tr_end = 0; my $coffset; my $exon_counter; my $prev_exon_end; for my $exon ( @exons ) { $exon_counter++; # # gene -> cds # my $match1 = Bio::Location::Simple->new (-seq_id =>'gene' , -start => $exon->start, -end => $exon->end, -strand => 1, -verbose=> $self->verbose); my $match2 = Bio::Location::Simple->new (-seq_id => 'cds', -start => $tr_end + 1, -end => $tr_end + $exon->end - $exon->start +1, -strand=>$exon->strand, -verbose=>$self->verbose); $cds_map->add_mapper(Bio::Coordinate::Pair->new (-in => $match1, -out => $match2, ) ); if ($exon->start <= 1 and $exon->end >= 1) { $coffset = $tr_end - $exon->start + 1; } $tr_end = $tr_end + $exon->end - $exon->start + 1; # # gene -> intron # if (defined $prev_exon_end) { my $match3 = Bio::Location::Simple->new (-seq_id => 'gene', -start => $prev_exon_end + 1, -end => $exon->start -1, -strand => $exon->strand, -verbose => $self->verbose); my $match4 = Bio::Location::Simple->new (-seq_id => 'intron'. ($exon_counter -1), -start => 1, -end => $exon->start - 1 - $prev_exon_end, -strand =>$exon->strand, -verbose => $self->verbose,); # negative intron coordinates my $match5 = Bio::Location::Simple->new (-seq_id => 'intron'. ($exon_counter -1), -start => -1 * ($exon->start - 2 - $prev_exon_end) -1, -end => -1, -strand => $exon->strand, -verbose => $self->verbose); $inex_map->add_mapper(Bio::Coordinate::Pair->new (-in => $match3, -out => $match4 ) ); $intron_map->add_mapper(Bio::Coordinate::Pair->new (-in => $self->_clone_loc($match3), -out => $self->_clone_loc($match4) ) ); $negative_intron_map->add_mapper(Bio::Coordinate::Pair->new (-in => $self->_clone_loc($match4), -out => $match5 )); } # store the value $prev_exon_end = $exon->end; # # gene -> exon # my $match6 = Bio::Location::Simple->new (-seq_id => 'exon'. $exon_counter, -start => 1, -end => $exon->end - $exon->start +1, -strand => $exon->strand, -verbose=> $self->verbose,); my $pair2 = Bio::Coordinate::Pair->new(-in => $self->_clone_loc($match1), -out => $match6 ); my $pair3 = Bio::Coordinate::Pair->new(-in => $self->_clone_loc($match6), -out => $self->_clone_loc($match2) ); $inex_map->add_mapper(Bio::Coordinate::Pair->new (-in => $self->_clone_loc($match1), -out => $match6 ) ); $exon_map->add_mapper(Bio::Coordinate::Pair->new (-in => $self->_clone_loc($match1), -out => $self->_clone_loc($match6) ) ); $exon_cds_map->add_mapper(Bio::Coordinate::Pair->new (-in => $self->_clone_loc($match6), -out => $self->_clone_loc($match2) ) ); } # move coordinate start if exons have negative values if ($coffset) { foreach my $m ($cds_map->each_mapper) { $m->out->start($m->out->start - $coffset); $m->out->end($m->out->end - $coffset); } } $self->{'_mappers'}->{$cds_mapper} = $cds_map; $self->{'_mappers'}->{$exon_cds_mapper} = $exon_cds_map; $self->{'_mappers'}->{$inex_mapper} = $inex_map; $self->{'_mappers'}->{$exon_mapper} = $exon_map; $self->{'_mappers'}->{$intron_mapper} = $intron_map; $self->{'_mappers'}->{$negative_intron_mapper} = $negative_intron_map; } return @{$self->{'_chr_exons'}} || 0; } sub _clone_loc { # clone a simple location my ($self,$loc) = @_; $self->throw("I need a Bio::Location::Simple , not [". ref $loc. "]") unless $loc->isa('Bio::Location::Simple'); return Bio::Location::Simple->new (-verbose => $self->verbose, -seq_id => $loc->seq_id, -start => $loc->start, -end => $loc->end, -strand => $loc->strand, -location_type => $loc->location_type ); } sub cds { my ($self,$value) = @_; if( defined $value) { if ($value =~ /^[+-]?\d+$/ ) { my $loc = Bio::Location::Simple->new(-start=>$value, -end => $value, -verbose=>$self->verbose); $self->{'_cds'} = $loc; } elsif (ref $value && $value->isa('Bio::RangeI') ) { $self->{'_cds'} = $value; } else { $self->throw("I need an integer or Bio::RangeI, not [$value]") } # strand !! my $len; $len = $self->{'_cds'}->end - $self->{'_cds'}->start +1 if defined $self->{'_cds'}->end; my $a = $self->_create_pair ('chr', 'gene', 0, $self->{'_cds'}->start-1, $len, $self->{'_cds'}->strand); my $mapper = $COORDINATE_SYSTEMS{'chr'}. "-". $COORDINATE_SYSTEMS{'gene'}; $self->{'_mappers'}->{$mapper} = $a; # recalculate exon-based mappers if ( defined $self->{'_chr_exons'} ) { $self->exons(@{$self->{'_chr_exons'}}); } } return $self->{'_cds'} || 0; } sub map { my ($self,$value) = @_; my ($res); $self->throw("Need to pass me a Bio::Location::Simple or ". "Bio::Location::Simple or Bio::SeqFeatureI, not [". ref($value). "]") unless ref($value) && ($value->isa('Bio::Location::Simple') or $value->isa('Bio::Location::SplitLocationI') or $value->isa('Bio::SeqFeatureI')); $self->throw("Input coordinate system not set") unless $self->{'_in'}; $self->throw("Output coordinate system not set") unless $self->{'_out'}; $self->throw("Do not be silly. Input and output coordinate ". "systems are the same!") unless $self->{'_in'} != $self->{'_out'}; $self->_check_direction(); $value = $value->location if $value->isa('Bio::SeqFeatureI'); $self->debug( "=== Start location: ". $value->start. ",". $value->end. " (". ($value->strand || ''). ")\n"); # if nozero coordinate system is used in the input values if ( defined $self->{'_nozero'} && ( $self->{'_nozero'} == 1 || $self->{'_nozero'} == 3 ) ) { $value->start($value->start + 1) if defined $value->start && $value->start < 1; $value->end($value->end + 1) if defined $value->end && $value->end < 1; } my @steps = $self->_get_path(); $self->debug( "mapping ". $self->{'_in'}. "->". $self->{'_out'}. " Mappers: ". join(", ", @steps). "\n"); foreach my $mapper (@steps) { if ($mapper eq $TRANSLATION) { if ($self->direction == 1) { $value = $self->_translate($value); $self->debug( "+ $TRANSLATION cds -> propeptide (translate) \n"); } else { $value = $self->_reverse_translate($value); $self->debug("+ $TRANSLATION propeptide -> cds (reverse translate) \n"); } } # keep the start and end values, and go on to next iteration # if this mapper is not set elsif ( ! defined $self->{'_mappers'}->{$mapper} ) { # update mapper name $mapper =~ /\d+-(\d+)/; my ($counter) = $1; $value->seq_id($COORDINATE_INTS{$counter}); $self->debug( "- $mapper\n"); } else { # # the DEFAULT : generic mapping # $value = $self->{'_mappers'}->{$mapper}->map($value); $value->purge_gaps if ($value && $value->isa('Bio::Location::SplitLocationI') && $value->can('gap')); $self->debug( "+ $mapper (". $self->direction. "): start ". $value->start. " end ". $value->end. "\n") if $value && $self->verbose > 0; } } # if nozero coordinate system is asked to be used in the output values if ( defined $value && defined $self->{'_nozero'} && ( $self->{'_nozero'} == 2 || $self->{'_nozero'} == 3 ) ) { $value->start($value->start - 1) if defined $value->start && $value->start < 1; $value->end($value->end - 1) if defined $value->end && $value->end < 1; } # handle merging of adjacent split locations! if (ref $value eq "Bio::Coordinate::Result" && $value->each_match > 1 ) { my $prevloc; my $merging = 0; my $newvalue; my @matches; foreach my $loc ( $value->each_Location(1) ) { unless ($prevloc) { $prevloc = $loc; push @matches, $prevloc; next; } if ($prevloc->end == ($loc->start - 1) && $prevloc->seq_id eq $loc->seq_id) { $prevloc->end($loc->end); $merging = 1; } else { push @matches, $loc; $prevloc = $loc; } } if ($merging) { if (@matches > 1 ) { $newvalue = Bio::Coordinate::Result->new; map {$newvalue->add_sub_Location} @matches; } else { $newvalue = Bio::Coordinate::Result::Match->new (-seq_id => $matches[0]->seq_id, -start => $matches[0]->start, -end => $matches[0]->end, -strand => $matches[0]->strand, -verbose => $self->verbose,); } $value = $newvalue; } } elsif (ref $value eq "Bio::Coordinate::Result" && $value->each_match == 1 ){ $value = $value->match; } return $value; } sub direction { my ($self) = @_; return $self->{'_direction'}; } sub swap { my ($self,$value) = @_; ($self->{'_in'}, $self->{'_out'}) = ($self->{'_out'}, $self->{'_in'}); map { $self->{'_mappers'}->{$_}->swap } keys %{$self->{'_mappers'}}; # record the changed direction; $self->{_direction} *= -1; return 1; } sub to_string { my ($self) = shift; print "-" x 40, "\n"; # chr-gene my $mapper_str = 'chr-gene'; my $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; if (defined $self->cds) { my $end = $self->cds->end -1 if defined $self->cds->end; printf "%16s%s: %s (%s)\n", ' ', 'gene offset', $self->cds->start-1 , $end || ''; printf "%16s%s: %s\n", ' ', 'gene strand', $self->cds->strand || 0; } # gene-intron $mapper_str = 'gene-intron'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; my $i = 1; foreach my $pair ( $self->{'_mappers'}->{$mapper}->each_mapper ) { printf "%8s :%8s -> %-12s\n", $i, $pair->in->start, $pair->out->start ; printf "%8s :%8s -> %-12s\n", '', $pair->in->end, $pair->out->end ; $i++; } # intron-negative_intron $mapper_str = 'intron-negative_intron'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; $i = 1; foreach my $pair ( $self->{'_mappers'}->{$mapper}->each_mapper ) { printf "%8s :%8s -> %-12s\n", $i, $pair->in->start, $pair->out->start ; printf "%8s :%8s -> %-12s\n", '', $pair->in->end, $pair->out->end ; $i++; } # gene-exon $mapper_str = 'gene-exon'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; $i = 1; foreach my $pair ( $self->{'_mappers'}->{$mapper}->each_mapper ) { printf "%8s :%8s -> %-12s\n", $i, $pair->in->start, $pair->out->start ; printf "%8s :%8s -> %-12s\n", '', $pair->in->end, $pair->out->end ; $i++; } # gene-cds $mapper_str = 'gene-cds'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; $i = 1; foreach my $pair ( $self->{'_mappers'}->{$mapper}->each_mapper ) { printf "%8s :%8s -> %-12s\n", $i, $pair->in->start, $pair->out->start ; printf "%8s :%8s -> %-12s\n", '', $pair->in->end, $pair->out->end ; $i++; } # cds-propeptide $mapper_str = 'cds-propeptide'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; printf "%9s%-12s\n", "", '"translate"'; # propeptide-peptide $mapper_str = 'propeptide-peptide'; $mapper = $self->_mapper_string2code($mapper_str); printf "\n %-12s (%s)\n", $mapper_str, $mapper ; printf "%16s%s: %s\n", ' ', "peptide offset", $self->peptide_offset; print "\nin : ", $self->in, "\n"; print "out: ", $self->out, "\n"; my $dir; $self->direction ? ($dir='forward') : ($dir='reverse'); printf "direction: %-8s(%s)\n", $dir, $self->direction; print "\n", "-" x 40, "\n"; 1; } sub _mapper_code2string { my ($self, $code) = @_; my ($a, $b) = $code =~ /(\d+)-(\d+)/; return $COORDINATE_INTS{$a}. '-'. $COORDINATE_INTS{$b}; } sub _mapper_string2code { my ($self, $string) =@_; my ($a, $b) = $string =~ /([^-]+)-(.*)/; return $COORDINATE_SYSTEMS{$a}. '-'. $COORDINATE_SYSTEMS{$b}; } sub _create_pair { my ($self, $in, $out, $strict, $offset, $length, $strand ) = @_; $strict ||= 0; $strand ||= 1; $length ||= 20; my $match1 = Bio::Location::Simple->new (-seq_id => $in, -start => $offset+1, -end => $offset+$length, -strand => 1, -verbose => $self->verbose); my $match2 = Bio::Location::Simple->new (-seq_id => $out, -start => 1, -end => $length, -strand => $strand, -verbose => $self->verbose); my $pair = Bio::Coordinate::ExtrapolatingPair->new (-in => $match1, -out => $match2, -strict => $strict, -verbose => $self->verbose, ); return $pair; } sub _translate { my ($self,$value) = @_; $self->throw("Need to pass me a Bio::Location::Simple or ". "Bio::Location::SplitLocationI, not [". ref($value). "]") unless defined $value && ($value->isa('Bio::Location::Simple') || $value->isa('Bio::Location::SplitLocationI')); my $seqid = 'propeptide'; if ($value->isa("Bio::Location::SplitLocationI") ) { my $split = Bio::Location::Split->new(-seq_id=>$seqid); foreach my $loc ( $value->each_Location(1) ) { my $match = Bio::Location::Simple->new (-start => int ($loc->start / 3 ) +1, -end => int ($loc->end / 3 ) +1, -seq_id => $seqid, -strand => 1, -verbose => $self->verbose, ); $split->add_sub_Location($match); } return $split; } else { return new Bio::Location::Simple(-start => int($value->start / 3 )+1, -end => int($value->end / 3 )+1, -seq_id => $seqid, -strand => 1, -verbose=> $self->verbose, ); } } sub _frame { my ($self,$value) = @_; $self->throw("Need to pass me a Bio::Location::Simple or ". "Bio::Location::SplitLocationI, not [". ref($value). "]") unless defined $value && ($value->isa('Bio::Location::Simple') || $value->isa('Bio::Location::SplitLocationI')); my $seqid = 'propeptide'; if ($value->isa("Bio::Location::SplitLocationI")) { my $split = Bio::Location::Split->new(-seq_id=>$seqid); foreach my $loc ( $value->each_Location(1) ) { my $match = Bio::Location::Simple->new (-start => ($value->start-1) % 3 +1, -end => ($value->end-1) % 3 +1, -seq_id => 'frame', -strand => 1, -verbose=> $self->verbose); $split->add_sub_Location($match); } return $split; } else { return new Bio::Location::Simple(-start => ($value->start-1) % 3 +1, -end => ($value->end-1) % 3 +1, -seq_id => 'frame', -strand => 1, -verbose => $self->verbose, ); } } sub _reverse_translate { my ($self,$value) = @_; $self->throw("Need to pass me a Bio::Location::Simple or ". "Bio::Location::SplitLocationI, not [". ref($value). "]") unless defined $value && ($value->isa('Bio::Location::Simple') || $value->isa('Bio::Location::SplitLocationI')); my $seqid = 'cds'; if ($value->isa("Bio::Location::SplitLocationI")) { my $split = Bio::Location::Split->new(-seq_id=>$seqid); foreach my $loc ( $value->each_Location(1) ) { my $match = Bio::Location::Simple->new (-start => $value->start * 3 - 2, -end => $value->end * 3, -seq_id => $seqid, -strand => 1, -verbose => $self->verbose, ); $split->add_sub_Location($match); } return $split; } else { return new Bio::Location::Simple(-start => $value->start * 3 - 2, -end => $value->end * 3, -seq_id => $seqid, -strand => 1, -verbose => $self->verbose, ); } } sub _check_direction { my ($self) = @_; my $new_direction = 1; $new_direction = -1 if $self->{'_in'} > $self->{'_out'}; unless ($new_direction == $self->{_direction} ) { map { $self->{'_mappers'}->{$_}->swap } keys %{$self->{'_mappers'}}; # record the changed direction; $self->{_direction} *= -1; } 1; } sub _get_path { my ($self) = @_; my $start = $self->{'_in'} || 0; my $end = $self->{'_out'} || 0; # note the order # always go from smaller to bigger: it makes caching more efficient my $reverse; if ($start > $end) { ($start, $end) = ($end, $start ); $reverse++; } my @mappers; if (exists $self->{'_previous_path'} and $self->{'_previous_path'} eq "$start$end" ) { # use cache @mappers = @{$self->{'_mapper_path'}}; } else { my $mapper; my $prev_node = ''; @mappers = map { $mapper = "$prev_node-$_"; $prev_node = $_; $mapper; } $self->{'_graph'}->shortest_path($start, $end); shift @mappers; $self->{'_previous_path'} = "$start$end"; $self->{'_mapper_path'} = \@mappers; } $reverse ? return reverse @mappers : return @mappers; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::GeneMapper - Transformations between gene related coordinate systems. =head1 VERSION version 1.007001 =head1 SYNOPSIS use Bio::Coordinate::GeneMapper; # get a Bio::RangeI representing the start, end and strand of the CDS # in chromosomal (or entry) coordinates my $cds; # get a Bio::Location::Split or an array of Bio::LocationI objects # holding the start, end and strand of all the exons in chromosomal # (or entry) coordinates my $exons; # create a gene mapper and set it to map from chromosomal to cds coordinates my $gene = Bio::Coordinate::GeneMapper->new(-in =>'chr', -out =>'cds', -cds =>$cds, -exons=>$exons ); # get a a Bio::Location or sequence feature in input (chr) coordinates my $loc; # map the location into output coordinates and get a new location object $newloc = $gene->map($loc); =head1 DESCRIPTION Bio::Coordinate::GeneMapper is a module for simplifying the mappings of coodinate locations between various gene related locations in human genetics. It also adds a special human genetics twist to coordinate systems by making it possible to disable the use of zero (0). Locations before position one start from -1. See method L. It understands by name the following coordinate systems and mapping between them: peptide (peptide length) ^ | -peptide_offset | frame propeptide (propeptide length) ^ ^ \ | translate \ | \ | cds (transcript start and end) ^ negative_intron | \ ^ | \ transcribe \ | \ intron exon \ ^ ^ ^ / splice \ \ / | / \ \ / | / \ inex | / \ ^ | / \ \ |/ ----- gene (gene_length) ^ | - gene_offset | chr (or entry) This structure is kept in the global variable $DAG which is a representation of a Directed Acyclic Graph. The path calculations traversing this graph are done in a helper class. See L. Of these, two operations are special cases, translate and splice. Translating and reverse translating are implemented as internal methods that do the simple 1E-E3 conversion. Splicing needs additional information that is provided by method L which takes in an array of Bio::LocationI objects. Most of the coordinate system names should be selfexplanatory to anyone familiar with genes. Negative intron coordinate system is starts counting backwards from -1 as the last nucleotide in the intron. This used when only exon and a few flanking intron nucleotides are known. This class models coordinates within one transcript of a gene, so to tackle multiple transcripts you need several instances of the class. It is therefore valid to argue that the name of the class should be TranscriptMapper. GeneMapper is a catchier name, so it stuck. =head1 ATTRIBUTES =head2 nozero Title : nozero Usage : $obj->nozero(1); Function: Flag to disable the use of zero in the input, output or both coordinate systems. Use of coordinate systems without zero is a peculiarity common in human genetics community. Example : Returns : 0 (default), or 'in', 'out', 'in&out' Args : 0 (default), or 'in', 'out', 'in&out' =head1 METHODS =head2 new =head2 in Title : in Usage : $obj->in('peptide'); Function: Set and read the input coordinate system. Example : Returns : value of input system Args : new value (optional) =head2 out Title : out Usage : $obj->out('peptide'); Function: Set and read the output coordinate system. Example : Returns : value of output system Args : new value (optional) =head2 strict Title : strict Usage : $obj->strict('peptide'); Function: Set and read whether strict boundaried of coordinate systems are enforced. When strict is on, the end of the coordinate range must be defined. Example : Returns : boolean Args : boolean (optional) =head2 graph Title : graph Usage : $obj->graph($new_graph); Function: Set and read the graph object representing relationships between coordinate systems Example : Returns : Bio::Coordinate::Graph object Args : new Bio::Coordinate::Graph object (optional) =head2 peptide Title : peptide Usage : $obj->peptide_offset($peptide_coord); Function: Read and write the offset of peptide from the start of propeptide and peptide length Returns : a Bio::Location::Simple object Args : a Bio::LocationI object =head2 peptide_offset Title : peptide_offset Usage : $obj->peptide_offset(20); Function: Set and read the offset of peptide from the start of propeptide Returns : set value or 0 Args : new value (optional) =head2 peptide_length Title : peptide_length Usage : $obj->peptide_length(20); Function: Set and read the offset of peptide from the start of propeptide Returns : set value or 0 Args : new value (optional) =head2 exons Title : exons Usage : $obj->exons(@exons); Function: Set and read the offset of CDS from the start of transcript You do not have to sort the exons before calling this method as they will be sorted automatically. If you have not defined the CDS, is will be set to span all exons here. Returns : array of Bio::LocationI exons in genome coordinates or 0 Args : array of Bio::LocationI exons in genome (or entry) coordinates =head2 cds Title : cds Usage : $obj->cds(20); Function: Set and read the offset of CDS from the start of transcipt Simple input can be an integer which gives the start of the coding region in genomic coordinate. If you want to provide the end of the coding region or indicate the use of the opposite strand, you have to pass a Bio::RangeI (e.g. Bio::Location::Simple or Bio::SegFeature::Generic) object to this method. Returns : set value or 0 Args : new value (optional) =head2 map Title : map Usage : $newpos = $obj->map(5); Function: Map the location from the input coordinate system to a new value in the output coordinate system. Example : Returns : new value in the output coordiante system Args : a Bio::Location::Simple =head2 direction Title : direction Usage : $obj->direction('peptide'); Function: Read-only method for the direction of mapping deduced from predefined input and output coordinate names. Example : Returns : 1 or -1, mapping direction Args : new value (optional) =head2 swap Title : swap Usage : $obj->swap; Function: Swap the direction of transformation (input <-> output) Example : Returns : 1 Args : =head2 to_string Title : to_string Usage : $newpos = $obj->to_string(5); Function: Dump the internal mapper values into a human readable format Example : Returns : string Args : =head1 INTERNAL METHODS =head2 _clone_loc Title : _clone_loc Usage : $copy_of_loc = $obj->_clone_loc($loc); Function: Make a deep copy of a simple location Returns : a Bio::Location::Simple object Args : a Bio::Location::Simple object to be cloned =head2 _mapper_code2string =head2 _mapper_string2code =head2 _create_pair Title : _create_pair Usage : $mapper = $obj->_create_pair('chr', 'gene', 0, 2555, 10000, -1); Function: Internal helper method to create a mapper between two coordinate systems Returns : a Bio::Coordinate::Pair object Args : string, input coordinate system name, string, output coordinate system name, boolean, strict mapping positive integer, offset positive integer, length 1 || -1 , strand =head2 _translate Title : _translate Usage : $newpos = $obj->_translate($loc); Function: Translate the location from the CDS coordinate system to a new value in the propeptide coordinate system. Example : Returns : new location Args : a Bio::Location::Simple or Bio::Location::SplitLocationI =head2 _frame =head2 _reverse_translate Title : _reverse_translate Usage : $newpos = $obj->_reverse_translate(5); Function: Reverse translate the location from the propeptide coordinate system to a new value in the CSD. Note that a single peptide location expands to cover the codon triplet Example : Returns : new location in the CDS coordinate system Args : a Bio::Location::Simple or Bio::Location::SplitLocationI =head2 _check_direction Title : _check_direction Usage : $obj->_check_direction(); Function: Check and swap when needed the direction the location mapping Pairs based on input and output values Example : Returns : new location Args : a Bio::Location::Simple =head2 _get_path Title : _get_path Usage : $obj->_get_path('peptide'); Function: internal method for finding that shortest path between input and output coordinate systems. Calculations and caching are handled by the graph class. See L. Example : Returns : array of the mappers Args : none =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut Result000755000765000024 013024421741 21352 5ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/CoordinateGap.pm100644000765000024 366613024421741 22572 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinate/Resultpackage Bio::Coordinate::Result::Gap; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Result::Gap::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::Location::Simple Bio::Coordinate::ResultI); # ABSTRACT: Another name for L. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Result::Gap - Another name for L. =head1 VERSION version 1.007001 =head1 SYNOPSIS $loc = Bio::Coordinate::Result::Gap->new(-start=>10, -end=>30, -strand=>1); =head1 DESCRIPTION This is a location object for coordinate mapping results. =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut Match.pm100644000765000024 360013024421741 23103 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinate/Resultpackage Bio::Coordinate::Result::Match; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::Result::Match::VERSION = '1.007001'; use utf8; use strict; use warnings; use parent qw(Bio::Location::Simple Bio::Coordinate::ResultI); # ABSTRACT: Another name for L. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::Result::Match - Another name for L. =head1 VERSION version 1.007001 =head1 SYNOPSIS $loc = Bio::Coordinate::Result::Match->new( -start=>10, -end=>30, -strand=>+1 ); =head1 DESCRIPTION This is a location class for coordinate mapping results. =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut ExtrapolatingPair.pm100644000765000024 1434113024421741 24252 0ustar00cjfieldsstaff000000000000Bio-Coordinate-1.007001/lib/Bio/Coordinatepackage Bio::Coordinate::ExtrapolatingPair; our $AUTHORITY = 'cpan:BIOPERLML'; $Bio::Coordinate::ExtrapolatingPair::VERSION = '1.007001'; use utf8; use strict; use warnings; use Bio::Root::Root; use Bio::LocationI; use parent qw(Bio::Coordinate::Pair); # ABSTRACT: Continuous match between two coordinate sets. # AUTHOR: Heikki Lehvaslaiho # OWNER: Heikki Lehvaslaiho # LICENSE: Perl_5 sub new { my($class,@args) = @_; my $self = $class->SUPER::new(@args); my($strict) = $self->_rearrange([qw(STRICT )], @args); $strict && $self->strict($strict); return $self; } sub strict { my ($self,$value) = @_; if( defined $value) { $self->{'_strict'} = 1 if $value; } return $self->{'_strict'}; } sub map { my ($self,$value) = @_; $self->throw("Need to pass me a value.") unless defined $value; $self->throw("I need a Bio::Location, not [$value]") unless $value->isa('Bio::LocationI'); $self->throw("Input coordinate system not set") unless $self->in; $self->throw("Output coordinate system not set") unless $self->out; my $match; if ($value->isa("Bio::Location::SplitLocationI")) { my $split = Bio::Coordinate::Result->new(-seq_id=>$self->out->seq_id); foreach my $loc ( sort { $a->start <=> $b->start } $value->sub_Location ) { $match = $self->_map($loc); $split->add_sub_Location($match) if $match; } $split->each_Location ? (return $split) : return ; } else { return $self->_map($value); } } sub _map { my ($self,$value) = @_; my ($offset, $start, $end); if ($self->strand == -1) { $offset = $self->in->end + $self->out->start; $start = $offset - $value->end; $end = $offset - $value->start ; } else { # undef, 0 or 1 $offset = $self->in->start - $self->out->start; $start = $value->start - $offset; $end = $value->end - $offset; } # strict prevents matches outside stated range if ($self->strict) { return if $start < 0 and $end < 0; return if $start > $self->out->end; $start = 1 if $start < 0; $end = $self->out->end if $end > $self->out->end; } my $match = Bio::Location::Simple-> new(-start => $start, -end => $end, -strand => $self->strand, -seq_id => $self->out->seq_id, -location_type => $value->location_type ); $match->strand($match->strand * $value->strand) if $value->strand; bless $match, 'Bio::Coordinate::Result::Match'; return $match; } 1; __END__ =pod =encoding utf-8 =head1 NAME Bio::Coordinate::ExtrapolatingPair - Continuous match between two coordinate sets. =head1 VERSION version 1.007001 =head1 SYNOPSIS use Bio::Location::Simple; use Bio::Coordinate::ExtrapolatingPair; $match1 = Bio::Location::Simple->new (-seq_id => 'propeptide', -start => 21, -end => 40, -strand=>1 ); $match2 = Bio::Location::Simple->new (-seq_id => 'peptide', -start => 1, -end => 20, -strand=>1 ); $pair = Bio::Coordinate::ExtrapolatingPair-> new(-in => $match1, -out => $match2, -strict => 1 ); $pos = Bio::Location::Simple->new (-start => 40, -end => 60, -strand=> 1 ); $res = $pair->map($pos); $res->start eq 20; $res->end eq 20; =head1 DESCRIPTION This class represents a one continuous match between two coordinate systems represented by Bio::Location::Simple objects. The relationship is directed and reversible. It implements methods to ensure internal consistency, and map continuous and split locations from one coordinate system to another. This class is an elaboration of Bio::Coordinate::Pair. The map function returns only matches which is the mode needed most of tehtime. By default the matching regions between coordinate systems are boundless, so that you can say e.g. that gene starts from here in the chromosomal coordinate system and extends indefinetely in both directions. If you want to define the matching regions exactly, you can do that and set strict() to true. =head1 METHODS =head2 new =head2 strict Title : strict Usage : $obj->strict(1); Function: Set and read the strictness of the coordinate system. Example : Returns : value of input system Args : boolean =head2 map Title : map Usage : $newpos = $obj->map($loc); Function: Map the location from the input coordinate system to a new value in the output coordinate system. In extrapolating coodinate system there is no location zero. Locations are... Example : Returns : new location in the output coordinate system or undef Args : Bio::Location::Simple =head1 INTERNAL METHODS =head2 _map Title : _map Usage : $newpos = $obj->_map($simpleloc); Function: Internal method that does the actual mapping. Called multiple times by map() if the location to be mapped is a split location Example : Returns : new location in the output coordinate system or undef Args : Bio::Location::Simple =head1 FEEDBACK =head2 Mailing lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists =head2 Support Please direct usage questions or support issues to the mailing list: I rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://github.com/bioperl/%%7Bdist%7D =head1 AUTHOR Heikki Lehvaslaiho =head1 COPYRIGHT This software is copyright (c) by Heikki Lehvaslaiho. This software is available under the same terms as the perl 5 programming language system itself. =cut