pax_global_header00006660000000000000000000000064146255263600014523gustar00rootroot0000000000000052 comment=981462dc70891c067625d8a24f1c185eeb32ef0f qiime2-2024.5.0/000077500000000000000000000000001462552636000131435ustar00rootroot00000000000000qiime2-2024.5.0/.gitattributes000066400000000000000000000000401462552636000160300ustar00rootroot00000000000000qiime2/_version.py export-subst qiime2-2024.5.0/.github/000077500000000000000000000000001462552636000145035ustar00rootroot00000000000000qiime2-2024.5.0/.github/CONTRIBUTING.md000066400000000000000000000015131462552636000167340ustar00rootroot00000000000000# Contributing to this project Thanks for thinking of us :heart: :tada: - we would love a helping hand! ## I just have a question > Note: Please don't file an issue to ask a question. You'll get faster results > by using the resources below. ### QIIME 2 Users Check out the [User Docs](https://docs.qiime2.org) - there are many tutorials, walkthroughs, and guides available. If you still need help, please visit us at the [QIIME 2 Forum](https://forum.qiime2.org/c/user-support). ### QIIME 2 Developers Check out the [Developer Docs](https://dev.qiime2.org) - there are many tutorials, walkthroughs, and guides available. If you still need help, please visit us at the [QIIME 2 Forum](https://forum.qiime2.org/c/dev-discussion). This document is based heavily on the following: https://github.com/atom/atom/blob/master/CONTRIBUTING.md qiime2-2024.5.0/.github/ISSUE_TEMPLATE/000077500000000000000000000000001462552636000166665ustar00rootroot00000000000000qiime2-2024.5.0/.github/ISSUE_TEMPLATE/1-user-need-help.md000066400000000000000000000006111462552636000221610ustar00rootroot00000000000000--- name: I am a user and I need help with QIIME 2... about: I am using QIIME 2 and have a question or am experiencing a problem --- Have you had a chance to check out the docs? https://docs.qiime2.org There are many tutorials, walkthroughs, and guides available. If you still need help, please visit: https://forum.qiime2.org/c/user-support Help requests filed here will not be answered. qiime2-2024.5.0/.github/ISSUE_TEMPLATE/2-dev-need-help.md000066400000000000000000000005641462552636000217710ustar00rootroot00000000000000--- name: I am a developer and I need help with QIIME 2... about: I am developing a QIIME 2 plugin or interface and have a question or a problem --- Have you had a chance to check out the developer docs? https://dev.qiime2.org There are many tutorials, walkthroughs, and guides available. If you still need help, please visit: https://forum.qiime2.org/c/dev-discussion qiime2-2024.5.0/.github/ISSUE_TEMPLATE/3-found-bug.md000066400000000000000000000017421462552636000212420ustar00rootroot00000000000000--- name: I am a developer and I found a bug... about: I am a developer and I found a bug that I can describe --- **Bug Description** A clear and concise description of what the bug is. **Steps to reproduce the behavior** 1. Go to '...' 2. Click on '....' 3. Scroll down to '....' 4. See error **Expected behavior** A clear and concise description of what you expected to happen. **Screenshots** If applicable, add screenshots to help explain your problem. **Computation Environment** - OS: [e.g. macOS High Sierra] - QIIME 2 Release [e.g. 2018.6] **Questions** 1. An enumerated list with any questions about the problem here. 2. If not applicable, please delete this section. **Comments** 1. An enumerated list with any other context or comments about the problem here. 2. If not applicable, please delete this section. **References** 1. An enumerated list of links to relevant references, including forum posts, stack overflow, etc. 2. If not applicable, please delete this section. qiime2-2024.5.0/.github/ISSUE_TEMPLATE/4-make-better.md000066400000000000000000000015321462552636000215520ustar00rootroot00000000000000--- name: I am a developer and I have an idea for an improvement... about: I am a developer and I have an idea for an improvement to existing functionality --- **Improvement Description** A clear and concise description of what the improvement is. **Current Behavior** Please provide a brief description of the current behavior. **Proposed Behavior** Please provide a brief description of the proposed behavior. **Questions** 1. An enumerated list of questions related to the proposal. 2. If not applicable, please delete this section. **Comments** 1. An enumerated list of comments related to the proposal that don't fit anywhere else. 2. If not applicable, please delete this section. **References** 1. An enumerated list of links to relevant references, including forum posts, stack overflow, etc. 2. If not applicable, please delete this section. qiime2-2024.5.0/.github/ISSUE_TEMPLATE/5-make-new.md000066400000000000000000000015131462552636000210560ustar00rootroot00000000000000--- name: I am a developer and I have an idea for a new feature... about: I am a developer and I have an idea for new functionality --- **Addition Description** A clear and concise description of what the addition is. **Current Behavior** Please provide a brief description of the current behavior, if applicable. **Proposed Behavior** Please provide a brief description of the proposed behavior. **Questions** 1. An enumerated list of questions related to the proposal. 2. If not applicable, please delete this section. **Comments** 1. An enumerated list of comments related to the proposal that don't fit anywhere else. 2. If not applicable, please delete this section. **References** 1. An enumerated list of links to relevant references, including forum posts, stack overflow, etc. 2. If not applicable, please delete this section. qiime2-2024.5.0/.github/ISSUE_TEMPLATE/6-where-to-go.md000066400000000000000000000100111462552636000215010ustar00rootroot00000000000000--- name: I don't know where to file my issue... about: I am a developer and I don't know which repo to file this in --- The repos within the QIIME 2 GitHub Organization are listed below, with a brief description about the repo. Sorted alphabetically by repo name. - The CI automation engine that builds and distributes QIIME 2 https://github.com/qiime2/busywork/issues - A Concourse resource for working with conda https://github.com/qiime2/conda-channel-resource/issues - Web app for vanity URLs for QIIME 2 data assets https://github.com/qiime2/data.qiime2.org/issues - The Developer Documentation https://github.com/qiime2/dev-docs/issues - A discourse plugin for handling queued/unqueued topics https://github.com/qiime2/discourse-unhandled-tagger/issues - The User Documentation https://github.com/qiime2/docs/issues - Rendered QIIME 2 environment files for conda https://github.com/qiime2/environment-files/issues - Google Sheets Add-On for validating tabular data https://github.com/qiime2/Keemei/issues - A docker image for linux-based busywork workers https://github.com/qiime2/linux-worker-docker/issues - Official project logos https://github.com/qiime2/logos/issues - The q2-alignment plugin https://github.com/qiime2/q2-alignment/issues - The q2-composition plugin https://github.com/qiime2/q2-composition/issues - The q2-cutadapt plugin https://github.com/qiime2/q2-cutadapt/issues - The q2-dada2 plugin https://github.com/qiime2/q2-dada2/issues - The q2-deblur plugin https://github.com/qiime2/q2-deblur/issues - The q2-demux plugin https://github.com/qiime2/q2-demux/issues - The q2-diversity plugin https://github.com/qiime2/q2-diversity/issues - The q2-diversity-lib plugin https://github.com/qiime2/q2-diversity-lib/issues - The q2-emperor plugin https://github.com/qiime2/q2-emperor/issues - The q2-feature-classifier plugin https://github.com/qiime2/q2-feature-classifier/issues - The q2-feature-table plugin https://github.com/qiime2/q2-feature-table/issues - The q2-fragment-insertion plugin https://github.com/qiime2/q2-fragment-insertion/issues - The q2-gneiss plugin https://github.com/qiime2/q2-gneiss/issues - The q2-longitudinal plugin https://github.com/qiime2/q2-longitudinal/issues - The q2-metadata plugin https://github.com/qiime2/q2-metadata/issues - The q2-phylogeny plugin https://github.com/qiime2/q2-phylogeny/issues - The q2-quality-control plugin https://github.com/qiime2/q2-quality-control/issues - The q2-quality-filter plugin https://github.com/qiime2/q2-quality-filter/issues - The q2-sample-classifier plugin https://github.com/qiime2/q2-sample-classifier/issues - The q2-shogun plugin https://github.com/qiime2/q2-shogun/issues - The q2-taxa plugin https://github.com/qiime2/q2-taxa/issues - The q2-types plugin https://github.com/qiime2/q2-types/issues - The q2-vsearch plugin https://github.com/qiime2/q2-vsearch/issues - The CLI interface https://github.com/qiime2/q2cli/issues - The prototype CWL interface https://github.com/qiime2/q2cwl/issues - The prototype Galaxy interface https://github.com/qiime2/q2galaxy/issues - An internal tool for ensuring header text and copyrights are present https://github.com/qiime2/q2lint/issues - The prototype GUI interface https://github.com/qiime2/q2studio/issues - A base template for use in official QIIME 2 plugins https://github.com/qiime2/q2templates/issues - The read-only web interface at view.qiime2.org https://github.com/qiime2/q2view/issues - The QIIME 2 homepage at qiime2.org https://github.com/qiime2/qiime2.github.io/issues - The QIIME 2 framework https://github.com/qiime2/qiime2/issues - Centralized templates for repo assets https://github.com/qiime2/template-repo/issues - Scripts for building QIIME 2 VMs https://github.com/qiime2/vm-playbooks/issues - Scripts for building QIIME 2 workshop clusters https://github.com/qiime2/workshop-playbooks/issues - The web app that runs workshops.qiime2.org https://github.com/qiime2/workshops.qiime2.org/issues qiime2-2024.5.0/.github/SUPPORT.md000066400000000000000000000122421462552636000162020ustar00rootroot00000000000000# QIIME 2 Users Check out the [User Docs](https://docs.qiime2.org) - there are many tutorials, walkthroughs, and guides available. If you still need help, please visit us at the [QIIME 2 Forum](https://forum.qiime2.org/c/user-support). # QIIME 2 Developers Check out the [Developer Docs](https://dev.qiime2.org) - there are many tutorials, walkthroughs, and guides available. If you still need help, please visit us at the [QIIME 2 Forum](https://forum.qiime2.org/c/dev-discussion). # General Bug/Issue Triage Discussion ![rubric](./rubric.png?raw=true) # Projects/Repositories in the QIIME 2 GitHub Organization Sorted alphabetically by repo name. - [busywork](https://github.com/qiime2/busywork/issues) | The CI automation engine that builds and distributes QIIME 2 - [conda-channel-resource](https://github.com/qiime2/conda-channel-resource/issues) | A Concourse resource for working with conda - [data.qiime2.org](https://github.com/qiime2/data.qiime2.org/issues) | Web app for vanity URLs for QIIME 2 data assets - [dev-docs](https://github.com/qiime2/dev-docs/issues) | The Developer Documentation - [discourse-unhandled-tagger](https://github.com/qiime2/discourse-unhandled-tagger/issues) | A discourse plugin for handling queued/unqueued topics - [docs](https://github.com/qiime2/docs/issues) | The User Documentation - [environment-files](https://github.com/qiime2/environment-files/issues) | Rendered QIIME 2 environment files for conda - [Keemei](https://github.com/qiime2/Keemei/issues) | Google Sheets Add-On for validating tabular data - [linux-worker-docker](https://github.com/qiime2/linux-worker-docker/issues) | A docker image for linux-based busywork workers - [logos](https://github.com/qiime2/logos/issues) | Official project logos - [q2-alignment](https://github.com/qiime2/q2-alignment/issues) | The q2-alignment plugin - [q2-composition](https://github.com/qiime2/q2-composition/issues) | The q2-composition plugin - [q2-cutadapt](https://github.com/qiime2/q2-cutadapt/issues) | The q2-cutadapt plugin - [q2-dada2](https://github.com/qiime2/q2-dada2/issues) | The q2-dada2 plugin - [q2-deblur](https://github.com/qiime2/q2-deblur/issues) | The q2-deblur plugin - [q2-demux](https://github.com/qiime2/q2-demux/issues) | The q2-demux plugin - [q2-diversity](https://github.com/qiime2/q2-diversity/issues) | The q2-diversity plugin - [q2-diversity-lib](https://github.com/qiime2/q2-diversity-lib/issues) | The q2-diversity-lib plugin - [q2-emperor](https://github.com/qiime2/q2-emperor/issues) | The q2-emperor plugin - [q2-feature-classifier](https://github.com/qiime2/q2-feature-classifier/issues) | The q2-feature-classifier plugin - [q2-feature-table](https://github.com/qiime2/q2-feature-table/issues) | The q2-feature-table plugin - [q2-fragment-insertion](https://github.com/qiime2/q2-fragment-insertion/issues) | The q2-fragment-insertion plugin - [q2-gneiss](https://github.com/qiime2/q2-gneiss/issues) | The q2-gneiss plugin - [q2-longitudinal](https://github.com/qiime2/q2-longitudinal/issues) | The q2-longitudinal plugin - [q2-metadata](https://github.com/qiime2/q2-metadata/issues) | The q2-metadata plugin - [q2-phylogeny](https://github.com/qiime2/q2-phylogeny/issues) | The q2-phylogeny plugin - [q2-quality-control](https://github.com/qiime2/q2-quality-control/issues) | The q2-quality-control plugin - [q2-quality-filter](https://github.com/qiime2/q2-quality-filter/issues) | The q2-quality-filter plugin - [q2-sample-classifier](https://github.com/qiime2/q2-sample-classifier/issues) | The q2-sample-classifier plugin - [q2-shogun](https://github.com/qiime2/q2-shogun/issues) | The q2-shogun plugin - [q2-taxa](https://github.com/qiime2/q2-taxa/issues) | The q2-taxa plugin - [q2-types](https://github.com/qiime2/q2-types/issues) | The q2-types plugin - [q2-vsearch](https://github.com/qiime2/q2-vsearch/issues) | The q2-vsearch plugin - [q2cli](https://github.com/qiime2/q2cli/issues) | The CLI interface - [q2cwl](https://github.com/qiime2/q2cwl/issues) | The prototype CWL interface - [q2galaxy](https://github.com/qiime2/q2galaxy/issues) | The prototype Galaxy interface - [q2lint](https://github.com/qiime2/q2lint/issues) | An internal tool for ensuring header text and copyrights are present - [q2studio](https://github.com/qiime2/q2studio/issues) | The prototype GUI interface - [q2templates](https://github.com/qiime2/q2templates/issues) | A base template for use in official QIIME 2 plugins - [q2view](https://github.com/qiime2/q2view/issues) | The read-only web interface at view.qiime2.org - [qiime2.github.io](https://github.com/qiime2/qiime2.github.io/issues) | The QIIME 2 homepage at qiime2.org - [qiime2](https://github.com/qiime2/qiime2/issues) | The QIIME 2 framework - [template-repo](https://github.com/qiime2/template-repo/issues) | Centralized templates for repo assets - [vm-playbooks](https://github.com/qiime2/vm-playbooks/issues) | Scripts for building QIIME 2 VMs - [workshop-playbooks](https://github.com/qiime2/workshop-playbooks/issues) | Scripts for building QIIME 2 workshop clusters - [workshops.qiime2.org](https://github.com/qiime2/workshops.qiime2.org/issues) | The web app that runs workshops.qiime2.org qiime2-2024.5.0/.github/pull_request_template.md000066400000000000000000000006121462552636000214430ustar00rootroot00000000000000Brief summary of the Pull Request, including any issues it may fix using the GitHub closing syntax: https://help.github.com/articles/closing-issues-using-keywords/ Also, include any co-authors or contributors using the GitHub coauthor tag: https://help.github.com/articles/creating-a-commit-with-multiple-authors/ --- Include any questions for reviewers, screenshots, sample outputs, etc. qiime2-2024.5.0/.github/rubric.png000066400000000000000000007014131462552636000165050ustar00rootroot00000000000000PNG  IHDR,4\sBIT|d pHYs.#.#x?vtEXtSoftwarewww.inkscape.org< IDATxw|g3AޫEEQvgmKhQRD.#g$ G)9IIr~vkxr\7ޯ\b0 R1NE/zQ@G E/z'w:~9 .VhhD?^Kg7o78s7HG^{5]~]v ӧOլYD ^-ZٳgW_TB iٲeںufR)//_lRl߾ݤIs8#8ѣGرcٰakQ ___۷Of (p&I8p@sIO5ydXTRZ~ҦMk7oZpDO<1ӧO(8!?0c2dȠs,fݺu_6 8 2?sI @C ޽{uСfIZj& ilϟ߬EILR^DԬYq]]]`I&Ѿ6vUV5o\TD$&~<н{%6p^DGGk֭˓'O/Sڵkhh#^QF &:t\]/TՕ={ve˖M/[l8cǎ%.aÆI ᅲWڴi#F)+)<<\&MҡC8$^dN>mR7n$Y*Toh?^@\UTщ'4tP.]Z3gͫ-[jӦMZf0=ŋMw!;֭ۗ`^q 7|iĚ3gNrN/p6mq$=iӦD Klc׮]K4(z1P ><Ȟ=j֬`gϞ%Y|֋رc; IMMd4k,$ zSLÇ; Id̘Ѭ#GԈ#e\~)z)׉''$w(z6{_uѣ6ͥ\rʔ)SIQh`7ne˖zyr8bŊY4\:t蠀UN/ N%ɝJ,iXQVR 4l0ruw*g͚k׮m+][-Zӧ;8\ !OddgϮGd|k׮ڵ-j9<{Q۶mu=O,XP+VN4ӧOu1*((Haaazː! (E\rJ>}2g`>}ZgΜӧ+W`0_q\\\%K+WN5kT9$ϘԩS:}N:]qSM0A3fPtt(z{xCpBmڴy`0hڿ6mhTMݻw~zmذAғ'OL%Jyj۶*Tę:K.i͚5ڼy;(ƹB jٲڵkB %q҃b[Ew/FGFFjŚ]t<$???EEEI8>vzRѢEuʕ$Uzu |ڷooҽfS5jO?5ڶdXª6m'o׮]2ev!k_?aÆ=zj׮m\֋}I' @;vɓ'M#Oҷ~kl9z:viӦiѢE*W\#""gIh@s(^x큁:tlbUѶm۴m65o\S\̚/oݺufIү_O>5y̍7ԡCi[bV\_:tf͚e.]d$I|'Ol; Kwv=zƍ?~ϯݻ믿Klٲ7oM ^v1UZU~m̟,[L˗iU7oVٲegƭ^Z~~~6/xܹS-Z0~;w֭[Z˗/5-Q'EӮq#""d/_^=z͛7,֊+;tD[ʕf]|Yyn*___b={L}_|amک`ѣյkW=~8...j֬vء7n(22RW^UlYݻwO5?l-v%kNϞ=x_ՆY'{R7'ojΜ9ѣcך5k4zh3Ʀw~-\P}1.q,W\ʕ+5j%KW^&7ne˦}iҘ1c#Gɓ13G*TP`N:PhhN:G***ʤ8UVUժUU^8m CK.Uv^z/_>uI۷ט1c4mڴD犈PvcլYӤչsgX5ٳgmV ,ŋSddU󹹹QF*U1c(,,L>ԑ#Gt9֑iӪv*VMs1X{, ՛0a&N9TREWV맟~RVL*R :T|MK}G&O&L~I/_N:%`0M6Z~IsΝ['OTk׮ԩSڸq:dϟ_|>C3fnܸϟ'8WZo߾ߵh"]~\,Y]&`0رcF>cmݺլ 2) *\Kn߾QF~0+͛7Ox͛75x߿&M$oooQHFi3gNmٲE+Vx /_^>LouIK.ѾϞ=Sum(UN8!wY~ʖ-4ᅰ-[V'N0{ܫݻ9rmsuuՎ;۷QF&iժɅ/dɒ&رمM6K.hLƌg2eѣչs؝D HtTsxչsg?WZJH.]|rLlx.]TىbJKX6mژK@@/^hyjȐ!&oݳhܫ*, 6,ނ׃ԭ[7 ^4j(s*VTbr+Wȑ#&-Zhҥfy!,,쥂Wٵ~7L+O< {zzjŊ*UE:ɓ't5egbΓ1cd%  ֐!C}xﻱQFrcWX=c^UNO6ͤЀ>}z9v- qx^^^ 矛DZYyy v*lRժU3{ܿjFRIt)=<<4vXs4iEc9~Fڷoϑ /W/{6m(**1ϟ?7-IPʫxfpݛhٳ[nf-o6m2|ʚ5Ѷ˗/kܹ&LjoXBׯ_o1͛77{̿ 2DF֭X///w6k,~&SM74{ܬYnU g}j^`['%@dɢ ?P͚5%Çk&_j]frs{!wfaÆYtҥKM>јH\2sOOO?qӧOWddqyGFFj֭fDZ3w߿uY9r䈷=W\:zƌ*Upz7ճgO=zTY Zn%K=nڴi={y޻woI QuujnGh ѣG͚;gΜfcUvwĈf/IgVLLEc|_~{}c,aB.]#G}'Oj…*S,͑ZtY``֭[gQUVɓ'/=K>:vh| PbŊZ~4qD(Q"c.[,>7n4{^KW6{LttM[Z5ըQ_wEڵ+u&xwRJ/]dq\YfUjK:t`1-wy֮];eɒŢ@ҡ0[5n8;wN>e˖-Ib>}Z.\Hρ̞7$$侏?̙3GYw9Ң(`0y>dȐ%} :Tk׮5;γgt]YAVe5zh;vL~~~f9y݃m@D `*Uh֬Yy6mڤVZæ1lH޽{+cƌʚ5kx{{+s2d={fQ?6ofTxqcڵK&ҥKokv<<<4j(|_xxVXy>}̎ 샢fͪ:wFiAAA]ry ҥrav˗| <˔)zหW*Zhrz^{qv8EoOOOu>(zYlYY'!%JД)SѣG9֤g3f4hvOj…;}{v,Y$8?oKy{{%hWh?cGnݚ`Mj]t)ޝ@I;K,<./WWW 4(-=~pڴi]Ec͕?~ġf_њ:u͛og- ۢ<}P't>K5jΝ;o|_Bw4Yzz-Cյk_sm޼YZ2-_>>۷Oϟ~}%޽;N>}X^DrUw1 ()+V̤~E`NME#F=&""],>ԏ?_>}ڽU@sRiԨQfwzuX" `0ԧH"]MrI8$XQ$)yxx\rȑCٳg7;v@@@GJѼys/^q͋?ĹW]7k7x<9>%gΜ޽LEFFjɒ%qk yQ'ц/㏊C6x"""WNx\\\TJcĉfK6l_ 6H2~a˖-޹UP!̙\8p3fҦMk֘gj˖-駟tΝҦM]2E(z0eWӭ[n:;d;w9JU V%c`ھ}Ec}zMڵ|||7sLi5h rUcu%… ?~!e3(o|SNfꫯwyޢE ȑ;Nä#ƍ_tСDt)Ѽ}YƦŒuQƍS*UtY驁=?,I:y*U}ltګ;ِ!C,Sox IDAT̙?ƍho^ϟ$O*[n޼ia_~~~5c խ[Ws-ZTڵ3{ܫ"E^z쏢8tҙO>QPPPrƍDRرwNYr$M8Ѥ3{l}W/=ԛoi/[nVkYTx!cƌyYӧOe+W.Oy瞽kƍ/=H>HֽիIc@A IBCCѣGIxӧOO>Znŋ>Sg}f~ &˪_5l0_OV>}z͞=[7o8gԮ];-^8N۔)S,*R<~آ<ý{C㚲KU&LPDDDי3gV…͎oJR˖--&M}Q''O?~\M4I̙3-O>QM]|EL4ɢbbb4h E2eʨ哐bŊyO6guyѣg0ԵkW8q챏?{ァiرj֬i,޽{q}9sqձ-AD]6qM6+s?;6m9sZ/^DrmVztMhʔ)EذaC97n?աCM4)ν>|ԩٳgiK.VXaSXټЮ];UL=|P5kԆ Lk׮]q (`)-.&&&G5cҤIo~z#o޼f9poΝ;3k˗Wƍ-ۧO싢8K^\\r5k?nqgϪaÆC_ʕ+-1|+Vq111?~jժk*,,,N(:tHzoN)SXT 2UUzu2fyKNk֬QƌnZmڴѮ]ӧOm65iDjR```>iҤ-:F޽{jժ._lXI5j(G4Gdd>3M>ݢϟ?׻ᆱ 6\}Uƍ4 jٲ[]rEu6lؠ GqܹTbQƌ71 R l:-R^,?~ 8P)h[nСCocǎD_,YRwz'R``jժ[nY< .ܹsMWPPP=6L_qMqFjʬ15jЁl4ib=W*^rȡhݿ_/^LpNWWW-]T:u2)u9׆ lr|g,YԨQ#UREEU TLeϞݢBBBt1ݹsGGѺut59Sҥ?~e͚U+V4il g<)R;̙3t!SjĈ6E/p"ǎ3wb˧ҥK+_|򒋋BBB#ԩSq5lP+WT֬Ym_@@{=]r&{Zh#&&F%KŋMfi&Iٲe!\\\4sL}&?~&MYϢEԣGƮ]Vm۶qF*Tn'O[oeՎhǎ*W޳g֭kRߌ3ڵk=OR&MGɽ^z_~w5>I>>3 6LnnnVBR}ڽ{5EǏg}f.Ç&ϗ/Zld4iDP۶mu…$#IZxڷoqe/.B iKj:uFOon(xq8wz+WNuQ5k,rI&wﮀ9Ҧrȡ]vi5kV^Z ^/tU>>>&8pݓw_˖-'NhĈJ6m(VG+ >\SLnnn۷>dƏhWWW <8II8ƍy/_>-YDGUʕ%S0ڼynݪ5j,~|4~x]pA/Vm61...8p._qY|ϒy{'ّԀ>}zԩSuyS6?שSTR% ˌ9RwVٲe-.]:mV-[6go4n8ɋo iq8ǏB 9pĈرc;sN::qݻgRӫ|]5kʕ+5~ٳgڸq6oެ;vfԭ[7KEM,Mw=(P@ӧ,X`Ǭ'$$D?l٢={$1%JP޽եKw>秥K/5!W\Y4h6m$6mJϝ;w~vE/p2A&LPcwuYƍp߿P`0(}ʘ1ͫH"I~bbbt=zT/^իWu-EEE)44Tʚ5gϮrʩjժz7hgѣw^˗OSb=_|ʙ3gr(IzN>G*((Hׯ_۷0e͚UYfUΜ9UR%UZU JaWҥKzٳ+w)⿣`Pٲeuԩ8mK_e#JmQSXhzw}g9H]R֯IСCq̙S;vLlQû{VX~(OOdE/8s*""gL[`G111u넇kĉX ,jժiƌ<5jҧO;rOĉuM͝;W>bbbԬY3>r~wkq|׺sKϊ-$y~\ j]矪Y"##վ}{-[LiҤIRu֩M6F۲e˦k׮tGŋUB>sqqy$N`>T)IZj5jR1ۼysm_c U6m|ݻw`SPPKvڥƍÇɔUsۗ/_.XÆ uɗ .)S$Y~ p!@[dwo{ ce˖͎YNJҙ3gmwqqQJt%dɒI&HP``*TG%د\rڹsgnRիСCfK6~Wխ[7 )$(u1т$8qB׽{Y[o=&C ڰa/E/$2~x$WWWm۶rww7ɓzwt{ 8P&/^8{/ )E/$pYM2ETL:tH۶mիyx 7OVݺuum{j+VLVR,YQFѣ*Wɉ;,mذ!NfjѢ"""-^vޭܹs#TΝ;Z|ۧ =}T3fT…UNuA>>>ɝ&#^I(*** nݪVZٳgFK,ݻw+gΜI"C`6lM6)]tF/\u͛v u(|X@ аaCmܸ(z,D `^) / Cr')**Jо}}/Yvޭ9s1K@25k,]zլA }xŋkʝ; 3H](z$VZ:w$)cƌݻ7nB Ç:~֭[;v(&&&޹E/dzi޽jժiʕ+ǏרQk׮xkr'؂W͚5{x ^T|yܹSӧOWڴi9w֭۷o'I);,88X ֣GSN)O<&?ԤIݿh;;3b-\P=$kά$UREW 8#^vq,dɒڻwrmEڵkP0;v,kk ,Xmۦ,Ym?wUݻwi;6l`ql=p&VdddVojUDDD7n^zzw+>|X+W*U1 E/; ygϪVZVYvm-_\ڵSLLLv۷k/=WkUl ՝],պuk >>>>Zl\]y=#77޶m}R{Fe5?QƌU\=}T]tQTT󻹹r2 Zn*VV`ACO8v֩S8:~d4iHO>Qn:e͚&1R^v֡Cyh"9RFqEIO?TAAA/p4,gΜٳѶiӦGu{n={V={9rh>`lSjf͚Zzrm>T*Uty)88XYdU);&NoTB-]T111SmVϟ$կ_p*ŋu]zUTP!URE 0kݻkɒ% TF-[*M4qڏ?={T^=rH(z`0hݺu:u8...jԨ&LU4j޽}TfM.]ZO߿_~ӦMYo P0{Ծ}{ouww_A4whhZn]vY^uRHDppj׮mRK4x` 6Lmۦݻ[g2e}v ^)QH@dd7ogϚ=v&ҤIŋk̙ʐ!ٱZh!˗]t)S8z5ydy.]:Ϙ1—$ 4H޽I1J(~I7nTLL83G^8+QR?XK/ܩ@&L5iҤدկ_?_zbbbt1} 2U@-^XoҥKզM+WN^^^ʕ+*W~~S@@6mj 8G^8+˸L0a&N(I{%sVWCՌ3$sڵkU~}}o߾^z%rjȐ!o$W0 ϮKvz'cFQ6mZmڴ)ނ$̙S?&NoQ_l+:d/'C ڲeԩ|II>l٢u&cVwJ>#ծ];1...7n;C /gX8+QRǿto /)y.\[ 7rHٳ{ _@2s a ^@ @ @j/d>s[ `PLL$z3۹sg _@2r ca^@27n\/B @/dΜ9oF4| 0@E$eΜy--|EGGkŊ86yXXl=ٸq_~eʽ1`OB`0h/ϟϟ+M4 jΝ;Kw83fPttO.777]rE|l٢;w*>lyXXlz Y|G/cΐ!nݪ~;?[2M/^TB|x _THɓGRddҤIիWe˖VWUҠ$W ^3gUjd r2*Yܹ`>}hVK/|^pWI;; p2ǏO%I}~7%vǗD 3WUb`GcǎɓcΒ%~W ^Rgx!sQUR%vוeUX1c/^X={`>gX8֫N/N(xH-̠A7xCTHHN8%J=5lP׮]:v=TJQ,ԏ>(zI`0hСq ^i %r2WztAUVMTT)޽[%K4:&((HuֵI+[lN 03WR7+xC   6L3f̈}o@rs2A...qߺuKΝ;gt\"E{n˗ϢoV3 ^q:^;$b/1V\riϞ=*U@ժUK1cbbԷo_ ^u^;$5sgT_Rھ}|}})x`Z^%vz6_kǎ8B&22R-[ԡC,#W\+00;6l/`&G^8֫N/  Yf>[o%cfׯ,X 2h֭z-Ν;W(P@{Q- ~^pWɋ^P8 I&i'OiӦVz״sxw|]rE7+::ZsUxxgz(z6`04x8;vP8 Q&Lx٣Gk\ݻjѢ֭[w8zq^ oXEkٱ^*W@\B_WDD̙3_Uj,q-;:wlٲ}z֬YO*M4+*X_\|YժUݻwg;`>G_8֫`!AiΜ9(xHLppW .HgG`Ǐǎ/` G_8֫; Wڹs'/)yZj .H"ڼy>|7naÆF򎯂 m+*e`0>x;wRJɘ /dÕ9sfըQC5k&Iʔ)6mڔ䅯>s ^aXR.75wgڱc/)31 TڴiEDDEھ}ѱ80} D 3WR7֫^(xHM테т$KN?4ibѣGjРgQSNQlԋ* GHx!Wڴivx:|䉚6mjQ111;v$ ^%XW0 8p͛EbŊɘ ϫB p IDAT @jzz HC @/d?A)$$ĪyҥKM6ŻѣG5yW%?k͚538z^.a04`͟??vڥ2e$cf#1 ֭-[J*iǎjNKw|iܹܹreUr caJ}A @j/d&Me˖I9w}n;c8p@+WѣhrWUN/&BfժUС^}}a_͛7׎;MV5kThh;&IU\9b+*KLLz%K> r2V:uS˗Ν;-[6ܺuK VDDI/^ݻ[p6^pW_LLzR+GHB&((HM4%IǏGʕK:t0o~(xfr `J(z_kҥ(xHLhh4iwJjԩS}z-M>ݪ8q `r *x.]:3LLLڴi3gΨxڲe"##u1-Z[jwTTb b-֨K"1D{KJWc^(*-5(bIٝ.lly>9-3lg&puu)S&[VhQر...zAdk^` Ȧ) ^Dd1laB#F@׮];j֬cǎ-rvv\rXr|988`ӦM(S^m"[WDdXB Bf~zi/"2Wyɓ'hѢݻvz OOOۮS.^B ˗زe ɓ'cҤIf^`> &i =UȈㄌ*c_B)R^wޅ/d;c""Kze!BA^ŋgEDf2ٕ*U*ץ۴i8 ǫWKWPXR^Y/^DDDDdS2?SVxq9rNhMպuk|RfΜ)ܢE 9HdSXR^Y7^DDDDd3x%Lr Ο?-Zٳg9`H,8eÇzDy^DDDDd  6H24RJ(;kQ(p9($$$ ##e˖76mSm_ؾ};jժ]3gbƌȜi߾=߯#1^Iݻw{F+<<ڵBÇQHIdnXlC/""""z ȒXݻw-[HҤzLj#k۹_...ѿ,X׮] pIB ҥK(Sn/FYs2%KoŜ9s0i$x~}͛g`EDDDDVYkIKKԩSpBtfϞ.cs r㣏>\"[dT_#%%|e .xxx9c-Y-B?Yky5ksx@TT={")))cK*'NF:쌍72"Ғ+SIMMxɓ1k,Y9<:tׯ_Fl^DDDDd27JJ(#G0""c2hܸ1nݺ]6??? 22ك\۫]6qqqq֭?kΝ;ѦM^zeJǏǂ 9s&LbpϟG6m >D% nR^&^DDDDdu4^GEʕpdDDYLFFڷoÇsرcao}љ]xYVR'N@…sԩ;w.Jcjժ 6jժڿ0"fʔ?VZiK׻|XbmY+ЋBi&i/"2WH=[W#G 888 ڵkɓ'# @cعs'n݊'*jsD"KzEY1"""""P(,Ld\\\qF#>Rݻwzejl|խ[zRK?QYte^\̍-M-[W6mxUɓ1k,sqq6*V,c$2'WC/""""8۷D'/"2W6! sIKw .>^2G}5k4^5u*3lc$2W C/""""( Ȓ̛7o_|)XU ))I>l-+seR 6u|DyrЋ,Fzz:2ZlY^DdluBt*̙ :t(TEFF>-ze2eԫW~)f̘-[HHH{|J3gDttmr~X"""""3}bҶ'GFD-OȔ+W'N_räIdi  P+Mdl^;P(?3/2 +ҖB 4IKKC>}_Ix 999aƌ8qʱQQQ(S Jv777=zT{\t kݻIdk^Y 6h Kɾ|+WQ`h(ezE=li ;̎NǣYf*>&MU{yyI&INNF6mj#}Gdk^Y3lڴI㒃'OƬYo޼'O2777ٳ銡?GFD5OȌ5 nҸĉ}ʶoF ԩ6l`И""";udP[Dƚ땵ӧ֬YNիW/.]=>cxyy4www۷Z2"m^>xO/""""2;iiiݻ7#mFXX/"2;|||b ѪU+$$$}ؿn/FBd?q8::GڵkmذAc Ϟ=CuٳgѰaC ȴXPҋBZZzꅽ{Jۼq1+W.FDN!`gg'=OHH~^l5j @ضm5j|u~g}FDDuԩL뫌!s\6lÑ???lڴWKd^Y[nr労mٳ'6lWWW@WFt OɁWzQcEDZ'dJ%tI&q@3xA0Jv{,X@˔)mۢt(Rw^\v PdIbŊ:XkѨ[.?~vnݺ\튯9s`ҤIZ+>>-Zݻwxɰ^\zQRx+Waaa XҥK1bxxxhܸ1mۆ}ʲի1|p{f͚ؽ{7} ҂5+k7xÇZU˗s5j9Db2o7֮]gHHH@ٲeQ^= 4ڵQЋLZZz`i/"2W!_ܿh߾=;whDEʕq-Z.XQZ5uۗL撆퓶1""se2BtAkĉhڴGFDze 6m!C`С? "##agg'{_x)^}b۶ms{{{m 63]իW1Oz%"""",-LȬ^Z x 6 ϟG|82뒒!D/#җ-+Kh"ٳCJ˕+g x{/9_ ȔXۥK}vy5aT^]帤$,YfBRRRvѱc< t[H)))ԩJ̒-L( ̜9P^=^QlYݼyǏ70رcѪUM/[W޽{6m@T/ĪUPreٳgF?} ȔXߒ%K{ 6mNxok? .vj ;f! ȨRRRХK>|XږxyyyȈ 3g`޽9s&D 0vxôhCΝ!@jpAGzehٲ%?+VƍqIDGGh@Uw^dJWOE˗(]4.^"Ez^bb"zPbC^DDDDd4 ȒpB ׫ ;v4,Ǔ'OPfM|RP^YKbĈjի1x`|򈈈L|) 888=zooou0`IKKC߾}{nM|qyC""""2d^Dd18!t?w) |L<2ˠT*g^-ZΝ;y42W!""?JvI1Riܺu <L|=|e˖=../_>[m6tM~S/uЋd.X"/"2KQk׮Mׯ_m۶ٖ"`ܹ Sh޼9]fQ5` JC AbbV_x֭3XRSS1h !xI^QF!** ;wFHH<<<>˧s{NNN_ B>5aEDDDD 9"mX"='dryŗ+)) ;vdſӧxGիY+o|Ep^dRWm޽سg͛7޽;n޼ ͛7zoumTV 6mһxO/""""Mrr2:wGJxℌrǗ;ۇ?#3/[,XBz<2tWnݺpv튱cǢ\ry&~/e˖8x d/3"`2oɨZ*>|C\\>}ŋGzz:=̙za߾}ߐUŋ@^DDDD$ MWXXJ,##"Κ'dӱc8q(VڵkΝ;\M&BbÆ jϟ!!!hܸT1ժUѣ 0'CK.Etts .CN:*YkWӧ_|]ٳgcԩ6l/_n!Gؼy_l%Ɲ;wвeKDEE_@:t 40Xk"U ^Y-[O>ԯ_͛7j*( 7nA}yu͛7UkbEDZ'dbbbЬY3/ B޽WXv-s<ܖG۶mqỶ+nXh}B|7M<*"^t IDATY+&O3g`ԨQXlv'''lٲCR4oWWWtIe1zΒ^#kIMMEqm@ձg %%h۶cpfhB~^1Xk"F ݗ|ZJJRUDzeY.]={`ٰw ۶m|YFƍ,7$""""d^ǎe^%JȻV?={ETT--Z`(]t^Hv7 'OָsرcKe˖ԡze*AAAׯ,QN}ڷo/;Q_:u¦M?~iìފDrbr/uk$''v/HDDDD6%)) ;vT j֬HW~p<ӦMS{M9D"„̉'hӦ|+h ݋^z!--M6.]K~~~ h۶-Μ9c⑑1B2w v͛KW˖-~z>}ºu0x`xzzVpp04i'Oh1c`aEzeN9m4QdIqѤI:yHLLB$ڵk% 3xbѧOQn]QH1gϞa l^[% 3,ׯ/vZǦUV2ea͚5EBB6l4}pttosZѣG (O^O|A___[շo_# 1g)J*#4cEDDDDdle3gtؿt% dpTzEʕ1tP}_3fDNٍ7_{Z2MW… z}aoo/ &|-}HcZ!xQ`2_GŊ3Y:tqqqZ[bŊ"**ȿzx|hU/ &J,)֭[p'd&NvB BTf;~RPls V*Ν+>Si[۶mMRΝ;wT&|}}ʕ+T~A*Tze ^w;v^~WϞ=u8>gggq֭\9y$/WkYzojԨ!F!ƌ#>QV-é*UkկW^^B0"""",^111y=4Rpp}gy=" cǎihx IIIK?6=z$˧$… 2NrrZ;OT{׫WĸqT`el^|nݺ&b ,UO< ѣAc}h۶ƿ~x~rr]6/29+u.ŋ`ݾ}[L4Ixxxdɒ޽{Z[eB؃@||<>cKjժÇpy82둔+WbܹqQ ~0JD`ʔ)={ hٲeJ>7n͛7k֬4hm{QHm6l@TT\\\W_iL29rVc+V ձ?3f={ƍ/_>,XÇ8mΝ3ʘI^LׯQZ5߸w_ܹsuԑźut#88B@R o>L۷#::Z۷oŋ`h i|EGGk׮HJJBҥqitQ~~~3g߿ѣGAcO>\Э[7̟?_+",, %KS^nDDDDxݺuK$e'Xzu'2y s!~(^(QEDD___wPD qΝlU^]]tɵgϞ GG ȫ )\\\J{hdzY&@bB>o߾SNjUT~ʕ:ѣG2nR)zowʕ9?ɰ^Ç /^ܰ0Qt?uMJ(Sʹ*U2+21""""qqqqaÆ (>>^TPA\ZJ־21˜~왬gn4ɫ 5kGGG}vxÇxzjtR5;`?߃+K.W=M2E=3F6D|YN /==]dddhe-[S 4ڵÕ$%%:udСl}ʼݽ{WmƍӫѸq͛7kVr6ٴxkgΜծ]K,00wUoųgd+%%E9##6lLk֬A?6QNjɝ9Ocs j ϟ?'f͚믿ƍ7p1,_v۷/ .m oÓ'O>ɋKctըQ~< iiiZ pٖvssCPP<<7d_6.oooO*۞>}S(G$7+'dooʕ+V|sN|jغuk0""""Ih۶-2 渿XbUzu瑑Xpl@ѢE%11Qֶ 9s* UTGrr2\"=vZ!pYxzzt ˗WhQh|޼yq_|ƍ>۷ԩSqq9.\ƍ 55Us˔)|̙]CUU*^O +åcҤI(\0+ٳg/K.!22Πl2:UN_l@׮]:zٜٳҶЫ H ͚5Syf͚ٶ͜9ϟKDEE.:y9!w/OePQQQBH7lؐMvޞry7n0vuoVXj.\(}O?xصkW3FNdd$V\y*ϯ\y4駟l+Plip ]tO?W^!::SLA׮]e n޼)پt8o-\0s6mڔc? ӪU"V^VT\Y(JYٷo_{-KۙN8idmzY1M׫Wzhg ѫW/#GLprrZ}%"tQNiB)S|~Me'k8{ptt*zիWe闲k߾}߹l_@ӧ UOIIReNʒxϟ!V%DRRRgddڵkkvvvbΜ9zqd' |j8wl}sݵkΩSd5ee\xe}x{{˗~e.+@̝;WsF֭[G ł ֭[O$c2_|)ѣG*V/6x\%KD6m n+1""""" ,BM4qRE+%%Eߋ!;;;Փ- 9A bذaFkDD8w]ׯE``С(Z...aÆok7oVJ*%o#Qxqf9z״ >ʕ+'VXq)fmIm-ZT$$$zR'NzE}cƌ믿eʔWPA<~ؠA+u}QN_X&N}Ϟ=+a0' īWDݺuU>6nܘdZM|)J1}t'2NNNbFzdmBwxTV^ĤIĺuDxx,KӮYFyvlBH"Kooox 9QFiGEEGx$^}wïUV~mVj履~ WWW@|z_I-+T*G}$J*|ĈbŊXg+JѲeK@ԨQhK4C/""""+"ZO| !ĕ+WDÆ uV8{^5*k1$ʜl(Zhڴ2d;wصkyf~oV]t%ۄǏM<ݶm[G.X s .,&N(>| i&i"핸Ν3ůXk2w h]n_^ZܹsN;{ݻ<^/>{dtWmժU*>ׁ}5G?Zy5 e%^DDDDu>|.Oծ];ѽ{weKTcǎ 777m;99͛?4$+k7V^WtA3F͛7?CL4IZNIa^*J.;tO uB ݻwgkwذaZogg'|}}E^D߾}EN >FgT^z%\u֕~"44To^ ty?CB={ژ={ίĉ?9֪UK y V+ ~>?_IMMGΙ2e/l !"EGGuָr労q Apd/_bȑ9s&ʗ/HOOǢE0i$( 7o^fdd͛|2ooo4jzW IDAT"uLٳgK-[7m4̜9S>_~%JFn݊ݻ}kƍptt1bIs׮]Xx1bbb4SN~6RcѢE?bp[?RL%,, ֭òemױpBlܸ:]|yЍ7PF ( *T}4*UO>U{%K z;t+9y6mV{Q\9|r>}y3g4+#2+cǎ\>W|m޼{ 4^戡be>v-M,k|ݹsG*UJ\zU~__dluBF%K޽{ ???0^˺tڵks=^PVZxe:~ GGG1a.?kZׯs]YӣB b9} jѦMƪo={L5uT̚5K>GGGT\Yk֬slѣGz*\˗/̙3x𡴟2޽{ѥK̙31e[j o%J+W;w ""BzгgOAcGPP5j ,,̠ϧ)))(_<>} [n_+VĔ)SЯ_?888H۟={___$%%@xNɈ 4hlق~=z@jj*vڅ@=G{|esv܉K.ݻx9 .bŊN:h߾=6m '''C_yԍlWx5mڔWx7oވRJ\a_K.1u[м,~YTC &ӧ+Whu^NKf>˧['O^28qܽ{VxB.]Z+'22Rl޼Y㣣EZ|C/""""3)~2GIl޼Y+SnWnDRRtR-~ M(uB&...-M>]vr u׳gDɒ%xOzO~7u W.ǐIOZL-keoo/n޼)k .ޣ}b̙p:_UT[lf͒M>|(ǰIG\L)k@GRR(^ *v؄/HWȺXELL쯁W$ʖ-u NiK~ ̔Yf ذa4JŊe2|`E`2^...m͘1C6H>*U '''lxerttϚ)xFY"pR9{+%%E,YDx{{~e>&L zw-[=zQ@s_^z뒟x䉡/f؃΋/ЪU+\vM֬Y3߿^rq_DDŒoݺu zln:888e,dLٳgK-[T{ァoXbZ9sLL2E |'رcG}=z4NhҚ+S B~=..}d﫯jl`` bbb#Fݻ?s_K,AttA%U8pʿ7n1B(PBÆ u!!!UwG}Oܶ-bEDDDdf?VZ6^^tAAAFnݺcx1„ݻwq!̘1CcȐ!:MfiۥKЮ];ڵ+uꏴӠA7nĖ-[x@BB3Gӱze LG/z˗/ҧ;Əx5͛NNN4hnܸ͛7zZ4^"C^=l[p|FxB Ç5_nBV_7oD&Md//5#"""y}͛sIC 7dQ︤!.P(TϚ5Kz֭BT*ŋEƍȑ#ׯ/Qzu:dkE_ӧOӦMS=z4{*&M$J*eВ^*=~xC^iZ땩[0G֭ٳg ;11Q+VLZT*Ş={|՘ 3>Xٳgֈ-[,ix_b߾}"==ٻ(m. ҤXbcbA-F5j~1jbѠhL4jƆX5*"X( /ceβpkKvfΜqq3gիW/7o\~X2^f"!!kԨ˂UVMx=c˘ӧxA(2r;uǏ^ߘݝrQ)Be?<ժUXme2msJ*x13ZJhoɒ%z > ^hj̙3ܹsgE ze^d2/^\e;ʡ)Rxyyq||}D LixVZaJC r!$S FEE̘Sfdd()  ;;:ttB'N "eI$WT9lllhǎԣGJHH ???ɳ.22wP;Rک*{ה9իhVU~46;F7vŋjcܸq³V^Z-ZЁٳԹsgu}ʕW?ze~R)yyy\߹sg4TET'NP2eDE2uPة+--] R~}~32pذa񕝝k׮~M/0|r\\+VL܎?׭[}3f ِ!C4^믿* vssSAw[N}5djLP ׫$˹m۶LD'_~ .ӧ?u痿??^B#F0\=z`T&M2}m^=zeƬIII S\+[pW^LHYծ];NOO7u@sb_r7oŊ㤤$u۷oGFUd:uwޞ;5j`GGG4jRRRH222^{~{ T .ԹΞ9sF! Pv\GAW---gϞͯ_γ.==ׯ_իW+СGDDhݗa1o߾ `+++&"H$uVQPʼ bccgv!B/yBB&E /&"6mGFFSd6lؠw^xBhhK$?B \J<ﱲgiӦj4h/_֪/ׯE ZYYիW>e_̕L&{rƍ ڵkgΜX .8p>|8 yd\D j1c"B/xYa۷o3g"b32׬Y DWd 0 lZCuKxxQ~~~ Sk֭k.Gy۶m,նyMgiZӓZ6M6FPꕥq]~v޼y¶}dz0eh߾x]x?ܹm+99SSS0`/ G,ÇyŊܫW/c___8p XoܸUb_ׯ__lTa y r*% qF^ ĉ`n͛\|y([j֬ϟ?+ ʒEGG44JNNf777&k׮tze9BCCUr|EGG+wy]|Yx+/ '?^ӧ]u1!777̏(D0 r93F޽{MMqx(ɸUVJߣ!CQ͛ӧD;6+;ްaƻk-u;w9sL}JJ^L_Zjyꌳ3CСCLD\t<{Πs,lzQ\\0uVұcG^`ҤI{VlY^x1;w\5J+wڥ8ASeWBd "E0OMѺ}MСC˜m۶֗-[T* ںu+)SFaM2^J: ̬YhFr;w\.71ݻGVNJ'N`ڻw/xB$ /4rH322[ntĉ<233iȐ!`RUXʖ-r}Rn$Hhȑ#JOOOF>r)11Qi^x{{ƍ)22ƏOvvvz*۷/zd_PJJ uЁvMŋga^۠A?T~[HHy{{իiŊԠA:~~Ȑ!deeفԩ@A Wq/ r 8p徒S;#GUxϞ=+Qa+O<7ݻw`ݺu7j(YveFv` 5k0ŋ!x֬Y\hQرcZKSED666gQl߾]!⠠ j>}`@^JNN`.S[P^,Ⴈ޽{1mۦԬ666||WA  3Çr45a@WquѨQH.4a6mɉ^|w_e)wN:ENDDTH}мyݻw:M)SL&ӸD"UVѨQTnFDDdccC;v=zܷ?7nз~KǏsu҅ǎkhf)wrOiԨQvM63.\rи8>}z~([ڴiu=dr:!?~hipЌ^eo>^hX>M2%ߏs6m{yy+7i҄.]ʩޗbbb^,?֯_14bf`"bz d^DĥJ2I&1D"ᐐy)_c5h qu^"ҥ ;88xe4_C Q|effWD fz-+ˑ^n]#:%___zZ@@޽SZF)|UZxS>ʕ+GL};w׸8;w(]?yd RM6tY^ ͛7yޞ8@mڴ8/y3ԸqcQL]zer.^+V=xJ,)qH[l ҥKDkt;K[FU1B:////vLaP[z@… BUBڱcy^|I'OTѣG~F9pVTɠVPΜ9C͛7W~͚5 $[v-/>|XiEDN]t'Nr͛ÇNoq^h rY[[S͚5^KIII&z;>|(j^P֭)&&FxW^ mR:u>mڴISRR`kݺmb25jԠ:uPfҥKԷo_a֭[Ӊ'TR-"]'!9tEg777:zhgQxx} FݻwGeT=zmөk׮Muؼys #;;;۸ѣGx9+m^Y ym˖-"1%SuU>0bŊQxx88q><===iƍϗ/_0.h߾}T\<٣r?[[[ڻw//̜9SeVZѲe˔sqqpjذ(ER4A*W_ѱc n?%%_""*Qzjٲ/EL=hтn߾U{_Ν;Ӂ $6 D*wЁ+tυ?_pAN*QLڳgY[[+~eюiժUDDTjU $__tA1%J/_/czUqF/+RЀtn7>>իGDDD'O6?^"___a:"^ٳ4m4w}8::җ_~Is%777{ 5nܘ5nDM4%J==xnܸA YYYѾ}sڝ> 3j(Zf խ[]tzяըQ޽+ܱcG:|hz /_^b())/.1@ 6lm۶Mz1K. Be\^ :\.UԮ];:~m+ R222ϏΝ;GD.2eʈgB/\T^۶mS9 ЬY4NVT)ڰauQv߿OC !Hhʕ4f 2ӧL2TjUx UZ޽{GDDw咽={jߟ9ږ-[zcC?S9==]3Ә1chJӁM6:[jܸ1LԨQ#C J^f۷oM6њ5k0+󖖖FG7o5ժUNۃnݺj^:OԥKm=~zAW^^4hmذA``fﳻ;Ӈ߿oꮁ'NTx4-ּf7o\c|aĿ(l;c|ĉ|;ɓ'ޞ9&&Ơ&O,G:uX.kg̙y~ҥK%k m.N&СCUX>u:uE; ӱcG&"vssM6;&ze޽{sŋ+Vl e+++5jīVDyߋp%[n^!D"˗k},^x1;88|J*qDD1LNEDܬY30/^p>vҸו.]>|hPr?~\h~YXi;;w"2"S+Ntt4(Wݻw甔Sw ߠ^ԩ{W˖-9))I6WXwŋW+믿o H^*11;WWWyvvv_111~1^zrw 6T;͏?6uW|;vL^9KjѣGZRT- j0>^PEFFrٲe>1c{7l0~6AAA*\\\'?ޠ ,c"b[[[>tmr۷o9<<>fW^||} $+󕚚*\$HxʕJ;u+WNK.ƍAZ^[ .x ^P *Oe7n bUч>|X^ZhK+ggϞq`I&ZߙveЮ]zJΝ;^}]H|yNj3 4ҥ0 SwdPӅf޼yj}1רQC+ǵkxܹܽ{wd\2wܙgQ ƃ eLfkkw g},ZHЍd^VZ֯,RM4#˹o߾immͣGXmm߾͛7War%w˸L]@Ç3ч;`/^h yKJJ>GtA;MVZ_`$PDFFR6m(>>^x?͛7 {X|9M8ʗ/Oqqq:Mj۶-|2:DB˖- &U̚5ϟ/looOaaa/?uPzz?@>>>{˖-4p@"";w}222hgK$_>k׎ʖ-Kŋ8ڳg]tʔ)CN"OOOΥMҋ/ʊjԨAo߾*֖KW.S>}T...N 648hm۶t j۶-;v0 +O?7|C\BjDPٵbŊt)OE-%Sn޽{\L;,޽{XbzTz@Y;T-[Gd2]ކ oƌz;gU?|PVԼysu3{*mmmŋ}<˸L]@;Ϟ=UlM޽2u:uJx~wKHHEqժU5!Cpff@W!==L20MXbLDsiҤƟӾp-T-%K1cf|5mW>1u9rDxڷoq/H":hze9UI$yѢEݠޙӣGa5k_߫WK.S{ëWbJC#2ud2nѢ^uƆ |Ҷo޼Ç?W+++:Ŀ;?^6˖-SZo߮L&#0a޺uK &M$i CY5tP^… ɓLN:53g/0:SȜg󌯬,ްa(Q5k 3'TV899Ys.LBzeTQvmNMMU^&oƁrJ~}6A7of"RJqݺuu/9 wMa 8p 駜>aaaloorcǎZcӦM}U 9q{SfM~(&&& w{yyy ak׸W^.t"^n^x[r>v3vݻwÇowѻSLa"L|dz}G͔֩sV\|Y(ȑ#9X^`^Æ Ce޽Ea'''>{,337nܘ_s{gΜ-t=2/"Zj~J#Gd[[[ïF ?+WN;9Aw.\`;;;Xw|5J?g^z|H!!!yi„ Z33]tIɓ'vZq:JPn(##R)U\Yu}xٿt֍Onq@5S+PѣGԨQ#Zv-ry=m޼ᆪwމ/LFÇ,>|8& IDATى~ ^Yw n޼9]pY`N?~n̴f""j׮ e @W˲mݺUt9T߿mkחٵknZ|W |ǵSgٳgA܍s1c+VL;˖- $;;LJo߾̏?z펯L.@//<Z*yFӄ^z?r\+?^>3~hdHNN2eʨ 4/ggg:t(Ϙ1[h!ߋt2ܿӧOϟv܉' /[L}^~sŋ4xsA{gϞI&qff^2jUjU!lۿ|988[ =-P f͚xh ::Z-Z͛7CiWrr2Khwb:@Wƍx stt4O2Expl"( F/v&Mh0buN>^`,5Ǐ7xz m:6lP|U,]Ԡ#S'''u Rh&˅; ?]Ɲ;wV[?۷opG3sBB0p{)^8תUkԨ666 ٹsN@ze-Z_9uHGuq?.\PyCfdܴiSsssS|yatyf^L= cH{#G'OG||^ܼ,ްapdɒWi&+X; ֹ?S+ΥKՕܹ۷oW;u쓒~~~*f͚+P!_^H.ZhZ1a^z5;%\.ZjiCW *24 c&".R/Z߽{!!!,JU~94iׯ/7dDvvv#G>}^ᗝw~}B;s1ܶouQ%K~aֻwo&, }ꢦjժ+3e}Ndz[|ڀJ=իp@Y蝙_|} ʼd2g%ݛ% iF޺uK=<==իWB/0[׮]CU@s=رcy֭_^m5yd}^/e@ɓ'x ;88Dʦ#[`(m 2_^K.rJ *U(A@fe{X933SvmVVV*$M8Qf͚ܷo_LAə2PyڌU107ol1$z x;wN[ ׯ_k|+WD9VVV{zz L *ltٳ':t(Ϻkr"ETֲƍszzAP^͛ YZh/^й=1JJJ^ 0 sU TB+[R)Ӈ###=zB}u ._a!WFFK㸻+YsWŬY41B mרQCzN8VVV:}z1o۶M<(\l۷/Q SxM8-KWL*U׵kpWUa}6 S~ݻw?sů >NNNLN;ooon֬׮][PӐK.+zu҅E>k`.|ʒ|+ʊo޼)1NhB ԫ5g/(ʊ7oެq4_&L~B/0+W PS,ڵks]!*^^SVi a%K5Eadz^ݻwyȑ 2) ۷ogkkkJ77o֧۹uF cccWZŅ;v4A\,ь35koߊrܦM )SF6tz޾}*T:Zp~Zc! Lʕ+'7yd='NL&S&%%9ƍzqTT^|9::rxk : |s9sf -ZPΝ;>ɓ\H*\ڷo) rdeezeY?Ύleeŝ;w_~DcΝE]0`fE8ݠ^ի.zj9;; t>֑#G>S KtVX! XrLMWdJMM`.WNWo߾>˗D[li_`}JDD^^^t *S}Gj@p9^$ˉtRA3gΤ&MUX*WLTzu߅UH$ԴiSx1`@0-Zеk(00PٳgCxx8YYYx͝;f͚t]vv6ߟve10WC@@޽lmm?py{{ɓ'5{:txUPA9P̏T*%///;w,zEDԩS'ڳgʚk:u/ B/0sΑB5e^ѣGŸ_x>IIIxb-Z… ŋݽ{hԣGW8g^` g}F7n{Ul=zN:+nܸADBwܹsiJ!2-+ˑMoߦӱc… 3|=}h*xk׎oNDرt;!^RJ\e )ʁ tf`'''9LbnfϞ-Vyذalggӳ} ۷^`TxƄq_y~}||Lݽ#w=y(md2ݻ7eFP_jj*]u癶988++Kv5=+͛ѣm6ꫯC%F[@2wm۶͘޶mzyyq||ю B/0^ˮ]` zjnٲAˎ;ah0 ^xsJNDdnR(T_r!2W...:}}iu m/uK͚5F@2w2K(U0f D#af&EDD? M:,Yb^!222t TVR*VHO퐐6lVVVqF߿(ǃPӧOo߾t9Сmٲ/vR^4N [boɓ'};vk׎iܸq?|IhҥcxbgLi@CUp-\fΜiPգ!CPvjժLvoVnݢڵke0 Mf8rHEi;++J,pǬi֭ԧOQ ʼEGGS6m(..>L-[$gggJHHzVmUR8@^^^j&j߾=թS7oN;vEjRzB :tprvvpjԨmذ*z𕒒BRٳ'ݻxLz8gϞ3iL-ɻwQF:O#J[n?l;::˔)#v|<3(0DӦMS2M~~~3T}Lu(>+^:[YYo߾ͳyʔ)󓛛ߺuK5MuئMNJJ2i(ze9BCCՕ/^Kxuhvuu3gYw ^ 3g *={ժU*첷篾JgF>|XwٲeF>(0 Uhhhz\xqnݺ1qǎy֬YydܱcG&"P<'Lj?K-[N߾})--~'51cU^ShL/0‡Y(={FDTlYjԨf:uSN2P,X<תUTuy{{Q`` mܸfϞMS5P``N*U ^L򢘘jٲ%>}Z6޼yC{Ǐ&((HeM'ORҥu':WX:|0uԉJ.MO>U233UB9{,SZZ/:z(=z"""(%%ҨhѢ駟P׮]>pXl_ҥK%$$A : Ȁ%bfTJIIUR DժUW^Y駟ÇK~d2SNъ+믿֫ Jcǎ%kkkڲe QժU֭[J.P,ßIC%"-[Pj'++ wxT;!{GAJP@~"x$(Cӣ ( M"R DE@z( ded$3ILk3$3뙽s/_>j@ٺ"h?CpXpñc̋/hrֹ"""L.]̱co߾Fҥ_sL 8p Xb)r%̓O>i-[f\n?mڴTzv$+8 5GMu%44̟?듕QG-N;vj$o73p@J_+Wvy5k֘ Cz\}P2כVZ.TPtl޼٭=W^}hzKhxSy=)y%wf&!!繾;bo1!@ozmܪ5k46lp9Vddd۶i޽;</^4mڴqڮk׮>ޝ QǕ+W@tA;d7UThw_1& :Ԅ`3uT<dmԫ믿2O<۟7on>ӧOOk2ӧMBc M/` 9szj֬Mu_Ǐ7qqqng[nɟ?d:ug Uz\CBB;c_4ޙ3gLJ6G'4 ,0Ν3$?ȑ#ft)9ӧ / W%&&k>s^V5+fLٲe$^p*Zvˣӻ(P|g鎟^3_x4jqR%J^#^ݱ~o̜9ѸtٱcyL*UPTZ5sٿLɒ%Ͱa$%%':oJ2Ok SD kMdd4+W6AAAu .8~Sxq'r̙"_|sIJJ2< /PϪU&O쳱] ?k|ݸWעEK.mZje5jhnwܹsK'ߐ^RJf„ fԨQ)|ӿ z5]n[v- 5{ldiӦMͥK|%KL@TfG|=zԩqaڷooNj~Ws1ᅴ)Sڵk;WJ- Ä 8*,,|)0 IDAT=xy]gϞ)Lrn|vd꣏>&U`3gӿ?OgϞ_ݻk|W?;/Z;wnՖ-[ZG|/^<ŗs_B+p;vΝH2 4pZj"-?Lwr]v)={y衇'Oӯ_?CÄ IRRcٟPߦ.[t]wǏɓE]G?UZzӿ]PP>z.wwQ2;*Qٿ={ qf͚W_O_aaaf͚5>x@hzt]4qcw,c…fӦM~_4Lܹ=+[lI3fʕ+e˖nS^=3asY_}o= ɬZ^jz)+U涇r,ub 7T7/SسgO9uꔹ;-\b"##9++{?qTcPPPG|}w&44HKD;vf͚}QHH>}Okٰa/znw9s[l߾ݱ7GxGzV\-[*..q5`SeMpyeTF Ƿz6nhy{W4c ܽ{~T۵vZ͛7O=Os!{d[lq|ڵti5nX{ŋ>*"""|rrze+::ZdI 6UV~WJr͛PB)nOHHPΝΝ$+V̫#8[Wׯ_O111jԨ.]^1cMm\B2+WUVN A6ʚΟ?SNzѣGzox>}ڣ% թS'͛W۷Wxx׹H3|7&gΜNKX 4XYʱc;c~#GḺt$$$x-[8ѤI>M6x;w>!)),Y4iZ?n6l^yǘ毿rSNի[^2|U^XCvXƌ3+VL\֭͹s<Xb1-ez彸83ydSf4Bxxy_ݺuOJJr۷"EX^0<88ٳǧW_uGÆ }:>&d۷o76{nN;)mcq#<3oK>Ku2]7ovٳ 8]nvn:\| 1oS=X?ܣzSܾg;wN ҪUTD ƽƍ+I*Yqo;wN6mryŋh߾}}Scƌh̛jJ?xKk׮Z~J*e9wΜ9G}T;wtVxq :T111ѣGuEm߾]/rʕK,ѯQ2eʨI&)nߺu_O ,PXXGlʕy/Lx:tݱ]:^:uK.ٖˎozɓ'W'LW^1Ԁ3 _KHH0:tpjݺuGSƚ-[ '|>,\дnڄ{dD…ڵkFdFHJJ2_|T[Gxlذ^yܹsVZ/88ؼ[?Zjiւ?<_~e5m$%%Y}իNuǀB l-\-[֜:uHo߾~ 53fjI{QboLג?4I5rHOLL4Zu|rꫩYvms K.Zi[^yo߾}{==EC GiӦ~m|ݰuVӧOӨQ#SlYd (`*UdZnm>cs H;/^4<˗/YrG?>MRRW^c>&>>q1_~q"""LllqB ld4Ne䃏??ޔ,Y2 {ҷ9&f͚Θ?+ߩ3{lHLL4QQQ.kjb,:w)^x\rYɓ1F5,eCW=z sy5m۶5|Yz_̆ ̒%Kȑ#MTT)RbfΜ~ndƎk1f~k|%&&Zw\b~aWpauVǹpႹ]֝R_~%#%z]vy4q$]Zd$^d^ݱ!C8^H8Yti_nkTHeRĄ o*sZҥKxx\p NsSNNX]~|G&$$ir6zk׮.]8^9s7|Ӝ>}:m_n-Zd6mꘔ NsO?ԩu_'O47h;W޹vѡ 6r)MZ[/9s47'OLsXK/}ٿl@F@6|C[W^u:o;R Æ K$SD 駟:iذdrՇ=VLF{勣Sunݺ^9ttkuPPyG͈#?~s1vZ˧橧"%n>j%,jG5.?ݗ$SNS@sY +$%%g}͛'{W^<^bbi۶m%SO=e&Ol~W~sQb ӷo_> d^dq˖-Kz펕̝;(P\vH.<{5'N4Ǐw>11,_y睎 onń aڴiNGPEEEl *8>^ٽ{wj5k/f+7-[>|1=ܗ'Io߾^eENRRٳ_DzGy?ޫ1\b4h<; dY˗/WTT^*I ȑ#էOe= .t|})4448qmۦnݺtҪP|A5lPe˖?={(88XcǎO<ɑ 0@v\ϕ+.] ژ nz畔nsޤI?~  Ǫ^/_Ʉd+:tbcc%It42_5nԞ'Ojǎ>zc^u92}s~n~/q|^?\C Qxx1[oiܹ :Qhz-[^4c޽:zlLRbbt[ǎguU9rp{P ڱc6mȦ?Ç_~.\gU޼y}2$mV;wT.]<:zPB8q/^,WrUM:UTdI6'Jm۶9~>t/^yF#nXd>cϒz"%DGG+&&F5hۊ+jŊ<d v|oҥNܼy;ݻݑ6mdSP!s ٿ2d^CBBL {g=Y瘀?z. z_7Ιl۷ה,Y2ZÇ .-GvF/:8izDVk֬֨Wqu/WP`o (`$%Kx=6nh^x7o^'w&22̜93ӞpGϺgv˖-S6m5jTlVr1'N)IJKÇkҤI)sw(::ZѺtݫC)w*Qʕ+R2!#n5h IJh>|X֭$uE+V4;*TCjС:|bbb_)_|*RjժE#o}k֬i_7 0@o={ŋׯ}%IaaaV^yˤsͮ_zJgVTT-[qKvڪ]&N{*&&FW\Q|TD ժU˯2J1xo…zꩧtuI42R5d5kj˖-6&r֬Y3-[L!!!ڽ{#L/f̘{έZ,-۷ٳU`AܹSԦM{nJLLtL$vmZ|A̙}?VHHO)SFeʔQڵUV-=>} FG(#l= Ă LhhR+G;Vѷo_e!S]J7/اO,>}Ӓ{y'LZ]H:;wcbŊEB }sι8Stifk奁WQ^=k_tDxbir̙cDDݻKaj9Ps뒆7xoӦ?ᗥ{ؾ\r&wnՕ˛cǺݻMHHM\\(4pO3fݱ[?`J2'O;ŋMXX`'&d_|UZ5qFSP!5~t'ӻ.\،1\z5;F޽P'22'c? >q=qfӦMn|z7|5ouϟߜ9sXv2=z0yo1""|WnƗz@ZWtRnMbb̛7'e˖;'Zjŋz뭷$Ӻuk^'  @͛7/Ekرvʖ/^GPPٺu ȶtR bBvرT'nuaKGs)S1b9y>]fڴixܥK|5M05n׮[G*ܐdkӤI䳿Ðl22?>IP3tP/6o>׏?u>Kٳgwm$'c@@?:tׯK4f IDATNpoÇ\r)noٲ/^\X`ڶm+I;v,+PTոqc>}: 0@t\?r6lHʕ+zK 6T"EtI=zT?,Y;vXʗ#GYrȡUViΝϞ= Z^e WwѣG+$$$Ξ=Yfi̘1ڽ{_͛W?Vco߮h}~ռys_իWK>}ϟ+!!!DŽjيrܶn:EFFҥKڵk_W:u?SGʕ+tR=zRիWOժUӕ+Wxb#7o27W_57˗/OEXvZӮ];}qiذaFOy!b޼yСc ;V=z9'|}O?> N=#I7n7;A pwׯիW+44}?^|F+Wx}uV/_m6}{9M6o?ׯ4i 1O?+xy|-)UR7{W5zhK:xy}JJJxUjƍ PW׬YTϱkر?,Y$A^VZ"EI&YjxIMCjϞ=z#G-R^~@ׄ Tn]۸qcYq^R.]hN/^̙3} z9mVfJ|7.\XÆ kƌںu{1ZVXA M/2shxqN;zϞ=` Llݺu`կ_߶A ҥKkҤIQ-O?v^ΨW˙3-Z뱂ԼysmذA~~aoذ{=y UnZd^VXUVVZ>ETT) 7 ;w~ ݻۜ nݺD>ѻwo}ᇎ7nHK؆zxh|_)&&F.\PrTF ^dB^ǏWnlN?~\w}Ο?cBBB4i$uyhʕ+>kw1!$&&F5Jջwo͠TUwY͘1C˖-Ν;ui\rUV-EEEf͚~I/:ZJ5˾Pׂ Ծ}TW9rh̙j߾}&^d2s3}Ժu4 ƒ%K4h IRկ_?2ȪUիW{d|>}dP*xzw:uzѢEmJz5iFfRhhhILLTǎ5k֬ Ld}4$nmxkԩ1'((HSLQ }ŋu_~iQڵSbb$i*]O du4ğ_|Qw}FӧOMBBf͚jժw4J*e˖)^=&d]UVMVroz\\:tu:"nA1FT^=5nDWC6m_($$$$&&SNZ`A9vZhG#&%7j֬YرAqڜ {nh<ȑ#իzHUTQbŔ+W.]pA֭[W_8z7ߨbŊ|Ƙ۷q~!7|࿿gcuQ۶mS2eo>;rDK.U%IaaaڴiWns*duԫgj߾㋯j)kuEǏ?e˖;.hz`iӦK.JJJD+UΝh"N:ZhJ, `BU]vʙ3֭[%K(11Q ʕ+ugPz}c||Tn]ǗƎ_~'c^e_4$qƊT\\kOʕ+ ^dS^pjxM:U;v92̙3̙3>788X={￯0  5jȭ%ho?~}7Seoԫ#))I/ .>׶mԺuk:tHԳgO9GIרWX`ڷoׯ{ݝwީ5k֨TR~J>^d ^̙3:tƏxǫUƍD*|  u6hx*{9scG}TSLQٲe=#!!A'Oo+WHz1c8-M  7xO?U׮]ix5i$M<~z7լY3-HLpE˿Wُ1FСC%Iѣw *d :Tv풔w;ިW/^hz\5Mgyd,1ڲe/_M6i:~PrTV-=j޼J.mcjdULۼy4h8s/^eocƌkx-I˗Wu=@ʟ?N8C)&&F?rB4uThŽlzL>];wN~x~F V%%%… ו?~!;`B\5dȐ/$iѢEСEUhQ?Q{L+WLq; /s,R 7U`A,X225M6Mq /^֭[kɒ%ʛ7o͑#W~[>c^;ѬY ;YՔ)S/y)ZQpƍkݺujӦۗ:2ed`2dw+eZ%^78 ?p>}: /2wVXQjժi֭j۶˖-K zO|׎ixޡM4ydKN 3f)1!xիzuy^~FԳgO\RŋwިQ#S!;^Ǐ׳>+c /Ghz#5:t`s2H {}~^~F?nݪzH9sOd1v M4Iݺu 0!Δ)STZ5^~B_UV)22(WAk׮4\5fΜۜ RbB@^+IhxȔ(W 2O>Dݺus4BCC5gnd2 @^@đ^x@„ @A(Wy1 / z PP scyC{N sUV6'(W 2? 'Ntjxi1! PP ^cȑѣSkΜ94dJL+z Hȑ#;8«e˖6ט(W  Gz ^ 2 @^@ #F `0! PP &7#FP^4oHo: /2 @^@cyC5|5oT2 @^@d1vNÇכoN @fƄ @AiKqqvM q7劈ɓհn X)1%RN)dk4& Le ]5@璴T @6`4p)2%ȶ F @`@^4 Ȗ zq=<<\ , Sb@^4 kbyCɓ5qDc@Ν;8B ʗ/3gСCXc_(pz-gΜ;FTItI%\`*z?-[ܖ{ܒݻwEY>}_||d*ϟܹs۶mB ٘\ٳףTHk+}[J7;!i=q~9ή]t]w9o߾]ժU1~p\߼yjժec"pz S4HzmSpNT!r BCN+]Μ;@ʙS x?$M[%&J $/#Vi`Ś5Сv[b/_T2v?d*2-7WƌΟ; hPFΞ; h@HM/HҔ)!<ew g۶I&%/{Le i,SݓLV>kWGNhz@zfϖ.;OS87O+S}jN[$$HӦI7ڝV'7);S~; W*U M/pDž ر޽v'ŋ'7)3ѴK҄ Ү]v'EHRXIenN ${:LN; T)Lxi(Y N!ii^}M/ܶmҔ)R|I`ETvpҤIҕ+v'-ZHO=ewlXrvU]J ڝwI}fw XեԸ)5^`ܹW_ٝVEGKUڝ…vU}H5j؝"ۢV%&JSJ?dwX-*ew#͘!I`UtT)%^K ]N+InRٝo/K'JfwXQԯ+I^୓'>; T)񕙜:%!>lwXQԷ)^ J#GJg؝V<+vpvpr)u=N_>Dr$eK驧Nߒ:|$Y3gNm_;Nt7;~O; IjM/ Nj԰;E NxC~Sdy4׌fL>Bo_lYS89SZo_\9Sdi4.]J>۝V,ܤ;߮\I>gܯڝV+EGKy؝$ˢrGv'+&7)2ӧ#CN+ʗ|SYM/ÇNٝVԭ+iw gGJÆI'OڝV9s4~Mcefͤg;ݻ DiDzYSd94 #]+͜)cwXѩcvpOҴiRRI`EǎRӦvRhz@FY RZvpܹvU&թcw,d35kNʕ;?*:ZPYM/HWH|"I`E޼M~G 2? N; YSz-S8OiYM/VƏbcN+ONl>i8~C U6;iӤNl&i5~A $פڵNliSKzS8ki,S~A nf͒_?;Nli2S>G nȝ;?Iv4yeI`ExxTBv'ԩYM/'7)21cN+JN>+3pA;Vڻ$[լ));& .?nwXQ۝Ұaɿ[@@ \y1Nls.\; hHzS8;p@=Z:w$hz@j:t7;M)SkN+ڵZ;[wI_؝_KfٝV!կow g+WJ_|aw +4 =Iwiw gs$7*W;s%KNXF ܤ(X$~=yMN+BCEڝo ԩ?۝L):.\Ǝ; (Q"w*($xQ?^ڽ$hzȺ|L^:~\6L:v$JL}[ IDAT=yR#QhHzS8;p@3F:w$ᇥnNihYnjNjζl&O^; h&l&}gw-4=;+/;-djNX-Ulw gsJK؝VEG'+3;WZ@hz1ֶ mo$$HSJ?lwX*Q$3F1CZ$@hz7JHnRٝo/JK{؝V-*T,.]&Lv; *^*Uɓ؝V}wߩ#GND |aC饗Ncᆪ(o"Ev.^'`4 0j4E}6ł 7XPUTDu0 3{g]a{ hIHcNJN(4XxZtl""""""""""J@~S(u XHJѧ0h֮E'!RQt Eg۷NA5 Et Ex{NAM/""""""""""yxR'{NAʚ1h@t ENA^DDDDDDDDDDRkkIdf[.NB*V"L&': 6TJ>K__t,5@`$ ss5ed$:ID^wNBĦ˛$2XQ|Ɨ:-BCE'!-Ǧ*k+:Py"*JtRFV€%KׯE'!-Ʀ $:;wuQg_N񢓐bӋ(t,:"__[t RȑΟnd2IH EDDDDDDDDDTT݁ DP?oS&MDPtgE """"""""͕gϞ˗x)}HOOG\\tuuaff055ER`eeXYY|򰵵EEXD L<.:L&eat(: ) "RHHΞ=gBd2 $$wT^ݺuÀбcG HD3~~~I7_bĉ]hD@^@Ts$Y<֬/X4 =VNLLD! Ʀi\7n@ODZ%>>'OqYQAAA oJ*aȐ!6lg/_ .#\ll,~7]]t+#:i#:%:I `?H /7czzӐbӋ\D={#G-֯IO>1|k3f'|Ԑ;}={`ݸqd2H;<Ο?F҇Djdxjhժ;;;ёH[˗|Tt,۶ɛ;NBbv\9It@lzQS+ IIIGgϞEffHE֭[4h4hիWM6#ѣʙTʔ<=卯Di䒒uM E!Ml 64 ^DDDTddd 00 ."*RSSqQ8q222DGh׮+,X#(99'Nđ#GeTXQt$6ի̙dy~VMto+VȯE! µ%XYz50b,_/^dË1ydTT ñcǴ/LM6N:سg8DDy:s 6lcǎBڨE `D)=} ,^ z%: i˗aӋHBÇGzdDFF1h ?)))͛7ݻ7VX!: iO? BÇ5@\$)?VbcE'! DG(6VZ֭[όZFF\]]51>Ct E/7NRzw -Mtlz0h޼9\": Q/_d2QHL 8:NaKouN?]D  nݺGt">>+ұIDE[Nc윩S33ibnbVQdʛg}ӵڸQ\ /7i FR{lzSzzzGӦMѬY34k 4@ɒ% U755wޅ/O?iiiΟ'O`С8~8ttttl"ʝڷoVZyhܸqflZn֭[c֬YǦMn:DDDI???]cǎ-:xP0^6oƎ D)0=]]I԰/B:tHt,=&LAFlz)IWWcSw'''tA%6I&hҤ &Ol߾ׯ-x9y$֯_o$"Eԩ:t耎;^mѕ+Wٳݻwc RsСCϢ%*2nn&ŕ+d9z(WNޔgx5u$YN,,#D'bBWt"""6l-[cݺucGݱvZXlzYd+ʔ)q֭[E6md\2e BCCl<"m-Z`Μ9z*^z={`ܸqW6>!CBҥU:ׯ1|A+wwzu) ?.:)][t Ev׬FRkEDDDY&6my0r A6m_`齎;… 8r&L0ѣG*H*U | >sCc300+sΩle˖ *(GJI@L4r%,,GGi 卯ɓ峾Az:ijRtRsEDDDZ xܻw[nڶmED ;w.BCCqy7N^p3F3?|||p)Amʔ)`˖-xۇCjLCvvv8{,O1i&'SJ&:V<ac~ԻwUǢL/"""EDTڴi7774^;v U6'uV,V#*\]]ZOOG C&I>1m4s%5Ҹ10}:`$Y""ŋ~7Qx__I| ,Z$lmE!5U|~""""*\022`llƍCZD)FpvvƋ/T2ƍ7pK%6f K^;88>>>prr6Qt ~-_ZP]ѭ[7zhӦ '''8:: M6/ּ~:.\X鉈T'66VZ""vijk׮գ"&N>zBt Ek 2CEP!f|MlzI0`U777亱L&C@@<==QJi?:t͛o޼QADDDD/^@jjdT"Y-"nޒjҤ %G$_/ۛ j{w)]l dfNBaӋ0sL4hCFFF322q㉉XbjԨ~z Dff&tu9IdJx7"* S=\N*Y-"SId-F%㴿?nɏիQ1e дZ'#3>8o길'-DPt({T@lz'p|(]k׮&_|,"""""UȀ5{S.3CDp=nZ####޽&&&%*6<=n!̓ _Fas6qwDPs'pTlzٳgVFbӋ ((HҚÆ i(L:M6ŵőҲeД3]HțmAyĉxXt16<<33I6P>ETwc*dɒEFFl<"""""dddয়~%$ID'883f̀-Zǘ9s&;;vv& w}036l@jqއJbbˁ'OD'|`ӋBBB Gjjtqqw7ũl,"""""lذ!Ǜ c000&i/_bÆ ޽;j֬  &`޼y*MT,5oL$:Hݻl=I m_,/56 ݻwَ%$$Oec`ĈWjDDDDDRzfϞ-iM##l vJKKãGc+Vѣq)dfflYfa*OTl} 0|;!!hߎEyNNȑS(zX /:Qq t~Sͦ^ *V(iM"NZZ<(Io(DEE!22AAA Ez/edd+V`E:.Q2l p$ 5=cp0zz#ܗ__GN5`F`xPt^D`lllϜ9׫o6m`aa(.HDDDDTXXb5 0m4Ik0p@1Trؽ{7Zl): 8QޤvMtǺc%̌E)8WW * |Yt,ǏE@rr=7~xڵK%a„ ڵ+Zj1 +)) CEZZujժIZctttlx`g':}?}=^e<=ZDP{7g5)6 N:KMMŐ!C0m4ƚ5kN>H^H x5MLLO?IZc4h???o033x167wG Ai':Jɯ)ssI6qVbӋ5jLCƍq"JEDDDD$ޅ C+9sFDD97n޼ۋCT|U"oRZKL*VEP \ <~,: }{ziElٲAQKݦ}s}f͚ݿ=z@1c :::Ep"""#z 000@2ePZ5Ԯ]%K,,wA|۷oqMDFFݻwjԨQp-<޽9,--ѰaC)SP#44111044D2e`cc"Co޼!C)i]{{{L0AҚDD9iڴ)\]]1x`2"I4iL,Z$:I.YVeLX!J$")%)iixHzgw503gį8Ө0c0$Y^_fs$ ɘ>}_4֭ NY5jܽ{7_?wΝ;ZjaԨQ1bT2233ݻwxi544D6mзo_ :TOpp0Ο? ϟǓ'OD(QBqqq߱eܺu 2,[=;;;`ѰWDlٲ;vKrF__-[ķ~CWWfm6ܹΝCzzzCݺu쌾}QC"<<\K,uB /0l04nXt"Խ;|85gʕѤfMԩT ml_*T)jfdfś7OVP?|̹;w0vrl:UhtDE6N%8X\ސ+[VtǦx‡IIIhso߾zΣG0m4̜9{ѣѥK7LrM6axI ___bƌ3f O 2>##wݻw1|o˗/GÆ %FDDD9sĉꫯ>:럈ׯO>={D۶ms=$4xIq$ #:6nu~j 3 [KKZZ©iǃ""p v MΪ)upן|RcڠAfd 7\]{0lzi͛7+|Xf->s*sSSSw^ݻvvv * inݺ#G͛Y[[e˖R J,x}dP;)) 111Ri6sss:u UT)tt?o߾Ir}CD$x{{c~EM"68[9~l\Lѥqc\X '~U+1DDL(6 TKKItItŦW1cƍh׮I:+W˗x +,, fBժU1rHɚ_7oƔ)S5eT>7Osmw^|7y>nݺ8y$^6lXYb> @GqF 85BʕQ^= 0Gݻsl̙3ػw/NwJ.͛7# nnnhѢ*W5kC_Ct5׺W\ӧY={fhXh?q0z͞=7nP:ѿ2220x`ܹsG7ou>&66֭zD$[[C%@*牱ʕ*ɥ͚j1F*o|TDk?+ ^ <|(:VґiؚW׮]ѣGDEE!::Hgf\\\(Ww)))r.]0˖-|icccL<ӦMR5ЪU+*PݻJ]kkŘ4iRk.\񜁁_`۶mZo}ݺuos=_n]<|(]4N86m|nbb"Zl<~)=:9ꫯecy& wPZ5DGGg;׼ys\r%2(00u} LD֭[1bݻwգ/((5jx7$Hƍի%kllW^.C{fffcHFGGO?T H 30s7x0)*gKR)5TBvvY@Ŋ0 7T:N&MpzR~~NSWm>#?s*MiL`̛7Wƞ={ .x"""bU5TbHp/^[n?saÆ4T :4[ 7ixҥKx+P z.-- Xvm7ooPX"_ /(UT!---_>tĉl /Xpa^`jj1cxڵk? o\ /^^ / +*d2ۇcƌHIIHt%IJ~A!^ X`°RW/G֭U:O@;1TcGDP ,_ps:5j֬):Z Akԯ_gAAAAСe SRR ]%ǏÇ6 ^JKK ШQ#.yH$A>}DPlYۤT)3 R8o/'R'nd9MqMZjV*zM6Ç9sJ*MNNƀpԩ|=>11K BWԩS.WXP/ ={~g5kRs=?9rAAAَ4l0}|}}T]"""^.\!y-Z`Œ%"R`` Zj 6v,жNv",5 ? 1n㔿モR'g۷N54ebbkkk1FN|hTT gFpp0N<#Fz}t 4O֭[͛u666_bEz¢355Efr=oggΝ;+~i/%s`zz:NoF^ z~c׮]^ҙԏ.$EٲenL8GVMDZSGt E{GN#F`7QY"88Nh)PV-zJt d(QBt '''899!99'v؁cǎ!99bbb0~x:t(mڴ)s-[,^z~ÇB~_SN\|9s7Vzjժzm} ٳgs}:6oތk׮۷8w~GmV%qA8;;+u3+|yyBU".XxHtltttqd8桵GnMUuYpK"bK pCdjjÇ㈈ҥK tZ*s~~~n:inn^ହ =F^ )L(/Jӧs=sN<o߾Wtt4nݺ///9.DDD1֭$kע&"bddva֬Y8<ñl24iҤH?s ʥ oRWŋgD'ɦtɒ7kJ`/"]^_Sᢓh,6@ ;j'ND`` vؑMǏϵuʕ\'eӫVZ[ =FaLR5Ke?猫S*W7n'"...ϱT'iX>TI 믿VIm"n+V gggڿ?\]]U:V [)^^@q/ +U_J^7*6|ڵƎBQh(t,a4Up4^Ň.Kܻw={^£\W?~8,YP?TD TT) U5w DlHD8;;͛*)S6ч:v#GhSV^͛7t "ҿ?Яn֭#13u+Ws$)L>AS({X+FBD5LQ255EݺuQNQ4£Gϴ͛jٽ133Çѷo_>|8Lkv<444礤H=KK\NjtbF|Ϧm۶ׯJSfff$""/99}.\~-~'&"M&Mp9l߾&L@LLJ7n6m뫤>;Ot,gQ(0#G9=p"ָAKTQ(⯿{bN#^%KDŊW\Y`!!! tuuQF TR+WFʕadd333D7@}- !!HMMEXXr!R~}CݺuQY~mqF,\|*U);vAΘʭٔ "^y}}|)5mۆ*UQ"""v6lΞ=ǪU$"\::::t(ڵk!Cŋ!C'J&ݻdٻWsI|޺5V}x>>>>IyL~Oazzz]4Y!QQo>۷/vD$\Vf IDAT*UWWWYFwE.ym"&^N# l"o|A0|I^wO^)?DN&Tǎh ]n3ϟ//[2 C 6kԨ<0̟?|dKm۶?>BBB0{lY0fIƢ,;=֫W/T\|0yF\*QD猍y7o.>lῳ=T)33#GΝ;UR[nرcgիO?y!$$DDZJ>KKk(Ա#ML$y}Iss5{ӬbLc^5kvUVlcǎիَO0ׯ_G޽U~7!̙ooo7k֭ڵkU:pW@@@>,s+f?pɒʫUB4ʕ|ddd%!"""m%0vXlݺU%۴i(&"RX @ʏ?(y]"fo/oRH`b <\t y6ּFZ_$* X(4V57o.99>|8/_^w2<_±)S܇3N>"1㖖9^P/^(td\UVMq4ǚ^EL&Uv\˖-qq>N^۶mÇ%KڵƎBQh(lY&I<^PD^JZSmjN(<tItbO^ c咍QPnnn C(ʖ-_zӧOVVV9xݺu(Wyܸqc4˗ϳ9~d2b*߰aC;v,ŢEХKIkfdd`…$"}N`:tjSÆ(%,OJZO|嗢S(a+%*mze@Y_N:혋dv)![dxP˖-+sjiӦ9>e˖y֓ȫպuk4Jʳ)xE$%%a""""3fʕ+URv8ydDDJWW[lukΝUD5 Yt E}}gaHIt 6 -Vtid;֭[7+VZ gffbhgz'pȑ";ԩ 0''<7}d^|RJSNڶmwaϞ=EfԨQgϞu5""uUbE_sNIk?݁ DP?o p5u*Ф$mz͛7kܸq vc(fڵ+}(r?E0}ڵkَuc^[[[j*>>>Iq#ڷoy̔IDDD̙3믿vJpiب>}W֭[%GD*V"L&': ׮-i=/@ >ߧ覗 ʖ-HLrؠB WVjjjc/^t'meccCCCc.lpAjՒ$:7ce$]p._>mO YIڟzuSKmC]JHHݻws=M:Cq\RN:3f8֦~<<~i3 b۶m9reN'ka?>OH2e/FD$>3Yk9rDzD6m)4I/A3=-QNz))2-['NAEr۫?$ڵk|.gw$kZ;&N( W⣏>QECСCѣG\i8:u*5jTZO&Bg3gyOdddmjժ3kvލQFiZh T:0 ?ghFӦMOD$'|"kK.ZBݻ% h>4 +gJ%JϞNAZbKKr(?߇Jķ~y+++)?ر#4͛ѶmM%kkk]PK,#?,twoqm۶^:'11NΝ;y>)) ˖-+t]غukB9s&ի5k׮Ev |oz ƍ_lDDDdVX.]4:͛+RH$WWWT\Yz˗/eG uKtֈ@׮Sh:}Xg]VlzgݻNAZ`KKիWqٳg֭[@C'''&* 93gP^=;6rsMt 14ΙbÆ / {Aٲes դ{w76שSww7/ލG}׍ĠA: ;w,p]Z%K`:t( \ ;v쀃Cם;w5ȑ#qͬ{!Ν WWW,Yq[[[ܹeUV/|o>lRDDcǎ͛7eG {(s3SMNi.`6+kӫ-oo?epR<~^~Z֭[s} -/_~@ƍڴs#֒m3'==K.Eѯ_?Z QQQ9Sոw,YmۢaÆ8tPlll}vO܍7ƩSPj碢Ю];;vuΞ==8ުU+5vZL6 uƍz1b-Z `ٲe5jj׮ׯnJJ 6ms4ܹL6 4q/+v,kƱcP||KJJB@@:uKKKTV m۶EfP|yKncc}zI"""2iժBCCeG [ 06ǏtH+^YYH+^(3a&ĉQN <|7o^%j5޽3gɓػw/>7k(y{ԩ׿>={`Ϟ=RJrʰϟ;fӦMvڷ.6Ǖ+WcvSTT:w.] ͚5Cr吙\r;v5T*xxxyrܽ{y?WիoL/>rƎ[^K.۷c֬YZ]f ֬Y{{{ywq% <gϞ}iiixQ&UZ{AÆ Ϻu0rHE>|gnn۷C&"75jԐ^xxHaҥ=Pg1$@24CUIN/)=IIKJScc pZZneڛ/_(UlmmaffkkkXZZK|111ǣG\M4qI$#'''XZZ}/QFARaݺuӧѣ9Kܽ{@cUX>>>=z4LLOlٲرc///zc#Gȑ#ռys,X 0^J%K`9u+(###1sy벉DDDd6mڄ#F(ڳg>Ck飚5kZ/""Bzy@ =SԤ>]tltO`rBKZCRV U&5SM M/-o߾Oӧ ]Νu>fqRPZ5_uZl#GΝ;عs'?أ+7NNNh߾=^zxTa =zl2>|laa={bM4Kzsilٳgjiidd?>|8zj\v@ﵶ<<<иqcmٱcLkb֭lxQRreYfgR6²_g"el L,X :IpOJΦMɓ__I-*c%J޽EG(VV\ :֭ubDxx8BBB/^~(QL899Ut民>} !!Nµk8 Y&5km۾~4mT"#Ch PB Ə#""sBCC4XYY5j@˖-{FH^"""ҽ]vaСSacӧ쵉 ,,,r-mV!=qԡ'P!$@Ϟ@LX4 (UJaz%[-+>o>P v$6yԩ;&:J6mFQh1Ujz:%uSʕADDDz:t(eeMDdlll"I`ر#_NY7P`K DVRZ°arNBy0~Y'Odv! 0622š5k0tPk 333j)ұ]mD mMȰ%nUgϐ)>ӄ @VSP* b̙cI&pssdtORd 6LDD$11QZE&Ć Sgg)4m>,{ِ'OdWM 7P 6dnݺ VӧO5/^ѣG^Dr2bp$ I>Z Z\*kًZϹ|yYP$vDd`ll۷kדR|}}Q1rCRpB;VDD&)) ճ 6U$.NBBd+ymj@56VL%¦Lq!CtL4 cLO^RܹsGDDѣGLb&<D'!m4o.O"#{"RqBeLhk)_ ooo||Jo:q9t /_T?S*RZtҲ#=p>d IaݺÇN)(XHH(R !^IU;#GNA`Kf&&&pi4o\g:99?3t6&)իѣb ^`r7eGzY` 3Stưa@^Sh:^ӵ. /]M2Wt ^ypylٲE旅L[nCDDDDDu5tԟ9s&OHm""CvyYըQCzGm5aЪlYj5uݺPT,־ hFtM/T*… 믿W_rʲԮS[,uHׯsΈS'fϞHm""CVq)YkUm9":i ^]t M7m.\XY[Wz%7P%%EVЪU+미}6.\/ÇǓ'OrgҥQreTT CƍѮ];N'""""*ݻ?2?'ODD͛xlJ*'''JIVƍE²||OOsi$ii@@tOb&{lz4 DbKT*ׯcĈ9gdd˗DDDDD$ѡCDEE)RS6QqEe*k=Sqq?0s&": V*RbTI=,Z̘`7p-Y#9g,ibEDzvvvlx0AAAر#>}H#F`ٲeܗ(X~5۷o/k=c/Ia_-:Ǐ??|d}NB%FRtM/""""""ѡC<~C m}Xb^DD.k_z炂ŋ/D'!mt 2P….?۲=dYtNQEDDDDD$PXXtHE껹aʕ02_򒕕ٳgZڵ&sի IHC}N%`*i7Dcʕifb[^Drs Dz |PE47n"M6ʕ+lԨʔ)#kM2&:ihFt M7kFW׵Y3^;r:c":@I{ի Ǐ/^ôiD$"""""@CE6m GD8xyy^w޲$y3`ot.: i$oIT^WKjh-||X@(I ZGbƍطoaÆnzaΜ9 ?mقWɺcG]7c!矕ѯ)o+:EŦLn߾ኍadd'Ncǎ9ΥsQl|"""""ѽ{wܼyS[Ea~ `۶m^z!55USLc%`*i/*L[.~7n3]4܀'zZq*98s@G HɎM/<}~!bcclٲ8tƍ IDATZ7f͚x"""""*t/_V~Vp*UJDT|ܹsnnnxqJ>>?iii4iFtEprrM͢S f  _ -ر !C:kر(ee5wwC)6HVcĈ٘&&&Xx1<==s=waӦM:CDDDDDKHH@׮]q%E7j;J.H}"*ѵkWɺZ"L7nUV7ox{{Z1m(5JImh=aLe#KeU ] #FNBqqx':mmqQQt"E饥t[1fM4:j믡Ve:{&<<c(NRaʕh׮(T\l 88}NBŘ- W&: bҍ7r=ާO[WT2e .][ʚJ*bbb xDDDDDDDD%J?>Q8֬_[NC^ EG!R*eK7o۷nݺɞL2Wոw$&&&Xb<<|Xڵkc֭8r^leAx{ˋNMfN"J%:^za'M'N(H$6:55syW^}ƖUTQ""""""""'ǴiPF Q,Y[nᣏ>Drt||76+`RIٲ%[ơs&031I68 Kaߜ$wܱcKKspB,áCЬY3ܻw/ʖ- *(̙3AAAuf͚:uꈎ5je˖Ç3f LJCt4KDGN"K п?mߎm3f^Pret<OHЩ(bb ɍM/-n:1ZQFaҤIHIIQl@ݻwG\\[oҤ TKDDDDDDTիW| ܹcҥ EGSSS 8NիW1zhXZZEm[૯DԤD/Z[cPX:a^MvT|޵kjYw4xX<{&WE"-iHMM8V1o}O>>}'OĹsp5EWQBʕѮ];t{F2eDG"C4j ܱcd;u ppNy/]=[Dϖ-_\7>Dpd$?WdT]4>Ϟq#.GM">|xMKHH別ٙ򂗗DDDDNٲeѷo_[nҥKtnܸ;w AOfMvhݺ5ڶmvڡjժcQq1m \&:I;&ŠAJ-ZhNH@pd$B|8/^˗/äI`oo/:@6662T|Ԭ)5)f$[L 0>0cP4TX..R3GtlqqtONCTh\n:+&O,:Q{Sh gD'!mh!'/$DƦ ֭ŋCR@ڼuڵxc)4ݻ5E{wD$S,GKd(ٳgXt)Zh!:QNY`:@@ϞSh/`j #CtcKF3f?(lƗJ_Ę<hDt Mv[NAښ8QZPl":Q%3///lڴ Kذa&NqJ,oojU)4m;&:ipqBCS^ 2d];d&MŋX&""""""""*lm&$ْˁkD'!mX[K$RS+˗E'!z+6cǎahР"cTT ˖-ŋQNE """"""""|HM }!!6Vտ{s`BIŦׯСC8p T۷իѣGȈDDDDDDDDD´hxzN)"D'!m4i"O"#{Id":@IRеkWt8qΟ?/ CJJJ-_<ѤIj ;wFuQwE')@RPa}! ^-:I `"`TZlz阥%z=zhGRRRSSK ,,,D$""""""""3 &8p@tl%5MƍL8|tO':I UD!rzvvvcQQxzJs.\$۾}='6Ə_%:I{jpI4p3(4o<4m /GDDDDDDDDDT*ݏ ~l >,:iUKt M[׬F"e6l؀+W`׮]:u*W/ҧ=RS+˗E'!m^^>.-sOcKK8zhs):nhh(^#˚5kаaC?~\񉈈V$>Xx@tFR3U$$AA`Kk5jt5fE KS###ѥKlٲE DDDDDDDDDTM'N)2<Ѩ0m<|}{H06K.B|_/FFYYY>|8g|?Bt MAA위IH:FN)$DE/: plzA۶m" NBr4Υb^cwo)4]H2<_t M/KƥNB%^E,:ڶm@-[VxLL f͚%(i?hJt Mq3ru0iTUڵkcٲe9/]Q@͚Shڼ8p@t Җ7*:[}DM"AA222vZAHԤ$[z:j!SSpt$[Fz5矢P ĦW1ңG6lZ- P' @P${J$˗ߢP æW1RjBCCqi(WӦNcO O{*"Bt*A*Fs=~'!""""""""|u):_E'!mt|BB .Nt*!*FnݺӧO8 Ր!@߾Sh|XHM1h0p]/E'Mbڵkؾ}{ݻ4DDDDDDDDDT _}m+:ÇMD m|=E:a":>HJJB\xycPݻy.o>@zShںؽ[t ҖShڱؾ]t *&ʃ &L˗/iӦhppp D'ɖ] 9#: iTItlj4II`-ԩgb;1H[I/Iz,] ܻ': ilY27$[Rۢcӫͱe|駢ܹsQfM1(֕$* E'!mԪ%-uOyPIȀU@&&&@.]DG)3gDDDDDDDDD$v퀱cE ̟ĈNBhBSX ^`bb-[raΝ={(DDDDDDDDD$77)4ݼ)-K$: iO`P)4ݽ ,Y"-ITHlz6m=YCժUCݱvZc݁Et~#]Et4nVNBF?:7Yf]ŋѲeK={ȀU.5)IL 0>*: iZ5Kt Mqq?": 6dpelwcǎa޼yE/88qqq9ǣG|r """""""""ֲ%0a__ *JtF_$2R>^E޽{#11@CVi̺uڵkhڴis/^@=\1 &:%K/E'!mt .: `" !AtlzٻضMt ք @VShڿ7)H06`ZQ KƦM`nn?0""""""""""T&:#GD myyիNif)H 6c>|sqq;#k =t̛7Oֱ*x{66dKIV^aeeʈN-- .^aKK@f!לUT;w"y&MJʕ+ Yj|铸8 QSϟlziܹsV}ԬYfff{G)Zjh֬Y㉉8t"ci\KDFӧ67Zt M~~үT饥+W8ɓ￑O˗?ѽ{wE3c]qȀt .: `b IH;#FN)8XPE%^Zxmoo~ FFJMLL dF'ODDDDDDDDDd zBӹs@F$C>}Dtjlzi)..N!C`ii)(&M(l"""""""""-Eд?oS<<6mDt y#lzɤu֢# 111joq"""""""""{@Shڼ8p@t Җ7PlT饥7guUTIPl>ܳgt )#:I4 |Yt҆rp$[z:f p$06dmm:))IPlgϞ7$""""""""\U"5)6*VDД,ZNB bKK*Tx}5AI$7n+W<4DDDDDDDDDdP7LBǀ+wտӧ/!: )M/-U\Y둕%$ZɓJGi u !:``Biww)4@\$6TF wŏ?($95:JCDDDDDDDDDkPO)4]Z%E 0@t M׮ITj$$3 Uƍs9s&2220k,T*3cԩX`[ZyIIF )ӘSDSF%KASaH<}[n… Z7ڴiHDDDDDDDDD%TJ0i$Ef\͚Pa99I/OO +KtɫW%RPj.V/Ec޽{ѻwovZXBgVrr2\uF`mm-0QRRRpׯ6l}Qaac樃;9)4YXUff6^Edn{htΘ| gzAѸq/3(B2eйsgJDڵk#EG "zDD21RR )8IM~O5:z=hԩ#c5.EDd8ӫO?,+++L0Ah:aÆׯ rs_!CQ`""l.^ΝH6jBuE.NV-ww5_!bIQD-ҢjPE[ZFդZrί۴k,.:Lcߗ !BBV ?>SK}q]^N~{|ŏ M/.y ѡCIX>}:>⺞DD C&Mxꩧ,<<5'NE .xADHWW#+/7x]_KJƏ.!g+q~k_}=]vxw3,d2Ig=qtL&Z5<`*djՒ0+(V.B:OqIO)(_<֯_eJQIWt+,. T 89I؟ҥkɞaÜ5!;'"#u\{uؼՁW~t,\wlX C]S@Nt ,^ڸ|K'شiv튓'Ovlܸ͚5k """"""g@`0 ۴k0ctY^zMd{~s дi@rʸ(_L*s__5S8 0POթ#]CC`,uMխ+]Crسg:t r~|7իHGݻÇKWX ;wKH]Q+,ED%ENر^nݒ.!-ڷ&Lt0{q襳+bCaÆؿ?Sl$"""""""+{5h?rXΖ.!- tlZ֓l/HWX:sXHO.!-^z P9-JjZt >P,!=׿uk6T%f9^=.oHDDDDDDDdKvF̛$'K:cJWXzU]SII%E "#Ր"!Ah8Qj@/]B4O҉e@Vt iNRKeI3*^RUFrjy:=}&]a),L-ɚ*]BЋ0]+H1cS_FuըQT#ٷX8Q !]a)4X ϗ.!pEDDDDDDDdL&۷KWV&H[+HS~FeEidOҶml0mPtYn.r%pt i_5jH's.!L&SҚ5޽մi@z֭SO(AVV~G|ӧ|}}888L2TG$j8a2aQ=N.^.!-VU7KRS^Laa%E@@PtYzΝ.!-*TPהtYFڇ)C/+:y$Fjժ^ϟGRR2~ۄ3''o~>EDDDDDDT tSrOxC6_?` KΩeӥKH>}^dMM.!-zxCRxRRKJt4g+HNNv빹>}:z4lر/#g`: j(kW KWV9Á=+,eCᗑ<\ J(zdѢE?|TRXp!K8=`#ѝ;wsθuLj/#ټ[ jTys K|l"]AZM l)]aiV 8XWM% ^: |o~GTPᾯ=z{Ffff88W%fW_"]BZkKWk{JV&PtuөdL&K҆ ?JWzQJJ f'sغu+ڡC0hQZU )KRSE9d{TQ7p?JLz9ҢbEuMJed}O.!-\]5UtYVlptIWEݵkW]KkW0||?#ץKHFiӤ+,%&g%ާyҢn]]SoWH=ݻ3ghÀM2{t:KWXTCҢ];`D KׯfUmڨ$*J KH-վqFrㆺbbK^EvZJg`ĉѣ}ߺu P ttZlO>kIWXpA-iuAϞС.],RRKH=å+,AA;%vC"XftsARW_ݻUDDDDDDDd(F]HWXڳ0`p{w Kz5PP ]BZ~ʕ@nt ikjHo$G˗%vC/nܸÇKgnݺzjj*6n(PDDDDDDDDƌWWW4|oMԮ-]aVPڻW2u+,[tie2^^6lv풮 M5a~8ȑ#ڐ!CpY\z7nħ~)S`„ Xx1;777ݛy~Q Q,]b,Y"SAoM22ԞqOKԊK̲eˀǥKH gguMU,]b9|Ĭ C/.]tkXj֬Y__߇7&L{SjzRRu?٨Fԧލ$!=.!- e#u ;zUWxTr20>pt iFrn8ڵk6~x 6~Ez'lٲZTT#"""""""֮_fR0=<L$]a)* 9.!-ɓ+,ݸ11%Eԩbbԍ%vC/-~_LL+'ʗ/w$GDDDDDDD6o_W+,],ZJzCJWXt XHI.!-zιsG5JRDz03SqQFF[n5k (/^|5#""""""l{w Kz5=mo={JWX:xX ˓.!-^{ Gґ#j_&=/$]aq4ӛPٞ={ҥKc ٔZu+ytii#]a;+Hm+,qti5aоt˗ l^+Wgdd 88_s/n] Kk!!U@Ptۥ (L&aC K6;vHWV&-]A:K?uYxG!>>_\r1ͩTI )\\K22%KӧKH K̲ (ʕSהu˗89ӦUJN8ҨN:x"BCCEZ6oތ/^"""""""4hn(ɭ[ܹի%Ez2^<=>u`U5jkAtF 4{ڱd :O\;V]S{JziTT) ,x\r ZJPt@LJJB%ٳ."""""""0]nub ^Q@%ɤ%f6)GMOIUFz<&П;']B `رp(}6nݺƻ9 IDATd^ХKxzzv<""""""B,N.!-ʗW7*T.1/Wˈ)WN]S+KeT#=>'' @ "7XXO5spPoЫ|}}M:Ǝ+@DDDDDD ;.!-M̟DDHꆲܾ /Kj{W]S.Iժ#̳%z_|ҥKKg>ƻ8*͛㭷ޒx?Jb""""""2Eaä+,')ܑ.!-uFt 0o,]BZt #]a)"B=휔$]BZt/]A̚5 ͚5ΰ0dK:~U@^t i1dлtÇ~L:Nh`ॗ+,?,b嗥+,:,] dfJJWcKe˖Epp0ݥS 6ļy3` wm+,iti5aоt;+Hc+, [']AZ t"]AC/x{{ᅦh;;T\Y/_ti4l(]ai` * hXƍKWVӦMJWX V zJC/m9͚5oooJf&lpt inU$]b;zT(S0\`J!ҢT)uMU.]b|/%մi@zGA݋:t#GDDDDDDdKsҢNuCHn˗KHZwMݽKKHj5(]b ,^ IUkYC/+Y&v܉%K?EP|y| {ŒHnfbcKH-)S+,ݼ7KHf͌7hǸTTt iѨzڙ Ǯ^غu+^{5VZ?OX=z4.]O>5j0a0e8 DDDDDDDsG 9Rҕ+/=;GKWXPO&%I:IWXfKH~[n^IIIԩ 6󈉉1g4ogΜzMHcqk'''t .ĵk0|ڡDDDDDDd/VR{<Wұcj/=KWX:u XTIHu]ٳ%@zt iѷ/vxPNNz#G<{1p@;wXrvv+W^y ñcpDFF"&&YYYHKK*UWWWԫW^^^hѢZjrYH!%on%f;wa%Ÿq)KvVהўn3zBBKQ#]BZ۵K`Cs>r/D^ʬTR񁏏O&lؠn)]BZL?ptƍ[6MP>ȇr!]#Rq PbX mTjN.)nyÂ,X౿V!""""""BV.!-ʔQ/\tC%ERꚪ^]Ch|૯T#N@PtYAzj ԑ(nuEDFF>)Sz1DDDDDDD@Ppt iQRɝ;@xt iQ\@3SSE0COpZx.O5Jԓ..%%ǩS |+WÈAAυуܼ iL )$69.!-6U7$.NOEEIo@̚2 wM0v7JLL|X"v\?ڶmZjlٲ_>|M#""""""?|?_=EK`h Kܹ@Rt iѱ#0ntH`lٞ~[j ]BZ< 0itEewC&M}ݺuyfT\Yو,fz_dɒ{'CDDDDDDTdǎ˗ /]a)`R 3S_]WFrӥKH}W_tZ!z (=֮] ???$&&"//uRJE>V\+V# `Nʰa%Ÿq)KvVF.!-FVTHtYHt i1bzjn55n ]C5lzڹSD'\I\rТE ./Ct~զM+]AZ=$8; * @g$_ |ti5e }dغؼYz]uk^\1*&&F::rr+ÇKH GG5VM,/O]S%Ԫ%]aVP^mpL&N KkF* __İWݺu{-77_rrrEu"""""""s åKHͿR% ,ZIzȖ'"^LK*ӦXQLLF3YҢ|y>UtI`ЇIp9]äDEE!::Z:Ⱥbc365mn(I\DEIF̞ \&]BZ4h`k*1Q]S%Ezig:zUR^^^v>Cff.2ړ^[lN """"""*@Rt iѱ#0ntH`- +,] %3#]a)*J}#.NhZEVewC/h۶_OMMdBÆ  K!i蕒YfIg'ˁ,Ң` KgK%K/KWX:^-u&]BZ "]aE$kJt i av.^}yףQFhԨFe˖!44IXDD}\v }E'"""""f.`: jsg K!!ڵȑ@n ((.!- S # VKH!C@Q:zr!##/7<<X|*U ʕ+J*sn{="">>>p)o߾k׮!77XGDDDDDƖdܾ}?Ґ;wX| *|UUe i xx}J&ګYVT%Ŕ)jʓ'K̶nU%ŻkQV5ӍxyGwtݱˡ^u,[L1n߾۷ok .h>/ѣ$&&~]v FLL nMJPfMԩS>>>A&M*Ur"+Wj_=&|tRPY;Jא&@2uԐhO7ڸ\DǾi#F11@t֮Jڷ/֨R.Vާ\.+v9wy+V@~~t ٰL\xaaa77n >>YYYsp]F*VRJB ]6WOO{lڴ)P]8MDE3gܹs8{,Ξ=ֲ?xڵk^^^ر#yt^^^DHjqc*,";[FIMU{1yx>>5TX* AgB=6`4gp~v;[anܸGȑ#8upڵBœv~+|||O駟FN_lҝ;wp19rGѣGmOIDD"""j*@ݺuѯ_?+h׮d䊋Ӂ:uk7VC?|0@15YZ6TӥKٳ>Փ!"iuߗ.v;O>۶mCbbt L~~>N>ȑ#qFwݻ5kk׮֭^xT^ػȺo>ݻ{ůڵk3g̙:u`=z44i"F%ի9t V @Pc}VZHOՐjUkֶ;g\(`L5Qj//. vN5zj~ʐ.]‚ 0`TV -[ɓuV`ڵx7QvmX~=ӥӈH޽SNEVPF +X`Ο?o?ٳcvG6M/ vYF SKIO{CHWXxQ-ɚjȖt"'gϞOa2Sؾ};mۆK.I'Z^^v؁;vB 8p }% -//۷o޽{g;wD7}iӦ裏п$*IBBԓ^o%]BZ۽_0@Nq6߯3mϰaڹS,4T5~<}~W-HnSJgQ11l0T^;vČ3lrg)))X|95k[,HDr]sř3gJΟ?_~m۶Łs$k`V jy~3NjFٲE"ݶ  "xW"^gaɒ%(St ,//?F5jo߾Xz5nݺ%fغu+ڴi=zIDDh߾=|Mܾ}[:J`j& x) dj5k={$ (Ǝ.ߺuO?IWم3ѣGh۶t ĉxQV-t+V@rrtVڵkZh{l@AA+ Tii-[Ԑbz 0mJG۶M]SK&}.1{2DȦ'E`OUZDDDd0YYY8p VFoѳgOdaoL&SҚ5= H%)ׯ~Y2 +,mtMˎ⋨\tL^^FkJ= )ʕ rKӁ%K>_dXvǥ3OFT#=> @=jYYeJHNNFʕv0͛c?FDDD#Ff͚ڵtlٲ_>֭ڵkzpww2e"%%HLLDBBbccK.!<<Vɓ'{a >.1KLfOՓ?y,]*Q8IIܹ򒮡ª[W L.1}7O]SO>)]CM"WBB>3lٲׯ_zV_K.ui"T*Uʕ+lٲpss  ::QQQst2rrr0p@8p7!BV|}}Ѵi{lРjԨ p|?TpIIIݻ7?77b??=777M6hӦ Zn:u98vZ QPP#F… (sR1G=a~tYXp!0y2Pt X{78 ]x8`lO[%f/kq-[eFzz5* ?:U-CLd'r蕘^z=p<[v믿_]:&9997k~~~ *UN:SNbؼy3֮][O]|XlY ;v?N:E(]t=hݺ5Zn>| f̘ŰHbb">c̝;b4|%f0~"Ttim~!>$l߾'OFV 53''' 4GŦMkѢEyM IWXڶ ؼYuo Urfv`& ղFcati5~<Сt]u+tcwCl^?""B*WcǎYfIDDDWfM;8t"""`兖3rqq|ח_~iyG۷~."RT!C`M6r=3g`ĉpppڹrrr`㓠n] K?,]Qbm޿[??X4]XKm`pM5iJWV&#]a)8XO5 N>]7^#DDDd˗ȑ#h̙3 BllUAT… ѭ[7899IgYUq!-wFnIR0w.Pa&.Xd?zu`6:m`|eҢvmėܹ,Xp_TzH7MI..^.!*JɅ K.믿ij>ڵkh@xWc„ XtV 0WΑ?V;>QIv~Ȯƍc޽V{Zbrrrrl֦ /]a): KJ>!!ye˔Q/ ݸ̚<>d@O=L*]a)&9S][d{|}Hԟ}QQ%Dz>}0zh4nO>$^u޽e~! tGAAaҥ8p_-!""*Ne˖Ÿqp%sVc8,صkW8ܹSA tKԓ))%vNZΝ[c~Eti\"+ S:d{vF̛)Iԩ0vtW9s[K4Wr d!<<k׮Oµk_FIVSfMWZ%:ٗcذaw׮]o?ہM+RȩSXYaFe+; +H+,_/]AZt,]a)$$z=fWZZZz\X^\×Jebɸr OnXC GGGݏ}%8p@Q3<<W^d@&%]aiO*-3g*ozmXU6mR/M&#]a)8X 6}$8غUPnUn^+((… u?,\xQ:j1rHcƌpwwN*v;wɓruYDTrժU ?t?u?&\tYVZqҥn..X;:uM9"]BZ8:!Eժ%fj9V~698?IAz5t cW ?k׮\Fz;w*OAӦMqY,[ 9qƺwݺhر(UJ߿z~%Dݺꆲ$'WHؼcqH[m͝;@P.]BZԨ%f))…MUӦF ,^ t c |hNN^z… hC\c옏3 lٲ1c u?.lGgЫf7n@P*ҳ/²5_ԱJPL 0sũ(xTB0{6pt _Z?|}}ѹsgZ c˗1| """*&}A6mt?}t?&Q^t=^%L9wHؤV!g'',4 Fz""yӄd{:uƎtKH +,EFs AAAz~~>݋{3ϠI&Fj *B \.1336mXYGľ}0yd9} 0@c۷CDD]vxQzk@b"PtuڏiLq8, sHǘ>dqLn(<(]bcy,Y99|hV>8P*ٵ 0ӍxƌQS}~Y]SFIFk知KSԘ1ώ7vb"""")Sus%]GDx'u;^vv6u;ِaC K6֬H?__NcWT#ɤ2``V jTys K[_DdC/?~<ʗ//ADDDdU=zxH^Ddz>Ub+n(W,]b9mTdc""s$ҢC`x KjH ]BZkL(]au5L.!`C/~ ggg""""{蕚 ##Cc)SF!٨A^t8l%]"./?nn;V*td (~{;KO.!-^{MRXZ$t 6lt jI6^Dd )))vpppxd&Lڷk~t͛q{R?*UҩȆ3nti5jz:H֬ KHÁ=+,W"q%bQ^=""""]UPA4u\Sc09F l.]!Rt4ߚ5E:Fz.:ف͛o NUɖ-٦ɓ-+,mݪ#Tb^s'st\2!<<\c=䓺섓0mPtYn.jptI/(YQ%]ߟOuQ~>W/HV@fi{KH+ [WҺuT"!%jBfŋ1f.yHDDD6/33SQvv6"""t;^K}F uCHKKoΞ-1>>^5kTdGRSE9d{TQC ggt׹s%EŊE!|_~%ƍg}+VDFPPLq`ܸqhذ!ʗ/&MUVxgO =6o\,=Dd|z-$$DcUPC/zΝ1c+,EDs窛vo͞"X44 Kȑ"#Ր"!Ah8QYlU6t(`Lb(`$KFVЪU~O^^ݻwcҤI}Ϸ~;Z3բwTTW.BT+VPĮwE &` {Î{DEPDqH&Rso-ֻ8{q'55U9&&&r?6)) J^bŊPb#=zÇx%իWLܸq7nIn:%$HU$/LBÇղ>zDC+\$ÇrONN}wɓ8TMXZۛwO7mߡ~zLN aZA88cjnIr~ [͕̚DEUH:::J.kӫL2jF(]vhݺ5tJ]vaܸq#55U/nnnXjhY\]]dtaL8?|/_^dׯw.R:B:(Y^XLB۷o㉒{[nݘ"n * |w~~f 1[*U aHct%Lz|F^Ν$ǵk90q"; )aÄ>N$͛I.#$6W(JWW˗/'I5Fe˖}8p N>;w`ڵȵv%' ---iSBSj_~"Bp(M\1 ?_KK `h`0\\y :;QԬY@ƼS:rDx"itaC)9~\Z3eF֪iڴ)*TPQ `ڴiƓ'O7600@ƍaccSSS܎;"++ @``{T+WZBrT' 7ƤIG{>DVV,\/^f͚z{ӧOEHgϞ!%%Y5j0E!w!b @yR8ڀpP*{de;w 3)lmyabs8vR5v F%0x'dg ˙NC ̘{;I{1eg;$ 8<?ǏARj*SSՅ!uuaT `Z*. +33T(]&&k6?}oވS_ ;w杄h8jzizQK"7o"::woB2001qD*GѣǷ?~V¦M iԨ/_{{{ޜEԲeKl۟CCCkILW\uf;k^ ;B4\fdS:u#]4i7ZCKK SNeR1K\PF cnԪ;R>}7*UƌaHLMs.(wv$% {ƙuNC dɜ1M=́ x&":^ (a_0 +ssԫRMQLFFӁx뱐 l",ڤ 4DtҼ#<`ԩx)b)ccc\znnn 7rSF _Ǐ/FݻwǍ7ЩS'lxB صkFsfffbʔ)6VZm_CBBrՄ>(RA`B#+ݻǬ^^Р_t#JSG,%?߸:y|U)S`bl(Ԩ!H`J[I"VR^ q=BefeƳgpݶ Əe>oڅn1kx@ZFB>}±7p.?Vu<"cc?A{֬y'!^>>>~&N;KKK:uJeYV^-3c]]ZZZb޽(Vhyx5kε\%lԨZjsGo1GQ>}M%7---k׎Y=BHѵo>{`ѢE"M`$)d}+4)^DU%˗qujto?i(j8:N!{` FO3BCww,+DXc~<4৯_s ',CKGGw/YRI`L!&#$DCQK4nӴiSޑH.Z|6lyƏɓ'k 37 NNNS*W}" !Eff&zk׆%zĉ1b% Z!yߟw YO%%NR(QqqpZ^%u_gO` )dx!,; QDn/N!k`:,' |QffqK߱5F@Iu$ j.ȑTD` v- D3PK ZZZ011GWJ#ݻwԩJƓ> |۰accc$ȑ#ҒX9sϞ=cZsR&8̜97ofZWWW?.!pq|wз/$:q_Ta\nIY˜wGқH3}֝;8vLS11c~9z{ˠA(idN>uK5!q065a= //o?W\AGG>DEi&n~/{yիWs=-C) U-[2I|8r֭˼;Znͼ.!77ʊw=@"61X]^zz쬒6TTw Y{/NAT;{sx /!9˽a=|8V9f⹺֪ 'ӧy ^&88wRWQϥ,۹f͚*NBQW݃ӚCռ Odgg̙3hٲ%~'2?ѣ1{lu T,:ƍ,M۰"#1oԮXQ" )UJSŋN#9Y3#I"&f<[h2q"@CCaLF[JkV#Qk yD_"***c˗WqB:ƴiӘ/!D߿amm D>}i&jլ)\H`J[Iv(9ã^*=`DTTTz5; QDT)֬{y-Z8:b-y*_^zSu@P$DPK!%%w BH>򛍩, !ׯ_gZwިVӚgbhԨ*Us犺 A֭GG)d{X!4$"1%VDvv5 |3BCww "w&M3yxx>NB##3˽S6Um Sye&DN)hٷowTR*LBQGɘ+5 ! HIIA\\|/_ << ‹/š6mݡMgEWD*^:6MX⌳6ǏJ՘ڧgc(u&̰ڽw_63fHk<".]FΝ ֮fJ䝆G@` ZLC CÆ?NS[>\^OO`i-H 54D@@:;!oY$}@$`ɒ%xӚ?#6mʴ&!D.\@Nx!֮]QFBQ&1u*`fLp\Gt5' IDATR5*.#F . .(9;IaLMЌZ3t:u$9nmLx!5>΅ɘأ8PU./¾qS5DiB 1z#'X|9R[˝R4o"0kШvT]BYkalh(1h֌w Yǎ>>SE99-[N!))H2231sl|Vs@QKb̙W޼y;!D#BPRR6"VP>}ЦM5 !D^FFFXbnܸڵkCHWWB)d\n۶'١.ҫ+P ֬FR85N杂OooL۰!@Z Tij$j^j&-- DNPfMxxxEtB{%(bܹxӚ`ZB䡧qט6mthy-"5Jnn@⼓HNGTz[/^`ѣJ0+Q+&L`HM VjσE cJJ{ ޽; #G }\lܺ; Q3R_ܹsQ|y.\P҄6xG cڵΞ=Uv1!DaҤIx6mڄrDHުU.(KITz5>,c2rD.UQ*5VL_1E&kAA.L${`YnQ+'8aL~; Q#/JMMűc???jr"#88!!!D||D (N7$9rT>xhlm6E кn|gm&hbm AC׮aۙ3rŒP`߾˜Ui|=|l,.Vw"a˗رclق(qQ{aĈxYǍaaaDFF"#/\o޼֭['Y.pB͛7}%-7AHo>lذy*U`ŊB?|ӧc]6u놾}y4ۃ3&ݻ8v 07 bZ{@xլ ~bHC99 EIaNBS)Sשy'q:5|8:hW>zn^_3x)el1aq7eh(ƭ\}5a \,rpASGNB$Lw(%%DNPvm,_^Hٿ?ڶm+*Q&Lg"11x glٲVVVԩSqEUAi }0## cǎe^W[[;wD%&ܼxhٲ%*U}w,B T;}癕{?W69;CG.9BӼSE5kN!nW a~{TTijvgeK8Wc QIGNA$fzϱk.l޼_~9sٳhӦ 4hҥKT 111Ell,"""p-ݻQѻwo$%%1=k,iӆy]BGhh(-[?]vŌ3о}{ޱ)cc@B4`w%Ah)9`vhTR5sTL 44aI:ssY3iHa 3Rl.-E?!uꄱ]ajO U,-mLL c<<08XsN߸͛jk cjt -;T*+K坆H5DǏK%Hݻwcݼ0; <ϊ+lx8Fa@@.]uVZUVywpp<Xd :v(V4Bǣk׮M4E%‰'p nK,Avx"*KVV"\m(^~V,VURQ0oj9*ilIrkT?4)".NS@ {ˆΝQHsZkkY.\ aQZq^~N鎩2"^ M4AӦMhޱl߾kiiDž?y߿!7ggg;vy]---l߾UHVQUA4ށPvmlٲwBS ֭bc lYɩJ~\bHraSp0ft^$ӡ ^JyFx]hkv.٘;v6mE;B޾VrDhyC}!A*za``-Ze˖UahhTTÇxΞ=0Fs~ |!6_nܸqG!Z k֬[cegg@JDGGcر8y$v؁"LCQQu ر<Е#R45r]\WG7C)=`VQ/שÇy'!y3w(K㲇:ϝ;/_rׯ#(<˕?$)Qί'O//Y+yRP\\87 5~x,]'33!9KBpt.(Ki3g}s6gDaիpASRݨJܲe52k~ر˜xv7~<$DUHڻw/y!M67a/͛zkkkhiizǺYj,dgB|}}1rHdee1]re8p $D4ocR+>>HMMEdd$m% 39Waz 8uQhUWW`ta)8ؿ_C1nJ$$'+un-Z6mA1$9TE gx'q0zD42©%Kld(c~=Æ<..B㫐[f"^r7֯_/ׯ|rܿB˫"Hڅ 0ddff2maaӧOܜymB_%J@&MTrt~o˗q^%(ѣSy QHӥQZm0Y\IP4%F''j\ 3OMFtТ4s23;1պ54jǬD Y:IJ_I:u @u*< %m; Hw) aeeSË%ڵ /FFPBm[nů;!̭[ЧO<(QΜ9CKB5 v»wS*{IHH@Ϟ=q T \P`Z (Ca_eVO|XPErS˜SBB[B t VŦiD}ҥdn.NIiD`Fij$*'W9iʕ+QN4m^^^_ưvڰ`޼y>޿+W`ԨQЖ.BǏѵkW$$$0mhhǏqkB`…x]>}9.)) z+)-EH^5fBVx8]cWDGZթqݺ)U^=I!%>E3S$"Bx D- CV}YSz H` [I't:֗/_Ɛ!C`eeӧŋcFXh^|ϟŋR X&==dž 0rH$%%D!.] ::ymٳmi BxBKUƗ/_гgOʼnzB5w YAA5ߖ^~,N߹TI==l>ZZ,t;7oի_y'!h8w Yo+WJg9O5fǡW{`ր#+{`*F"􊎎ׯ:`߾},5Ċ ƍk׮˗X`j֬ @X~;m&J4Hvv6h"ڢdɒhҤ &M;v ##wDB[hh(:uO"l2͛7'ڬ5qU4h@~cƌ>!L ;{['ظQr AJ#rg͛I"~IWR0&BX4 Ay^W`왰ԡWq#Wvv60`XZZbx)Xy244qqDDD`ӦMV.wSuԉCBBŋ1vXXYYiӦXp!;!(ӧOûwD鉑#GRBզMܹs3f F Q GGw)9s]]䲆u+Wl.L(Б }x ;VI(%/{N{FI##5 03FfPիΝ@v6$DL+""˗/G5`oo+mmmn6m§O~tzzz>v*JH[QQQXbjժ;;;lٲ?~v;vļyo>ܾ}䘘B.] P;-[)SRB K__8y$J,)9̙0Qjœ׉DpD=slBqqBpDQ..qRrp0j&0{œ9Rrp D4镕sΡo߾PΝ ޱTNx=_qơDr?r6d,]+Wƌ3oʔ)ggg\xŋ1h 4oBHbccǏR3g( !DG ׎3󺄈B__lj; `/DZwoƆA%]]Z4$923I\]rxȑ %!ػ7t^/t Ɣg0 Zt)W.]#x f͚+++jrl"cǎvpssC¿([,֬Y\:t.EHQ3qqqܹ3޽+J}''',ZHڄ‚ _.Cʕ+" ++͍w LYK#!J)SF¶R /y'!07Ɣ>$9=D55E ּHӫT)ὯxqY, $D$NDD;.\mXU !ѧOݻ7?~)SАSBBQNbb"z;wRԨQXr( !+(-ŒjR0㋓Yt(D3UFD(&fM%%ʕH{#k3 (ɩbj\r0޾坄L#^R[PWW;vɓs=^zufqQӋHYFF뇹s"++KѣqasJG!KJJBqU8~l޼ZZZ'֪W___3/]Dz;V姍0A;G˖ *fڶ&oU(I"Z_wB"2wѧukh36[ea"4pwDcid+88w’BCC燥Kf͚1;ݻgj,GGG~{{{{lܸ.BZrr2w˗/R~¶mۘ 'U3֮]˼&!' @5JJ5Wয়}y1$%NBѫ0p =:LLD-(KUAԽ;0t(0,z%,; Fns1 c~ }-[e˖eVVΞ=7~{---xzzҾ]}ҥKҥ G5o<!.<<|'B6,Gs]|ADt 0ǎ~Æ qIRBxpuueZ͛+ӚD2e A#F($˜$GB缓E*%4) y'ɑ$ $V7|aS))}MIlY'kzEGG#::y]o^x9šyH%)xDdg+8FH>gff֭[A!ddd`4g IDAT8z(4h???RBx5z8Owuec#\Pf%_ְaŊڧ8DjeL)%2XxwkkI!%_+WoN"Yf%JD%$'3*U7~V޼ᝄ0qM@jԨݬ.OXx}M-9?/_xzzzͫy-{ܹSR_O !D2331b:tHjٳgajj*J}BIKK #F`Z֭[LR&N!CVBak L;w&Ed$$D-[NNS {Mi(+ssfSR 5cۚ  W֤I;v Shf>zqΎ\DV:|0.] }'HMMŁ |7\kTRyx"222[Qzz: l?zQQQf$%%Ȩu-eбcGBĕ#Gb޽ԯQ.^2eʈRBk׮psscV/ Y-BgaH7)eԩ,ޑݡ/Vy.U kՓѫ02Ϟ KN(pн;;IW 3\D70`V+95Yo~Qxҍ큁uYth\KGG=z@=c_{Abb51c hBĤ(t/_^BBB憿K,prrkײeгgOUTXhh(֯_S[#%%ƕ+W.ϣzGӧѷoBG`` ߿qOOO :NA'I&aݢԯ^:.]_'_>ŤÇmF"DeƏ.(_; >8? ?LUdZ;G /\C cIq+-`v`dD4ՠ~ظq#ñi&/ԟ:u -[D&MU!T9pwwСC!Ǐk׮شi\CvsB-צM|Ϙ1K,ZBBQnoMÇysqq͛R4DNA4ɜ9@ÆS~z1h҄w Y N!9Ɔjed =#Y=΀&8!ЊDD7nƍׯGAFX߿c֬Y8p Qn]&͛7G\\q}8rw^8ph޼9V%KbrC>>hݺ\GQR%̝;"'%D34jǏG~6qy4nݺزeKc6!$/...Xz(˔)sΡjժ')c"V-2!X[KoL}\)BD~UHoL} xzoN" ,^ S lg)TP$L200@~Я_? {w6 |˗/;;;;hF7m8}t :T,rl۱cG|>>>8r߿oǵQR%ԫWvvvݻ7*VoaÆVZXx1.^߮)ڴicǢ[n=?ٳ ]nmLMMq=&,HQyaٲeԶj׮-J}B%K2W7BԎ-0e v-$DSl 89"ХP?_MAKf̜+x',XqJ%0YX1fո10k_9'33'N'PvmL4 K_ω'sssM [###9#Gh7v7oǏ~!=BWWMh3!jk577lllDO!ؘid,EKz ah݁H`>Ir| _̜ 0~o * N;I@`:yR4\dggC^й3%,-(AA59@R9 caas"88GEN 8::|0a=z$s<22ٌ7/???\pϿ"DLLLd211)E!߰xbQj*U gΜA=ڻR1GM/Fx dHK)d;vYYE ;\ݺlʬ*^0hгjY{[Q+:::իΝ;/^`ԩr-ÑM6aÆŁ\Ϫ{{{tۏڄBHQ_U%KĹsh(!@R1U_!D6䝂h3}&oLJw (ggE )d8QdgʾiLhZϙ3S9PKN5kիlذA;1h o߾>&&&I#BHZ 3g?f͚RBMBBzg"9@SMT;}??)\]jxup,*z r2LQϹ <;)5 &LǏqezzz>/11_|ؓ'Od!OOOL6Mŋɓ'ѦMQB:bbG!cb"\P64䝄h~Ɣ^?//CI"r2-@@$*uS-˴ )SS>Mztۼ|PK ڵ޾} /{y"""A!5[l( q kN>}Ĵ"EpAVTޘXxwJ&@P$*üeiɴ^XY 3S$6Xc+#5(W-Zw;++ ]v'>p{)\B!(ڶmƏlǡCСC !D1DM/RdjL;$͛ӧN!`|QƍYxxxB##myԯ/Oԇ\PӋ!=== 0W^ŋ/0uT|}NNN([,6m ˗:'M!I;vرcż>|}}ѵkW !D23ʊi=B$G``)&qp BV` n; QDQS Tl,$|i===Ԑž;cN!`jzVZXz5>|+WZeee! -BڵQ~},XyoƵkXD'B47ƌ#JKOODݙ&Mj_T^i=B$oHK)&[7)dݺddNB1hг'nx' WckkgZSa}N!}a߸TIȿJ*ggg899ܹsXnN<)E'Oɓ'Xx1*V޽{W^hӦ ?ƀDYB4СCɼvލRrG!ʛTJ*3g_T=D++RX^Jl!_ׇ.h&&j1 DF 8~03 ᝄ(QS7oNiaL ; sBCqk5[֮ʹ&M49aLImvcFM/B.]ХK`ƍغu+{xzzƨX"222$ʝB!ݢ:::صk 6!hW2fy"1xۯ7_ xחY Փ,WWa7ox'ɱ?`nNՕ07c-)nTka^Si{缓+ !UTnݺ5s~^B9AƎ;0 ދ^H+YR'~* xyNBad$%y'ɑ*,p0Q!aLIiVvz:};p6$|ĮוL/TҼcqwjzqe``~wƍCCCޱ!si 4kkiiaÆ :t(ڄiôf-#DT*\_ի5[ȯR%鍩` (w++aƗ01ӦreX3L2lgm 7Wx')$4*&M`ӦMǪUPreޑ!pYOHacY---[ƍc^B4ôf#D-5o.,&%>@D$DfN!+, Xwsy#.-5g@vdn]%%{_h($E5$TRprrBpp0Ο?ݻCKKw,B!D-?{FJJ ZZZĉ&M?iM333Hq Bxpp BLxI":wFBV` v0K;;`)dyxzѼ($+;==)7ڶe^ xZDENRdQKaooǏիW3gL-!"q׮]C>}Dix)SRB4ю;35۷om)-kCo"4MaU%*0x0Уnx'!0Ӈw Y Ɖ:ؖۇ/_2[JԮXy]Q3з/?LJ❤Hoj˖-ûwaԖBk׮HLL%K0gQjB&… ׵g^7}ܡ?x{NA5u*ЪN睂(j$M)d;;Es)Jm/m_:N!ES^j&LӧOq!TZw$B!Drnܸ$$$R~6&|>J^B5gy rsjB"Sw Y>>B^ CYkϗGNQPK ikkӧO1vXq!ɸu~Gċo̙31|QjBڷoünϞ=annμ.!H\$$9RS%x'!00.(KitaLݹ; Q0,,x'ɑ!,z$z.s"Z=GvK)[gi;Il`.UIjz1CCCxyy~Bڵh iӦ᯿6!h7obܸq1b(u * $:X杄(|y鍩X`Z 0wKKI%Ã7_L y#N uIoa!L$GBq#9$E541l01!n>|{{{DGGR +V6!D3#U 7?Ç e˖E.]%$EJQ&̙S <<y'!h;w Y?"n]%l MD:*Y uvϟE;GvP\9DZr)V޿睤HXz5LLLx BTǰׯ_E?f\Rڄ kkkxyy!33wѹsgƊR6!K`H)d sDz "ƌB֛70 zh&]j XBB1%r4:̜"żERcêA@4+h%c5ֱ^{0&$73KL,b&F{FEQ"Hc%9٣&(>x+/p8;ך,]a_%^Dr0lADD.^Ν;#>>>_裏wQ6DFFbxk.X`q{CͧW\ya~6t(гtѱcK%C S.P @S59zduSVt 3/G!C0wÅ yѠZ51@_S۹X1Ν.)t8""""p k]vŖ-[xl޼'Zn_8~8233{_s=QFѣÿ{=Bqcݓ̟ܺҭ` ?|Ԙ` *Jtԭy  8wW˝;ԍZ(+ ?ݸak4<>qo+U ;z \\ EԮf,\ugLDDDvڵkuyT IDATp7ʔ)#ALxBBBpqqAӦMѢE h͚5CժUP?G_{ 6lg QAyum*WŋsmHӮZS}$]bቛt T6Iꮽ|`pz)jAmuN'}=8E==Q %]\0=iiILDt|@|?Qti899YYYHKKjݻw8TDo݁QKHDŽ jM}tjM+]B:ƎUCK잃Mɽ{K5Jkuw\5 D$&&Jg䙿xץ3  0szY!EϞ%bQC 3mmZS/$]B:]S[V_^SN1ٳΧNIp\+"@DDDDDD[>>>x3΀!]b ] 9"]B:9Z^^%6YYuj MOK-'GG,<eUJW9#]`8"""""""SpttIJeA:奆frt)pt Tk*gNd 0Pt.-]`˖ä^ŞlY5*YR&#CqEDDDDDD+͛7cҤI)DϮ ?3KHG[SDEoo_ _|Q:żj0)HTݺuc)DԾ=0~tQx8$$H6m@988`R^> *T1ٖT(pEDDDDDDb'NE<0 VRSKHG>AOtQ]2?X6e 9;K؏n݀å+(pEDDDDDDbŊ/~z49 DLk']a l,]AƎ:tx_H b8r%7n,cFP/kzQqqq.^9T\. ]@FAAgIW.__Q# O> V cuÅիֈ($dfTЋ(jժwww S+RF˗/G2e99"]A:,LdeK.??+lV`oKT6Eoo888H+^(&+c̙<+/Y,@C/"""""\۷/nݺ}aN2bŊ^éSn:T\Y: ZZt)pt Tt@r2|9pt psSkX1` yk z5 ۶Eg4U<<7t(m؀%'tRS3tiQprrBǎ"::ĬYP^=4+Vo7n`͚5xHRL 0>%]B:WC 3sGHQLbc /77C~=>.Ŋa@۶;q#5 ^nnY[j=E¡QsttDv . <<˖-CQct2doߎ̛7wMx8$$H6m+""Ԑ".Nt20etjFvFŊ'#htF-v놭b?Ag˖p,KiS`l !ݛNDDDDDT8U^&M¤ISNСCow!))I:Q+zs(^tYh(r%0mbだ gնtӧ%JHPNꥆ%6?Ϝ9(UJr{w8ǣ h Q}P 2d*S5*VDڵѡqcҤ jV$EХo& 7o͛cFHH>'ODJJpUX[FVкuk4iEcʉ`3FtH</޽%qFV.;{jDArj{j_p?\p M1q[ҨX<+_ WG5РZ509wd4tZS.]B^Kg`0dHnS֧GHЋL>>>DDF>} ߻w_ܼy񈏏GBBC$''###Vwޅ+dɒ(Z(QT)B 򂇇<==QbExyy,._YYur5O]TySDZOOm[TdtƍTX2_ԵkZ?5M7mRΝYX,Y@Xt H2eаaC4lP:(w?T֕rs]PNOQWժa$$5.]B:*WVC 3{XrEtTd?Qzح[ ǏܹsdN$""""""7nٟ-9f 11%iSuƗܼ%IG(:ZhѰ!_&$@pfϞYYYO}\%зo_L6 /RV3g%KJPNڸQ%`r5<1vy=]qqu%6aa9sek(:vTkjjWŋ@L[0Ro#g^P\xBTT>| $''۷oGƍ|>|Mk`ƍظq#zTZ5oC@ĉt ȑ=%6!!jMM8:JPN .(tѣ<ɓ>d_Vv.9yR)SEk(Wkjvg^-Z :u iiiے?O;wXf DDDDDDDc:~@1kR?.]bkZSCHi#GKl PC9?'SK٣~O)]B:&LP[4:$]^Ξ=~!G߳n:/߿`ٲeyDDDDDDDd6HW.^]hf`> Ԭ)]a-]A|}:u+l1]3~~@4h"Ϟ˘"""""""SHIVΞ.!J ʥJI<|Zt (QB]P.W.G?-tW#e_Ѣj@.]b Yc)QkBg ^ŋg˗ׯ|||~=+/XV3W\ɳ$"""""",\DDH?3IHpQJה5R~u.t)rR%wO˗KHZSNlٲǍ7L2bg>)SѬ:LnTkmѤ #]aTtt h|`| *JtԯvW.hѢO~q%̚5 %Jxc/^ 5))]4u놑#GSN(Vcٳg~-DDDDDDT\,_ܿ/]B:vF0 -KHGǎ1FWK;o']aLMH.!m'JW<8ʅ={ ==;::bѢEضm*~III k:::Ý;wW_aڵػw/_nݺ=BHn%]B: z0:rXVCg`W/ 'իY_d0 U[>|(]B:U^뱯988`Ŋ6m͛ __P>S4im۶>FDDDDDDΝ@LJ&2mТt_[HW)SV++Hĉ3 6m ]:HWjzitN>x =:[ϱrȑ#1bĈ~x !""""""6IW.YS(0ؽ[tY,@:F[*'??A mۀ> ]@F^Ǿֺuk̛7/[?…%K:w 777ٓ!""""""+V<>(.(+']b|1pt (ZT ݥKl23֙GJ"EԚcl ܣGjߐ T$]Q(qرc`8::fԩSQt?|#7onZhhh^Ȯ$$@xt RE]P6wK+WKHGJ[SIIjMIOO5(bŋ%]*ZT1J˙3g /AfͲ9z*U z_V-wet͛?pt h3G(:?_OÆjHa&oSQQ%AugĨS%N 4>իW655r9ͿlٲcbbDDDDDDDv+, X wOtt ds dOqFԄѶ-0qtQDp!?׫NjL"]Qp̽{"555ۏݺuQ\py͚5sDDDDDDDv-- X 8yRt-.(ϛEed3.!NNjZtͣGjMH.TIj֯.!]f+q襩J*lܺuڵk>|g/9 wŋ+WKHs~~FIIҥ@6L&S0atQr2|9pt pwWEKl<>8^t/.]`8Ժuk{{~g"""in/m׮*Tk>vW/KDDDDDDTP?4j2۷ *Jtx{K<.&X.!u ׯK84i{hѢy!333ۯ1x53|3w*WK;t']aN5`" >^tjL,]at|J(45n~݋M6eK,~e2d"!>>>pvvJ'N K}+NRƥIPaq b"]B:zrxC;^mut QK#FeZFFzEܹs'՚7n\]]H/} _Z*N$""""""*v+Hĉ#n^ or&z%+Hט1@ǎF96Md8ʅ3gTRaƌpwwG:uPJHJJ:;;cƌ4EFpȑǾ'Ndu򜛛t֭_JW.__((صKt4l(]a'JW.I ;ۥ+ ^PBO^ZZ._(CM'L?OԮ]q){geeg_(ϼ_5ѣ%I <=Kl=Rk*$DtY,@J6V+aCT"]aaptQ+N>}UV c_駟ZBZ[oƍ|{EtttHҀtQRt)&]B:*TPVr2|9pt pw|}EKlYL.HrԚrq.IIQgƝ=+]BLo6mB޽s\KƶmPlYGcܹst6XVV>6"""""""mfr6ܼ)]B:40ߚ"#KHGݺ[S%[ $.X.!7z|s+WKO8|mڴ֯_@--; A<<" O:ZI~+ΜQ[H޽+ΟW[>x ]B:z 0tImɚ,]B8cq!|5kիyzzwزe N>&M[Mn@DDDDDDf1i7滽{M+H@F7JW1c+6lrp ȑ@׮F!!ڵ@Vt Qr(ڶmmŝ;wPD =UBԨQ㱯)SbŊ|~TT 5jԅ IDAT@͚5DDDDDDDb,`,mgOajΞ.vw#eZS%6۷滻g,MٹS)H z'''TX+V˗q ''59sGF֬ʗg% _T۶5SQQ%665e){Ԛ.ٴINK7$""""""WR88Hܿ|:쏇,]bb2]}j"]bΌ3]}K5UtÇl.!z?I]3`| 2Rtԭk5 ,X\.]B:ͷEue_[S @xt QqGۘ?>KܻwO:ȼڵ&L0vM]P.!['KW]Tlt h6M 573͚m$* 7D8ʥ˗/{R 9s7ٳѣGxyyҩDDDDDDDԯпtљ3ʕ@jt &550iҤ,#""""""3ǫd~`& 5f +F~ _X%c(K `Z +Kt t.]atf )]BC/M5o̜)]aoK_̑0VKC/M9z|-жm|!""""""* 0:w#=ٟ=C+.^>@mI[7` 0`2=ѹ30ztѕ+jM%%I!4UP![srr€! cW^0[`zj.!F]HWkYY%cp{w #GԚ̔.!C!;^ K.45lߛ}+NRƥI+Oxxxoȑ#޾}PfM=gϞͷՊ{/ߞtQۈɎ@Pt9Smwh&l*]ANU۲@`t4 hFh^`f "z];w{iiiXf 5joooL>}ѣ\vzz:M\?]XU+6m ] PtQ` ][hV5"dKW!=0'{??~uؾ};Oʕ+@@@XbX"ʕ+WWW\rpr5qqq8|0\""""""z-.͜ P:uk 58;;ÛGF?#]B:6fϖ0y?uKt4nlA*֒Z[d'52{*2RQzijРtVF/˖w5J(,Lݝc)g^y;V(<\ݙ(]B:ڵ&L0vM)lI47ADDDDDDDt&]a꜓L1t(гtѱcyp 4G)`*u~ٟ~ԇ+Vs. ;(_<|_ ݃oމKD4-::{8gdLuֱcO瀇0dt :SL+|w7RLKݫ֔n0A7%65ePK^DDDD/_756I(wd^5\oش (_Yt¤KleHc!ŋ%6AAjMFE ϝ. RF*&ooo+VL:T)5(]Z&-Mmvt pqQ]]%#X\w5R9;~~.PT۱~t ppPKlV`z!zFp DDDDDDD4ժ f,Y\*]B:*WV$) X\wQyy5%uWܿ|pt P9;K$'}dРA3ӭ&n:\R:RՑ#GdɒB5DDFEF;Qa/Ξx9Mنߺy1C P!+QQ6nmTQxqߨ\QyrM///s;wٳ]bc ksC\^Ov Hge[hhtѯ:H80TޙCyC) WyqMQ^KKM no z>t=3xW.pd͛7ǔ)S3*117n!C]Ȧ#DDyvx~} #G۷+Ud+Փ0dyQlY'8~\[7kIsmujtԽt ' ơW.ԬYŊCZZSSrei?<ʕ+<<2==111ؿ?..]K.DDOuEkܹhԨ`o^DTxq ѣ95f^U:ODfp*Х гtܼy08p jժ%XVZSI\ t t 30W/]b.7rKz傓ԩ3g<=777,_G"jŲe0m4deek5bQk ㏁ͥk(??bckL`ZhRrUk*&FF֭֭݁kH/k,wZ jMk']C^ԠAdž^%J L<ׯ_& +ZQtҀ #''Uה3P3.=##"~UQ//{*WV9sKl-ԖOd_l=`RjזTfRCL3HN/WC njlo TJ GjMqC#^.=\3fl5k,,\=*&"""6p 0rtK`" Ly kH_fz_Fx Ĥѣ#5j,ʆƍ.Vk7@KÆ{I?*W__ߖ.sǶ^rvmugoH 5Ut 31b@ 3i.!}JW=KI.!zCHW>39Ytt .]at2lpt  jűcǰh"khӦ jժ*U ^^^miӀ-+زEtM j%]ag(]A&Nڶ0 6m ]:HW<l ]AF:w0;`zj.!#FݺIWzsڣG%d8Gʕ+E1c֭[ÇիBbb"bbbpl?ӧ /`ǎXODDDDT|}5+ݻ+Hԩ#]ae}40 >LtIWm|t3xE ;mۤ+H̙@Fv?^ ..'ND:uyxxqJJ 3gΠo߾hݺ5.\gODDDDTJP˕.IOW[?.]B:U/wwL`QZS*Hde!!%x9 ظ8tHtUHWm8 ]A|}ե+6m +QFg̚50 K/>V2LDDDDJuAL.\.!*oMݻΣ .!jM99I$'ss.^.!nnrѢ%6~?/]B:ʕS%KlRSՙqOKRԟ}KKؤV'OJq蕇֭[nݺ֭[YYY{c_OMM믿!C $=j0W$Ӱl&o@Tt _|k*&F.!uo@ ,XDDH5ͷ぀5QIb"d pt ^yd=zt"E`xQmݺݺu󭁈(tc7+W_3:IW %m[`D `B .NtjL"]at|J-ӥ+"#Հ>&Ft4m ̞-]atZSoK q9^{ YYYZQxǾ7_oDDDDDfW/ 'իY_d0 V>.!}ue&gϪ-RRKHG^?3'%{w`p KԖ|}9R(,LmG$]B&áW.bذa(}WGGǾ3fhQ2EL+Hĉ3 V}7NIh&7JWѣΝ+֭x}1U($XxHt !]at(@_'s+-ZBXbҥKuV*""""<`NdV/+HРtѶm]Wg&||t3Gqi&;vUdf50ڵ  ]Ӧ-[JW}et^? G[o͘1+ """"P|Kddw'=*]B:QC OOG5k;>Y,@J6V+~CT"]aq#ptXե+6m ] -]ae {t^&8mڴyn;#PDDDDD*URC 3IJRgI ԅ"&pr:ETѢ%6Ο.!ʩ;SK.IMUgƝ>-]B:JRJ.IKSt pqQkU&=]msxt M_^:%K)]B:Q$6X.!5koM\SZ5TBZS%re󭩻w՛Ӯ\.!aziAx 3'%%a^&""""{֡0ntQxP ]B:ڶ&N0-KHGV)Fׯ5+]B:ZPgIdzGLt hLe&7o5ut h0Vk*:ZqoţG~1ԩ&L[0`Ӷ={|׷oߞ/GDDDDT`0 U[I}պ23g+az0:^mut 6L%%%kW`H 0uwNRt 3FU`b 1Q8Iثcǎ={_VZ?~ڵm[lį9r$O_?{wU{! IDAT`n(\TMKrռ.2d{ݺR.i*Z{(\Pr Qm:̙~xs>z^gf<߇DV:$ln|y`p$ȑj́IWsNBzS{J'1;tHͩ1cGl@7tSvI'1 VsjX׽ R*wNbvڏi8H4dy*(H:YhSkM\Sc63g{m/h޼9ƎkL+W~㉉f\""""r>>@)mۤS^&%Bk&`哑l p5ejMg$۶?O'y^?6H ƍS~d. @: `K؇?I&͚5u$/^FDDDDUbE$f+W;>L@ժ)rrի,c2իK7֪FԪ%Bk:`)B+ عS:e2uJZX)_S:;;?K.mHH~Bۛ`EJKu!}IMU{<#SFjsڋY$G2/Pt4`R)$*RH'1wX8qB: Q:O)#,#C;zT: Q<*,+Kݜ }K/wެY3ԨQC(ؿ}DDDDH4Pj$!=NBzԭ.(IbS4Bvxsu``SZꂲܼ ̛p[ TTRp!%Zxsm5""P>aKt/駟7'nVKDDDDm[`hZ.s i0XUxى-9F̚ d7W{|jNK'!=7V{ɕ+'//ԜtI: tzpW~巌 lxf%RVNuj!&x5ZgΨVwH'!=w Nujɚ":T:Vd$hZQAC`ZQQjٟv퀑#ShEG襓=ϟ?/D˗CFR_d~`Z׈:|-#:T*$8Xͩl$ǠAj$!!j,$G@Ϟ)VP{}[:VXlړ^:i~?{,;&e׮]9sSz|JCDDDD$dRTd&`V哑l l,2hT:VP(8Nc~t kxukZv)H1cTKr#ٻNAyE/<<<zlƌY6n܈}O|#""""S&<(rrի O>>7<[8p@: Ԭ)BسG:e2);S^&Pt UA쓯/Рt mۤSPaK&M<ؾ}0|pdÒ۔L:o&<;ʔ)癈D"E"IRS^LgJ'!=ʖUYTZ3)$GRjN=e{ˁ'%J F| ptңhQuruNbZ>,(THͩJݿڱK'^.]NBz4hVIB:O=p^:9;;W^O|Nhh(|}}Ѻuk.]...Dݺu1}ts]oDʕϢL2h֬???>}ZW֮]zz5Zg_ӑ4H:Vx8xSO.!)"#En%;ѩ0|t (`ϫ#GJЊS+Tm #B+&Fݜvt0j(DGG#""… FBBRRRlM6UM"""""+#FFzO@.ڔ  t,B+8Xs> <M:VH$+K: 1`гt cǀ+^_dzN,[$$ӻWFrꔚS9!eM⥗^DG@DDDD$g Q#Z7OSF79( NAzMx{KڱX^:5~<кt ]ט1j ['9hN:ڵ)X_|'''Tvm3Z՜(Lt 5kS^@͚)={S^&)BkzS^&Pt ᅲNAzK mNAzLt M-[SXRV0h}駟H"1ʮ-T/.,- XTR!SS..I/W-(.()#,=]ͩP$GѢjN+',3S<|X: *VNbvSIH/ ZU:YNjE~t^6`գG1vmH_UgUK]P67yh$TR0>%VML5d`B 2R: Q'YYIH=Sh;| !K:j߸t$G@߾)NT{I'!=^{ _:֙3;I,z٘7BBB흯}_|1_KDDDDdN6Nl(4 Ocat kxU+Z;w)Hc^N{7/5 hN:ڵ)H#Sh:|-;m򀧧'~g|xgXE;#ԩS'OEDDDDP|}5ShI ||Z?J |}}߰{ԯ/B+0ضM:e2^^)6mlNAzM֭)(X#Θ>}:"""0yd1p@:u _|+f񉈈^RHQtwU[0$Gɒ_IUP$3Ϩ9UtL:a$S+J'1_ͩ`$T*,'XF!d2իKZ8p@:^yRJŋ_s(Z5jO>QQQ."""""kxx/Fr&0o-pw7ޜJJժ_F ,ZDFJ'!=*URB, ,Y;'puU7YqRS^LgJ'!=ʔQsxq$fiiSSͱ.]?k.ܺu 1g7ݻwGfаaCxxxM6K/!CGPPp ]?1hL,B%`, >^: Ѥ 0mt +WٳW?.իfT+gpҋV/#WIVtcGZG+VIH7zNu*ĥK'!=Nuj!&x5ZgΨ;IH݁ASh;ZJ'G`ы&OVdv`ׄ @V)vS^c/$Bkn_:5jЮt uS^#F:H:tXɑNBz  t,B+8X ΖNB`ыL&C:V@)H/__NZ6/2ShmܨdL&K:V` ut ktQ#Z7OSMJ R*2#9%?#yڇJ'!=yFͩr夓ef֙GH'!=bE$fYYjNt\Y:YNk!$IZ8p@: Ԭ)BسG:^i5j@b gggxzz^ /񸻫 5F,X?/VM]P6d`B 2R: QSNNIRRŋ^'d*TPNbb:{V: Q%%ǏG #88qqqHcC#::۶mÄ PF <GDDDDTp4iڨˀptңQ#` ZWfEaCy*.N: QTb"0{6+]xsu`\ &F: Q͛@tt^eddØ?>z-4m+VDQV-4l͚5CVбcGtCr 6m ";effbڵhذ!}dee?:w NV$'K'!=ڷ~[:VtPstWѣSh].(?_26mqShƪ9(h8Q:ŋ@tңys`dZ.ؙM^:-[ĉn:?~V޼y:txbQI233GsHJJҝ0p Уt #G+ $ǛozI:~\ٙ>}}Sh< ,[ܽ+x5u2ӧVt>SIxjɚ*2D:Vd$hptE/իgrrr0|pV~t )))6HFDDDDD VSh l ;x%ZwI FRd>_:juz5#6L7``* ]`uN\ #tڵů)Z(J,+WDPP-b=#lSG:V@*~}QI` }t dR|I` ut k lެ~>M 4m*B+(H>Mx{Kڱ7 bK_~ sAL2 [FF># 7|cq gQEeef6GH'!=Q eeE: _+K'1Q+sNBzLt 5kS^@͚)֭SS>)NQ SB0gΜ'>p5j"##gjժ=g͚5͓.ُv?#IN.T{\Y)''$f))js礓* EH'1KMU{1=+([VNb,]"S*R<#w՜ NBz,TI~I gyR ۇ 믿KƴiӰsNbǎ:t( ڵk-Q#H^f._NBz4lhbjB0{6ptң^=ͩk׀9sk FrS11IH㝧nƛSII@TtE/+?yJ#44/_GZtF!<<ХK4iݺuêUyf.\Xٳg[4>BoKЊ,nݒNBz 0zt huM$G6ر)bbT"1Q: Ѫ0at ??UT%Ӣ0yt KԍDIH&MiӤSh]SWH')0X޽{XtCתU {A*U:Ʋe,:fcT~Lo_ƍd oVO?W99IHaÀΝSh9-<M:VHj%0@` #C:cK}=،3ЬY\H+_jԨ hۻwoCDDDDD29Fl.2>_Fl*1CGh$[7K N6Nu+qt k$Ho6lNAzMڲΝ@@t ,^8qB{r0eʔ\x3,( OXEDDD@ZZ._W"..vp-ܺu IIIAJJ rʔ)pqqAQbETT UTAŊQZ5ԩS'W+ĉ|}՞F٧&+Kݝ\;Zt YKܑNܽ!,_xJTsj>J^^eJORdkV-yg穛7(ʕ󔷷tĢN&l׮]_{- dɒ߯_l*nDDTp%''ѣ8vN<ӧO#<<믿_g}mڴA۶m^^^ (6T?L:YBM{%pvvf l'K'!GѢZE1{tK| IDATTJӐ4M ?s$fWfTji +!:-[V%6mڔ>WYv5o< *T@ƍ1qD]Νף?5j+W"/m`N=ݽkoĔʣPkzo)Bi`2՞O]ekW``Zj%EJtңS'`0Hڷ~[:Vt -$E/,zvߏ˗/ù>N.],Wh=Rb=MNNN> O>X"^xL4 }nWw0|p?Dbbt$"zѣW^NonE/ׂɋb8Y#6L]T&!CTH~Uc2"=H/IaK IN6mdѝ]u|X4?ADDdTgFJ &`˖-qt|>pD||>_F|]l<D1C1͛ Lo)ȑL )H T[V"[;V6ݻ-9E/^x{yk؂>+VD֭-kk֬Ecӧ/RRRz߮89BTH{dg~ {%sׯP;3o\&ʽVS=^Ӧ7˗oNAz5j$ WLMMN`X„ zl—_~7nDorQeNOO: 99YjժA_]"-ʈW/_?ZNKt&.^)iSDTu &B+&F:Y}{S#y`hdC,zY7@z<22w~aֆs={9/^<ޥIDDT۷mڴAllt"#W_Nuf ?{H!K t$a۷99)zNA7DDҧ!j*/^ܦΙ3Ώ:''GĉQZ5 8-.\ȂgΜa;'^:͛1㏑"}6 hD:@`t "2 T[V"[;`k=Z"Ǣׯu=HLj#гgOc?3F777x{{cHLLx/^DhhMt:vk׮IG![9yg/Pt??Lt -U""MkKxXptdՓNAdR|]c^u|(SUt .|ٳgcҥr劮qCBBEDDD!22=z@*/l; ૯p$GrK3H'AKCJVKNbv3ߤ/ʖNw#)H"EԍDI'!GS+K'!+e#:uBHHZh{͛QX1'Ξ=k1 2eʠvhڴ)6m:h~ZlM...qr;V:JBI'!=QE AiF`SC.ɍySZQfTj*p!)\Y# T1H$mznݺ ֭[駟O}Mjbȑ(\p|rKDDhQvmXL 0g*,պ50~<`MDLy4 ;W:Y\0k{@JiHZFK'1zUH{@ժiRFlI^=usGI'!X1'''{F\\~={ q-eˢAhժڶmBlp¨^:<<#""rdKFӦMѢE _>իE8Ŋ'<==ѥK?}6b۶mضmmzܧ;v,ZjOO|=.ST IڌrgOu _uS+֬Nb,YL*%@3Gi'ٗD` $fӧ# lrFx祗cŋXCիWǨQgF…>… t(_?*TE;"""{Vn]+xѢE ԩS'On<ɭg}ݺuCnݰd۷/eXJJ a׮]y~,'6IWEݻpFYoJbULݹS:Yp:O Q*`7mQ<$,4X0n S7M#ҫat^vW^쎇:v숗_~ T" *;cǎᅬk";;;O?bԩSͪ}IHiſ\Io0ySG}mۦZ ?95xtc8uA知ޭ԰aIQۧoK'!=~[}ڷO: DDDD6TX1tsŹspy|W0` ^Y&[;Ϗ7mڴVNBzի!NmA}6 "5 UK:j" X2&?HۥS#ٸ;׌U%n/_?7n`׮]8q"ԩ#j^^^駟9 s+$G2o͖0ӊ1kZ(9uAHhaIQdf6FZHWzPA:YVj"Ev6ZHUNAE>6A;vĴiӤ# TR׿w^cʕٳ'JbI0Lعs'fK,ɳIHb"0{6#TE< J/ʕ,9c*嗚5e#u XNB") XNBzT.(jNEDH'!G ,YK'!=ʕSѹߟ9lɓػw?DDDD6駟b…h߾}2:_~AʕdCϓI''یs"0w*&M'1f*JCiS`Z/fIQ\ԕ+IH/:T:VdZI,j5!ٟvQShEG7oJ'!GÛYV ) Lڵk8t """"ݼ~z*dp[ldrrCBp̊ח }N8ӠA@)V1eZ_}ӥ}}H -ݓNB)5Ҥ=zH(0E/Xft""""tS=ŋq9Ku+qt kys/?#+#|l\n$fDeKZ?_/"@kn){6z4+)S#9pj5|8Щt zUZh3ٹO>6w6 dZ`~Ԫe˲ _0Q`Kf2kKZعS:U Pt ;S^>>@)モNAdf`&״i@&)uu7nܐBDDD[ѢE|}|L2wU S..HQE/ g+ El+F&qŋ eH'1PmNBz89󔫫t,`J $D: QS*I'1Vmݕ/rrj^&.@_EfͰzjddX3z 6`}IGt:0g $fW{" //uϤ%$̙tTjN}d6t횚Sァn Vb"0{:Oլ),nN{]$Z;w`5(W5j&Mm۶h߾=J(! `РA6-zegg#22^^^6 9u)#E>e V R_Aͩ+?,XAOv_K'1S%˕NCz%`(bb z$uUt70i0wt777o1k,˗G>}phDDD$5m|`[UIHAfX9pLkI:vLŭ S߾@)I'!=zNujK&→n5ILn) <̽{ehѢ~WHDDD$]v6͛6 n6`Fĉ@˖=< @V';6nкt ]ט1@۶)S^#G*B`H_`j 'G: 1dеtE/;vQncǎEzzt"""РA; `e\f2uJZرC: モNAz?/Bkf`&Hnild/``ɒ%hӦ \"Y:ul:^J5C.ݻ,_8!(QB]P.SWLrxȎ-ji+$GBjNU$,;[,UNa] :$Q`׮NAzL@Z) $gѥKԩS(^H,֭[HII۷qҥKرch׮<*Uqb"""25jt<*n{NC^0w-+yWDjUufd$fokK!KUԔ)ijߜzӐʕS+&O6ξw_}V7R,5mw=ʗ^xA: Yy7YT0p@899Iyׯصklق>7"":uBHHJ,)HM3g#C/fWiBnV+7CK]Pǫ̙tTjN}tOUԭ.(tD`w5Ӑ OW us̙\5djT+Sg̐NRlfޫR ~'[vqQ:uٳ 4k?}4Ǝ HRRl:w2Hm$d%K%ѶCW#ShEG7oJ'!=ڶƌNׯK'!=ZƏN ̞ `dIShũ}BtңiS`Ml25ڦTP!G+>yWFPPP>#""" 6D6PH;(`%%~}Q+PR-B+, XLIHwo5)B(-2={Hw$Gn)ΝS-Y&>u *(PkڵknݺIǰcǎ1{.L:FiLDDDy{pU2&$g`z CBOVaP6qQV޽t kHH֮NAz t$B!o$ǐ!@.)UԾcd wNQ `UTvm:tUV}oQQQXb@*"""Om܆Zj6X@st zT7Ϫ1+@)6nmNAzd&`״i@)nU*OS͛Kڶ  NAzMl)W#\jհi&*y!w9WF Gv,# 4T: =ŋqي=h X%"V*RT(}Վ58X: e2WLNzZCdWNv-t djՒN-2ڵS8S*[^iժFΝnZDD*Qc['S̟DEI'CΗcڵ &MBdddF ,$"ul=z d6lPnݤ83/y+V0vΞQ** e2 5NI'1۴IzNBzL.(8!lV5xm>/F IDATMѣI̶mSsj$ĉ/$D:?*ou#θqjNK'qްx] 9Ei,^lY;..ΝV,xݘ3gMYu9L`JuG)_Ϟ"+lרv颊+(K=3ڜVV+t.ZR^Eի4jnUC57֪FԪ%Bk:`w\Eg`<"ْԭ+!pqss.@\nP"@izjcGAXX5j0p@#66ϟ1zꅚ5k"11NB {lA%$$/xh߾ƣ 9YPPt~~ȶag?899*/Փ'Fiߜ8)_WO: Y5 c$fw\]xYY8㑑r\? \IO$\ @ypP㏟xƃiij8WW+F⢊'s=`r57NCOq?;/xT"/_V籜wr\|F+ uuus<:nv͛[;巢Eg1c?Xr0...=vU$h֬Y۷o?ҥKx>>(T BY̙t6s*~Uc|8d}Gam8nU+7-UiS7}$$rT/͛#FHк|Yͩx$^Xxڧ당[࿧NYUʍ;,0 jk}8 ˒\?T$`>)NNNxwj[o "ʭcԩ6?)b19~\IONp20/u1 ƊSQ1M>=~8yRc[3gTC K0ib=~Y]> lsTKVķOmJ'xXd$h;RS1j\lWZZHGs(usZRR#2.Z=F[-zߗ-[OIVN.lu'zt|"XnlӵkW6 ݻj)}]5 hΪڵ)-[J':t[]P kBh`)jNeoD!!ʕpј) r-4Z8v XBET`Ο?ݻIQ\rAAAO|Çqĉ|JUpnn)֬NA`AHسVHB ;S }azUP%* f̘1ϟpAHLb|%CFFF[]w#&_SA8xwЫW/KdIuA>ҧNH/6l@hdUc 4EjNiTٳIHeU1xq$fiijϸSAAz@S{[ u?!>(SF:YF7\D _+9LιG+??}o\YmRIl,L gN"Q+W`4pg,ǁF 0猋3!"\ 6}*>fφӧu'n,)x_;S9sUG6]fϞ}ߋ+… ̞ B!,,)SШQ#T´iӸqSԩ-) M47ԝڙ30}:\ۯ-^2[9x0r a2YI;Gݺ0lΟW/N"LĂ*uX 0y2\dx.O? GNa-*Jݧt'nfP o\S.>.e6""" a&}+F63n!441p@ZlIrxBᖢٽ{73gΤ[n+W+2j(Թիv>ުG\(V?wy{WXG}DΜ9]U!0c 9;+ƚ5khԨ(UƍS ǎN!$ vt'1t>M}د% JcfN_6)vQdΨQjA!I2|vQb gM_~tȜCϾ}t'ɰmO;pC[Q= 74nY1%K(Zj̙3ڧz'Yd I!ױcG>,^Yj^$,Xv%˘i?8^reA %J8gl{XaS6C2SX[ Vw aR;P 6X\mR%)] 7N!>p/TR^,X@w DGGLժU&~i'bp»Q 4zM~9ƢEM!ըQ;vB``8B BLΗÉ٘-Y,9l*?ZwHJR=՝D#wnuMN-X\stQd:A_$RSŨu'n"_ᙿp玺tQdM/aÆl21lre֮]K=(S ~!l`O:uPx[rԩ~PIжm9?#vJ>!OrJ~W^xqn+e3iYILb0`tnsFUgP|ԕ+0}:DFN"QZP681""t'&uh aL8}SzL`l ӝD?ɓB a, 55}pB:vH 7񤦦ùsx"'OѣX,lԨ9!9s{|ڵ^Kݙ6mCYt)3fpJF!4hah߾=ٳgGIÆ0dZ1HINeqKm_~qhJ%K2U}օR fqL&@B[ծ #Fԩdp&Ow1WX7V@ G ʗy!/~~w_w#);wHLNr|<^\vԴ4k?oLDEeܧ-ȓO ?ԝ$åK>5a{y)ZAy޽C{]o޺EJj*))s>6Wx SS5 6vbb dIل0Bˁ8pNצM g۶mwܷo_r׾Z|9~!9s4<BduJk׮ӇJf5 ̥m[ujZI2; ,9!U޾=yyx\Y >sݜs$̝ F\Ȝ-5|$`=ѝ-xyyQhQjV@ X|bV Hsp<'HsǠ 2@i5 ƌ|L8GӦ>eV;V!(U=K{JX8}"ᄆaniz_g)3g2N+XP[.!e7^{517oe}mʕ[C7ٳS !DVӰaC^z%ZlI5rVy^]-ܩ;I~@0?'7W.^mԈW5b|w/Kmcv@&P5oغUy S{NavuM;[KkJfMivX l\y=p+wߩk7\Bc#pˋQF!g}F\\?<ҥ s~DZc 'Y˗2d+WZj{?~\w4.a {)\JB#sxoSfƞVT͸qPV6שF ʛڵcϴiD]ˢ#h] 5/v8B;~8V>, .$99Yw,afeI2$%^Y!3g]]cdULEʥT^.oaN2~>>ڨ_>׬aƠA4VnP: pp,T9> tN5VBt'p玺!v! @r1cnJ*VT_fr L8Ńs3}@y81""t'(]Z-I|<̜ ONTE `L΄/_Cz1cԮ\gda3!!́0'%NUOi#81̓?НDrVv^|kMgt_?`  2`_HL .LtwF3f63gwޙ~o_ޑxB!2ҥK > *duGfTjxn&)p$Nsuޘ6q&@ax.pIG[xᰚ5ah)EEԩ;fxέZ_x`HZqx*_ީs%M>u Z|ĨSNryrd?Iڵ42 dʕGGӧCXlwgXaWNٞk)յkWl{c/ZȮlB!s^u5j$=ܑW^g?/k~g8yUOzssCc4Q͛(  Sԍ{4o}Na-, fVt#$3/;}`r6LhO?rNcшxevST(>'ii{t6䧟`HIѝD W?N~îF{p# :NaRZMmK}ǹ qi/UÛ78ƑM/˺u뤬IZZ .g}ZjU'Z`ƾ~:QQQ>}'NoqQӝ6ӰaCmF5\:0yB͈)Y,O=;ݾ eɶm=;KF^;.s*I<;_},SSU̠ ][wa^Hco€ 1-M]S B.0nO];b Bƺ{…}M$~v?ۓ&NzþN3өF!#^n(00sw^ʖ-;pSs"̤I(R(P Q^%fϞ=k !ȟ??/2ӧOٳ۷iٲ%NKڵՓfr>ٌ_[țcb}VTlbcaTsWC)o3 <鄙plzҥKf"kq+Vxoݺŵk2Ojjj^ϊ!DE={6ϟgĉ-Zԩs^t_|8в%;S`s7nth ///~ Sst'xxu)ED׮N"lTlYFth|AK5lCNaYuM9gThR. o>p”7G>wNm_مp욘X@@jf׮]3n8sJ>S_mBc̘1>}w}0ӧѣׄSm~IZ$u;5~SW7{y~IJIѝDأsghNw k q­ӣePС^GEϥp?m۪{; tn o :aNT%Y0˲ݐk׮MX"r.???C)^8^^^# cX7oײgNF(S 6ȹs8piYz*ׯ[n6#YIΜ9yݻ7}[Flٲ>f+%1B=]OdشI56g-[94F`޼LߠD⡶nUWvQdΠAҞ=dؾOI rNcC]/vTהN7yuuڹSw ?@O>37X}9~6)#W SԠA UɄIdM/ooo.\H*]vq˗gTT׿Ŝ9s ʦBdR2eعs'3gd6ͬ &ТE z)n,8Xm~jysI觓'~2y H$kjPcӍ"sƍSNԝ$CH^~YwanM5)"PсBBCf;(2gXI;I}CI=FRY}y>S~%ȾJ]S]8rۯ3f̐ /ݿ!5c C6@={6iϞ=4/Bar^^^ 6ܹs>~ZZ$==˕K-(;ᚳ۪|}_e 8KkӹIcBKMUetQd6"-MceGl>ތtA#X`rؽۈфPV'U.0~<)/qT3>3שFeM5jo!6ϟg=M .L- gȑbaņ+y|qB߉h"nLK\fAx$Vr%"##wΜ>ܘ@"`l ӝDأHl2]7n';Af1tL/?Nb$}ʉzm >_έNeOkSe>)3jYV߿?٤ϟos}dϞy֭KʕO>[n>Bx5ksNoIf~<ʨŋ0e DGN&8<·QL'Vhcǝ4iթ`!z50ۦO?ҥ/&^!z{dH<[Vuw$:Kh*$'‹>k`YcTä$G3'ה;Zb=M'dT|ɕ#齃b/cFɄ'a|yb`k"-[+t'p*=z4\.O8_@߸aX`lxmUS'M/!<ԁx <<*U,Kzxbz-J(a؉ͮP;|O:nqܩDtȜ_W;wNa.(X|{nz٣~ h"szT)3߿-S'f#Q\w q#]R^6{pO#FrfiNʊL1zbhիQ+%y&&MA\r塯/iԨ[ly))GƲ`d ݻw[fj#fذa|͛oɞ={25?~#uF||]s!jժexgϞ5t<||`8YНDxtSsj?ם"Ţʙ}$ƵNZeS6PVoѝBX:`kךT(r.Uw?fiժTX@=z4<Ξ=˷~Kxx8'Oȑ#9sŶgAÞ={ȕ+գqtޝeoq.^ݻoIOOx1114nܘuҴiS/NPP5H"VMMM믿ŋ9s/f^n۶mUV<*T"EТE Bᨲe˒={vN?ސqDRZ1S3UbEi'HLyN7+P >ec IJ 5U4H20 Uϸ@QCG\}jHHHНF}[ ;֌%ݺeX k`AnnO6'OdOq%''sn߾]^7ofvex7ocvA Bg" )o(BA?ׯ6b НZDZRQ"#atY!Oz0tΝS!6Vw,-3yz$~]1 .2vj*736 3N9s`hȝ[wa_T-ӝ$CX:ې74!n0l,"")1c ~g&(QK"CjYe5klH!7Չݻu'ɰcGѯ$SڥQtȜ}ՂdسGݧ//i11؛Q7jd @kj~&I2,] C};|fL6B!u'x);Vw k2a;w_z8 7'^B!"&&˛7ac qWЩ֎U%Ēt'aRьrvQdNp:sFw VAм$e-|߿~|B!WN7>u$BB (|=ܺݻ3 i׭SW^ѝDB!o6lz XIIѝFySm,U>%Krɒ.f> Ϗf5k3X2KdU`AQCwaYiݾʇ,<;G/4|zϸ(5U9 4Fx>>0~bcuQ`2uMtX,Lt©M<敞~)n# B!<ƍILL4l\w kÔ);3Ϩtfr"L .Nntǖ}i3y)^|}Ĩ}.NR^q_[?]S/k9IɦB! o5t{x6mkW)?J޼;GVнC=l2y6-Z@^SX S')t'q[o/ZD:eTfЯs;$<f4OEa矇tN;;3L^g>n瞃F` vHdz8IIɦB!׿Ezzc6iȔ}ޭX,{6*d>sGwaݡuk)<1Nv?-[ LԪT)s/ڶ5CU=SUlz !BÇb~~~ԕB1cT'3ٰ֭ӝBkHUKw k7nBÆ~NnN(b]GjԤ ^^^Ð!а9}\S7qc)N؜ʒoߦK6[׭K92д8d.XBw aB%B0`᧼Zlc a`(YRw k+WwN! eNaҴ…B)Y[NRrI}_gax80WYZw ajU)_N1M¯aaN /O:e>_;0B!pS&LlJD֓/ZəSw ɪ,ѣ{Mܹu'p** IDATo{ſ|t'ɐ$n0`t~8r)(W7T`앖˖{d˦…u'ɐFݫ;{orV唱T)Zթoŋ;e^X,N"LD6B!Ж-[4iK/>6+_^-(ɕ+0mj-O2滦` НDأdIl&0{6>;)Y,̚ŲoqctŊ~]]SNN"Qu'ɐɓIko'ӹ3V@uM9m~%%^ǏN"LB6B!́ԩe ^{5|}} Wԭz0y2N"Qe&/k%I=~Z"4(uMEEY}y70zB.k Wڝ;1m6mru*WI R6bb`pAwa*UwM]mz۶ԩDfa/gNdPKVhbc4qlz !´ܹC~8o/C楗^͛-[6x ![CSX{hf+NR$$N"Ѭ;p5K[nz`\Bەb'u #:*_ڴ]NKLԝDh$^B!0.\HJxwb,RSSywر#I6BY>>>{N[a󳺑#Ui:3ٸ֭ӝBk0xXv]֬ѝBk仙lW?[L *?ƌaӁX,t۷x1 㬓?;z{3{P>p 4nl옎ڱVҝBثx\:W۵ ~C.:u ݺѾaCc+_|ի4 ?#>&zDD!W^e͚5YlٲQfM{9֭K:u(i IIIرM6i&b`A *Ă \>h\Q", Q} ѝFتY3ضt$NYԆJ0iTݧ-ҝP 7ovO ŊQD *,I%XO+Fqh89oO[x8lݚ-Zvƍa@u,"#a4Ӡ ~֘Źsjk r7oÇ@%RrRwcE p4畄EFoO˩S8{ѿZթûF?Y ӧ;#ΟW&TpB!GKOO_~UX1ԩCթT+VbŊq"͛7 ѣi !Bx8___V\ɘ1ctG¹jքѣuS'ͷuL .N"QZPn!{lۣ&L 6UZP6X6 "#u'(_|ի0cNhՊ&Q8~=ʔ1穸8uMEDN"@6B!ooajR=E ѥKɦB!?Ahh(M6Eڴѝ/2*)){tNa-4T]SoN"Ѿ)kЀ gtGɼ6mK)?J޼;Gн.^ԝm<6>7%CЫTIdK!ȑ'm64tzLmիu85ҝpON!Q`AM7F5ӝ`N"ѫڨn#@ ɓ\8׽P5aRHM՝DD6B!ŋSXKHPt'(\X-4mb"̛'ON")__NB@~fnծ/gAYQS 9sN!) .Gu'PT@F ~7=5,=%Jd9g|t'ɐ$AY B}|g\|/ݻL5[.} 8ooyQzIPc&.Ґ]UjԘ:ΟםDأR%ܧr 4>^ټhRF&jf DQ36Z w L{-65 ,ΧKBLlߞ}\9rhN|*Ub֐!D]1cPH+YהxpӺȦB!B9h۶-+V 66~cO29s[nT;Ν2sf=M-ZT.?n`7lV ˔D\*lR%tv{mtoB_=NoL0>/穏>N̜cǢmӦh۴) Dzͱ쎏˗WBӢn*uUu$mjl$ӧKA3(ծ]O>$|I@||<֭[7_~ݻ;ŋGӦMѮ];t ]tA"0MQyz$V+ÂE-65X ^]h0P^V;Ú5ҦzJ; zNB&^xAV nFE, GSRp$) RRp"5Gqlz:^8 ,BBBPLX\9 CԊTAԎDuQҡmv/}Qof$˗E?rŊo-7ވ;;C'NDj*8~4RΝ/-U,W!*-+zDjV*V*Q0-g8\))~ *,zQWT)n[}ѣػw/<#G->>O?\jժjժVШQ#4n 6dHCh)FNN#9sNC&1cgΕοӐh9O=a^;ÂҦxr?O}?nviS.i'!X""""""d MS-^ ,]LhFieKv_~)* NF[k[XH;zYSح\ ,\L=4оv UxqZcыBBd!յ8XY^; jNa7ovv 25~Z;7Nax1lv rE/"""""\ҡ!=x`^$d""B:KN␑!kڥL//m\9$.fɔO|ʔp$/KڶM; (YRTJI23e͛PyZU;Cv5X""""""ZFGu IN&OMSIDzҡHRSSC:u]F~Q`PvI:sF; h>\;2i$\2^vsL t额nF` 'G; 0U;ݦMҦ~ݵS +ID>@Ϟ)~U֍|Y; [;Νf*/NSǢ;-Zh3Qp3hR;ݗ_Kh SF[k+OSgS;݊…)3i[ X@;6 Q;ݚ5)<<jNa7o)Tt4Pv  NA(v ~L4hONAbbƍS-^ ,[L͚iSϵSi,zy!%Kj'qȐvNB&åM-Egv$dlY NpLmv2QJ8df~lެLJZU;CvLǺqv2Ԩ뵓Y,zyA='IDTtT`T!$dNkSgӦh'!5k{_Z0s&LT&b-}<}ID*r!QXvtw{Itv!""""" pw .!88uJ; h =Z;ѣ ɓIcj;~4 8qB; h\:ɉr:vL; h$TR|JLNB&5 >$E^DDDDDD _;]l\ML<0pv 8୷w >;if̐Q:|2D;ݡC2$dC`pvSOk'!m#Fh;rDfeHNNREDDDDDA.]Sm̝+P0@_e휬,$d)-[ٳ+WGzNa믲nID^@)v-32=}S#^`ыĘ12X YXX;=Z; $_}|v 25rLHV.NAyhN;ݪU)԰a@ǎ)֬>X;<I;ݺuy8EDDDDDd*&SG;݂k S11@Tv oNAƏ4Na'_k S11@),"=hY3vD^xhB;gɍ^DDDDDDålY$/ʴ<۷k'!eJ NpLImv2QJ8dfԙ[h'!RN␕%m꧟iSիk'q,6zz$d*&UK;ݼyڵ) ="*J$i ;:uM9L)ISIDFצI#GMSI -A<]hۿ9S~ӥ 0hv 8L{/0dv Cdd3ID)6uv2Ѷ-0bv #GHLi.!A Ni')X""""""NaϲIVv2ѿ?Эv -[d+WGzNaY~^޽SmsIG9Wݻwߕ(t&IlLɚaыȟFW_|L yv +ئ3i[ cdj0HX@;2DFu뀹s9B4HF9s$ ^DDDDDDDEi[P:)8EG˺dB)~QpsR#|ŋKSqd=@gr4f вv /s ^DDDDD$$D;Cٲҡ\v˗em۴ҥC9"B;CfLsev2DFj'qʒXI;  TI,KF_Lji7djxn]v,S/X""""""uՁ`tA$dV- $ii#IDҦ邋e}Hy N␞.k1j'!r*]Z;CFݻⴲe8\GHyƢQ~iRQ $ǎ''Nh'!-ZtO SML=L4j$ʁ)yKHNB&6L k'!QQޗ L*QEDDDDDtI\0sҡs!)P^t !KMNB&ڵyF;]|)ئӝw#GjKHbSIDV)6nNX""""""oݺiۼYFdfj'!> 衝 ޽^Sm.=]L)@{7pv 2խ| $2%kzv2)NNX""""""*#Gmh`"djm[T . h SC@evʋAd} ٸ;,$dbkWG,zY",\XL9f:d*&F $K_|L'˘1k[XX;5J;BE/""""`tv2Qt(GDh'qf*, ?Nbliv 2"յ8}v2 Ԭ Nv T N11@Tv Ѷm)DZGu IKfNB&W$IDdP$o9|*U%\Sxe8\(n߮L-+m*<\; ^DDDDDy`D V-Z/q`OfR$IIr:zT; h8 NIJHNB&4f >^; Tj*0mpv2QN)^DDDDD*.NFRi'!:k;xP:jΜNB&:v Naw0utRixv@rv2qȑ)N&Mb VZGkKL}Rv2qn-ȺL|zNam0kpv2ѻ7Ыv e=GNa{7@Fv2ѭпv 3ӵ NaL}v2ѥ 0hv 2ĢQ`"djm[vVL t蠝٦ِ!=h[7O;4Y;݆ ܹei'!J+3@Vv2ѿ)EDDDDT,ZXLi[X\;4Nax1)Tt4мv >>T;;V $Kʹ2a +OS#eZV *,zW4[j'!aaҡ\v,:y&$dX1)V!'#ず5S8X$d*&SG;݂2:SL nBEOixA g35&;'mj~$djUiSIeݜX$dR%iS%Kj'q}W.mlY$/ʚq;vh'!eʅDI._RmNB&J6ĢQar80aKY+|%%'Gj'!^:uJTBv2Ѡ\HRRɓx$d"*J:ԩIDݺצΜONB&j S^DDDDDi5aôS:$IDv3hLb;}V;]B0iLn .1Q.$JJNB&ZƌNaw\qv2Ѣ0nv ^DDDDDѶmו+ID@^)o.]NB&zNak{2=݁~SLLt n~`Ly$dK` vqq2utZv2q!) +el NÇk[X@;:vki SO> tn`\NaYYID@n)l23GzNAnEDDDDT-Z,_L4inb/Shysv}|v 25v,p).,NAFZNaWlSlHMv|#)8m\`ы0ʒ7mNB&EUN␓|I@͚), ?X^; Na7>fv 2 ԫn`*d*&__;…)TL Шv rE/""""9Yb~$djU:4T;Cz:;@lv2Qt(,iS{h'!ҦʔNpcv2Q/e`,Y7OҦ""8dfgt|BH$t 'GƍC9$%IJLNB&6 6 Lk'!7$ʁi`Ta$dn]P$gӧj'!jy*-M.NNB&W6¢QQqtj'!O?.><HINB&xYv RJNNB&ZFNaL z >-[ch;vL.8yR; h7N;ҦNB&6 8ہ.]NB&~G;ݮ]2XFv2ѽ;Яv ~[Ѥӵ+0`v M?Lt ..1CFP^``vӦhB >;Æi?EDDDDTԬ^-Sp:vkk SO> tn`\Naq#0gLu( 衝n6Y7e$dwoW/%Ke˴Shysv~ |v 25v,pm).sC;ݲe)smh`"djm[vVL t蠝cы(>P| ԨddITL Pv 5kS^=v H2~}v +Vh S11@F)//NAbbd/RâQQ.kj'!U%Jh'qpx`$dbE`xL$/ʚq;vh'!y|y$/g4b|J6!3S9ܺU;  "#8dei6i'!!!ҦUNRdEDDDDT%%'IDÆHIx$d⦛M> L6֕bj IMONB&jՒ@r,0cpv2QF൩sM߯LT*BCI,zu@Jv2qjKHL֭QS%&&HO˖W 9vLɓIm/qiSǏk'!͚E=L4nxm`ы]22݁}S#S^Lt D;1hV;ʕE)@)V,NAG;E9X0wСԮn|IH)&WO;}LEGk[XB;6Nah|v 25~<Фv ŋ/NA͵S ,z݅ Ӟ=IDŊQSveZ]r夣\9$.fL)#mbE$WȔt[j'!%JHJ$YY6i'!ŊIVM;CNLI@͚) =z#IM7Ir II6Lԫ'$5:8tH; ];ٳIDצΝ6v2QP$rqZlv2Q%j,zsGH"9Y; hF $ @Rv2qW 9vLɓImZqiSǏk'!͚^+) 88zT; h8pTbv2ѠA]VȰEDDDDD#W^L< )~8^; ~'SH4$dS'``vӧ/ >;Æi;tHFj'!O?.>^.NKINB&xY^DDDDD~ei'!O{?nvY%$dᇁ>}SLt망P I!! [TuꤝLt$${j'qKKYrOkvҹpUwHk,_(<)z*mNr%PTO;=$-[l.ҦxByAR矵8]+mœQ8Pw#S4]DDT[nw؁͛+&""opw""""""r[;E@DDDDDDDDDDDDX""""""""""""ǢPDDD-44իWTZ'"r.$D;QPbыo'NЎADei' """""" Jސ^DDDDDDDDDDDDX""""""""""""5 7-[j "ΝSKRB$IJR67NoHDDDDDDDDDDDDAE/"""""""""""" z,zQcы^DDDDDDDDDDDDX""""""""""""Ǣ^7o)>K,K;  _nۖ-!+gΨ+RSg幱,re)ŕ+=Ϟul "BQٶͿiiunq"#|Y_v @T?k+W+u++^#) XXؽطھV2@˖@vw:xWKO5 ѯ?Qlsp"p̕+u/ɑ8Ko67J{` 23QH/"""""MHv%[nڷ:u:wJNEqHqh0kyR+? 4mz99ԩJ( ,\h(Wr aaw]}uj'*8? ,]jX"?EDDDDD ׵*TF4NC Y``tO;&M#ٷ+_8}Z:"^W g`X3?tcY"?EDDDDDG9y[!X1U+vmvp0p:x5o IDATY? ݻh`N !ǰiWΔ, 4kvLCZ<HIcceJ <42k>ٰh??]>*S8!]];[䟗_D}{X"?EDDDDDykx10c}AqOT/Zv]f,^mX:AתYS^cxbSѣc+~]P6ަ}{`{w <>/_u> GZcܱ,_›JdVW֝;v.Y-S0֮^yE.pE/"^DDDDD'O=$x㣏}?ŋ^\hb鄧).h,z…siS̏o H7z,1?&;Zx}xncl*⡡Cߖi l` `<۱ED`ы))I9}vŋKtic][oy6,Lցjx^y?]Ϣ˒TZ!!qK/I~x1瑩__:Kl,pW^8/J˛QO> ̚Udnp>n.<T vy)x@ٲ̙z̔<(x$rrd7p^*UJ}utv;AQ#GpljyXR.͏L*TDD^`ы˛Nnz xM۝<)W`STFaYRz}(!kp?ǏiIǎu]No_d "աwըcwra'11jQ^DDDDD_tJGy>)p +2m+|t钿BB=jDDժy]dת0a231]QƑE/""""" ~!!)9O/LZrrG 5k _|_ . ] 﨣75ڵM;.Qcы =od ڵe-ğJUOHԙoN-LŊs渞ȑCDx ~}sg`24G{Go" ¢y1`ϢeZ̞z 8w'e\\+Ӧ9/!`wOM bEg!" ,zQq}ޭysgѲzv Vs+Wukws:vG|".pjT"иRS?Q`ы  *v;w G?ŋ@Lm &;/pN.D$-.]NQ5hvǎo" .5jxަNx뵘ܙ:}i:@nǕΝ[o/5H… …AD$X"""""¥re$%v Fof@Xt!cؿ !*ʎZR!,̻ʔDDA">Q*mr`>Μq#Lo-k:YiӀ,Yn!s p,dfv F̘~Zf &7J:u,DEՉ]\PRRy?GA9yx,M&W]cы] ̙hHMPj D!?Qzm€{q.]> XP:*ϟin[ÆI?\$?myΝGVI+w$+%JNSAu4_+5S/Xz6+K1{G M@TTgrt8_IO8r.۾\ܖ-rۺU:rG*z9#=+T5FNuodek>7˖-As^JJ JZ*?Ęw_̝+M<)[nG9_;+Kn?&#Cs/K\?eɱ}ֻ K [F}<4\"9s:;W\tiM) bN`t/{ 4l(ˀ;//y1ŋK9/תLGӦKz^."B>O ~|oj7}j>劏f̐zGx޾V-,Яй}Je x!/""""""|5iZ:S'oӽtZHO^xG'P駁G:Wqm7gؿ_%xWm&#~ܭg<1w.0bw&LmsٹSK<)R^m2``yyZN0{LG'^xAFe)~7k4^"&HqWw|!Рdns}3s옌8{,[A[xۥ潘 yeK#}y5˒"ʸqލvL)sW(QkgJNƏs­+M ^DDDDD_ZE}dOW(%KQd;iBj'dR:p@?k^_xK:-ҀÇ_~qbŀ[Fuڑke4D^DUKggVANޭ6n4+R=[: MRjVF awm/ =k浝3gYѕ+tLH 7:A)Δ//={w7n, oք_)n~d輸8amwuL`,ٗՓ%nv:߬,ߝ[-,kڻײRS-%JL%K,kWϙg9'mzW^of<*]Fs\?vF˪]u-eYVޮWc,;ֲBB{+]ڲ~-dI?[6I%'= '"W{%>52ɓ}>㏖U/^ܲzʲv1YYzeOaY[z>f?#߹ӲV b,k,~w._ߗ28IӊLǸxѲOwzRŲ?~ͲjԸ!!5`eΖ`v3-kYk=_y?eI1hв>Pk;fY/hYJ׿ZVNY9;ҤerۻײRR8zԲ>̲zsmʲ\=ו+u]}Y֫ZVFefZ\g?WU5ye9sRS-k4\YZR2珩Yxa}з-[ukIU/qPd6~ݻ}˒k{]}V=lMk3ҥiI]u#""""BGz}ez>ܹ{diTCWs}fv} Ov\ j_{4+;[F\zZвbc-)JOxZJF+}`L=zh'7QZ}|+9YF-gYrpսee*Vt g9{VTɻ,'OZ։^եeedF42,|W-#W9ڴIzVFxx]27(Zs,kx uݏ??ƫmۑ^չe *Wߞx²yIJuA5kzR<Сd;2RF b@?K&y!#rr,ǝ}ƍېs}Ϝ9{7@?ܷL%yy__>~פIgDDDDDTT++˲|S:p=ϗBL}\9ٖլ۷Ѳ ;n׷"UFNߟ׸qdjt=xPڬvV%W_yy iۿ쏢eYV^cRc-Q|]w_uؤ,qqBɓyuӦ{sVʽU:ǹsuM?&,̲vkYRwVȽM>Ms;} y=X[r`AOvL|mbŤM9f맞p:[1nY#ITQQ_Sڦ,G^1c=Vdӕ_ΞmÇNYrն-`y&3G﫯d:WS? q6ՕOM\0v߯T 2lcIL&Mn?S4תV)BCB't/GM-ޠor(,Ybz03>ի;iiYP__=)932}'>,k&4i)#]ȜFɚt0aT̳Vd A2~}:wvMﳠuzu|MʮvZ[ 3xqۭ\iWA> \ߤW(Yz 1#UT)|;wWYl UL 5,/, h='<Ι3RU˅ bZ'(6yAX1+Ov\]弰4twYgʗ5\6ͻ  kʟGK|0j5k^ٱc=嗝׭pfyBBbW>77+HWNVΕΐ^ƍZÆղSG-H'?{YW UWۼY:]1MWm'O:=[Ώ?eey^\9onKM79&$}Kq͓P;2j܏<իck=ިRhy<47߷}^˛,K˨]w^}U.rhBFT.شIt9'-Za_~VZS&Vҙ yƁƛb Eߞ 7%seȫ]zՒ%}ɑ#z!!Wkp*bVb"z>P:_+;[7ɋs]o{HHp]ݹ}|hSi~iOvъGtFJ2{w}fΦM,zNoHDDDDD̳uaa@F2ާ IDAT2(Ym;'J{< g?|Y%11rPXT)44L]]OQp钌p_~\˛u_I_hq7/ zg)*deA/]ߗs}m?{>z)Nbaы{w]UuB0 DhQDD<%BE}P ڊRUh>dh@ & S@ 㷲2{>{o.|?kE=g}=So&""""ozHR?/~=0iRkX=5r^'6t/ |+0,z8S󟻟ͩnn ȟ?< E ffze`[#*۴q֭y-^l=VMys*37{:J8x]^,] eM3fWoT0xM`~%%˫KIo=~~TsmU+ ED{|f) """"""rRϤaazM7X)66nwo>s.eecD? n w)!rTO? wr$8tȷv68_)Ds?۷޽HII~ ̜_OwCwM/;ŨQ5./X4l(ϣJ;keב#%Á7h ? կo#ysguҸi"""""Ką _{Z`Qx<%818uU  M7l*+v}}sK7J|<^Z&N4TYlqK#o 2୷>}~gjIt.(ۓ'_m/^yyՊ =z7}€%A/""""""Skݻ?|ue믾,F"(Uun$^+&-=s)4iʕ{~%dh]N59mnE7OٳN=]i?iiP)`""""""frWׯaТ~ڽ)t=do.vr11@RRBצ~d׃M͛-$ave.F/~XV[O}ohܪUeǵʨcǀU$_O)I+)5g۶~eN~Ǡ'۷u߻v`B ,, X2GP<h@Ւ>ؿeK$j?ޔ;R#fH[zc_mlw2c}Lڴɝr;]J={ԅ- n헯;oRSݻj?,Yx׿ii/]hYlz".Ngsz_~EA/"""""ʬ.eb8[7e**>L}?bce R79*JJ,=cpz%'{o$ lR鮻̖o/h2CQ@ˬZ>UTX|y=x~ 11{RMYopu Qu;wOLtVny9' ;= ѯKLƏS܆ 2+ML3Ѩh|U =/rs<8z/%k֟GOGد~^ˁck"5U2S7Zq1oW)2X>*ll**wLxiϿrϿ7o| R_"k""""""w>_tA˭|iSzܹ2squIݒ7x1С~Sgw }AyMet}+4n Wӧݓ'ufffe[~(s)/`W1OF3 $(Q)2~u )S̶o*/ny'rUp<:{6;2fgJzݹ8rٺ_}߽;=.nn-[̚%cƸ[6A/"""""rI`;(/8wYQXhɓϽ{|{w;BgL 3|?.J|/ɜYƴ"i뫯Gå]xQW$b <^Aߪ*Х))k>)XLG:6Cgy=<ϗԹS<`um:wf㏁ɓ}o~|69{0GPʯP֭>utJIiǒ%fi`R_Z/oo۰C-W v>:TƴiiB8nz;/[w}Zcy}\Fpk^#@^[SSec}k7Xs..y X9=ON ^?l6z6o7}JtxCt|{ET9E (y(ݬ,z;=v?WU+|`ڝfd(5rdmiw^~AR͚YDRRe9u5ʺ^xAӧ}ի߉s;vҳR.jv6h_ɚ5J<}~{+ӓwߵu׉26.7R[ltR'O:K̚evlJ)j{~QmJ .ٳJmڤT||{}T?ryԼT5Y| w~*թپ&Vj,_nާ)zZ/>zYsaaJf{vmbb:|X^i|7&W+l^BNVYN+DDDDDt+?4kFV yzHiI2?f7JCmJK̔F+a?I Sj6TxCk\^uwoS)uRGJYO<9Ҧ,oQ_^zImZFDG+ηQdd(d}+u= +yR)ըcQ>?~&)r>ٸQ.zIIJkϢ:T3ZHG_;j¬Q/))JkgV3g~mJJPn uРnTZ:UWNfVo%Θ1_CO=ԁJ;'Ad lYs'pY-7WȈWdR?/:_9瘼:w~)rz@KV粊%v]u͛]iiխQ#V_~WJ=R_~=3x,d @V@orNth޼j>}:lu_Ujy̔g>}>U׏YڵfY盯x@y:atF=yR;tȑ=O_)TGQ9rD~> |_k TxfGeKIҽkնe(˵l)nV%)JL |Y#0c<1+}KWwe p-e?_Ni#scΫWۛsgסwoڵ.WR"s:%CIhOI ٱh| x%Igz5m*t5HZ¦M<8wNy9{VR</@k Tӧ (/\R(z2qK.#C-ɹn[Mw  cl_Uwꔜ{ii*jj{Y4Pʹز#T&%eDD5sgaCI{ x4ol&*}2OnY%ArjMΡv\ׯ7?6%`^rv*LJΪU6I:[O+rsÇ%\judnI4c=) *9v:t{gͫi壏$}'J5o3//<\ΙD筸XRR<#5yjX$%E1u4]V=O^ٺv~s_hB%Z3_NK1N,Zdw1C@Rܲo__:kӎG.…ԛ1c$WNI{@f̰==]RrIQܤ pr?={Acn8QӓޗWbb>MխR[,;/<]\e zik5{}lq{K~nsxIKsw;sQjR#FM7IJYVz{U/<謌6m2}{挳}6guIMkB?A +1^zM7ϓW^ 7FUƌ1HIV_,]V^w;ifYa];3iM_11CuG vR{Mlե9QQU2j>d85;&5ĩSJ wANGRp5"""""ڮx} kKH  ]vxL~D2g(r""BFpC}U);{/\6LFsmV5B;%ˁSZ[(C74i"rwF&_1!ٳx1{' ѣ@bp͚I:8`͒[} RXp! bH',L.=oAr ̽40V6zHR오$(kmyӯ0aBggqˈKlI%f\ٌF{XpKAo%ǖizN$7yr"I׶+_:ȿKG˗~jyL,X,] n]f͒FJ'Tge$%N4=& n5 /~$nHJ~|s瞓{ɹߠ/1unHNԽv]Tr;ް8sr[KZ8O6 Ν+iC:uuNk["#_wnZ<._.)jg I8{omD+_w3y ׫s;qUQR)r2~2曒NJttvy)q*3Sow3G""""""),Q[))ȟ/#;7?8JNbbeN '.\[ʨ'aQ#"!A:2DpKE|Ud\x )Kk~i^>҉íFWV Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, and Caporaso JG. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37:852–857. https://doi.org/10.1038/s41587-019-0209-9 qiime2-2024.5.0/ci/000077500000000000000000000000001462552636000135365ustar00rootroot00000000000000qiime2-2024.5.0/ci/recipe/000077500000000000000000000000001462552636000150055ustar00rootroot00000000000000qiime2-2024.5.0/ci/recipe/meta.yaml000066400000000000000000000025341462552636000166230ustar00rootroot00000000000000{% set data = load_setup_py_data() %} {% set version = data.get('version') %} package: name: qiime2 version: {{ version }} source: path: ../.. build: script: make install requirements: host: - python {{ python }} - setuptools run: - python {{ python }} - pyyaml - decorator >=4,<5 - pandas {{ pandas }} # tzlocal 3 is currently broken - once this is fixed drop pin - tzlocal <3 - python-dateutil - bibtexparser # This is pinned because networkx 3.2 was trying to use # importlib.resources.files which apparently doesn't exist on # importlib_resources 6.1.0 which is what we were using when this was # pinned. Pinning this to the version that was working for us was simpler # than digging around in importlib_resources to find a compatible version - networkx =3.1 - dill - psutil - flufl.lock - parsl {{ parsl }} - appdirs - tomlkit - lxml test: requires: - pytest - tornado - notebook <7 imports: - qiime2 commands: # TODO don't require devs to remember setting this env var before running # tests. The value can be anything. - QIIMETEST= python -c "import qiime2.plugins.dummy_plugin" - QIIMETEST= py.test --pyargs --doctest-modules qiime2 about: home: https://qiime2.org license: BSD-3-Clause license_family: BSD qiime2-2024.5.0/hooks/000077500000000000000000000000001462552636000142665ustar00rootroot00000000000000qiime2-2024.5.0/hooks/00_activate_qiime2_envs.sh000066400000000000000000000002241462552636000212200ustar00rootroot00000000000000#!/bin/sh export MPLBACKEND='Agg' export R_LIBS_USER=$CONDA_PREFIX/lib/R/library/ export PYTHONNOUSERSITE=$CONDA_PREFIX/lib/python*/site-packages/ qiime2-2024.5.0/hooks/00_deactivate_qiime2_envs.sh000066400000000000000000000001051462552636000215270ustar00rootroot00000000000000#!/bin/sh unset MPLBACKEND unset R_LIBS_USER unset PYTHONNOUSERSITE qiime2-2024.5.0/qiime2/000077500000000000000000000000001462552636000143315ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/__init__.py000066400000000000000000000025251462552636000164460ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.sdk import Artifact, Visualization, ResultCollection from qiime2.metadata import (Metadata, MetadataColumn, CategoricalMetadataColumn, NumericMetadataColumn) from qiime2.plugin import Citations from qiime2.core.cache import Cache, Pool from ._version import get_versions __version__ = get_versions()['version'] del get_versions # "Train release" version includes . and excludes patch numbers # and pre/post-release tags. All versions within a train release are expected # to be compatible. __release__ = '.'.join(__version__.split('.')[:2]) __citations__ = tuple(Citations.load('citations.bib', package='qiime2')) __website__ = 'https://qiime2.org' __all__ = ['Artifact', 'Visualization', 'ResultCollection', 'Metadata', 'MetadataColumn', 'CategoricalMetadataColumn', 'NumericMetadataColumn', 'Cache', 'Pool'] # Used by `jupyter serverextension enable` def _jupyter_server_extension_paths(): return [{"module": "qiime2.jupyter"}] qiime2-2024.5.0/qiime2/_version.py000066400000000000000000000441211462552636000165310ustar00rootroot00000000000000 # This file helps to compute a version number in source trees obtained from # git-archive tarball (such as those provided by githubs download-from-tag # feature). Distribution tarballs (built by setup.py sdist) and build # directories (produced by setup.py build) will contain a much shorter file # that just contains the computed version number. # This file is released into the public domain. Generated by # versioneer-0.18 (https://github.com/warner/python-versioneer) """Git implementation of _version.py.""" import errno import os import re import subprocess import sys def get_keywords(): """Get the keywords needed to look up the version information.""" # these strings will be replaced by git during git-archive. # setup.py/versioneer.py will grep for the variable names, so they must # each be defined on a line of their own. _version.py will just call # get_keywords(). git_refnames = " (tag: 2024.5.0, Release-2024.5)" git_full = "981462dc70891c067625d8a24f1c185eeb32ef0f" git_date = "2024-05-29 04:20:00 +0000" keywords = {"refnames": git_refnames, "full": git_full, "date": git_date} return keywords class VersioneerConfig: """Container for Versioneer configuration parameters.""" def get_config(): """Create, populate and return the VersioneerConfig() object.""" # these strings are filled in when 'setup.py versioneer' creates # _version.py cfg = VersioneerConfig() cfg.VCS = "git" cfg.style = "pep440" cfg.tag_prefix = "" cfg.parentdir_prefix = "qiime2-" cfg.versionfile_source = "qiime2/_version.py" cfg.verbose = False return cfg class NotThisMethod(Exception): """Exception raised if a method is not valid for the current scenario.""" LONG_VERSION_PY = {} HANDLERS = {} def register_vcs_handler(vcs, method): # decorator """Decorator to mark a method as the handler for a particular VCS.""" def decorate(f): """Store f in HANDLERS[vcs][method].""" if vcs not in HANDLERS: HANDLERS[vcs] = {} HANDLERS[vcs][method] = f return f return decorate def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None): """Call the given command(s).""" assert isinstance(commands, list) p = None for c in commands: try: dispcmd = str([c] + args) # remember shell=False, so use git.cmd on windows, not just git p = subprocess.Popen([c] + args, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=(subprocess.PIPE if hide_stderr else None)) break except EnvironmentError: e = sys.exc_info()[1] if e.errno == errno.ENOENT: continue if verbose: print("unable to run %s" % dispcmd) print(e) return None, None else: if verbose: print("unable to find command, tried %s" % (commands,)) return None, None stdout = p.communicate()[0].strip() if sys.version_info[0] >= 3: stdout = stdout.decode() if p.returncode != 0: if verbose: print("unable to run %s (error)" % dispcmd) print("stdout was %s" % stdout) return None, p.returncode return stdout, p.returncode def versions_from_parentdir(parentdir_prefix, root, verbose): """Try to determine the version from the parent directory name. Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory """ rootdirs = [] for i in range(3): dirname = os.path.basename(root) if dirname.startswith(parentdir_prefix): return {"version": dirname[len(parentdir_prefix):], "full-revisionid": None, "dirty": False, "error": None, "date": None} else: rootdirs.append(root) root = os.path.dirname(root) # up a level if verbose: print("Tried directories %s but none started with prefix %s" % (str(rootdirs), parentdir_prefix)) raise NotThisMethod("rootdir doesn't start with parentdir_prefix") @register_vcs_handler("git", "get_keywords") def git_get_keywords(versionfile_abs): """Extract version information from the given file.""" # the code embedded in _version.py can just fetch the value of these # keywords. When used from setup.py, we don't want to import _version.py, # so we do it with a regexp instead. This function is not used from # _version.py. keywords = {} try: f = open(versionfile_abs, "r") for line in f.readlines(): if line.strip().startswith("git_refnames ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["refnames"] = mo.group(1) if line.strip().startswith("git_full ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["full"] = mo.group(1) if line.strip().startswith("git_date ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["date"] = mo.group(1) f.close() except EnvironmentError: pass return keywords @register_vcs_handler("git", "keywords") def git_versions_from_keywords(keywords, tag_prefix, verbose): """Get version information from git keywords.""" if not keywords: raise NotThisMethod("no keywords at all, weird") date = keywords.get("date") if date is not None: # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant # datestamp. However we prefer "%ci" (which expands to an "ISO-8601 # -like" string, which we must then edit to make compliant), because # it's been around since git-1.5.3, and it's too difficult to # discover which version we're using, or to work around using an # older one. date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) refnames = keywords["refnames"].strip() if refnames.startswith("$Format"): if verbose: print("keywords are unexpanded, not using") raise NotThisMethod("unexpanded keywords, not a git-archive tarball") refs = set([r.strip() for r in refnames.strip("()").split(",")]) # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of # just "foo-1.0". If we see a "tag: " prefix, prefer those. TAG = "tag: " tags = set([r[len(TAG):] for r in refs if r.startswith(TAG)]) if not tags: # Either we're using git < 1.8.3, or there really are no tags. We use # a heuristic: assume all version tags have a digit. The old git %d # expansion behaves like git log --decorate=short and strips out the # refs/heads/ and refs/tags/ prefixes that would let us distinguish # between branches and tags. By ignoring refnames without digits, we # filter out many common branch names like "release" and # "stabilization", as well as "HEAD" and "master". tags = set([r for r in refs if re.search(r'\d', r)]) if verbose: print("discarding '%s', no digits" % ",".join(refs - tags)) if verbose: print("likely tags: %s" % ",".join(sorted(tags))) for ref in sorted(tags): # sorting will prefer e.g. "2.0" over "2.0rc1" if ref.startswith(tag_prefix): r = ref[len(tag_prefix):] if verbose: print("picking %s" % r) return {"version": r, "full-revisionid": keywords["full"].strip(), "dirty": False, "error": None, "date": date} # no suitable tags, so version is "0+unknown", but full hex is still there if verbose: print("no suitable tags, using unknown + full revision id") return {"version": "0+unknown", "full-revisionid": keywords["full"].strip(), "dirty": False, "error": "no suitable tags", "date": None} @register_vcs_handler("git", "pieces_from_vcs") def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command): """Get version from 'git describe' in the root of the source tree. This only gets called if the git-archive 'subst' keywords were *not* expanded, and _version.py hasn't already been rewritten with a short version string, meaning we're inside a checked out source tree. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] out, rc = run_command(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True) if rc != 0: if verbose: print("Directory %s not under git control" % root) raise NotThisMethod("'git rev-parse --git-dir' returned error") # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] # if there isn't one, this yields HEX[-dirty] (no NUM) describe_out, rc = run_command(GITS, ["describe", "--tags", "--dirty", "--always", "--long", "--match", "%s*" % tag_prefix], cwd=root) # --long was added in git-1.5.5 if describe_out is None: raise NotThisMethod("'git describe' failed") describe_out = describe_out.strip() full_out, rc = run_command(GITS, ["rev-parse", "HEAD"], cwd=root) if full_out is None: raise NotThisMethod("'git rev-parse' failed") full_out = full_out.strip() pieces = {} pieces["long"] = full_out pieces["short"] = full_out[:7] # maybe improved later pieces["error"] = None # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] # TAG might have hyphens. git_describe = describe_out # look for -dirty suffix dirty = git_describe.endswith("-dirty") pieces["dirty"] = dirty if dirty: git_describe = git_describe[:git_describe.rindex("-dirty")] # now we have TAG-NUM-gHEX or HEX if "-" in git_describe: # TAG-NUM-gHEX mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe) if not mo: # unparseable. Maybe git-describe is misbehaving? pieces["error"] = ("unable to parse git-describe output: '%s'" % describe_out) return pieces # tag full_tag = mo.group(1) if not full_tag.startswith(tag_prefix): if verbose: fmt = "tag '%s' doesn't start with prefix '%s'" print(fmt % (full_tag, tag_prefix)) pieces["error"] = ("tag '%s' doesn't start with prefix '%s'" % (full_tag, tag_prefix)) return pieces pieces["closest-tag"] = full_tag[len(tag_prefix):] # distance: number of commits since tag pieces["distance"] = int(mo.group(2)) # commit: short hex revision ID pieces["short"] = mo.group(3) else: # HEX: no tags pieces["closest-tag"] = None count_out, rc = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root) pieces["distance"] = int(count_out) # total number of commits # commit date: see ISO-8601 comment in git_versions_from_keywords() date = run_command(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[0].strip() pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) return pieces def plus_or_dot(pieces): """Return a + if we don't already have one, else return a .""" if "+" in pieces.get("closest-tag", ""): return "." return "+" def render_pep440(pieces): """Build up version string, with post-release "local version identifier". Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += plus_or_dot(pieces) rendered += "%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_pre(pieces): """TAG[.post.devDISTANCE] -- No -dirty. Exceptions: 1: no tags. 0.post.devDISTANCE """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += ".post.dev%d" % pieces["distance"] else: # exception #1 rendered = "0.post.dev%d" % pieces["distance"] return rendered def render_pep440_post(pieces): """TAG[.postDISTANCE[.dev0]+gHEX] . The ".dev0" means dirty. Note that .dev0 sorts backwards (a dirty tree will appear "older" than the corresponding clean one), but you shouldn't be releasing software with -dirty anyways. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%s" % pieces["short"] else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += "+g%s" % pieces["short"] return rendered def render_pep440_old(pieces): """TAG[.postDISTANCE[.dev0]] . The ".dev0" means dirty. Eexceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" return rendered def render_git_describe(pieces): """TAG[-DISTANCE-gHEX][-dirty]. Like 'git describe --tags --dirty --always'. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render_git_describe_long(pieces): """TAG-DISTANCE-gHEX[-dirty]. Like 'git describe --tags --dirty --always -long'. The distance/hash is unconditional. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render(pieces, style): """Render the given version pieces into the requested style.""" if pieces["error"]: return {"version": "unknown", "full-revisionid": pieces.get("long"), "dirty": None, "error": pieces["error"], "date": None} if not style or style == "default": style = "pep440" # the default if style == "pep440": rendered = render_pep440(pieces) elif style == "pep440-pre": rendered = render_pep440_pre(pieces) elif style == "pep440-post": rendered = render_pep440_post(pieces) elif style == "pep440-old": rendered = render_pep440_old(pieces) elif style == "git-describe": rendered = render_git_describe(pieces) elif style == "git-describe-long": rendered = render_git_describe_long(pieces) else: raise ValueError("unknown style '%s'" % style) return {"version": rendered, "full-revisionid": pieces["long"], "dirty": pieces["dirty"], "error": None, "date": pieces.get("date")} def get_versions(): """Get version information or return default if unable to do so.""" # I am in _version.py, which lives at ROOT/VERSIONFILE_SOURCE. If we have # __file__, we can work backwards from there to the root. Some # py2exe/bbfreeze/non-CPython implementations don't do __file__, in which # case we can only use expanded keywords. cfg = get_config() verbose = cfg.verbose try: return git_versions_from_keywords(get_keywords(), cfg.tag_prefix, verbose) except NotThisMethod: pass try: root = os.path.realpath(__file__) # versionfile_source is the relative path from the top of the source # tree (where the .git directory might live) to this file. Invert # this to find the root from __file__. for i in cfg.versionfile_source.split('/'): root = os.path.dirname(root) except NameError: return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to find root of source tree", "date": None} try: pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose) return render(pieces, cfg.style) except NotThisMethod: pass try: if cfg.parentdir_prefix: return versions_from_parentdir(cfg.parentdir_prefix, root, verbose) except NotThisMethod: pass return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to compute version", "date": None} qiime2-2024.5.0/qiime2/citations.bib000066400000000000000000000053031462552636000170050ustar00rootroot00000000000000@Article{Bolyen2019, author={Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodr{\'i}guez, Andr{\'e}s Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and V{\'a}zquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, title={Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, journal={Nature Biotechnology}, year={2019}, volume={37}, number={8}, pages={852-857}, issn={1546-1696}, doi={10.1038/s41587-019-0209-9}, url={https://doi.org/10.1038/s41587-019-0209-9} } qiime2-2024.5.0/qiime2/core/000077500000000000000000000000001462552636000152615ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/__init__.py000066400000000000000000000005351462552636000173750ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/archive/000077500000000000000000000000001462552636000167025ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/__init__.py000066400000000000000000000011631462552636000210140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .provenance import (ImportProvenanceCapture, ActionProvenanceCapture, PipelineProvenanceCapture) from .archiver import Archiver __all__ = ['Archiver', 'ImportProvenanceCapture', 'ActionProvenanceCapture', 'PipelineProvenanceCapture'] qiime2-2024.5.0/qiime2/core/archive/archiver.py000066400000000000000000000414151462552636000210640ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import uuid as _uuid import pathlib import weakref import zipfile import importlib import os import io import qiime2 import qiime2.core.cite as cite from qiime2.core.util import md5sum_directory, from_checksum_format, is_uuid4 _VERSION_TEMPLATE = """\ QIIME 2 archive: %s framework: %s """ ArchiveRecord = collections.namedtuple( 'ArchiveRecord', ['root', 'version_fp', 'uuid', 'version', 'framework_version']) ChecksumDiff = collections.namedtuple( 'ChecksumDiff', ['added', 'removed', 'changed']) class _Archive: """Abstraction layer over the archive filesystem. Responsible for details concerning manipulating an archive agnostic to its format. It is responsible for managing archive UUID, format, and framework versions as those are designed to be constant throughout all future format implementations. Breaking compatibility with that is a BIG DEAL and should avoided at (nearly) any cost. Example filesystem:: / !--- 770509e6-85f4-432c-9663-cdc04eb07db2 |--- VERSION !--- VERSION file:: QIIME 2 archive: framework: This file is itentionally not YAML/INI/An actual format. This is to discourage the situation where the format changes from something like YAML to another format and VERSION is updated with it "for consistency". To emphasize, the VERSION (filepath and content) and root archive structure MUST NOT CHANGE. If they change, then there is no longer a consistent way to dispatch to an appropriate format. """ VERSION_FILE = 'VERSION' @classmethod def is_archive_type(cls, filepath): raise NotImplementedError @classmethod def setup(cls, uuid, path, version, framework_version): root_dir = path version_fp = root_dir / cls.VERSION_FILE version_fp.write_text(_VERSION_TEMPLATE % (version, framework_version)) return ArchiveRecord(root_dir, version_fp, uuid, version, framework_version) @classmethod def save(cls, source, destination): raise NotImplementedError def __init__(self, path): self.path = path self.uuid = self._get_uuid() self.version, self.framework_version = self._get_versions() def _get_uuid(self): if not self.path.exists(): raise TypeError("%s does not exist or is not a filepath." % self.path) roots = set() for relpath in self.relative_iterdir(): if not relpath.startswith('.'): roots.add(relpath) if len(roots) == 0: raise ValueError("Archive does not have a visible root directory.") if len(roots) > 1: raise ValueError("Archive has multiple root directories: %r" % roots) uuid = roots.pop() if not is_uuid4(uuid): raise ValueError( "Archive root directory name %r is not a valid version 4 " "UUID." % uuid) return uuid def _get_versions(self): try: with self.open(self.VERSION_FILE) as fh: header, version_line, framework_version_line, eof = \ fh.read().split('\n') if header.strip() != 'QIIME 2': raise Exception() # GOTO except Exception version = version_line.split(':')[1].strip() framework_version = framework_version_line.split(':')[1].strip() return version, framework_version except Exception: # TODO: make a "better" parser which isn't just a catch-all raise ValueError("Archive does not contain a correctly formatted" " VERSION file.") def relative_iterdir(self, relpath='.'): raise NotImplementedError def open(self, relpath): raise NotImplementedError def mount(self, filepath): raise NotImplementedError class _ZipArchive(_Archive): """A specific variant of Archive which deals with ZIP64 files.""" @classmethod def is_archive_type(cls, path): return zipfile.is_zipfile(str(path)) @classmethod def save(cls, source, destination): parent_dir = os.path.split(source)[0] with zipfile.ZipFile(str(destination), mode='w', compression=zipfile.ZIP_DEFLATED, allowZip64=True) as zf: for root, dirs, files in os.walk(str(source)): # Prune hidden directories from traversal. Strategy modified # from http://stackoverflow.com/a/13454267/3776794 dirs[:] = [d for d in dirs if not d.startswith('.')] for file in files: if file.startswith('.'): continue abspath = pathlib.Path(root) / file relpath = abspath.relative_to(parent_dir) zf.write(str(abspath), arcname=cls._as_zip_path(relpath)) def relative_iterdir(self, relpath=''): relpath = self._as_zip_path(relpath) seen = set() with zipfile.ZipFile(str(self.path), mode='r') as zf: for name in zf.namelist(): if name.startswith(relpath): parts = pathlib.PurePosixPath(name).parts if len(parts) > 0: result = parts[0] if result not in seen: seen.add(result) yield result def open(self, relpath): relpath = pathlib.Path(str(self.uuid)) / relpath with zipfile.ZipFile(str(self.path), mode='r') as zf: # The filehandle will still work even when `zf` is "closed" return io.TextIOWrapper(zf.open(self._as_zip_path(relpath))) def mount(self, filepath): # TODO: use FUSE/MacFUSE/Dokany bindings (many Python bindings are # outdated, we may need to take up maintenance/fork) # We will have already allocated filepath at this point, we check if # the VERSION file exists to determine whether or not we have alredy # written to the allocated directory. This is relevant when you try to # load an artifact that is already in the cache because data/ # will be read only, so attempting to extract there will error. We also # just don't need to put the data there again if it is already there if not os.path.exists(filepath / 'VERSION'): self.extract(filepath) root = filepath return ArchiveRecord(root, root / self.VERSION_FILE, self.uuid, self.version, self.framework_version) def extract(self, filepath): filepath = pathlib.Path(filepath) assert os.path.basename(filepath) == str(self.uuid) with zipfile.ZipFile(str(self.path), mode='r') as zf: for name in zf.namelist(): if name.startswith(str(self.uuid)): # extract removes `..` components, so as long as we extract # into `filepath`, the path won't go backwards. zf.extract(name, path=str(filepath.parent)) return filepath @classmethod def _as_zip_path(self, path): path = str(pathlib.PurePosixPath(path)) # zip files don't work well with '.' which is the identity of a Path # obj, so just convert to empty string which is basically the identity # of a zip's entry if path == '.': path = '' return path class _NoOpArchive(_Archive): """For dealing with unzipped artifacts""" @classmethod def is_archive_type(cls, path): return os.path.isdir(str(path)) def _get_uuid(self): """If we are using a _NoOpArchive we are a data element in a pool meaning we are unzipped and our name is our uuid """ return os.path.basename(self.path) def relative_iterdir(self, relpath=''): seen = set() for name in os.listdir(str(self.path)): if name.startswith(relpath) and name not in seen: seen.add(name) yield name def open(self, relpath): return open(os.path.join(self.path, relpath)) def mount(self, path): root = path return ArchiveRecord(root, root / self.VERSION_FILE, self.uuid, self.version, self.framework_version) class ArchiveCheck(_Archive): """Used by the Jupyter handlers""" # TODO: make this part of the archiver API at some point def open(self, relpath): abspath = os.path.join(str(self.path), relpath) return open(abspath, 'r') def relative_iterdir(self, relpath='.'): for p in pathlib.Path(self.path).iterdir(): yield str(p.relative_to(self.path)) def _get_uuid(self): return os.path.basename(self.path) class Archiver: CURRENT_FORMAT_VERSION = '6' _FORMAT_REGISTRY = { # NOTE: add more archive formats as things change '0': 'qiime2.core.archive.format.v0:ArchiveFormat', '1': 'qiime2.core.archive.format.v1:ArchiveFormat', '2': 'qiime2.core.archive.format.v2:ArchiveFormat', '3': 'qiime2.core.archive.format.v3:ArchiveFormat', '4': 'qiime2.core.archive.format.v4:ArchiveFormat', '5': 'qiime2.core.archive.format.v5:ArchiveFormat', '6': 'qiime2.core.archive.format.v6:ArchiveFormat' } @classmethod def _make_temp_path(cls, uuid): """Allocates a place in the cache for the file to be temporarily written. Returns this location and the cache in use. """ from qiime2.core.cache import get_cache cache = get_cache() path = cache.process_pool._allocate(uuid) return path, cache @classmethod def _destroy_temp_path(cls, process_alias): from qiime2.core.cache import get_cache cache = get_cache() cache.process_pool.remove(str(process_alias)) @classmethod def get_format_class(cls, version): try: imp, fmt_cls = cls._FORMAT_REGISTRY[version].split(':') except KeyError: return None return getattr(importlib.import_module(imp), fmt_cls) @classmethod def get_archive(cls, filepath): filepath = pathlib.Path(filepath) if not filepath.exists(): raise ValueError("%s does not exist." % filepath) if _ZipArchive.is_archive_type(filepath): archive = _ZipArchive(filepath) elif _NoOpArchive.is_archive_type(filepath): archive = _NoOpArchive(filepath) else: raise ValueError("%s is not a QIIME archive." % filepath) return archive @classmethod def _futuristic_archive_error(cls, filepath, archive): raise ValueError("%s was created by 'QIIME %s'. The currently" " installed framework cannot interpret archive" " version %r." % (filepath, archive.framework_version, archive.version)) @classmethod def peek(cls, filepath): archive = cls.get_archive(filepath) Format = cls.get_format_class(archive.version) if Format is None: cls._futuristic_archive_error(filepath, archive) # NOTE: in the future, we may want to manipulate the results so that # older formats provide the "new" API even if they don't support it. # e.g. a new format has a new property that peek should describe. We # add some compatability code here to return a default for that # property on older formats. return Format.load_metadata(archive) @classmethod def extract(cls, filepath, dest): archive = cls.get_archive(filepath) dest = os.path.join(dest, str(archive.uuid)) os.makedirs(dest) # Format really doesn't matter, the archive knows how to extract so # that is sufficient, furthermore it would suck if something was wrong # with an archive's format and extract failed to actually extract. return str(archive.extract(dest)) @classmethod def load(cls, filepath): archive = cls.get_archive(filepath) path, cache = cls._make_temp_path(archive.uuid) try: Format = cls.get_format_class(archive.version) if Format is None: cls._futuristic_archive_error(filepath, archive) archive.mount(path) process_alias, data_path = \ cache._rename_to_data(archive.uuid, path) rec = ArchiveRecord( data_path, data_path / archive.VERSION_FILE, archive.uuid, archive.version, archive.framework_version) ref = cls(data_path, process_alias, Format(rec), cache) return ref # We really just want to kill these paths if anything at all goes wrong # Exceptions including keyboard interrupts are re-raised except: # noqa: E722 cls._destroy_temp_path(archive.uuid) if 'process_alias' in vars(): cls._destroy_temp_path(process_alias) raise @classmethod def load_raw(cls, filepath, cache): archive = cls.get_archive(filepath) process_alias = cache._alias(str(archive.uuid)) Format = cls.get_format_class(archive.version) if Format is None: cls._futuristic_archive_error(filepath, archive) path = pathlib.Path(filepath) rec = archive.mount(path) ref = cls(path, process_alias, Format(rec), cache) return ref @classmethod def from_data(cls, type, format, data_initializer, provenance_capture): uuid = _uuid.uuid4() path, cache = cls._make_temp_path(uuid) try: rec = _Archive.setup(uuid, path, cls.CURRENT_FORMAT_VERSION, qiime2.__version__) Format = cls.get_format_class(cls.CURRENT_FORMAT_VERSION) Format.write(rec, type, format, data_initializer, provenance_capture) process_alias, data_path = cache._rename_to_data(uuid, path) rec = ArchiveRecord(data_path, data_path / _Archive.VERSION_FILE, uuid, cls.CURRENT_FORMAT_VERSION, qiime2.__version__) ref = cls(data_path, process_alias, Format(rec), cache) return ref # We really just want to kill these paths if anything at all goes wrong # Exceptions including keyboard interrupts are re-raised except: # noqa: E722 cls._destroy_temp_path(uuid) if 'process_alias' in vars(): cls._destroy_temp_path(process_alias) raise def __init__(self, path, process_alias, fmt, cache): self.path = path self.process_alias = process_alias self._fmt = fmt self._destructor = weakref.finalize(self, cache._deallocate, str(self.process_alias)) @property def uuid(self): return self._fmt.uuid @property def type(self): return self._fmt.type @property def format(self): return self._fmt.format @property def data_dir(self): return self._fmt.data_dir @property def root_dir(self): return self._fmt.path @property def provenance_dir(self): return getattr(self._fmt, 'provenance_dir', None) @property def citations(self): return getattr(self._fmt, 'citations', cite.Citations()) def save(self, filepath): _ZipArchive.save(self.path, filepath) def validate_checksums(self): if not isinstance(self._fmt, self.get_format_class('5')): return ChecksumDiff({}, {}, {}) obs = dict(x for x in md5sum_directory(str(self.root_dir)).items() if x[0] != self._fmt.CHECKSUM_FILE) with open(self.root_dir / self._fmt.CHECKSUM_FILE) as fh: exp = dict(from_checksum_format(line) for line in fh.readlines()) obs_keys = set(obs) exp_keys = set(exp) added = {x: obs[x] for x in obs_keys - exp_keys} removed = {x: exp[x] for x in exp_keys - obs_keys} changed = {x: (exp[x], obs[x]) for x in exp_keys & obs_keys if exp[x] != obs[x]} return ChecksumDiff(added=added, removed=removed, changed=changed) qiime2-2024.5.0/qiime2/core/archive/format/000077500000000000000000000000001462552636000201725ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/format/__init__.py000066400000000000000000000007371462552636000223120ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- # NOTE: Don't import anything here. Importing one format shouldn't import all # of them, that is a waste of the computer's time. qiime2-2024.5.0/qiime2/core/archive/format/tests/000077500000000000000000000000001462552636000213345ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/format/tests/__init__.py000066400000000000000000000005351462552636000234500ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/archive/format/tests/test_util.py000066400000000000000000000056061462552636000237310ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import tempfile import os import zipfile from qiime2.core.testing.type import FourInts from qiime2.core.testing.util import ArchiveTestingMixin import qiime2.core.archive as archive from qiime2.core.archive.format.util import artifact_version from qiime2.sdk import Artifact class TestArtifactVersion(unittest.TestCase, ArchiveTestingMixin): def setUp(self): prefix = "qiime2-test-temp-" self.temp_dir = tempfile.TemporaryDirectory(prefix=prefix) self.provenance_capture = archive.ImportProvenanceCapture() def tearDown(self): self.temp_dir.cleanup() def test_nonexistent_archive_format(self): with self.assertRaisesRegex(ValueError, 'Version foo not supported'): with artifact_version('foo'): pass def test_write_v0_archive(self): fp = os.path.join(self.temp_dir.name, 'artifact_v0.qza') with artifact_version(0): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) artifact.save(fp) root_dir = str(artifact.uuid) # There should be no provenance expected = { 'VERSION', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', } self.assertArchiveMembers(fp, root_dir, expected) with zipfile.ZipFile(fp, mode='r') as zf: version = zf.read(os.path.join(root_dir, 'VERSION')) self.assertRegex(str(version), '^.*archive: 0.*$') def test_write_v4_archive(self): fp = os.path.join(self.temp_dir.name, 'artifact_v4.qza') with artifact_version(4): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) artifact.save(fp) root_dir = str(artifact.uuid) expected = { 'VERSION', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml', } self.assertArchiveMembers(fp, root_dir, expected) with zipfile.ZipFile(fp, mode='r') as zf: version = zf.read(os.path.join(root_dir, 'VERSION')) self.assertRegex(str(version), '^.*archive: 4.*$') qiime2-2024.5.0/qiime2/core/archive/format/tests/test_v0.py000066400000000000000000000045131462552636000232750ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import tempfile import uuid as _uuid import pathlib import io from qiime2.core.testing.type import IntSequence1 from qiime2.core.testing.format import IntSequenceDirectoryFormat from qiime2.core.archive.archiver import _ZipArchive, ArchiveRecord from qiime2.core.archive.format.v0 import ArchiveFormat class TestArchiveFormat(unittest.TestCase): def setUp(self): prefix = "qiime2-test-temp-" self.temp_dir = tempfile.TemporaryDirectory(prefix=prefix) def tearDown(self): self.temp_dir.cleanup() def test_format_metadata(self): uuid = _uuid.uuid4() with io.StringIO() as fh: ArchiveFormat._format_metadata(fh, uuid, IntSequence1, IntSequenceDirectoryFormat) result = fh.getvalue() self.assertEqual(result, "uuid: %s\ntype: IntSequence1\nformat: " "IntSequenceDirectoryFormat\n" % uuid) def test_format_metadata_none(self): uuid = _uuid.uuid4() with io.StringIO() as fh: ArchiveFormat._format_metadata(fh, uuid, IntSequence1, None) result = fh.getvalue() self.assertEqual(result, "uuid: %s\ntype: IntSequence1\nformat: null\n" % uuid) def test_load_root_dir_metadata_uuid_mismatch(self): fp = pathlib.Path(self.temp_dir.name) / 'root-dir-metadata-mismatch' fp.mkdir() r = _ZipArchive.setup(_uuid.uuid4(), fp, 'foo', 'bar') fake = ArchiveRecord(r.root, r.version_fp, _uuid.uuid4(), # This will trick the format r.version, r.framework_version) ArchiveFormat.write(fake, IntSequence1, IntSequenceDirectoryFormat, lambda x: None, None) with self.assertRaisesRegex( ValueError, 'root directory must match UUID.*metadata'): ArchiveFormat(r) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/archive/format/util.py000066400000000000000000000014621462552636000215240ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from contextlib import contextmanager from qiime2.core.archive import Archiver @contextmanager def artifact_version(version): version = str(version) if version not in Archiver._FORMAT_REGISTRY: raise ValueError("Version %s not supported" % version) original_version = Archiver.CURRENT_FORMAT_VERSION try: Archiver.CURRENT_FORMAT_VERSION = version yield finally: Archiver.CURRENT_FORMAT_VERSION = original_version qiime2-2024.5.0/qiime2/core/archive/format/v0.py000066400000000000000000000046771462552636000211070ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import uuid as _uuid import yaml import qiime2.sdk as sdk # Allow OrderedDict to be serialized for YAML representation yaml.add_representer(collections.OrderedDict, lambda dumper, data: dumper.represent_dict(data.items())) class ArchiveFormat: DATA_DIR = 'data' METADATA_FILE = 'metadata.yaml' @classmethod def _parse_metadata(self, fh, expected_uuid): metadata = yaml.safe_load(fh) if metadata['uuid'] != str(expected_uuid): raise ValueError( "Archive root directory must match UUID present in archive's" " metadata: %s != %s" % (expected_uuid, metadata['uuid'])) return metadata['uuid'], metadata['type'], metadata['format'] @classmethod def _format_metadata(self, fh, uuid, type, format): metadata = collections.OrderedDict() metadata['uuid'] = str(uuid) metadata['type'] = repr(type) metadata['format'] = None if format is not None: metadata['format'] = format.__name__ fh.write(yaml.dump(metadata, default_flow_style=False)) @classmethod def load_metadata(self, archive): with archive.open(self.METADATA_FILE) as fh: return self._parse_metadata(fh, expected_uuid=archive.uuid) @classmethod def write(cls, archive_record, type, format, data_initializer, _): root = archive_record.root metadata_fp = root / cls.METADATA_FILE with metadata_fp.open(mode='w') as fh: cls._format_metadata(fh, archive_record.uuid, type, format) data_dir = root / cls.DATA_DIR data_dir.mkdir() data_initializer(data_dir) def __init__(self, archive_record): path = archive_record.root with (path / self.METADATA_FILE).open() as fh: uuid, type, format = \ self._parse_metadata(fh, expected_uuid=archive_record.uuid) self.uuid = _uuid.UUID(uuid) self.type = sdk.parse_type(type) self.format = sdk.parse_format(format) self.path = path self.data_dir = path / self.DATA_DIR qiime2-2024.5.0/qiime2/core/archive/format/v1.py000066400000000000000000000020641462552636000210740ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v0 as v0 class ArchiveFormat(v0.ArchiveFormat): PROVENANCE_DIR = 'provenance' @classmethod def write(cls, archive_record, type, format, data_initializer, provenance_capture): super().write(archive_record, type, format, data_initializer, provenance_capture) root = archive_record.root prov_dir = root / cls.PROVENANCE_DIR prov_dir.mkdir() provenance_capture.finalize( prov_dir, [root / cls.METADATA_FILE, archive_record.version_fp]) def __init__(self, archive_record): super().__init__(archive_record) self.provenance_dir = archive_record.root / self.PROVENANCE_DIR qiime2-2024.5.0/qiime2/core/archive/format/v2.py000066400000000000000000000012761462552636000211010ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v1 as v1 class ArchiveFormat(v1.ArchiveFormat): # Exactly the same as v1, but in provenance, when the action type isn't # import, there is an `output-name` key in the action section with that # node's output name according to the action's signature object. Also has # pipeline action types. pass qiime2-2024.5.0/qiime2/core/archive/format/v3.py000066400000000000000000000011641462552636000210760ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v2 as v2 class ArchiveFormat(v2.ArchiveFormat): # Exactly the same as v2, but inputs may be variadic where the UUIDs are in # a YAML sequence. Additionally `Set` is now represented as a sequence # with a custom !set tag. pass qiime2-2024.5.0/qiime2/core/archive/format/v4.py000066400000000000000000000027331462552636000211020ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v3 as v3 from qiime2.core.cite import Citations class ArchiveFormat(v3.ArchiveFormat): # - Adds a transformers section to action.yaml # - Adds citations via the !cite yaml type which references the # /provenance/citations.bib file (this is nested like everything else # in the /provenance/artifacts/ # directories). # - environment:framework has been updated to be a nested object, # its schema is identical to a environment:plugins: object. # Prior to v4, it was only a version string. @property def citations(self): files = [] files.append(str(self.provenance_dir / 'citations.bib')) if (self.provenance_dir / 'artifacts').exists(): for ancestor in (self.provenance_dir / 'artifacts').iterdir(): if (ancestor / 'citations.bib').exists(): files.append(str(ancestor / 'citations.bib')) citations = Citations() for f in files: citations.update(Citations.load(f)) return citations qiime2-2024.5.0/qiime2/core/archive/format/v5.py000066400000000000000000000021001462552636000210670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v4 as v4 from qiime2.core.util import md5sum_directory, to_checksum_format class ArchiveFormat(v4.ArchiveFormat): CHECKSUM_FILE = 'checksums.md5' # Adds `checksums.md5` to root of directory structure @classmethod def write(cls, archive_record, type, format, data_initializer, provenance_capture): super().write(archive_record, type, format, data_initializer, provenance_capture) checksums = md5sum_directory(str(archive_record.root)) with (archive_record.root / cls.CHECKSUM_FILE).open('w') as fh: for item in checksums.items(): fh.write(to_checksum_format(*item)) fh.write('\n') qiime2-2024.5.0/qiime2/core/archive/format/v6.py000066400000000000000000000023511462552636000211000ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.archive.format.v5 as v5 class ArchiveFormat(v5.ArchiveFormat): # - Adds execution_context to the execution section of action.yaml. # This looks like: # # execution_context: # type: parsl/synchronous/asynchronous # parsl_type (if type is parsl): Type of executor # # NOTE: Import actions will not have an execution_context section # # - Adds support for output collections. # This looks like: # # output-name: # - The name of the entire output collection (the qiime output name) # - The key of this element in the collection # - The index of this element in the collection ex. '5/10' for the 5th # out of 10 elements in the collection # # Input collections now look like: # # input-name: # - key: value # - key: value # etc. for n elements pass qiime2-2024.5.0/qiime2/core/archive/provenance.py000066400000000000000000000572311462552636000214240ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import time import collections import collections.abc import pkg_resources import uuid import copy import shutil import sys import warnings from datetime import datetime, timezone from typing import Any, List, NamedTuple, Set, Union from pathlib import Path import distutils import yaml import tzlocal import dateutil.relativedelta as relativedelta import qiime2 import qiime2.core.util as util from qiime2.core.cite import Citations def _ts_to_date(ts): time_zone = timezone.utc try: time_zone = tzlocal.get_localzone() except ValueError: pass return datetime.fromtimestamp(ts, tz=time_zone) # Used to give PyYAML something to recognize for custom tags ForwardRef = collections.namedtuple('ForwardRef', ['reference']) NoProvenance = collections.namedtuple('NoProvenance', ['uuid']) MetadataPath = collections.namedtuple('MetadataPath', ['path']) ColorPrimitive = collections.namedtuple('ColorPrimitive', ['hex']) LiteralString = collections.namedtuple('LiteralString', ['string']) CitationKey = collections.namedtuple('CitationKey', ['key']) class OrderedKeyValue(collections.OrderedDict): pass # Used for yaml that looks like: # - key1: value1 # - key2: value2 yaml.add_representer(OrderedKeyValue, lambda dumper, data: dumper.represent_list([ {k: v} for k, v in data.items()])) # Controlling the order of dictionaries (even if semantically irrelevant) is # important to making it look nice. yaml.add_representer(collections.OrderedDict, lambda dumper, data: dumper.represent_dict(data.items())) # YAML libraries aren't good at writing a clean version of this, and typically # the fact that it is a set is irrelevant to tools that use provenance # so add a custom tag and treat it like a sequence. Then code doesn't need to # special case set vs list in their business logic when it isn't important. yaml.add_representer(set, lambda dumper, data: dumper.represent_sequence('!set', data)) # LiteralString uses the | character and has literal newlines yaml.add_representer(LiteralString, lambda dumper, data: dumper.represent_scalar('tag:yaml.org,2002:str', data.string, style='|')) # Make our timestamps pretty (unquoted). yaml.add_representer(datetime, lambda dumper, data: dumper.represent_scalar('tag:yaml.org,2002:timestamp', data.isoformat())) # Forward reference to something else in the document, namespaces are # delimited by colons (:). yaml.add_representer(ForwardRef, lambda dumper, data: dumper.represent_scalar('!ref', data.reference)) # This tag represents an artifact without provenance, this is to support # archive format v0. Ideally this won't be seen in the wild in practice. yaml.add_representer(NoProvenance, lambda dumper, data: dumper.represent_scalar('!no-provenance', str(data.uuid))) # A reference to Metadata and MetadataColumn whose data can be found at the # relative path indicated as its value yaml.add_representer(MetadataPath, lambda dumper, data: dumper.represent_scalar('!metadata', data.path)) # A color primitive. yaml.add_representer(ColorPrimitive, lambda dumper, data: dumper.represent_scalar('!color', data.hex)) yaml.add_representer(CitationKey, lambda dumper, data: dumper.represent_scalar('!cite', data.key)) def citation_key_constructor(loader, node) -> str: """ A constructor for !cite yaml tags, returning a bibtex key as a str. All we need for now is a key string we can match in citations.bib, so _we're not parsing these into component substrings_. If that need arises in future, these are spec'ed in provenance.py as: |:|[|] and frequently look like this (note no identifier): framework|qiime2:2020.6.0.dev0|0 """ value = loader.construct_scalar(node) return value def color_constructor(loader, node) -> str: """ Constructor for !color tags, returning an str. Color was a primitive type representing a 3 or 6 digit color hex code, matching ^#(?:[0-9a-fA-F]{3}){1,2}$ Per E. Bolyen,these were unused by any plugins. They were removed in e58ed5f8ba453035169d560e0223e6a37774ae08, which was released in 2019.4 """ return loader.construct_scalar(node) class MetadataInfo(NamedTuple): """ A namedtuple representation of the data in one !metadata yaml tag. Attributes ---------- input_artifact_uuids : list of str The uuids of any artifacts viewed as metadata. relative_fp : str The filepath of the metadata file relative to the action.yaml file. md5sum_hash : str The md5sum hash of the contents of the corresponding metadata file, needed by qiime2.core.cache to tell if two metadata inputs are equal. """ input_artifact_uuids: List[str] relative_fp: str md5sum_hash: str def metadata_path_constructor(loader, node) -> MetadataInfo: """ A constructor for !metadata yaml tags, which come in the form [[,][...]:] Returns a MetadataInfo object containing a list of UUIDs, and the relative filepath where the metadata was written into the zip archive Most commonly, we see: !metadata 'sample_metadata.tsv' In cases where Artifacts are used as metadata, we see: !metadata '415409a4-371d-4c69-9433-e3eaba5301b4:feature_metadata.tsv' In cases where multiple Artifacts as metadata were merged, it is possible for multiple comma-separated uuids to precede the ':' !metadata ',,...,:feature_metadata.tsv' The metadata files (including "Artifact metadata") are saved in the same dir as `action.yaml`. The UUIDs listed must be incorporated into our provenance graph as parents, so are returned in list form. NOTES ----- Assumes `loader` has been passed a filehandle to an action.yaml file. If instead of a filehandle e.g. file contents are passed, will break because `loader.name` will no longer be a useful path that we can use to find the corresponding metadata file. """ raw = loader.construct_scalar(node) if ':' in raw: artifact_uuids, rel_fp = raw.split(':') artifact_uuids = artifact_uuids.split(',') else: artifact_uuids = [] rel_fp = raw action_fp = Path(loader.name) metadata_fp = action_fp.parent / rel_fp md5sum_hash = util.md5sum(metadata_fp) return MetadataInfo(artifact_uuids, rel_fp, md5sum_hash) def no_provenance_constructor(loader, node) -> str: """ Constructor for !no-provenance tags. These tags are written by QIIME 2 when an input has no /provenance dir, as in the case of v0 archives that have been used in analyses in QIIME2 V1+. They look like this: action: inputs: - table: !no-provenance '34b07e56-27a5-4f03-ae57-ff427b50aaa1' For now at least, this constructor warns but otherwise disregards the no-provenance-ness of these. The v0 parser deals with them directly anyway. """ uuid = loader.construct_scalar(node) warnings.warn(f"Artifact {uuid} was created prior to provenance tracking. " + "Provenance data will be incomplete.", UserWarning) return uuid def ref_constructor(loader, node) -> Union[str, List[str]]: """ A constructor for !ref yaml tags. These tags describe yaml values that reference other namespaces within the document, using colons to separate namespaces. For example: !ref 'environment:plugins:sample-classifier' At present, ForwardRef tags are only used in the framework to 'link' the plugin name to the plugin version and other details in the 'execution' namespace of action.yaml This constructor explicitly handles this type of !ref by extracting and returning the plugin name to simplify parsing, while supporting the return of a generic list of 'keys' (e.g. ['environment', 'framework', 'version']) in the event ForwardRef is used more broadly in future. """ value = loader.construct_scalar(node) keys = value.split(':') if keys[0:2] == ['environment', 'plugins']: plugin_name = keys[2] return plugin_name else: return keys def set_constructor(loader, node) -> Set[Any]: """ A constructor for !set yaml tags, returning a python set object """ value = loader.construct_sequence(node) return set(value) # NOTE: New yaml tag constructors must be added to this registry, or tags will # raise ConstructorErrors CONSTRUCTOR_REGISTRY = { '!cite': citation_key_constructor, '!color': color_constructor, '!metadata': metadata_path_constructor, '!no-provenance': no_provenance_constructor, '!ref': ref_constructor, '!set': set_constructor, } for key in CONSTRUCTOR_REGISTRY: yaml.SafeLoader.add_constructor(key, CONSTRUCTOR_REGISTRY[key]) class ProvenanceCapture: ANCESTOR_DIR = 'artifacts' ACTION_DIR = 'action' ACTION_FILE = 'action.yaml' CITATION_FILE = 'citations.bib' def __init__(self): self.start = time.time() self.uuid = uuid.uuid4() self.end = None self.plugins = collections.OrderedDict() # For the purposes of this dict, `return` is a special case for output # we expect to transform this later when serializing, but this lets # us treat all transformations uniformly. self.transformers = collections.OrderedDict() self.citations = Citations() self._framework_citations = [] for idx, citation in enumerate(qiime2.__citations__): citation_key = self.make_citation_key('framework') self.citations[citation_key.key] = citation self._framework_citations.append(citation_key) self._build_paths() @property def _destructor(self): return self.path._destructor def _build_paths(self): self.path = qiime2.core.path.ProvenancePath() self.ancestor_dir = self.path / self.ANCESTOR_DIR self.ancestor_dir.mkdir() self.action_dir = self.path / self.ACTION_DIR self.action_dir.mkdir() def add_ancestor(self, artifact): other_path = artifact._archiver.provenance_dir if other_path is None: # The artifact doesn't have provenance (e.g. version 0) # it would be possible to invent a metadata.yaml, but we won't know # the framework version for the VERSION file. Even if we did # it won't accomplish a lot and there shouldn't be enough # version 0 artifacts in the wild to be important in practice. # NOTE: this implies that it is possible for an action.yaml file to # contain an artifact UUID that is not in the artifacts/ directory. return NoProvenance(artifact.uuid) destination = self.ancestor_dir / str(artifact.uuid) # If it exists, then the artifact is already in the provenance # (and so are its ancestors) if not destination.exists(): # Handle root node of ancestor shutil.copytree( str(other_path), str(destination), ignore=shutil.ignore_patterns(self.ANCESTOR_DIR + '*')) # Handle ancestral nodes of ancestor grandcestor_path = other_path / self.ANCESTOR_DIR if grandcestor_path.exists(): for grandcestor in grandcestor_path.iterdir(): destination = self.ancestor_dir / grandcestor.name if not destination.exists(): shutil.copytree(str(grandcestor), str(destination)) return str(artifact.uuid) def make_citation_key(self, domain, package=None, identifier=None, index=0): if domain == 'framework': package, version = 'qiime2', qiime2.__version__ else: package, version = package.name, package.version id_block = [] if identifier is None else [identifier] return CitationKey('|'.join( [domain, package + ':' + version] + id_block + [str(index)])) def make_software_entry(self, version, website, citations=()): entry = collections.OrderedDict() entry['version'] = version entry['website'] = website if citations: entry['citations'] = citations return entry def reference_plugin(self, plugin): plugin_citations = [] for idx, citation in enumerate(plugin.citations): citation_key = self.make_citation_key('plugin', plugin, index=idx) self.citations[citation_key.key] = citation plugin_citations.append(citation_key) self.plugins[plugin.name] = self.make_software_entry( plugin.version, plugin.website, plugin_citations) return ForwardRef('environment:plugins:' + plugin.name) def capture_env(self): return collections.OrderedDict( (d.project_name, d.version) for d in pkg_resources.working_set) def transformation_recorder(self, name): section = self.transformers[name] = [] def recorder(transformer_record, input_name, input_record, output_name, output_record): entry = collections.OrderedDict() entry['from'] = input_name entry['to'] = output_name citation_keys = [] if transformer_record is not None: plugin = transformer_record.plugin entry['plugin'] = self.reference_plugin(plugin) for idx, citation in enumerate(transformer_record.citations): citation_key = self.make_citation_key( 'transformer', plugin, '%s->%s' % (input_name, output_name), idx) self.citations[citation_key.key] = citation citation_keys.append(citation_key) records = [] if input_record is not None: records.append(input_record) if output_record is not None: records.append(output_record) for record in records: self.reference_plugin(record.plugin) for idx, citation in enumerate(record.citations): citation_key = self.make_citation_key( 'view', record.plugin, record.name, idx) self.citations[citation_key.key] = citation citation_keys.append(citation_key) if citation_keys: entry['citations'] = citation_keys # Don't create duplicate transformer records. These were happening # with collections of inputs. If we have a method that takes a # List[IntSequence1] with view type of list, we need to transform # every IntSequence1 into a list to match the view type. This would # add a transformation record for every IntSequence1 in the list. # This ensures we only end up with one record of a given type for a # given input while still allowing multiple unique records. # NOTE: This does redundant work creating the record, do we care? if entry not in section: section.append(entry) return recorder def make_execution_section(self): execution = collections.OrderedDict() execution['uuid'] = str(self.uuid) execution['runtime'] = runtime = collections.OrderedDict() runtime['start'] = start = _ts_to_date(self.start) runtime['end'] = end = _ts_to_date(self.end) runtime['duration'] = \ util.duration_time(relativedelta.relativedelta(end, start)) if not isinstance(self, ImportProvenanceCapture): execution['execution_context'] = collections.OrderedDict( {k: v for k, v in self.execution_context.items()}) return execution def make_transformers_section(self): transformers = collections.OrderedDict() data = self.transformers.copy() output = data.pop('return', None) if data: transformers['inputs'] = data if output is not None: transformers['output'] = output return transformers def make_env_section(self): env = collections.OrderedDict() env['platform'] = pkg_resources.get_build_platform() # There is a trailing whitespace in sys.version, strip so that YAML can # use literal formatting. env['python'] = LiteralString('\n'.join(line.strip() for line in sys.version.split('\n'))) env['framework'] = self.make_software_entry( qiime2.__version__, qiime2.__website__, self._framework_citations) env['plugins'] = self.plugins env['python-packages'] = self.capture_env() return env def write_action_yaml(self): settings = dict(default_flow_style=False, indent=4) with (self.action_dir / self.ACTION_FILE).open(mode='w') as fh: fh.write(yaml.dump({'execution': self.make_execution_section()}, **settings)) fh.write('\n') fh.write(yaml.dump({'action': self.make_action_section()}, **settings)) if self.transformers: # pipelines don't have these fh.write('\n') fh.write(yaml.dump( {'transformers': self.make_transformers_section()}, **settings)) fh.write('\n') fh.write(yaml.dump({'environment': self.make_env_section()}, **settings)) def write_citations_bib(self): self.citations.save(str(self.path / self.CITATION_FILE)) def finalize(self, final_path, node_members): self.end = time.time() for member in node_members: shutil.copy(str(member), str(self.path)) self.write_action_yaml() self.write_citations_bib() # Certain networked filesystems will experience a race # condition on `rename`, so fall back to copying. try: os.rename(self.path, final_path) except (FileExistsError, OSError) as err: if isinstance(err, FileExistsError) or isinstance(err, OSError) \ and err.errno == 18: distutils.dir_util.copy_tree(str(self.path), str(final_path)) distutils.dir_util.remove_tree(str(self.path)) else: raise err def fork(self): forked = copy.copy(self) # Unique state for each output of an action forked.plugins = forked.plugins.copy() forked.transformers = forked.transformers.copy() forked.citations = forked.citations.copy() # create a copy of the backing dir so factory (the hard stuff is # mostly done by this point) forked._build_paths() distutils.dir_util.copy_tree(str(self.path), str(forked.path)) return forked class ImportProvenanceCapture(ProvenanceCapture): def __init__(self, format=None, checksums=None): super().__init__() self.format_name = format.__name__ if format is not None else None self.checksums = checksums def make_action_section(self): action = collections.OrderedDict() action['type'] = 'import' if self.format_name is not None: action['format'] = self.format_name if self.checksums is not None: action['manifest'] = [ collections.OrderedDict([('name', name), ('md5sum', md5sum)]) for name, md5sum in self.checksums.items()] return action class ActionProvenanceCapture(ProvenanceCapture): def __init__(self, action_type, plugin_id, action_id, execution_context): from qiime2.sdk import PluginManager super().__init__() self._plugin = PluginManager().get_plugin(id=plugin_id) self.action = self._plugin.actions[action_id] self.action_type = action_type self.inputs = OrderedKeyValue() self.parameters = OrderedKeyValue() self.output_name = '' self.execution_context = execution_context self._action_citations = [] for idx, citation in enumerate(self.action.citations): citation_key = self.make_citation_key( 'action', self._plugin, ':'.join([self.action_type, self.action.id]), idx) self.citations[citation_key.key] = citation self._action_citations.append(citation_key) def handle_metadata(self, name, value): if value is None: return None uuid_ref = "" if value.artifacts: uuids = [] for artifact in value.artifacts: uuids.append(str(artifact.uuid)) self.add_ancestor(artifact) uuid_ref = ",".join(uuids) + ":" relpath = name + '.tsv' value.save(str(self.action_dir / relpath)) return MetadataPath(uuid_ref + relpath) def add_parameter(self, name, type_expr, parameter): type_map = { 'Color': ColorPrimitive, 'Metadata': lambda x: self.handle_metadata(name, x), 'MetadataColumn': lambda x: self.handle_metadata(name, x) # TODO: handle collection primitives (not currently used) } # Make sure if we get a Collection of params the items are put into # provenance in the right order if isinstance(parameter, dict): parameter = [{k: v} for k, v in parameter.items()] handler = type_map.get(type_expr.to_ast().get('name'), lambda x: x) self.parameters[name] = handler(parameter) def add_input(self, name, input): if input is None: self.inputs[name] = None elif isinstance(input, qiime2.sdk.result.ResultCollection): # If we took a Collection input, we will have a ResultCollection, # and we want the keys to line up with the processed values we were # given, so we can maintain the order of the artifacts self.inputs[name] = \ [{k: self.add_ancestor(v)} for k, v in input.items()] elif isinstance(input, collections.abc.Iterable): self.inputs[name] = type(input)( [self.add_ancestor(artifact) for artifact in input]) else: self.inputs[name] = self.add_ancestor(input) def make_action_section(self): action = collections.OrderedDict() action['type'] = self.action_type action['plugin'] = self.reference_plugin(self._plugin) action['action'] = self.action.id action['inputs'] = self.inputs action['parameters'] = self.parameters action['output-name'] = self.output_name if self._action_citations: action['citations'] = self._action_citations return action def fork(self, name): forked = super().fork() forked.output_name = name return forked class PipelineProvenanceCapture(ActionProvenanceCapture): def make_action_section(self): action = super().make_action_section() action['alias-of'] = str(self.alias.uuid) return action def fork(self, name, alias): forked = super().fork(name) forked.alias = alias forked.add_ancestor(alias) return forked qiime2-2024.5.0/qiime2/core/archive/provenance_lib/000077500000000000000000000000001462552636000216705ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/__init__.py000066400000000000000000000022101462552636000237740ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- """ Software to support scientific reproducibility, attribution, and collaboration on the QIIME 2 platform. Core objects: - ProvDAG: A directed, acyclic graph (DAG) describing QIIME 2 provenance - ProvNode: Parsed data about a single QIIME 2 Result, generated internally by a provenance parser and available through the ProvDAG """ from .parse import ProvDAG, archive_not_parsed from .replay import ( replay_provenance, replay_citations, replay_supplement, ) from .util import get_root_uuid, get_nonroot_uuid from .usage_drivers import ReplayPythonUsage from .tests.testing_utilities import DummyArtifacts __all__ = [ 'ProvDAG', 'archive_not_parsed', 'get_root_uuid', 'get_nonroot_uuid', 'replay_provenance', 'replay_citations', 'replay_supplement', 'ReplayPythonUsage', 'DummyArtifacts' ] qiime2-2024.5.0/qiime2/core/archive/provenance_lib/_checksum_validator.py000066400000000000000000000121071462552636000262510ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from enum import IntEnum import pathlib import warnings from typing import Optional, Tuple from zipfile import ZipFile from qiime2.core.util import md5sum_directory_zip, from_checksum_format from qiime2.core.archive.archiver import ChecksumDiff from .util import get_root_uuid, parse_version class ValidationCode(IntEnum): ''' Codes indicating the level of validation a ProvDAG has passed. The code that determines which ValidationCode an archive receives is by necessity scattered. INVALID: One or more files are known to be missing or unparseable. Occurs either when checksum validation fails, or when expected files are absent or unparseable. VALIDATION_OPTOUT: The user opted out of checksum validation. This will be overridden by INVALID iff a required file is missing. In this context, `checksums.md5` is not required. If data files, for example, have been manually modified, the code will remain VALIDATION_OPTOUT, but if an action.yaml file is missing, INVALID will result. PREDATES_CHECKSUMS: The archive format predates the creation of checksums.md5, so full validation is impossible. We initially assume validity. This will be overridden by INVALID iff an expected file is missing or unparseable. If data files, for example, have been manually modified, the code will remain PREDATES_CHECKSUMS. VALID: The archive has passed checksum validation and is "known" to be valid. Md5 checksums are technically falsifiable, so this is not a guarantee of correctness/authenticity. ''' INVALID = 0 VALIDATION_OPTOUT = 1 PREDATES_CHECKSUMS = 2 VALID = 3 def validate_checksums( zf: ZipFile ) -> Tuple[ValidationCode, Optional[ChecksumDiff]]: ''' Uses diff_checksums to validate the archive's provenance, warning the user if checksums.md5 is missing, or if the archive is corrupt or has been modified. Parameters ---------- zf : ZipFile The zipfile object of the archive. Returns ------- tuple of (ValidationCode, ChecksumDiff) If the checksums.md5 fle isn't present set ChecksumDiff to None and ValidationCode to INVALID and return. ''' checksum_diff: Optional[ChecksumDiff] provenance_is_valid = ValidationCode.VALID for fp in zf.namelist(): if 'checksums.md5' in fp: break else: warnings.warn( 'The checksums.md5 file is missing from the archive. ' 'Archive may be corrupt or provenance may be false.', UserWarning ) return ValidationCode.INVALID, None checksum_diff = diff_checksums(zf) if checksum_diff != ChecksumDiff({}, {}, {}): root_uuid = get_root_uuid(zf) warnings.warn( f'Checksums are invalid for Archive {root_uuid}\n' 'Archive may be corrupt or provenance may be false.\n' f'Files added since archive creation: {checksum_diff.added}\n' 'Files removed since archive creation: ' f'{checksum_diff.removed}\n' 'Files changed since archive creation: ' f'{checksum_diff.changed}', UserWarning ) provenance_is_valid = ValidationCode.INVALID return provenance_is_valid, checksum_diff def diff_checksums(zf: ZipFile) -> ChecksumDiff: ''' Calculates checksums for all files in an archive (except checksums.md5). Compares these against the checksums stored in checksums.md5, returning a summary ChecksumDiff. Parameters ---------- zf : ZipFile The zipfile object of the archive. Returns ------- ChecksumDiff A tuple of three dicts, one each for added, removed, and changed files. Keys are filepaths. For the added and removed dicts values are the checksum of the added or removed file. For the changed dict values are a tuple of (expected checksum, observed checksum). ''' archive_version, _ = parse_version(zf) # TODO: don't think this is ever called if int(archive_version) < 5: return ChecksumDiff({}, {}, {}) root_dir = pathlib.Path(get_root_uuid(zf)) checksum_fp = str(root_dir / 'checksums.md5') obs = md5sum_directory_zip(zf) exp = {} for line in zf.open(checksum_fp): fp, checksum = from_checksum_format(str(line, 'utf-8')) exp[fp] = checksum obs_fps = set(obs) exp_fps = set(exp) added = {fp: obs[fp] for fp in obs_fps - exp_fps} removed = {fp: exp[fp] for fp in exp_fps - obs_fps} changed = { fp: (exp[fp], obs[fp]) for fp in exp_fps & obs_fps if exp[fp] != obs[fp] } return ChecksumDiff(added=added, removed=removed, changed=changed) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/archive_parser.py000066400000000000000000001101031462552636000252330ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import abc import os import pandas as pd import pathlib import tempfile import yaml import warnings from zipfile import ZipFile from dataclasses import dataclass from datetime import timedelta from io import BytesIO from typing import Any, Dict, List, Optional, Set, Tuple, Union import bibtexparser as bp import networkx as nx from ._checksum_validator import ( ValidationCode, ChecksumDiff, validate_checksums ) from .util import get_root_uuid, get_nonroot_uuid, parse_version from ..provenance import MetadataInfo @dataclass class Config(): ''' Dataclass that stores user-selected configuration options. Attributes ---------- perform_checksum_validation : bool Whether to opt in or out of checksum validation. parse_study_metadata : bool Whether to parse study metadata stored in provenance. recurse : bool Whether to recursively parse nested directories that contain artifacts. verbose : bool Whether to print status messages to stdout during processing. ''' perform_checksum_validation: bool = True parse_study_metadata: bool = True recurse: bool = False verbose: bool = False @dataclass class ParserResults(): ''' Results generated and returned by a ParserVx. Attributes ---------- parsed_artifact_uuids : set of str The uuids of the artifacts directly parsed by a parser. Does not include the uuids of artifact parsed from provenance. When parsing a single archive this is a single member set of that uuid. When parsing a directory, it is the set of all artifact uuids in that directory. prov_digraph : nx.Digraph The directed acyclic graph representation of the parsed provenance as an nx.DiGraph object. provenance_is_valid : ValidationCode A flag indicating the level of checksum validation. checksum_diff : ChecksumDiff or None A tuple of three dictionaries indicating the uuids of files that have been 1) added 2) removed or 3) changed in the archive since the archive was checksummed. None if no checksum validation was perfomed, e.g. when opted out or impossible because archive version did not support checksums, or when checksums.md5 missing from archive where it was expected. Interpretable only in conjunction with provenance_is_valid. ''' parsed_artifact_uuids: Set[str] prov_digraph: nx.DiGraph provenance_is_valid: ValidationCode checksum_diff: Optional[ChecksumDiff] class ProvNode: ''' One node of a provenance DAG, describing one QIIME2 Result. ''' @property def _uuid(self) -> str: return self._result_md.uuid @_uuid.setter def _uuid(self, new_uuid: str): ''' ProvNode's UUID. Safe for use as getter. Prefer ProvDAG.relabel_nodes as a setter because it preserves alignment between ids across the dag and its ProvNodes. ''' self._result_md.uuid = new_uuid @property def type(self) -> str: return self._result_md.type @property def format(self) -> Optional[str]: return self._result_md.format @property def archive_version(self) -> str: return self._archive_version @property def framework_version(self) -> str: return self._framework_version @property def has_provenance(self) -> bool: return int(self.archive_version) > 1 @property def citations(self) -> Dict: citations = {} if hasattr(self, '_citations'): citations = self._citations.citations return citations @property def metadata(self) -> Optional[Dict[str, pd.DataFrame]]: ''' A dict containing {parameter_name: metadata_dataframe} pairs where parameter_name is the registered name of the parameter the Metadata or MetadataColumn was passed to. Returns an empty dict if this action takes no Metadata or MetadataColumn. Returns None if this action has no metadata because the archive has no provenance, or the user opted out of metadata parsing. ''' self._metadata: Optional[Dict[str, pd.DataFrame]] md = None if hasattr(self, '_metadata'): md = self._metadata return md @property def _parents(self) -> Optional[List[Dict[str, str]]]: ''' A list of single-item {Type: UUID} dicts describing this action's inputs, including Artifacts passed as Metadata parameters. Returns [] if this action is an Import. NOTE: This property is private because it is slightly unsafe, reporting original node IDs that are not updated if the user renames nodes using the networkx API instead of ProvDAG.relabel_nodes. ProvDAG and its extensions should use the networkx.DiGraph itself to work with ancestry when possible. ''' if not self.has_provenance: return None inputs = self.action._action_details.get('inputs') parents = [] if inputs is not None: # Inputs are a list of single-item dicts for input in inputs: (name, value), = input.items() # value is usually a uuid, but may be a collection of uuids # the following are specced in qiime2/core/type/collection if type(value) in (set, list, tuple): for i in range(len(value)): # Make these unique in case the single-item dicts get # merged into a single dict downstream if type(value[i]) is dict: unq_name, = value[i].keys() v, = value[i].values() else: unq_name = f'{name}_{i}' v = value[i] parents.append({unq_name: v}) elif value is not None: parents.append({name: value}) else: # skip None-by-default optional inputs pass return parents + self._artifacts_passed_as_md def __init__( self, cfg: Config, zf: ZipFile, node_fps: List[pathlib.Path] ): ''' Constructs a ProvNode from a zipfile and the collected provenance-relevant filepaths for a single result within it. ''' for fp in node_fps: if fp.name == 'VERSION': self._archive_version, self._framework_version = \ parse_version(zf) elif fp.name == 'metadata.yaml': self._result_md = _ResultMetadata(zf, str(fp)) elif fp.name == 'action.yaml': self.action = _Action(zf, str(fp)) elif fp.name == 'citations.bib': self._citations = _Citations(zf, str(fp)) elif fp.name == 'checksums.md5': # Handled in ProvDAG pass if self.has_provenance: all_metadata_fps, self._artifacts_passed_as_md = \ self._get_metadata_from_Action(self.action._action_details) if cfg.parse_study_metadata: self._metadata = self._parse_metadata(zf, all_metadata_fps) def _get_metadata_from_Action( self, action_details: Dict[str, List] ) -> Tuple[Dict[str, str], List[Dict[str, str]]]: ''' Gathers data related to Metadata and MetadataColumn-based metadata files from the parsed action.yaml file. Captures filepath and parameter-name data for all study metadata files, so that these can be located for parsing, and then associated with the correct parameters during replay. It captures uuids for all artifacts passed to this action as metadata so they can be included as parents of this node. Parameters ---------- action_details : dict The parsed dictionary of the `action` section from action.yaml. Returns ------- tuple of (all_metadata, artifacts_as_metadata) Where all_metadata is a dict of {parameter_name: filename}. Where artifacts_as_metadata is a list of single-items dict of the structure {'artifact_passed_as_metadata': }. Notes ----- When Artifacts are passed as Metadata, they are captured in action['parameters'], rather than in action['inputs'] with the other Artifacts. Semantic Type data is thus not captured. This function returns a filler 'Type' for all UUIDs discovered here: 'artifact_passed_as_metadata'. Because Artifacts passed (viewed) as Metadata retain their provenance, downstream Artifacts are linked to their real parent Artifact nodes with the proper Type information. ''' all_metadata = dict() artifacts_as_metadata = [] if (all_params := action_details.get('parameters')) is not None: for param in all_params: param_val, = param.values() if isinstance(param_val, MetadataInfo): param_name, = param.keys() md_fp = param_val.relative_fp all_metadata.update({param_name: md_fp}) artifacts_as_metadata += [ {'artifact_passed_as_metadata': uuid} for uuid in param_val.input_artifact_uuids ] return all_metadata, artifacts_as_metadata def _parse_metadata( self, zf: ZipFile, metadata_fps: Dict[str, str] ) -> Dict[str, pd.DataFrame]: ''' Parses all metadata files captured from Metadata and MetadataColumns (identifiable by !metadata tags) into pd.DataFrames. Parameters ---------- zf : ZipFile The zipfile object of the archive. metadata_fps : dict A dict of parameter names to metadata filenames for metadata paramters. Returns ------- dict A dict of parameter names to dataframe objects that is loaded from the corresponding metadata file. An empty dict if there is no metadata. ''' if metadata_fps == {}: return {} root_uuid = get_root_uuid(zf) pfx = pathlib.Path(root_uuid) / 'provenance' if root_uuid == self._uuid: pfx = pfx / 'action' else: pfx = pfx / 'artifacts' / self._uuid / 'action' all_md = dict() for param_name in metadata_fps: filepath = str(pfx / metadata_fps[param_name]) with zf.open(filepath) as fh: df = pd.read_csv(BytesIO(fh.read()), sep='\t') all_md[param_name] = df return all_md def __repr__(self) -> str: return repr(self._result_md) __str__ = __repr__ def __hash__(self) -> int: return hash(self._uuid) def __eq__(self, other) -> bool: return ( self.__class__ == other.__class__ and self._uuid == other._uuid ) class _Action: '''Provenance data from action.yaml for a single QIIME2 Result.''' @property def action_id(self) -> str: '''The UUID of the Action itself.''' return self._execution_details['uuid'] @property def action_type(self) -> str: ''' The type of Action represented e.g. Method, Pipeline, et al. ''' return self._action_details['type'] @property def runtime(self) -> timedelta: '''The elapsed run time of the Action, as a datetime object.''' end = self._execution_details['runtime']['end'] start = self._execution_details['runtime']['start'] return end - start @property def runtime_str(self) -> str: ''' The elapsed run time of the Action in seconds and microseconds.''' return self._execution_details['runtime']['duration'] @property def action_name(self) -> str: ''' The name of the action itself. Imports return 'import'. ''' if self.action_type == 'import': return 'import' return self._action_details.get('action') @property def plugin(self) -> str: ''' The plugin which executed this Action. Returns 'framework' if this is an import. ''' if self.action_type == 'import': return 'framework' plugin = self._action_details.get('plugin') return plugin.replace('-', '_') @property def inputs(self) -> dict: ''' Creates a dict of artifact inputs to this action. Returns ------- dict A mapping of input name to the data type passed for that input (either uuid, list of uuid, or dict), see below for details. Notes ----- One of three structures may be encountered when parsing this section of action.yaml, described below: case 1: inputs: - some_input_name: some_uuid - some_other_input_name: some_other_uuid (...) case 2: inputs: - some_input_name: - some_uuid - some_other_uuid (...) case 3 (result collection): inputs: - result_collection_name: - some_key: some_uuid - some_other_key: some_other_uuid (...) and thus is a different structure entirely. ''' inputs = self._action_details.get('inputs') results = {} if inputs is not None: for input_ in inputs: nest_lvl_1 = next(iter(input_.values())) if type(nest_lvl_1) is list and type(nest_lvl_1[0]) is dict: # result collection rc = {} for member in nest_lvl_1: rc.update(member) input_name = next(iter(input_)) results.update({input_name: rc}) else: # not result collection results.update(input_) return results @property def input_result_collections(self): ''' Collects all result collections passed as inputs (if any). Used for constructing the result collection namespace. Returns ------- list of str A list of the names of the result collections passed as input. The names are as registered in the method registration. ''' result_collection_names = [] for key, value in self.inputs: if type(value) is dict: result_collection_names.append(key) return result_collection_names @property def parameters(self) -> dict: '''Returns a dict of parameters passed to this action.''' params = self._action_details.get('parameters') results = {} if params is not None: for item in params: results.update(item.items()) return results @property def output_name(self) -> Optional[str]: ''' Gets the output name of the node. Returns ------- str or None The name of the output as parsed from action.yaml, or None if there is no output-name section. ''' output_name = self._action_details.get('output-name') if type(output_name) is list: output_name = output_name[0] return output_name @property def result_collection_key(self) -> Optional[str]: ''' Gets the result collection key if the artifact is part of a result collection. Returns ------- str The result collection key if the artifact was output as part of a result collection, none otherwise. Notes ----- We know if the artifact comes from a ResultCollection because outputs from a ResultCollection look like: output-name: - output - key - position/total positions ''' output_name = self._action_details.get('output-name') if type(output_name) is not list: return None return output_name[1] @property def format(self) -> Optional[str]: '''Returns this action's format field if any.''' return self._action_details.get('format') @property def transformers(self) -> Optional[Dict]: '''Returns this action's transformers dictionary if any.''' return self._action_dict.get('transformers') def __init__(self, zf: ZipFile, fp: str): with tempfile.TemporaryDirectory() as tempdir: zf.extractall(tempdir) action_fp = os.path.join(tempdir, fp) with open(action_fp) as fh: self._action_dict = yaml.safe_load(fh) self._action_details = self._action_dict['action'] self._execution_details = self._action_dict['execution'] def __repr__(self): return ( f'_Action(action_id={self.action_id}, type={self.action_type},' f' plugin={self.plugin}, action={self.action_name})' ) class _Citations: ''' Citations for a single QIIME2 Result, as a dict of citation dicts keyed on the citation's bibtex ID. ''' def __init__(self, zf: ZipFile, fp: str): bib_db = bp.loads(zf.read(fp)) self.citations = bib_db.get_entry_dict() def __repr__(self): keys = list(self.citations.keys()) return f'Citations({keys})' class _ResultMetadata: '''Basic metadata about a single QIIME2 Result from metadata.yaml.''' def __init__(self, zf: ZipFile, md_fp: str): _md_dict = yaml.safe_load(zf.read(md_fp)) self.uuid = _md_dict['uuid'] self.type = _md_dict['type'] self.format = _md_dict['format'] def __repr__(self): return ( f'UUID:\t\t{self.uuid}\n' f'Type:\t\t{self.type}\n' f'Data Format:\t{self.format}' ) class Parser(metaclass=abc.ABCMeta): accepted_data_types: str @classmethod @abc.abstractmethod def get_parser(cls, artifact_data: Any) -> 'Parser': ''' Return the appropriate Parser if this Parser type can handle the data passed in. Should raise an appropriate exception if this Parser cannot handle the data. ''' @abc.abstractmethod def parse_prov(self, cfg: Config, data: Any) -> ParserResults: ''' Parse provenance to return a ParserResults. ''' class ArchiveParser(Parser): accepted_data_types = 'a path to a file (a string) or a file-like object' @classmethod def get_parser(cls, artifact: Union[str, pathlib.PosixPath]) -> Parser: ''' Returns the correct archive format parser for a zip archive. Parameters ---------- artifact_data : str or pathlib.PosixPath A path to a zipped archive. Returns ------- Parser An ArchiveParser object for the version of the artifact. One of ParserV[0-6]. ''' if type(artifact) is pathlib.PosixPath: artifact = str(artifact) if type(artifact) is not str: raise TypeError( 'ArchiveParser expects a string or pathlib.PosixPath path to ' f'an archive, not an object of type {str(type(artifact))}.' ) if os.path.isdir(artifact): raise ValueError('ArchiveParser expects a file, not a directory.') try: with ZipFile(artifact, 'r') as zf: archive_version, _ = parse_version(zf) return FORMAT_REGISTRY[archive_version]() except KeyError as e: raise KeyError( f'While trying to parse artifact {artifact}, ' 'a corresponding parser was not found for archive version ' f'{archive_version}: {str(e)}.' ) def parse_prov(cls, cfg: Config, data: Any) -> ParserResults: raise NotImplementedError( 'Use a subclass that usefully defines parse_prov for some format.' ) class ParserV0(ArchiveParser): ''' Parser for V0 archives. V0 archives have no ancestral provenance. ''' # These are files we expect will be present in every QIIME2 archive with # this format. "Optional" filenames (like Metadata, which may or may # not be present in an archive) should not be included here. expected_files_root_only = tuple() expected_files_all_nodes = ('metadata.yaml', 'VERSION') def parse_prov(self, cfg: Config, archive: str) -> ParserResults: ''' Parses an artifact's provenance into a directed acyclic graph. In the case of v0 archives, the only provenance information is that which is attached to the artifact itself; information about ancestor nodes does not exist. The parsed dag contains only a single node. In the case of v1 archives, ancestor nodes do exist in the archive. However, because the corresponding action.yaml does not track output names, when two outputs share the same semantic type, it is not possible to untangle provenance. Instead of wrangling with this and in consideration of the expected rarity of v1 archives, it was decided to treat v1 archives as v0 archives. Parameters ---------- cfg : Config A dataclass that stores four boolean flags: whether to perform checksum validation, whether to parse study metadata, whether to recursively parse nested directories, and whether to enable verbose mode. archive : str A path to the artifact to be parsed. Returns ------- ParserResults A dataclass that stores the parsed artifact uuids, the parsed networkx graph, the provenance-is-valid flag, and the checksum diff. ''' with ZipFile(archive) as zf: if cfg.perform_checksum_validation: provenance_is_valid, checksum_diff = \ self._validate_checksums(zf) else: provenance_is_valid = ValidationCode.VALIDATION_OPTOUT checksum_diff = None root_uuid = get_root_uuid(zf) warnings.warn( f'Artifact {root_uuid} was created prior to provenance ' 'tracking. Provenance data will be incomplete.', UserWarning ) exp_node_fps = [] for fp in self.expected_files_all_nodes: exp_node_fps.append(pathlib.Path(root_uuid) / fp) prov_fps = self._get_provenance_fps(zf) self._assert_expected_files_present( zf, exp_node_fps, prov_fps ) # we have confirmed that all expected fps for this node exist node_fps = exp_node_fps nodes = {} nodes[root_uuid] = ProvNode(cfg, zf, node_fps) graph = self._digraph_from_archive_contents(nodes) return ParserResults( {root_uuid}, graph, provenance_is_valid, checksum_diff ) def _parse_root_md(self, zf: ZipFile, root_uuid: str) -> _ResultMetadata: ''' Parses the root metadata file of an archive for its uuid, semantic type, and format. Parameters ---------- zf : ZipFile A zipfile object of a v0 artifact. root_uuid : str The uuid of the root node. Because this operates on a v0 archive, the root node is the only node. Returns ------- _ResultMetadata An object representing the information stored in a metadata.yaml file, namely the uuid, type, and format fields. ''' root_md_fp = os.path.join(root_uuid, 'metadata.yaml') if root_md_fp not in zf.namelist(): raise ValueError( 'Malformed Archive: root metadata.yaml file ' f'misplaced or nonexistent in {zf.filename}' ) return _ResultMetadata(zf, root_md_fp) def _validate_checksums( self, zf: ZipFile ) -> Tuple[ValidationCode, Optional[ChecksumDiff]]: ''' Return the ValidationCode and ChecksumDiff for an archive. Because checksums were not introduced, until ArchiveFormat version 5, uses the PREDATES_CHECKSUMS flag and returns None to indicate that checksum diffing was not performed. Parameters ---------- zf : ZipFile The zipfile object representing the archive. Ignored here but needed in signature for inheritance. Returns ------- tuple of (ValidationCode, None) The validation code and None to indicate missing ChecksumDiff. ''' return (ValidationCode.PREDATES_CHECKSUMS, None) def _digraph_from_archive_contents( self, archive_contents: Dict[str, 'ProvNode'] ) -> nx.DiGraph: ''' Builds a networkx.DiGraph from a {UUID: ProvNode} dictionary. 1. Create an empty nx.digraph. 2. Gather nodes and their required attributes and add them to the DiGraph. 3. Add edges to graph (including all !no-provenance nodes) 4. Create guaranteed node attributes for these no-provenance nodes, which wouldn't otherwise have them. Parameters ---------- archive_contents : dict of {str to ProvNode} A dictionary of node uuids to their representative ProvNode objects. Returns ------- nx.DiGraph The directed, acyclic graph representation of the provenance of the archive. Edge directionality is from parent to child. Parents may have multiple children and children may have multiple parents. ''' dag = nx.DiGraph() nodes = [] for node_uuid, node in archive_contents.items(): node_info = { 'node_data': node, 'has_provenance': node.has_provenance } nodes.append((node_uuid, node_info)) dag.add_nodes_from(nodes) edges = [] for node_uuid, attrs in dag.nodes(data=True): if parents := attrs['node_data']._parents: for parent in parents: parent_uuid, = parent.values() edges.append((parent_uuid, node_uuid)) dag.add_edges_from(edges) return dag def _get_provenance_fps(self, zf: ZipFile) -> List[pathlib.Path]: ''' Collect filepaths of all provenance-relevant files in an archive. Relevant is defined by `self.expected_files_all_nodes` and `self.expected_files_root_only` (which is empty). Parameters ---------- zf : ZipFile The zipfile object of the archive. Returns ------- list of pathlib.Path Filepaths relative to root of zipfile for each file of interest. ''' fps = [] for fp in zf.namelist(): for expected_filename in self.expected_files_all_nodes: if expected_filename in fp: fps.append(pathlib.Path(fp)) return fps def _assert_expected_files_present( self, zf: ZipFile, expected_node_fps: List[pathlib.Path], prov_fps: List[pathlib.Path], ): ''' Makes sure that all expected files for a given node are present in an archive. Raises a ValueError if not. Parameters ---------- zf : ZipFile The zipfile object representing an archive. expected_node_fps : list of pathlib.Path The filepaths that are expected to be present in the zipfile for some node. prov_fps : list of pathlib.Path All provenance-relevant filepaths in the archive. Raises ------ ValueError If there are expected provenance-relevant files missing from a node. ''' error_contents = 'Malformed Archive: ' root_uuid = get_root_uuid(zf) for fp in expected_node_fps: if fp not in prov_fps: node_uuid = get_nonroot_uuid(fp) error_contents += ( f'{fp.name} file for node {node_uuid} ' f'misplaced or nonexistent in {zf.filename}.\n' ) error_contents += ( f'Archive {root_uuid} may be corrupt ' 'or provenance may be false.' ) raise ValueError(error_contents) class ParserV1(ParserV0): ''' Parser for V1 archives. Although action.yaml was introduced for this archive version, we are pretending that it was introduced in V2 because of difficulties untangling provenance without output names. V1 archives are treated as having no provenance, like V0 archives. ''' expected_files_root_only = ParserV0.expected_files_root_only expected_files_all_nodes = ParserV0.expected_files_all_nodes class ParserV2(ParserV1): ''' Parser for V2 archives. Introduces action/action.yaml to provenance. Directory structure identical to V1, action.yaml changes to support Pipelines. ''' expected_files_root_only = ParserV1.expected_files_root_only expected_files_all_nodes = ( *ParserV1.expected_files_all_nodes, 'action/action.yaml' ) def parse_prov(self, cfg: Config, archive: str) -> ParserResults: ''' Parses an artifact's provenance into a directed acyclic graph. For each artifact in provenance, gathers all corresponding provenance-relevant files and constructs a ProvNode. Once all ProvNodes are constructed, creates the provenance graph. Parameters ---------- cfg : Config A dataclass that stores four boolean flags: whether to perform checksum validation, whether to parse study metadata, whether to recursively parse nested directories, and whether to enable verbose mode. archive_data : str A path to the artifact to be parsed. Returns ------- ParserResults A dataclass that stores the parsed artifact uuids, the parsed networkx graph, the provenance-is-valid flag, and the checksum diff. ''' with ZipFile(archive) as zf: if cfg.perform_checksum_validation: provenance_is_valid, checksum_diff = \ self._validate_checksums(zf) else: provenance_is_valid = ValidationCode.VALIDATION_OPTOUT checksum_diff = None prov_fps = self._get_provenance_fps(zf) root_uuid = get_root_uuid(zf) # make a provnode for each UUID archive_contents = {} for fp in prov_fps: exp_node_fps = [] if 'artifacts' not in fp.parts: node_uuid = root_uuid prefix = pathlib.Path(node_uuid) / 'provenance' root_only_expected_fps = [] for exp_filename in self.expected_files_root_only: root_only_expected_fps.append( pathlib.Path(node_uuid) / exp_filename ) exp_node_fps += root_only_expected_fps else: node_uuid = get_nonroot_uuid(fp) # /root-uuid/provenance/artifacts/node-uuid prefix = pathlib.Path(*fp.parts[0:4]) if node_uuid in archive_contents: continue for expected_file in self.expected_files_all_nodes: exp_node_fps.append(prefix / expected_file) self._assert_expected_files_present( zf, exp_node_fps, prov_fps ) # we have confirmed that all expected fps for this node exist node_fps = exp_node_fps archive_contents[node_uuid] = ProvNode(cfg, zf, node_fps) graph = self._digraph_from_archive_contents(archive_contents) return ParserResults( {root_uuid}, graph, provenance_is_valid, checksum_diff ) def _get_provenance_fps(self, zf: ZipFile) -> List[pathlib.Path]: ''' Collect filepaths of all provenance-relevant files in an archive. Relevant is defined by `self.expected_files_all_nodes` and `self.expected_files_root_only`. Parameters ---------- zf : ZipFile The zipfile object of the archive. Returns ------- list of pathlib.Path Filepaths relative to root of zipfile for each file of interest. ''' fps = [] for fp in zf.namelist(): for expected_filename in self.expected_files_all_nodes: if 'provenance' in fp and expected_filename in fp: fps.append(pathlib.Path(fp)) root_uuid = get_root_uuid(zf) for expected_filename in self.expected_files_root_only: fps.append(pathlib.Path(root_uuid) / expected_filename) return fps class ParserV3(ParserV2): ''' Parser for V3 archives. Directory structure identical to V1 & V2, action.yaml now supports variadic inputs, so !set tags in action.yaml. ''' expected_files_root_only = ParserV2.expected_files_root_only expected_files_all_nodes = ParserV2.expected_files_all_nodes class ParserV4(ParserV3): ''' Parser for V4 archives. Adds citations to directory structure, changes to action.yaml including transformers. ''' expected_files_root_only = ParserV3.expected_files_root_only expected_files_all_nodes = ( *ParserV3.expected_files_all_nodes, 'citations.bib' ) class ParserV5(ParserV4): ''' Parser for V5 archives. Adds checksum validation with checksums.md5. ''' expected_files_root_only = ('checksums.md5', ) expected_files_all_nodes = ParserV4.expected_files_all_nodes def _validate_checksums( self, zf: ZipFile ) -> Tuple[ValidationCode, Optional[ChecksumDiff]]: ''' Checksum support added for v5, so perform checksum validation. Parameters ---------- zf : ZipFile The zipfile object representation of the parsed archive. Returns ------- tuple of (ValidationCode, ChecksumDiff or None) Where ValidationCode is one of valid, invalid, predates checksums, optout. Where ChecksumDiff contains filepaths of all changed, added, and removed files since last checksumming. If checksums.md5 is missing from archive the archive, an invalid code is returned and a ChecksumDiff of None is returned. Notes ----- Because a ChecksumDiff of None here has a different interpetation than in pre-V5 archive parsers, the ChecksumDiff should only be intepreted in conjuction with the ValidationCode. ''' return validate_checksums(zf) class ParserV6(ParserV5): ''' Parser for V6 archives. Adds support for output collections, adds execution_context field to action.yaml. ''' expected_files_root_only = ParserV5.expected_files_root_only expected_files_all_nodes = ParserV5.expected_files_all_nodes FORMAT_REGISTRY = { # NOTE: update for new format versions in qiime2.core.archive.Archiver '0': ParserV0, '1': ParserV1, '2': ParserV2, '3': ParserV3, '4': ParserV4, '5': ParserV5, '6': ParserV6 } qiime2-2024.5.0/qiime2/core/archive/provenance_lib/assets/000077500000000000000000000000001462552636000231725ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/assets/copyright_note.txt000066400000000000000000000004661462552636000267760ustar00rootroot00000000000000# This document is a representation of the scholarly work of the creator of the # QIIME 2 Results provided as input to this software, and may be protected by # intellectual property law. Please respect all copyright restrictions and # licenses governing the use, modification, and redistribution of this work. qiime2-2024.5.0/qiime2/core/archive/provenance_lib/assets/python_howto.txt000066400000000000000000000022341462552636000264750ustar00rootroot00000000000000 # Instructions for use: # 1. Open this script in a text editor or IDE. Support for Python # syntax highlighting is helpful. # 2. Search or scan visually for '<' or '>' characters to find places where # user input (e.g. a filepath or column name) is required. If syntax # highlighting is enabled, '<' and '>' will appear as syntax errors. # 3. Search for 'FIXME' comments in the script, and respond as directed. # 4. Remove all 'FIXME' comments from the script completely. Failure to do so # may result in 'Missing Option' errors # 5. Adjust the arguments to the commands below to suit your data and metadata. # If your data is not identical to that in the replayed analysis, # changes may be required. (e.g. sample ids or rarefaction depth) # 6. Optional: search for 'SAVE' comments in the script, commenting out the # `some_result.save` lines for any Results you do not want saved to disk. # 7. Activate your replay conda environment, and confirm you have installed all # plugins used by the script. # 8. Run this script with `python `, or paste commands # into a python interpreter or jupyter notebook for an interactive analysisqiime2-2024.5.0/qiime2/core/archive/provenance_lib/parse.py000066400000000000000000000467131462552636000233670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from __future__ import annotations import copy from typing import Any, List, Optional, Set import networkx as nx import os from pathlib import Path from networkx.classes.reportviews import NodeView from ._checksum_validator import ValidationCode, ChecksumDiff from .archive_parser import ( Config, ParserResults, ProvNode, Parser, ArchiveParser ) class ProvDAG: ''' A directed acyclic graph (DAG) representing the provenance of one or more QIIME 2 Archives (.qza or .qzv files). Parameters ---------- artifact_data : Any The input data payload for the ProvDAG. Often the path to a file or directory on disk. validate_checksums : bool If True, Archives will be validated against their checksums.md5 manifests. parse_metadata : bool If True, the metadata captured in the input Archives will be parsed and included in the ProvDAG. recurse : bool If True, and if artifact_data is a directory, will recursively parse all .qza and .qzv files within subdirectories. verbose : bool If True, will print parsed filenames to stdout, indicating progress. Attributes ---------- dag : nx.DiGraph A Directed Acyclic Graph (DAG) representing the complete provenance of one or more QIIME 2 Artifacts. This DAG is comprehensive, including pipeline "alias" nodes as well as the inner nodes that compose each pipeline. parsed_artifact_uuids : Set[UUID] The set of user-passed terminal node uuids. Used to generate properties like `terminal_uuids`, this is a superset of terminal_uuids. terminal_uuids : Set[UUID] The set of terminal node ids present in the DAG, not including inner pipeline nodes. terminal_nodes : Set[ProvNode] The terminal ProvNodes present in the DAG, not including inner pipeline nodes. provenance_is_valid : ValidationCode The canonical indicator of provenance validity for a dag, this contains the lowest ValidationCode from all parsed Artifacts unioned into a given ProvDAG. checksum_diff : ChecksumDiff A ChecksumDiff representing all added, removed, and changed filepaths from all parsed Artifacts. If an artifact's checksums.md5 file is missing, this may be None. When multiple artifacts are unioned, this field prefers ChecksumDiffs over Nonetypes, which will be dropped. ''' def __init__( self, artifact_data: Any = None, validate_checksums: bool = True, parse_metadata: bool = True, recurse: bool = False, verbose: bool = False, ): ''' Creates a digraph by getting a parser from the parser dispatcher then parses the incoming data into a ParserResults, and then loads those results into key fields. ''' cfg = Config(validate_checksums, parse_metadata, recurse, verbose) parser_results = parse_provenance(cfg, artifact_data) self.cfg = cfg self._parsed_artifact_uuids = parser_results.parsed_artifact_uuids self.dag = parser_results.prov_digraph self._provenance_is_valid = parser_results.provenance_is_valid self._checksum_diff = parser_results.checksum_diff # clear cache whenever we create a new ProvDAG self._terminal_uuids = None def __repr__(self) -> str: return ( 'ProvDAG representing the provenance of the Artifacts: ' f'{self._parsed_artifact_uuids}' ) __str__ = __repr__ def __len__(self) -> int: return len(self.dag) def __eq__(self, other) -> bool: if self.__class__ != other.__class__: return False if not nx.is_isomorphic(self.dag, other.dag): return False return True def __iter__(self): return iter(self.dag) @property def terminal_uuids(self) -> Set[str]: ''' The UUIDs of the terminal nodes in the DAG, generated by selecting all nodes in a collapsed view of self.dag with an out-degree of zero. We memoize the set of terminal UUIDs to prevent unnecessary traversals, so must set self._terminal_uuid back to None in any method that modifies the structure of self.dag, or the nodes themselves (which are literal UUIDs). These methods include self.union() and self.relabel_nodes(). ''' if self._terminal_uuids is not None: return self._terminal_uuids cv = self.collapsed_view self._terminal_uuids = { uuid for uuid, out_degree in cv.out_degree() if out_degree == 0 } return self._terminal_uuids @property def parsed_artifact_uuids(self) -> Set[str]: ''' The set of user-passed terminal node uuids. Used to generate properties like self.terminal_uuids. ''' return self._parsed_artifact_uuids @property def terminal_nodes(self) -> Set[ProvNode]: '''The terminal ProvNodes in the DAG's provenance.''' return {self.get_node_data(uuid) for uuid in self.terminal_uuids} @property def provenance_is_valid(self) -> ValidationCode: return self._provenance_is_valid @property def checksum_diff(self) -> Optional[ChecksumDiff]: return self._checksum_diff @property def nodes(self) -> NodeView: return self.dag.nodes @property def collapsed_view(self) -> nx.DiGraph: ''' Returns a subsetted graphview of self.dag containing only nodes that are exterior to pipeline actions. ''' outer_nodes = set() for terminal_uuid in self._parsed_artifact_uuids: outer_nodes |= self.get_outer_provenance_nodes(terminal_uuid) return nx.subgraph_view(self.dag, lambda node: node in outer_nodes) def has_edge(self, start_node: str, end_node: str) -> bool: return self.dag.has_edge(start_node, end_node) def node_has_provenance(self, uuid: str) -> bool: return self.dag.nodes[uuid]['has_provenance'] def get_node_data(self, uuid: str) -> ProvNode: '''Returns a ProvNode from this ProvDAG selected by UUID.''' return self.dag.nodes[uuid]['node_data'] def predecessors(self, node: str, dag: nx.DiGraph = None) -> Set[str]: ''' Returns the parent UUIDs of a given node. Parameters ---------- node : str The uuid of the node of interest. dag : nx.DiGraph The current provenance graph. Returns ------- set of str The uuids of the parents of the node. ''' dag = self.collapsed_view if dag is None else dag return set(self.dag.predecessors(node)) @classmethod def union(cls, dags: List[ProvDAG]) -> ProvDAG: ''' Class method that creates a new ProvDAG by unioning two or more graphs. The returned ProvDAG._parsed_artifact_uuids will include uuids from all dags, and other DAG attributes are reduced conservatively. Parameters ---------- dags : list of ProvDAG A collection of ProvDAGs to union. Returns ------- ProvDAG A single ProvDAG representing the union of all inputs. ''' if len(dags) < 2: raise ValueError('Please pass at least two ProvDAGs.') union_dag = ProvDAG() union_dag.dag = nx.compose_all((dag.dag for dag in dags)) union_dag._parsed_artifact_uuids = dags[0]._parsed_artifact_uuids union_dag._provenance_is_valid = dags[0]._provenance_is_valid union_dag._checksum_diff = dags[0].checksum_diff for next_dag in dags[1:]: union_dag._parsed_artifact_uuids = \ union_dag._parsed_artifact_uuids.\ union(next_dag._parsed_artifact_uuids) union_dag._provenance_is_valid = min( union_dag.provenance_is_valid, next_dag.provenance_is_valid ) union_dag.cfg.parse_study_metadata = min( union_dag.cfg.parse_study_metadata, next_dag.cfg.parse_study_metadata ) union_dag.cfg.perform_checksum_validation = min( union_dag.cfg.perform_checksum_validation, next_dag.cfg.perform_checksum_validation ) # Here we retain as much data as possible, preferencing any # ChecksumDiff over None. This might mean we keep an empty # ChecksumDiff and drop a None ChecksumDiff, the latter of which is # used to indicate a missing checksums.md5 file in v5+ archives. # union_dag._provenance_is_valid will still be INVALID however. if next_dag.checksum_diff is None: continue if union_dag.checksum_diff is None: union_dag._checksum_diff = next_dag.checksum_diff else: union_dag.checksum_diff.added.update( next_dag.checksum_diff.added ) union_dag.checksum_diff.removed.update( next_dag.checksum_diff.removed ) union_dag.checksum_diff.changed.update( next_dag.checksum_diff.changed ) # make union._terminal_uuids be recalculated on next access union_dag._terminal_uuids = None return union_dag def get_outer_provenance_nodes(self, node_id: str = None) -> Set[str]: ''' Performs depth-first traversal of a node's ancestors. Skips over nodes that are interior to a pipeline because pipeline output nodes point to the pipeline's inputs as parents not their direct parents inside the pipeline. Parameters ---------- node_id : str The uuid of the node for which to discover ancestors. Returns ------- set of str All ancestor uuids according to the above definition. ''' nodes = set() if node_id is None else {node_id} parents = [edge_pair[0] for edge_pair in self.dag.in_edges(node_id)] for uuid in parents: nodes = nodes | self.get_outer_provenance_nodes(uuid) return nodes # TODO: can this get nuked? class EmptyParser(Parser): ''' Creates empty ProvDAGs. Disregards Config, because it's not meaningful in this context. ''' accepted_data_types = 'None' @classmethod def get_parser(cls, artifact_data: Any) -> Parser: if artifact_data is None: return EmptyParser() else: raise TypeError(f' in EmptyParser: {artifact_data} is not None.') def parse_prov(self, cfg: Config, data: None) -> ParserResults: ''' Returns a static ParserResults with empty parsed_artifact_uuids, an empty graph, a valid ValidationCode, and a None ChecksumDiff. ''' return ParserResults( parsed_artifact_uuids=set(), prov_digraph=nx.DiGraph(), provenance_is_valid=ValidationCode.VALID, checksum_diff=None, ) class DirectoryParser(Parser): accepted_data_types = \ 'filepath to a directory containing .qza/.qzv archives' @classmethod def get_parser(cls, artifact_data: Any) -> Parser: ''' Return a DirectoryParser if appropriate. Parameters ---------- artifact_data : Any Ideally a path to a directory containing one or more archives, but may be a different type during searches for other Parsers. Raises ------ TypeError If something other than a str or path-like object is input. ValueError If the path does not point to a directory. ''' try: is_dir = os.path.isdir(artifact_data) except TypeError: t = type(artifact_data) raise TypeError( f' in DirectoryParser: expects a directory, not a {t}.' ) if not is_dir: raise ValueError( f' in DirectoryParser: {artifact_data} ' 'is not a valid directory.' ) return DirectoryParser() def parse_prov(self, cfg: Config, data: str) -> ParserResults: ''' Iterates over the directory's .qza and .qzv files, parsing them if their terminal node isn't already in the DAG. This behavior assumes that the ArchiveParsers capture all nodes within the archives they parse by default. Parameters ---------- cfg : Config User-selected configuration options for whether to perform checksum validation, whether to parse study metadata, whether to recursively parse nested directories of artifacts, and whether to print status messages during processing. data : Any The path to the directory containing artifacts. Returns ------- ParserResults A dataclass that stores the parsed artifact uuids, the parsed networkx graph, the provenance-is-valid flag, and the checksum diff. ''' dir_name = Path(str(data).rstrip('/') + os.sep) if cfg.recurse: artifacts_to_parse = list(dir_name.rglob('*.qz[av]')) err_msg = ( f'No .qza or .qzv files present in {dir_name} or any ' 'directory nested within it.' ) else: artifacts_to_parse = list(dir_name.glob('*.qz[av]')) err_msg = ( f'No .qza or .qzv files present in {dir_name}. Did you ' 'mean to recurse into nested directories?' ) if not artifacts_to_parse: raise ValueError(err_msg) dag = ProvDAG() for archive in artifacts_to_parse: if cfg.verbose: print("parsing", archive) dag = ProvDAG.union([ dag, ProvDAG( archive, cfg.perform_checksum_validation, cfg.parse_study_metadata ) ]) return ParserResults( dag._parsed_artifact_uuids, dag.dag, dag.provenance_is_valid, dag.checksum_diff, ) def archive_not_parsed(root_uuid: str, dag: ProvDAG) -> bool: ''' Checks if the archive with root_uuid has not already been parsed into dag which is defined as either not in the dag at all, or added only as a !no-provenance parent uuid. Parameters ---------- root_uuid : str The root uuid of an archive. dag : ProvDAG The ProvDAG in which to search for the root uuid. Returns ------- bool Indicating whether the archive represented by root_uuid has not been parsed into the ProvDAG. ''' if root_uuid not in dag.dag: return True elif dag.get_node_data(root_uuid) is None: return True return False class ProvDAGParser(Parser): ''' Effectively a ProvDAG copy constructor, this "parses" a ProvDAG, loading its data into a new ProvDAG. Disregards Config, because it's not meaningful in this context. ''' accepted_data_types = 'ProvDAG' @classmethod def get_parser(cls, artifact_data: Any) -> Parser: ''' Returns ProvDAGParser if appropriate. Parameters ---------- artifact_data : Any Hopefully a ProvDAG but may be a different type during searches for the proper Parser. Returns ------- ProvDAGParser An instance of ProvDAGParser if artifact_data is a ProvDAG. Raises ------ TypeError If artifact_data is not a ProvDAG. ''' if isinstance(artifact_data, ProvDAG): return ProvDAGParser() else: raise TypeError( f' in ProvDAGParser: {artifact_data} is not a ProvDAG.' ) def parse_prov(self, cfg: Config, dag: ProvDAG) -> ParserResults: ''' Parses a ProvDAG returning a ParserResults by deep copying existing attributes that live on the ProvDAG and make up a ParserResults. Parameters ---------- cfg : Config Ignored because a ProvDAG is not being constructed from scratch. Present for inheritance purposes. dag : ProvDAG The ProvDAG to parse, read: copy attributes from. Returns ------- ParserResults A dataclass that stores the parsed artifact uuids, the parsed networkx graph, the provenance-is-valid flag, and the checksum diff. ''' return ParserResults( copy.deepcopy(dag._parsed_artifact_uuids), copy.deepcopy(dag.dag), copy.deepcopy(dag.provenance_is_valid), copy.deepcopy(dag.checksum_diff), ) def parse_provenance(cfg: Config, payload: Any) -> ParserResults: ''' Parses some data payload into a ParserResults object ingestible by ProvDAG. Parameters ---------- cfg : Config A dataclass that stores four boolean flags: whether to perform checksum validation, whether to parse study metadata, whether to recursively parse nested directories, and whether to enable verbose mode. payload : Any The payload to attempt to parse, commonly a path to an archive or directory containing archives. Returns ------- ParserResults A dataclass that stores the parsed artifact uuids, the parsed networkx graph, the provenance-is-valid flag, and the checksum diff. ''' parser = select_parser(payload) return parser.parse_prov(cfg, payload) def select_parser(payload: Any) -> Parser: ''' Attempts to find a parser that can handle some given payload. Parameters ---------- payload : Any The payload for which to find a parser. Returns ------- Parser The appropriate Parser for the payload type. Raises ------ UnparseableDataError If no appropriate parser could be found for the payload. ''' _PARSER_TYPE_REGISTRY = [ ArchiveParser, DirectoryParser, ProvDAGParser, EmptyParser ] accepted_data_types = [ parser.accepted_data_types for parser in _PARSER_TYPE_REGISTRY ] optional_parser = None errors = [] for parser in _PARSER_TYPE_REGISTRY: try: optional_parser = parser.get_parser(payload) if optional_parser is not None: return optional_parser except Exception as e: errors.append(e) err_msg = ( f'Input data {payload} is not supported.\n' 'Parsers are available for the following data types: ' f'{accepted_data_types}.\n' 'The following errors were caught while trying to identify a parser ' 'that can_handle this input data:\n' ) for e in errors: err_msg += str(type(e)) + str(e) + '\n' raise UnparseableDataError(err_msg) class UnparseableDataError(Exception): pass qiime2-2024.5.0/qiime2/core/archive/provenance_lib/replay.py000066400000000000000000001450461462552636000235500ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import bibtexparser as bp from bibtexparser.bwriter import BibTexWriter import networkx as nx import os import pathlib import pkg_resources import shutil import tempfile from uuid import uuid4 from dataclasses import dataclass, field from typing import Dict, Iterator, List, Optional, Tuple, Union from .archive_parser import ProvNode from .parse import ProvDAG from .usage_drivers import build_header, build_footer from ..provenance import MetadataInfo from qiime2.sdk import PluginManager from qiime2.sdk.usage import Usage, UsageVariable from qiime2.sdk.util import camel_to_snake @dataclass class ReplayConfig(): ''' Dataclass that stores various user-selected configuration options and other bits of information relevant to provenance replay. Parameters ---------- use : Usage The usage driver to be used for provenance replay. dump_recorded_metadata : bool If True, replay should write the metadata recorded in provenance to disk in .tsv format. use_recorded_metadata : bool If True, replay should use the metadata recorded in provenance. pm : PluginManager The active instance of the QIIME 2 PluginManager. md_context_has_been_printed : bool A flag set by default and used internally, allows context to be printed once and only once. no_provenance_context_has_been_printed : bool Indicates whether the no-provenance context documentation has been printed. header : bool If True, an introductory how-to header should be rendered in the script. verbose : bool If True, progress is reported to stdout. md_out_dir : str The directory where caputred metadata should be written. ''' use: Usage dump_recorded_metadata: bool = True use_recorded_metadata: bool = False pm: PluginManager = PluginManager() md_context_has_been_printed: bool = False no_provenance_context_has_been_printed: bool = False header: bool = True verbose: bool = False md_out_dir: str = '' @dataclass class ActionCollections(): ''' std_actions are all normal provenance-tracked q2 actions, arranged as: { : { : 'output_name', 'output_name_2' }, : ... } no_provenance_nodes can't be organized by action, and in some cases we don't know anything but UUID for them, so we can fit these in a list. ''' std_actions: Dict[str, Dict[str, str]] = field(default_factory=dict) no_provenance_nodes: List[str] = field(default_factory=list) @dataclass class UsageVariableRecord: name: str variable: UsageVariable = None @dataclass class ResultCollectionRecord: collection_uuid: str members: Dict[str, str] class ReplayNamespaces: ''' A dataclass collection of objects that each track some useful bit of information relevant to replay/usage namespaces. Attributes ---------- _usg_var_ns : dict The central usage variable namespace that ensures no namespace clashes. Maps artifact uuid to `UsageVariableRecord`. _action_ns : set of str A collection of unique action strings that look like `{plugin}_{action}_{sequential int}`. result_collection_ns : dict Used to keep track of result collection members during usage rendering. Structure is as follows: { action-id: { output-name: { 'collection_uuid': uuid, 'artifacts': { uuid: key-in-collection, (...), } }, (...), } } where the action-id and the output-name uniquely identify a result collection that came from some action, the `collection_uuid` key stores a uuid for the entire collection needed for querying the `usg_var_namespace`, and the `artifacts` key stores all result collection members along with their keys so they can be accessed properly. ''' def __init__(self, dag=None): self._usg_var_ns = {} self._action_ns = set() if dag: self.result_collection_ns = \ self.make_result_collection_namespace(dag) self.artifact_uuid_to_rc_uuid, self.rc_contents_to_rc_uuid = \ self.make_result_collection_mappings() else: self.result_collection_ns = {} self.artifact_uuid_to_rc_uuid = {} self.rc_contents_to_rc_uuid = {} def add_usg_var_record(self, uuid, name, variable=None): ''' Given a uuid, name, and optionally a usage variable, create a usage variable record and add it to the namespace. Parameters ---------- uuid : str The uuid of the the artifact or result collection. name : str The not-yet-unique name of the artifact or result collection. variable : UsageVariable or None The optional UsageVariable instance to add to the record. Returns ------- str The now-unique name of the artifact or result collection. ''' unique_name = self._make_unique_name(name) self._usg_var_ns[uuid] = UsageVariableRecord(unique_name, variable) return unique_name def update_usg_var_record(self, uuid, variable): ''' Given a uuid update the record to contain the passed usage variable. The record is assumed to already be present in the namespace. Parameters ---------- uuid : str The uuid of the artifact or result collection for which to update the usage variable instance. variable : UsageVariable The usage variable to add to the record. ''' self._usg_var_ns[uuid].variable = variable def get_usg_var_record(self, uuid): ''' Given a uuid, return the corresponding usage variable record, or none if the uuid is not in the namespace. Parameters ---------- uuid : str The uuid of the artifact or result collection for which to return the record. Returns ------- UsageVariableRecord or None The record if the uuid was found, otherwise None. ''' try: return self._usg_var_ns[uuid] except KeyError: return None def get_usg_var_uuid(self, name: str) -> str: ''' Given a usage variable name, return its uuid, or raise KeyError if the name is not in the namespace. Parameters ---------- name : str The name of the usage variable record of interest. Returns ------- str The corresponding uuid of the record. Raises ------ KeyError If the name is not found in the namespace. ''' for uuid, record in self._usg_var_ns.items(): if name == record.name: return uuid raise KeyError( f'The queried name \'{name}\' does not exist in the namespace.' ) def _make_unique_name(self, name: str) -> str: ''' Appends `_` to name, such that the returned name won't collide with any variable names that already exist in `usg_var_ns`. Parameters ---------- name : str The variable name to make unique. Returns ------- str The unique integer-appended variable name. ''' counter = 0 unique_name = f'{name}_{counter}' names = [record.name for record in self._usg_var_ns.values()] # no-provenance nodes are stored with angle brackets around them while unique_name in names or f'<{unique_name}>' in names: counter += 1 unique_name = f'{name}_{counter}' return unique_name def make_result_collection_namespace(self, dag: nx.digraph) -> dict: ''' Constructs the result collections namespaces from the parsed digraph. Parameters ---------- dag : nx.digraph The digraph representing the parsed provenance. Returns ------- dict The result collection namespace. ''' rc_ns = {} for node in dag: provnode = dag.get_node_data(node) rc_key = provnode.action.result_collection_key if rc_key: # output result collection action_id = provnode.action.action_id output_name = provnode.action.output_name if action_id not in rc_ns: rc_ns[action_id] = {} if output_name not in rc_ns[action_id]: artifacts = {rc_key: provnode._uuid} rc_ns[action_id][output_name] = ResultCollectionRecord( collection_uuid=str(uuid4()), members=artifacts ) else: rc_ns[action_id][output_name].members[rc_key] = \ provnode._uuid return rc_ns def make_result_collection_mappings(self) -> Tuple[Dict]: ''' Builds two mappings: - one from artifact uuid to a tuple of the uuid of the result collection of which it is a member and its key in the collection - one from the hash of the result collection contents (both with and without keys) to the uuid of the result collection Returns ------- tuple of dict The two result collection mappings. ''' a_to_c = {} # artifact uuid -> collection uuid c_to_c = {} # hash of collection contents -> collection uuid for action_id in self.result_collection_ns: for output_name in self.result_collection_ns[action_id]: record = self.result_collection_ns[action_id][output_name] for key, uuid in record.members.items(): a_to_c[uuid] = (record.collection_uuid, key) hashed_contents = self.hash_result_collection(record.members) hashed_contents_with_keys = \ self.hash_result_collection_with_keys(record.members) c_to_c[hashed_contents] = record.collection_uuid c_to_c[hashed_contents_with_keys] = record.collection_uuid return a_to_c, c_to_c def hash_result_collection_with_keys(self, members: Dict) -> int: ''' Hashes the contents of a result collection. Useful for finding corresponding usage variables when rendering the replay of result collections. Order of the input result collection is not taken into account (the result collections are ordered alphabetically by key). Parameters ---------- members : dict The contents of a result collection, looks like: { 'a': some-uuid, 'b': some-other-uuid, (...) } Returns ------- int The hashed contents. ''' sorted_members = {key: members[key] for key in sorted(members)} hashable_members_with_keys = tuple( (key, value) for key, value in sorted_members.items() ) return hash(hashable_members_with_keys) def hash_result_collection(self, members: Union[Dict, List]) -> int: ''' Hashes a list of uuids. Useful for finding corresponding result collections that may have been cast to list of uuids. If a dict is input it is first converted to a list of values (uuids). Parameters ---------- members : dict or list The contents of a result collection, either as a dict or list. Returns ------- int The hashed contents. ''' if type(members) is dict: members = list(members.values()) sorted_members = list(sorted(members)) hashable_members = tuple(uuid for uuid in sorted_members) return hash(hashable_members) def add_rc_member_to_ns(self, uuid, name, use): ''' Accesses a result collection member of interest and adds it the central usage variable namespace. Parameters ---------- uuid : str The uuid of the artifact of interest. name : str The desired name of the to-be-made usage variable. use : Usage The currently executing usage driver. ''' collection_uuid, key = self.artifact_uuid_to_rc_uuid[uuid] collection_var = self.get_usg_var_record(collection_uuid).variable var_name = self.add_usg_var_record(uuid, name) usg_var = use.get_artifact_collection_member( var_name, collection_var, key ) self.update_usg_var_record(uuid, usg_var) def uniquify_action_name(self, plugin: str, action: str) -> str: ''' Creates a unique name by concatenating plugin, action, and a counter, and adds this name to _action_ns before returning it. Parameters ---------- plugin : str The name of the plugin. action : str The name of the action. action_namespace : set of str The collection of unqiue action names. Returns ------- str The unique action name. ''' counter = 0 plg_action_name = f'{plugin}_{action}_{counter}' while plg_action_name in self._action_ns: counter += 1 plg_action_name = f'{plugin}_{action}_{counter}' self._action_ns.add(plg_action_name) return plg_action_name def replay_provenance( usage_driver: Usage, payload: Union[str, ProvDAG], out_fp: str, validate_checksums: bool = True, parse_metadata: bool = True, recurse: bool = False, use_recorded_metadata: bool = False, suppress_header: bool = False, verbose: bool = False, dump_recorded_metadata: bool = True, md_out_dir: str = '' ): ''' Renders usage examples describing a ProvDAG, producing an interface- specific executable. ProvDAG inputs retain their original config values. The `validate_checksums`, `parse_metadata`, `recurse`, and `verbose` parameters are disregarded if the payload is a ProvDAG. Parameters ---------- usage_driver : Usage The type of Usage driver to be used. Currently intended to be either `ReplayPythonUsage` or `ReplayCLIUsage`. payload : str or ProvDAG A filepath to an artifact or directory containing artifacts, or the ProvDAG to be parsed. out_fp : str The filepath at which to write the rendered executable. validate_checksums : bool Whether to perform checksum validation on the input artifact. parse_metadata : bool Whether to parse study metadata recorded in provenance. recurse : bool Whether to recursively parse nested directories containing artifacts. use_recorded_metadata : bool Whether to use the metadata recorded in provenance. suppress_header : bool Whether to forgo rendering the header and footer that are included by default in replay scripts. verbose : bool Whether to print status messages during processing. dump_recorded_metadata : bool Whether to write the metadata recorded in provenance to disk. md_out_dir : str The directory in which to write the recorded metadata if desired. ''' if type(payload) is ProvDAG: parse_metadata = payload.cfg.parse_study_metadata if not parse_metadata: if use_recorded_metadata: raise ValueError( 'Metadata not parsed for replay. Re-run with parse_metadata, ' 'or set use_recorded_metadata to False.' ) if dump_recorded_metadata: raise ValueError( 'Metadata not parsed, so cannot be written to disk. Re-run ' 'with parse_metadata, or set dump_recorded_metadata to False.' ) if md_out_dir: raise ValueError( 'Metadata not parsed, so cannot be written to disk. Re-run ' 'with parse_metadata, or do not pass a metadata output ' 'filepath argument.' ) if use_recorded_metadata and not dump_recorded_metadata: raise NotImplementedError( 'In order to produce a replay script that uses metadata ' 'captured in provenance, that metadata must first be written to ' 'disk. Re-run with dump-recorded-metadata set to True, or ' 'use-recorded-metadata set to False.' ) dag = ProvDAG( payload, validate_checksums, parse_metadata, recurse, verbose ) cfg = ReplayConfig( use=usage_driver(), use_recorded_metadata=use_recorded_metadata, dump_recorded_metadata=dump_recorded_metadata, verbose=verbose, md_out_dir=md_out_dir ) ns = ReplayNamespaces(dag) build_usage_examples(dag, cfg, ns) if not suppress_header: cfg.use.build_header() cfg.use.build_footer(dag) if cfg.dump_recorded_metadata: print('metadata written to recorded_metadata/') output = cfg.use.render(flush=True) with open(out_fp, mode='w') as out_fh: out_fh.write(output) def build_usage_examples( dag: ProvDAG, cfg: ReplayConfig, ns: ReplayNamespaces ): ''' Builds a chained usage example representing the analysis `dag`. Parameters ---------- dag : ProvDAG The dag representation of parsed provenance. cfg : ReplayConfig Replay configuration options. ns : ReplayNamespaces Info tracking usage and result collection namespaces. ''' sorted_nodes = nx.topological_sort(dag.collapsed_view) actions = group_by_action(dag, sorted_nodes, ns) for node_id in actions.no_provenance_nodes: node = dag.get_node_data(node_id) build_no_provenance_node_usage(node, node_id, ns, cfg) for action_id in (std_actions := actions.std_actions): # we are replaying actions not nodes, so any associated node works try: some_node_id = next(iter(std_actions[action_id])) node = dag.get_node_data(some_node_id) except KeyError: # we have result collection some_output_name = next(iter(ns.result_collection_ns[action_id])) some_node_id = next(iter( ns.result_collection_ns[action_id][ some_output_name].members.values() )) node = dag.get_node_data(some_node_id) if node.action.action_type == 'import': build_import_usage(node, ns, cfg) else: build_action_usage(node, ns, std_actions, action_id, cfg) def group_by_action( dag: ProvDAG, nodes: Iterator[str], ns: ReplayNamespaces ) -> ActionCollections: ''' This groups the nodes from a DAG by action, returning an ActionCollections aggregating the outputs related to each action. Takes an iterator of UUIDs, allowing us to influence the ordering of the grouping. In cases where a captured output_name is unavailable, we substitute the output data's Semantic Type, snake-cased because it will be used as a variable name if this data is rendered by ArtifactAPIUsage. Parameters ---------- dag : ProvDAG The dag representation of parsed provenance. nodes : iterator of str An iterator over node uuids. ns : ReplayNamespaces Info tracking usage and result collection namespaces. Returns ------- ActionCollections The outputs grouped by action. ''' actions = ActionCollections() for node_id in nodes: if dag.node_has_provenance(node_id): node = dag.get_node_data(node_id) action_id = node.action.action_id output_name = node.action.output_name if output_name is None: output_name = camel_to_snake(node.type) if node.action.result_collection_key: rc_record = ns.result_collection_ns[action_id][output_name] node_id = rc_record.collection_uuid if action_id not in actions.std_actions: actions.std_actions[action_id] = {node_id: output_name} else: # for result collections we overwrite this but don't care actions.std_actions[action_id][node_id] = output_name else: actions.no_provenance_nodes.append(node_id) return actions def build_no_provenance_node_usage( node: Optional[ProvNode], uuid: str, ns: ReplayNamespaces, cfg: ReplayConfig ): ''' Given a ProvNode (with no provenance), make sure comments will be rendered explaining this, add an empty usage variable to the namespace and log the node. Returns nothing, modifying the passed usage instance in place. Parameters ---------- node : ProvNode or None Either a no-provenance node, or None indicating that only `uuid` is available. uuid : str The uuid of the node/result. ns : ReplayNamespaces Info tracking usage and result collection namespaces. cfg : ReplayConfig Replay configuration options. Contains the modified usage driver. ''' if not cfg.no_provenance_context_has_been_printed: cfg.no_provenance_context_has_been_printed = True cfg.use.comment( 'One or more nodes have no provenance, so full replay is ' 'impossible. Any commands we were able to reconstruct have been ' 'rendered, with the string descriptions below replacing actual ' 'inputs.' ) cfg.use.comment( 'Original Node ID String Description' ) if node is None: # the node is a !no-provenance input and we have only UUID var_name = 'no-provenance-node' else: var_name = camel_to_snake(node.type) ns.add_usg_var_record(uuid, var_name) # make a usage variable for downstream consumption empty_var = cfg.use.usage_variable( ns.get_usg_var_record(uuid).name, lambda: None, 'artifact' ) ns.update_usg_var_record(uuid, empty_var) # log the no-prov node usg_var = ns.get_usg_var_record(uuid).variable cfg.use.comment(f"{uuid} {usg_var.to_interface_name()}") def build_import_usage( node: ProvNode, ns: ReplayNamespaces, cfg: ReplayConfig ): ''' Given a ProvNode, adds an import usage example for it, roughly resembling the below. Returns nothing, modifying the passed usage instance in place. raw_seqs = use.init_format('raw_seqs', lambda: None, ext='fastq.gz') imported_seqs = use.import_from_format( 'emp_single_end_sequences', 'EMPSingleEndSequences', raw_seqs ) The `lambda: None` is a placeholder for some actual data factory, and should not impact the rendered usage. Parameters ---------- node : ProvNode The imported node of interest. ns : ReplayNamespaces Info tracking usage and result collection namespaces. cfg : ReplayConfig Replay configuration options. Contains the modified usage driver. ''' format_id = node._uuid + '_f' ns.add_usg_var_record(format_id, camel_to_snake(node.type) + '_f') format_for_import = cfg.use.init_format( ns.get_usg_var_record(format_id).name, lambda: None ) var_name = ns.add_usg_var_record(node._uuid, camel_to_snake(node.type)) use_var = cfg.use.import_from_format( var_name, node.type, format_for_import ) ns.update_usg_var_record(node._uuid, use_var) def build_action_usage( node: ProvNode, ns: ReplayNamespaces, std_actions: Dict[str, Dict[str, str]], action_id: str, cfg: ReplayConfig ): ''' Adds an action usage example to `use` for some ProvNode. Returns nothing, modifying the passed usage instance in place. use.action( use.UsageAction(plugin_id='diversity_lib', action_id='pielou_evenness'), use.UsageInputs(table=ft), use.UsageOutputNames(vector='pielou_vector') ) Parameters ---------- node : ProvNode The node the creating action of which is of interest. ns : ReplayNamespaces Info tracking usage and result collection namespaces. std_actions : dict Expalained in ActionCollections. action_id : str The uuid of the action. cfg : ReplayConfig Replay configuration options. Contains the modified usage driver. ''' command_specific_md_context_has_been_printed = False plugin = node.action.plugin action = node.action.action_name plg_action_name = ns.uniquify_action_name(plugin, action) inputs = _collect_action_inputs(cfg.use, ns, node) # Process outputs before params so we can access the unique output name # from the namespace when dumping metadata to files below raw_outputs = std_actions[action_id].items() outputs = _uniquify_output_names(ns, raw_outputs) for param_name, param_val in node.action.parameters.items(): # We can currently assume that None arguments are only passed to params # as default values, so we can skip these parameters entirely in replay if param_val is None: continue if isinstance(param_val, MetadataInfo): unique_md_id = ns.get_usg_var_record(node._uuid).name \ + '_' + param_name md_fn = ns.add_usg_var_record( unique_md_id, camel_to_snake(param_name) ) if cfg.dump_recorded_metadata: md_with_ext = md_fn + '.tsv' dump_recorded_md_file( cfg, node, plg_action_name, param_name, md_with_ext ) if cfg.use_recorded_metadata: # the local dir and fp where md will be saved (if at all) is: md_fn = f'{plg_action_name}/{md_fn}' md = init_md_from_recorded_md( node, param_name, unique_md_id, ns, cfg, md_fn ) else: if not cfg.md_context_has_been_printed: cfg.md_context_has_been_printed = True cfg.use.comment( "Replay attempts to represent metadata inputs " "accurately, but metadata .tsv files are merged " "automatically by some interfaces, rendering " "distinctions between file inputs invisible in " "provenance. We output the recorded metadata to disk " "to enable visual inspection.") if not command_specific_md_context_has_been_printed: if cfg.md_out_dir: fp = f'{cfg.md_out_dir}/{plg_action_name}' else: fp = f'./recorded_metadata/{plg_action_name}/' cfg.use.comment( "The following command may have received additional " "metadata .tsv files. To confirm you have covered " "your metadata needs adequately, review the original " f"metadata, saved at '{fp}'") if not param_val.input_artifact_uuids: md = init_md_from_md_file( node, param_name, unique_md_id, ns, cfg ) else: md = init_md_from_artifacts(param_val, ns, cfg) param_val = md inputs.update({param_name: param_val}) usg_var = cfg.use.action( cfg.use.UsageAction(plugin_id=plugin, action_id=action), cfg.use.UsageInputs(**inputs), cfg.use.UsageOutputNames(**outputs) ) # add the usage variable(s) to the namespace for res in usg_var: uuid_key = ns.get_usg_var_uuid(res.name) ns.update_usg_var_record(uuid_key, res) def _collect_action_inputs( use: Usage, ns: ReplayNamespaces, node: ProvNode ) -> dict: ''' Returns a dict containing the action Inputs for a ProvNode. Dict structure: {input_name: input_var} or {input_name: [input_var1, ...]}. Parameters ---------- use : Usage The currently executing usage driver. ns : ReplayNamespaces Info tracking usage and result collection namespaces. node : ProvNode The node the creating action of which's inputs are of interest. Returns ------- dict Mapping input names to their corresponding usage variables. ''' inputs_dict = {} for input_name, input_value in node.action.inputs.items(): # Currently we can only have a None as a default value, so we can skip # this as it was not provided if input_value is None: continue # Received a single artifact if type(input_value) is str: if ns.get_usg_var_record(input_value) is None: ns.add_rc_member_to_ns(input_value, input_name, use) resolved_input = ns.get_usg_var_record(input_value).variable # Received a list of artifacts elif type(input_value) is list: # may be rc cast to list so search for equivalent rc # if not then follow algorithm for single str for each input_hash = ns.hash_result_collection(input_value) if collection_uuid := ns.rc_contents_to_rc_uuid.get(input_hash): # corresponding rc found resolved_input = ns.get_usg_var_record( collection_uuid ).variable else: # find each artifact and assemble into a list input_list = [] for input_value in input_value: if ns.get_usg_var_record(input_value) is None: ns.add_rc_member_to_ns(input_value, input_name, use) input_list.append( ns.get_usg_var_record(input_value).variable ) resolved_input = input_list # Received a dict of artifacts (ResultCollection) elif type(input_value) is dict: # search for equivalent rc if not found then create new rc rc = input_value input_hash = ns.hash_result_collection_with_keys(rc) if collection_uuid := ns.rc_contents_to_rc_uuid.get(input_hash): # corresponding rc found resolved_input = ns.get_usg_var_record( collection_uuid ).variable else: # build new rc new_rc = {} for key, input_value in rc.items(): if ns.get_usg_var_record(input_value) is None: ns.add_rc_member_to_ns(input_value, input_name, use) new_rc[key] = ns.get_usg_var_record(input_value).variable # make new rc usg var new_collection_uuid = uuid4() var_name = ns.add_usg_var_record( new_collection_uuid, input_name ) usg_var = use.construct_artifact_collection(var_name, new_rc) ns.update_usg_var_record(new_collection_uuid, usg_var) resolved_input = ns.get_usg_var_record( new_collection_uuid ).variable # If we ever mess with inputs again and add a new type here this should # trip otherwise we should never see it else: msg = f"Got a '{input_value}' as input which is of type" \ f" '{type(input_value)}'. Supported types are str, list," \ " and dict." raise ValueError(msg) inputs_dict[input_name] = resolved_input return inputs_dict def _uniquify_output_names( ns: ReplayNamespaces, raw_outputs: dict ) -> dict: ''' Returns a dict containing the uniquified output names from a ProvNode. Dict structure: {output_name: uniquified_output_name}. Parameters ---------- ns : ReplayNamespaces Info tracking usage and result collection namespaces. raw_outputs : dict Mapping of node uuid to output-name as seen in action.yaml. Returns ------- dict Mapping of original output-name to output-name after being made unique. ''' outputs = {} for uuid, output_name in raw_outputs: var_name = ns.add_usg_var_record(uuid, output_name) outputs.update({output_name: var_name}) return outputs def init_md_from_recorded_md( node: ProvNode, param_name: str, md_id: str, ns: ReplayNamespaces, cfg: ReplayConfig, md_fn: str ) -> UsageVariable: ''' Initializes and returns a Metadata UsageVariable from Metadata parsed from provenance. Parameters ---------- node : ProvNode The node the creating action of which was passed metadata. param_name : str The name of the parameter to which metadata was passed. md_id : str Looks like: f'{node uuid}_{param name}'. ns : ReplayNamespaces Namespaces associated with provenance replay. cfg : ReplayConfig Replay configuration options. Contains the executing usage driver. md_fn : str Looks like: f'{plugin}_{action}_{counter}/{unique param name}. Returns ------- UsageVariable Of type metadata or metadata column. Raises ------ ValueError If the node has no metadata. ''' if not node.metadata: raise ValueError( 'This function should only be called if the node has metadata.' ) md_df = node.metadata[param_name] def factory(): from qiime2 import Metadata return Metadata(md_df) cwd = pathlib.Path.cwd() if cfg.md_out_dir: fn = str(cwd / cfg.md_out_dir / md_fn) else: fn = str(cwd / 'recorded_metadata' / md_fn) md = cfg.use.init_metadata( ns.get_usg_var_record(md_id).name, factory, dumped_md_fn=fn ) plugin = node.action.plugin action = node.action.action_name if param_is_metadata_column(cfg, param_name, plugin, action): mdc_id = node._uuid + '_mdc' mdc_name = ns.get_usg_var_record(md_id).name + '_mdc' var_name = ns.add_usg_var_record(mdc_id, mdc_name) md = cfg.use.get_metadata_column(var_name, '', md) return md def init_md_from_md_file( node: ProvNode, param_name: str, md_id: str, ns: ReplayNamespaces, cfg: ReplayConfig ) -> UsageVariable: ''' Initializes and returns a Metadata UsageVariable with no real data, mimicking a user passing md as a .tsv file. Parameters ---------- node : ProvNode The node the creating action of which was passed a metadata file. param_name : str The parameter name to which the metadata file was passed. md_id : str Looks like: f'{node uuid}_{param name}'. ns : ReplayNamespaces Namespaces associated with provenance replay. cfg : ReplayConfig Replay configuration options. Contains the executing usage driver. Returns ------- UsageVariable Of type metadata or metadata column. ''' plugin = node.action.plugin action = node.action.action_name md = cfg.use.init_metadata(ns.get_usg_var_record(md_id).name, lambda: None) if param_is_metadata_column(cfg, param_name, plugin, action): mdc_id = node._uuid + '_mdc' mdc_name = ns.get_usg_var_record(md_id).name + '_mdc' var_name = ns.add_usg_var_record(mdc_id, mdc_name) md = cfg.use.get_metadata_column(var_name, '', md) return md def init_md_from_artifacts( md_inf: MetadataInfo, ns: ReplayNamespaces, cfg: ReplayConfig ) -> UsageVariable: ''' Initializes and returns a Metadata UsageVariable with no real data, mimicking a user passing one or more QIIME 2 Artifacts as metadata. We expect these usage vars are already in the namespace as artifacts if we're reading them in as metadata. Parameters ---------- md_inf : MetadataInfo Named tuple with fields `input_artifact_uuids` which is a list of uuids and `relative_fp` which is the filename of the metadata file. These are parsed from a !metadata tag in action.yaml. ns : ReplayNamespaces Info tracking usage and result collection namespaces. cfg: ReplayConfig Replay configuration options. Contains the executing usage driver. Returns ------- UsageVariable Of type metadata. Raises ------ ValueError If no input artifact uuids are present in MetadataInfo. ''' if not md_inf.input_artifact_uuids: raise ValueError( 'This funtion should not be used if ' 'MetadataInfo.input_artifact_uuids is empty.' ) md_files_in = [] for artifact_uuid in md_inf.input_artifact_uuids: amd_id = artifact_uuid + '_a' var_name = ns.get_usg_var_record(artifact_uuid).variable.name + '_a' if ns.get_usg_var_record(amd_id) is None: var_name = ns.add_usg_var_record(amd_id, var_name) art_as_md = cfg.use.view_as_metadata( var_name, ns.get_usg_var_record(artifact_uuid).variable ) ns.update_usg_var_record(amd_id, art_as_md) else: art_as_md = ns.get_usg_var_record(amd_id).variable md_files_in.append(art_as_md) if len(md_inf.input_artifact_uuids) > 1: # we can't uniquify this normally, because one uuid can be merged with # combinations of others merge_id = '-'.join(md_inf.input_artifact_uuids) var_name = ns.add_usg_var_record(merge_id, 'merged_artifacts') merged_md = cfg.use.merge_metadata( var_name, *md_files_in ) ns.update_usg_var_record(merge_id, merged_md) return art_as_md def dump_recorded_md_file( cfg: ReplayConfig, node: ProvNode, action_name: str, md_id: str, fn: str ): ''' Writes one metadata DataFrame pointed to by `md_id` to a .tsv file. Each action gets its own directory containing relevant md files. Raises a ValueError if the node has no metadata Parameters ---------- cfg : ReplayConfig Replay configuration options. Contains the executing usage driver. node : ProvNode The node the creating action of which recorded metadata as input. Used here only to ensure that metadata was in fact recorded. action_name : str Looks like: f'{plugin}_{action}_{counter}'. md_id : str Looks like: f'{node uuid}_{param name}'. fn : str Looks like: f'{unique param name}.tsv'. Raises ------ ValueError If the passed node does not have metadata in its creating action. ''' if node.metadata is None: raise ValueError( 'This function should only be called if the node has metadata.' ) if cfg.md_out_dir: md_out_dir_base = pathlib.Path(cfg.md_out_dir) else: cwd = pathlib.Path.cwd() md_out_dir_base = cwd / 'recorded_metadata' action_dir = md_out_dir_base / action_name action_dir.mkdir(parents=True, exist_ok=True) md_df = node.metadata[md_id] out_fp = action_dir / (fn) md_df.to_csv(out_fp, sep='\t', index=False) def param_is_metadata_column( cfg: ReplayConfig, param: str, plugin: str, action: str ) -> bool: ''' Returns True if the parameter name `param` is registered as a MetadataColumn. Parameters ---------- cfg : ReplayConfig Replay configuration options. Contains the plugin manager object. param : str The name of the parameter of interest. plugin : str The plugin that the relevant action belongs to. action : str The action that has the parameter of interest. Returns ------- bool Indicating whether the parameter of interest is a MetadataColumn. Raises ------ KeyError - If the plugin of interest is not registered with the plugin manager. - If the action of interest is not registered with the plugin. - If the parameter is not in the signature of the action. ''' plugin = cfg.pm.get_plugin(id=plugin) try: action_f = plugin.actions[action] except KeyError: raise KeyError( f'No action registered with name {action} in plugin {plugin}.' ) try: param_spec = action_f.signature.parameters[param] except KeyError: raise KeyError( f'No parameter registered with name {param} in action {action}.' ) # HACK, but it works without relying on Q2's type system return 'MetadataColumn' in str(param_spec.qiime_type) def collect_citations( dag: ProvDAG, deduplicate: bool = True ) -> bp.bibdatabase.BibDatabase: ''' Returns a BibDatabase of all unique citations from a ProvDAG. If `deduplicate` is True references will be heuristically deduplicated. Parameters ---------- dag : ProvDAG The ProvDAG object whose nodes contain citations to collect. deduplicate : bool Whether to deduplicate redundant citations. Returns ------- bp.bibdatabase.BibDatabase A BibDatabase object containing the collected citations in bibtex format. ''' bdb = bp.bibdatabase.BibDatabase() citations = [] for node_uuid in dag: node = dag.get_node_data(node_uuid) # Skip no-prov nodes, which never have citations anyway if node is not None: node_citations = list(node.citations.values()) citations.extend(node_citations) if deduplicate: citations = dedupe_citations(citations) bdb.entries = citations return bdb class BibContent(): ''' A hashable data container capturing common bibtex fields Has many fields because keeping true duplicates is preferable to deduplicating two true non-duplicates. Parameters ---------- entry : dict A dictionary of bibtex entries. ''' def __init__(self, entry): self.title = entry.get('title'), self.author = entry.get('author'), self.journal = entry.get('journal') self.booktitle = entry.get('booktitle') self.year = entry.get('year') self.pages = entry.get('pages') def __eq__(self, other): return ( type(self) is type(other) and self.title == other.title and self.author == other.author and self.journal == other.journal and self.booktitle == other.booktitle and self.year == other.year and self.pages == other.pages ) def __hash__(self): return hash( str(self.title) + str(self.author) + str(self.journal) + str(self.journal) + str(self.booktitle) + str(self.year) + str(self.pages) ) def dedupe_citations(citations: List[Dict]) -> List[Dict]: ''' Deduplicates citations based on bibtex id, bibtex content, and DOI. Citations are not guaranteed to be truly unique after deduplicating based on these values. Ensures only one qiime2 framework citation. Parameters ---------- citations : list of dict The possibly redundant citations, each dict is a bibtex citation. Returns ------- list of dict The deduplicated citations. ''' deduped_citations = [] is_framework_cited = False id_set = set() doi_set = set() content_set = set() for entry in citations: citation_id = entry['ID'] if 'framework|qiime2' in citation_id: if not is_framework_cited: root = pkg_resources.resource_filename('qiime2', '.') root = os.path.abspath(root) path = os.path.join(root, 'citations.bib') with open(path) as bibtex_file: q2_entry = bp.load(bibtex_file).entries.pop() q2_entry['ID'] = citation_id id_set.add(citation_id) deduped_citations.append(q2_entry) is_framework_cited = True continue # dedupe on id if citation_id in id_set: continue # dedupe on content entry_content = BibContent(entry) if entry_content in content_set: continue else: content_set.add(entry_content) # dedupe on doi if present doi = entry.get('doi') if doi is None: id_set.add(citation_id) deduped_citations.append(entry) elif doi not in doi_set: id_set.add(citation_id) doi_set.add(doi) deduped_citations.append(entry) return deduped_citations def replay_citations( dag: ProvDAG, out_fp: str, deduplicate: bool = True, suppress_header: bool = False ): ''' Writes a bibtex file containing all citations from a ProvDAG to disk. If `deduplicate` is True citations will be deduplicated, see `dedupe_citations()` for details. Parameters ---------- dag : ProvDAG The provenance graph from which to collect citations. out_fp : str The filepath to which to write the citations. deduplicate : bool Whether to deduplicate the collected citations. suppress_header : bool Whether to forgo adding a header and footer to the output file. ''' bib_db = collect_citations(dag, deduplicate=deduplicate) boundary = '#' * 79 header = [] footer = [] extra = [ '', '# This bibtex-formatted citation file can be imported into ' 'popular citation ', '# managers like Zotero and Mendeley, simplifying management and ' 'formatting.' ] if not suppress_header: header = build_header(boundary=boundary, extra_text=extra) + ['\n'] footer = build_footer(dag=dag, boundary=boundary) if bib_db.entries_dict == {}: bib_db = 'No citations were registered to the used Actions.' with open(out_fp, 'w') as bibfile: bibfile.write(bib_db) else: with open(out_fp, 'w') as bibfile: bibfile.write('\n'.join(header)) bibfile.write(BibTexWriter().write(bib_db)) bibfile.write('\n'.join(footer)) def replay_supplement( usage_drivers: List[Usage], payload: Union[str, ProvDAG], out_fp: str, validate_checksums: bool = True, parse_metadata: bool = True, use_recorded_metadata: bool = False, recurse: bool = False, deduplicate: bool = True, suppress_header: bool = False, verbose: bool = True, dump_recorded_metadata: bool = True ): ''' Produces a zipfile package of useful documentation for in silico reproducibility of some QIIME 2 Result(s) from a ProvDAG, a QIIME 2 Artifact, or a directory of Artifacts. Package includes: - replay scripts for all supported interfaces - a bibtex-formatted collection of all citations ProvDAG inputs retain their original config values. The `validate_checksums`, `parse_metadata`, `recurse`, and `verbose` parameters are disregarded if the payload is a ProvDAG. Parameters ---------- usage_drivers : list of Usage The types of Usage drivers to use. Currently intended to consist of `ReplayPythonUsage`, `ReplayCLIUsage`. payload : str or ProvDAG A filepath to an artifact or directory containing artifacts, or the ProvDAG to be parsed. out_fp : str The filepath to which to write the zip file. validate_checksums : bool Whether to perform checksum validation on the input artifact. parse_metadata : bool Whether to parse study metadata recorded in provenance. use_recorded_metadata : bool Whether to use the metadata recorded in provenance. recurse : bool Whether to recursively parse nested directories containing artifacts. deduplicate : bool Whether to deduplicate citations collected from provenance. suppress_header : bool Whether to forgo rendering the header and footer that are included by default in replay scripts. verbose : bool Whether to print status messages during processing. dump_recorded_metadata : bool Whether to write the metadata recorded in provenance to disk. ''' dag = ProvDAG( artifact_data=payload, validate_checksums=validate_checksums, parse_metadata=parse_metadata, recurse=recurse, verbose=verbose ) with tempfile.TemporaryDirectory() as tempdir: tempdir_path = pathlib.Path(tempdir) arc_root = tempdir_path / pathlib.Path(out_fp).stem os.makedirs(arc_root) drivers_to_filenames = { 'ReplayPythonUsage': 'python3_replay.py', 'ReplayCLIUsage': 'cli_replay.sh', } for usage_driver in usage_drivers: if usage_driver.__name__ not in drivers_to_filenames: continue rel_fp = drivers_to_filenames[usage_driver.__name__] md_out_dir = arc_root / 'recorded_metadata' tmp_fp = arc_root / rel_fp replay_provenance( usage_driver=usage_driver, payload=dag, out_fp=str(tmp_fp), use_recorded_metadata=use_recorded_metadata, suppress_header=suppress_header, verbose=verbose, dump_recorded_metadata=dump_recorded_metadata, md_out_dir=md_out_dir ) print( f'The {usage_driver.__name__} replay script was written to ' f'{rel_fp}.' ) citations_fp = arc_root / 'citations.bib' replay_citations( dag, out_fp=str(citations_fp), deduplicate=deduplicate, suppress_header=suppress_header ) print('The citations bibtex file was written to citations.bib.') out_fp = pathlib.Path(os.path.realpath(out_fp)) if out_fp.suffix == '.zip': out_fp = out_fp.with_suffix('') shutil.make_archive(out_fp, 'zip', tempdir) print(f'The reproducibility package was written to {out_fp}.zip.') qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/000077500000000000000000000000001462552636000230325ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/__init__.py000066400000000000000000000005351462552636000251460ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/000077500000000000000000000000001462552636000237435ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/cite_many.zip000066400000000000000000000111061462552636000264360ustar00rootroot00000000000000PKqRQ ֖. cite_many.bibUT )g`Lg`ux dZvHW|H%sZO[rIjʛ9I I Lծe/z3J۫H#2ƍ IRGLILp?b~~*c_NLVyO/e.dk+S=Jjq-3nmZK+Uax#AqX"Uо?%~@U)MI~e*BF`*-= '~[%-g-;)_fvUw-trlpvMD)~-`࿒ h83ǡ]b: B;(/z7j!K5 %7%r>|tJdCtK0AYav 3U3 6Su[dbkjOh*_M VE7W67yitP1TQ`SHR˜w؃)x}?ARƹDyo33H|RJ)^eZr #u{2*OqAUZ=2 a jaPz0Au_@$"2WKf&@WC ~`2 Ʈꊍ&5F݃|w0G}!i~O ,Sj#Kp xܔ0;7Yy/ io?؛P[QqZP5UgD:GhdT"z,u娗tdLLɄ~\iʲIYhg[vI-eC(zo?-i13.TϺ?F{|_NLAGUR!gMz$mz1ezϹtzhqv;8F&q۬T鄙RKAY;yF29 J 8E*ZJsk24RlL /52 xf0%YUܔbA5!g82UxҶ$ACK0L{FSL8htV2QZJr S %H46A1k QN!VֳB 4I :2:-VchVS:qVUבPZ,Ra&Fҏ˲5יQqp(cKbbs:%5iLX`{a7PD51T/OQ = E[!CmHa!fvК6s-?#̂]'L $oQXg]XN`p,xfJra3r@3GkS ILLfq9Wfh[f)dmp4HC\ A56sJr@YJ'=;vQՀȃ+'>U +]z((>T_ga֊n֮~TŀYE#"V"l<ԑIJYHvÊ3#7qF.7J!o@hgFpVo?%iShgt>m68yR3sQUZX[sPh?>%N=BCTi.c.{,u0EJYWb\G7> Mn){d?ՙS9S#~CDcWB~]FWjC73,"$ ˮbB!HuT4 ^zڿ02WF:B\p6Dpd[iUٵjC)Ot6\!.*+UZh N *Nh| bٜIef$x^KO9XNQl֩Fcn`ù W|s͉rVGk+stoM%\0\}.k k8:<Mh*Z4UFa;kP6~0}y]nKbb[Z썺/^1SW V`gp`Vun?F"Bk-A~JVoPG`H$]dܫ pl*gq2YJ4]h f9xk_ D|fN.[sϧp(N1ys4بfJk:xMC8*saLBu) E[1RBWt=kX#\R޺w %cO .|RLẃc0j>h)C)G]пd/}MHyPHܐ0VJrȹtѩqY_ч2ӛܞС' v*TETwe`<Բ&y-}ic? <_Zhd $n[Ot=@ŻT"\] )] kd*NU|5˷E3kTf tiM=qZ9ԤH@ŭk30Ɩm,!mMmdM]x'.xt^6K ^JU|Ҥ5ۉߊwrX+~tazc MU^3b$[»0$2򊳙> f8aR-+Ҧ `% +ؖz~ .Spp804V\G[n1.oʛI? kDP3S-8b}vjޱ'D]>y:l7]&^2v^?1MiFڿG !Dd˾\9Rm^U_E}G_xSw(_{m8Ii &+>V^ O'/ϩoqz|8Gg?z}ПE&׷= w{;~st9X q^5ʗJ"&nZu=A7$>{bo2ߏGO.g^dOP9u(_s>NnN%+f[5"m8Ukb _L~/|1:hKz!o[:<:Ğh3E O _U;`WZ{3 j^|mUCc][5^u"S[UW]nMjD!4P|dQiӫnja9)\\*9Mb9 r nΩiuU\lZacTk.QS3Gf~j3MR3y^)O ﳮƹ2sz9cg/LyMűtMؠS!#6_}>&GAϊx܏,dcUz]]go'T^Qk[RF^󜂄ZV`!ƫJfNl# xpTmД`Mh܋L<NAV@fFܸvfרf%r*H 8xp3_DjXPkH0kJtϭ~9F{9[spK"luryA[ѯ1EpC`؇}5p"㽔88"F/Ev!4Eč{64OeCl2Ng%.Ĺ5+;,u9bcԼ >"VD/B3am?e|tn/Ȍ 9))+TDY#IOUC; 3Nc>ť5\BrFyoc] gEe5;;dMϔ'֐TJ#+q9DGZRuCÔL=j 7hQ_!ܥæCGҟ(;}()k6pSt*EYzKQ>ɥCn`vx3 bq^*`īrߒ̓$?(2n80^D=t1)PQQjh6K/22xPRJi3R3 ,Y&NjХSDh殱2ww%2vJ\\-_%,.^3GNxS/ȧb䊑>B^@9=P9cg=Ͽb@H*7jCy"K \gy}$R?UIG!9&v˹$)"#G!fH(N!aV.iWӮ;}&)Mi=^!5I}}V W3Þ貌ET֭૫J>vw;/l޼v?t^ "ɤނ>nh<6E\,4Gdjk얗S5 RCӀ*U=&1XU:;QF1SrtZ%%MKv{OmWV61{QQڃُ[[PKqRA cite_one.bibUTg`ux dPKRqiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/000077500000000000000000000000001462552636000270225ustar00rootroot000000000000005b929500-e4d6-4d3f-8f5f-93fd95d1117d/000077500000000000000000000000001462552636000335515ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1VERSION000066400000000000000000000000471462552636000346220ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117dQIIME 2 archive: 1 framework: 2017.2.0 data/000077500000000000000000000000001462552636000344625ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117dints.txt000066400000000000000000000000261462552636000361760ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000362110ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117duuid: 5b929500-e4d6-4d3f-8f5f-93fd95d1117d type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000357115ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117dVERSION000066400000000000000000000000471462552636000367620ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenanceQIIME 2 archive: 1 framework: 2017.2.0 action/000077500000000000000000000000001462552636000371665ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenanceaction.yaml000066400000000000000000000022531462552636000413310ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/actionexecution: uuid: f8c2524f-d01d-460b-9519-686612a683c4 runtime: start: 2023-07-26T15:13:13.650134-07:00 end: 2023-07-26T15:13:13.657998-07:00 duration: 7864 microseconds action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: 96c7ce7b-837f-4046-a826-44843c7e1846 - ints2: 96c7ce7b-837f-4046-a826-44843c7e1846 - ints3: 9e8ffbe3-9207-49f0-9137-45e333d63261 parameters: - int1: 7 - int2: 100 environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.2.0 plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 python-packages: qiime2: 2017.2.0 wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 artifacts/000077500000000000000000000000001462552636000376715ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance96c7ce7b-837f-4046-a826-44843c7e1846/000077500000000000000000000000001462552636000442665ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifacts96c7ce7b-837f-4046-a826-44843c7e1846/VERSION000066400000000000000000000000471462552636000453370ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsQIIME 2 archive: 1 framework: 2017.2.0 96c7ce7b-837f-4046-a826-44843c7e1846/action/000077500000000000000000000000001462552636000455435ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifacts96c7ce7b-837f-4046-a826-44843c7e1846/action/action.yaml000066400000000000000000000014301462552636000477020ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsexecution: uuid: 0dc30d09-3f48-4791-8186-3e4ef8b31ba9 runtime: start: 2023-07-26T15:13:13.634950-07:00 end: 2023-07-26T15:13:13.639495-07:00 duration: 4545 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.2.0 plugins: {} python-packages: qiime2: 2017.2.0 wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 96c7ce7b-837f-4046-a826-44843c7e1846/metadata.yaml000066400000000000000000000001411462552636000467260ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsuuid: 96c7ce7b-837f-4046-a826-44843c7e1846 type: IntSequence1 format: IntSequenceDirectoryFormat 9e8ffbe3-9207-49f0-9137-45e333d63261/000077500000000000000000000000001462552636000442565ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifacts9e8ffbe3-9207-49f0-9137-45e333d63261/VERSION000066400000000000000000000000471462552636000453270ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsQIIME 2 archive: 1 framework: 2017.2.0 9e8ffbe3-9207-49f0-9137-45e333d63261/action/000077500000000000000000000000001462552636000455335ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifacts9e8ffbe3-9207-49f0-9137-45e333d63261/action/action.yaml000066400000000000000000000014301462552636000476720ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsexecution: uuid: 1bbaa33e-8a99-4f7d-88f6-50ae6e992e98 runtime: start: 2023-07-26T15:13:13.644943-07:00 end: 2023-07-26T15:13:13.646660-07:00 duration: 1717 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.2.0 plugins: {} python-packages: qiime2: 2017.2.0 wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 9e8ffbe3-9207-49f0-9137-45e333d63261/metadata.yaml000066400000000000000000000001411462552636000467160ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenance/artifactsuuid: 9e8ffbe3-9207-49f0-9137-45e333d63261 type: IntSequence2 format: IntSequenceDirectoryFormat metadata.yaml000066400000000000000000000001411462552636000403510ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v1/5b929500-e4d6-4d3f-8f5f-93fd95d1117d/provenanceuuid: 5b929500-e4d6-4d3f-8f5f-93fd95d1117d type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/000077500000000000000000000000001462552636000270235ustar00rootroot00000000000000e01f0484-40d4-420e-adcf-ca9be58ed1ee/000077500000000000000000000000001462552636000340755ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2VERSION000066400000000000000000000000471462552636000351460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1eeQIIME 2 archive: 2 framework: 2017.9.0 data/000077500000000000000000000000001462552636000350065ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1eeints.txt000066400000000000000000000000261462552636000365220ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000365350ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1eeuuid: e01f0484-40d4-420e-adcf-ca9be58ed1ee type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000362355ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1eeVERSION000066400000000000000000000000471462552636000373060ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenanceQIIME 2 archive: 2 framework: 2017.9.0 action/000077500000000000000000000000001462552636000375125ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenanceaction.yaml000066400000000000000000000023161462552636000416550ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/actionexecution: uuid: c6faae37-2d5d-4182-ba9a-e41bdf393e33 runtime: start: 2023-07-26T15:19:08.493600-07:00 end: 2023-07-26T15:19:08.500628-07:00 duration: 7028 microseconds action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: 37db1974-0ff9-487b-8ac7-f611a5af5ad0 - ints2: 37db1974-0ff9-487b-8ac7-f611a5af5ad0 - ints3: 4dea926f-fe8c-463f-bfc8-c6335bea3384 parameters: - int1: 7 - int2: 100 output-name: concatenated_ints environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.9.0 plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2017.9.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 artifacts/000077500000000000000000000000001462552636000402155ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance37db1974-0ff9-487b-8ac7-f611a5af5ad0/000077500000000000000000000000001462552636000451475ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifacts37db1974-0ff9-487b-8ac7-f611a5af5ad0/VERSION000066400000000000000000000000471462552636000462200ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsQIIME 2 archive: 2 framework: 2017.9.0 37db1974-0ff9-487b-8ac7-f611a5af5ad0/action/000077500000000000000000000000001462552636000464245ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifacts37db1974-0ff9-487b-8ac7-f611a5af5ad0/action/action.yaml000066400000000000000000000014301462552636000505630ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsexecution: uuid: 1a13eb82-979f-4500-b38c-41cd035e23da runtime: start: 2023-07-26T15:19:08.479684-07:00 end: 2023-07-26T15:19:08.483349-07:00 duration: 3665 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.9.0 plugins: {} python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2017.9.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 37db1974-0ff9-487b-8ac7-f611a5af5ad0/metadata.yaml000066400000000000000000000001411462552636000476070ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsuuid: 37db1974-0ff9-487b-8ac7-f611a5af5ad0 type: IntSequence1 format: IntSequenceDirectoryFormat 4dea926f-fe8c-463f-bfc8-c6335bea3384/000077500000000000000000000000001462552636000453175ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifacts4dea926f-fe8c-463f-bfc8-c6335bea3384/VERSION000066400000000000000000000000471462552636000463700ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsQIIME 2 archive: 2 framework: 2017.9.0 4dea926f-fe8c-463f-bfc8-c6335bea3384/action/000077500000000000000000000000001462552636000465745ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifacts4dea926f-fe8c-463f-bfc8-c6335bea3384/action/action.yaml000066400000000000000000000014301462552636000507330ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsexecution: uuid: 631c3001-67b4-4373-9242-8016d5b27d4b runtime: start: 2023-07-26T15:19:08.488210-07:00 end: 2023-07-26T15:19:08.490148-07:00 duration: 1938 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2017.9.0 plugins: {} python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2017.9.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 4dea926f-fe8c-463f-bfc8-c6335bea3384/metadata.yaml000066400000000000000000000001431462552636000477610ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenance/artifactsuuid: 4dea926f-fe8c-463f-bfc8-c6335bea3384 type: IntSequence2 format: IntSequenceV2DirectoryFormat metadata.yaml000066400000000000000000000001411462552636000406750ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v2/e01f0484-40d4-420e-adcf-ca9be58ed1ee/provenanceuuid: e01f0484-40d4-420e-adcf-ca9be58ed1ee type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/000077500000000000000000000000001462552636000270245ustar00rootroot00000000000000aa960110-4069-4b7c-97a3-8a768875e515/000077500000000000000000000000001462552636000332445ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3VERSION000066400000000000000000000000471462552636000343150ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515QIIME 2 archive: 3 framework: 2018.2.0 data/000077500000000000000000000000001462552636000341555ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515ints.txt000066400000000000000000000000261462552636000356710ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000357040ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515uuid: aa960110-4069-4b7c-97a3-8a768875e515 type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000354045ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515VERSION000066400000000000000000000000471462552636000364550ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenanceQIIME 2 archive: 3 framework: 2018.2.0 action/000077500000000000000000000000001462552636000366615ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenanceaction.yaml000066400000000000000000000023161462552636000410240ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/actionexecution: uuid: 28536786-c8f5-46c1-839c-8961c0747bdd runtime: start: 2023-07-26T15:20:54.661709-07:00 end: 2023-07-26T15:20:54.669409-07:00 duration: 7700 microseconds action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: c3905e60-2702-418a-8527-7e9b799f32eb - ints2: c3905e60-2702-418a-8527-7e9b799f32eb - ints3: cfea882d-0b50-44af-97d9-b0b71ecae73e parameters: - int1: 7 - int2: 100 output-name: concatenated_ints environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2018.2.0 plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2018.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 artifacts/000077500000000000000000000000001462552636000373645ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenancec3905e60-2702-418a-8527-7e9b799f32eb/000077500000000000000000000000001462552636000437475ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsc3905e60-2702-418a-8527-7e9b799f32eb/VERSION000066400000000000000000000000471462552636000450200ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsQIIME 2 archive: 3 framework: 2018.2.0 c3905e60-2702-418a-8527-7e9b799f32eb/action/000077500000000000000000000000001462552636000452245ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsc3905e60-2702-418a-8527-7e9b799f32eb/action/action.yaml000066400000000000000000000014301462552636000473630ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsexecution: uuid: a0ba8e1b-bf7b-46ec-b4a1-88fccfc3482a runtime: start: 2023-07-26T15:20:54.646821-07:00 end: 2023-07-26T15:20:54.651089-07:00 duration: 4268 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2018.2.0 plugins: {} python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2018.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 c3905e60-2702-418a-8527-7e9b799f32eb/metadata.yaml000066400000000000000000000001411462552636000464070ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsuuid: c3905e60-2702-418a-8527-7e9b799f32eb type: IntSequence1 format: IntSequenceDirectoryFormat cfea882d-0b50-44af-97d9-b0b71ecae73e/000077500000000000000000000000001462552636000445305ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactscfea882d-0b50-44af-97d9-b0b71ecae73e/VERSION000066400000000000000000000000471462552636000456010ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsQIIME 2 archive: 3 framework: 2018.2.0 cfea882d-0b50-44af-97d9-b0b71ecae73e/action/000077500000000000000000000000001462552636000460055ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactscfea882d-0b50-44af-97d9-b0b71ecae73e/action/action.yaml000066400000000000000000000014301462552636000501440ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsexecution: uuid: 84428784-85be-46d9-a3c9-1b580e42eb74 runtime: start: 2023-07-26T15:20:54.655949-07:00 end: 2023-07-26T15:20:54.658281-07:00 duration: 2332 microseconds action: type: import environment: platform: macosx-10.6-x86_64 python: |- 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] framework: 2018.2.0 plugins: {} python-packages: wheel: 0.37.1 tzlocal: '2.1' six: 1.16.0 setuptools: 40.2.0 qiime2: 2018.2.0 PyYAML: 5.3.1 pytz: '2023.3' python-dateutil: 2.8.2 pip: 10.0.1 pandas: 0.25.3 numpy: 1.18.5 decorator: 5.1.1 certifi: 2020.6.20 cfea882d-0b50-44af-97d9-b0b71ecae73e/metadata.yaml000066400000000000000000000001431462552636000471720ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenance/artifactsuuid: cfea882d-0b50-44af-97d9-b0b71ecae73e type: IntSequence2 format: IntSequenceV2DirectoryFormat metadata.yaml000066400000000000000000000001411462552636000400440ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v3/aa960110-4069-4b7c-97a3-8a768875e515/provenanceuuid: aa960110-4069-4b7c-97a3-8a768875e515 type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/000077500000000000000000000000001462552636000270255ustar00rootroot00000000000000856502cb-66f2-45aa-a86c-e484cc9bfd57/000077500000000000000000000000001462552636000337055ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4VERSION000066400000000000000000000000471462552636000347560ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57QIIME 2 archive: 4 framework: 2018.6.0 data/000077500000000000000000000000001462552636000346165ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57ints.txt000066400000000000000000000000261462552636000363320ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000363450ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57uuid: 856502cb-66f2-45aa-a86c-e484cc9bfd57 type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000360455ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57VERSION000066400000000000000000000000471462552636000371160ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenanceQIIME 2 archive: 4 framework: 2018.6.0 action/000077500000000000000000000000001462552636000373225ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenanceaction.yaml000066400000000000000000000052671462552636000414750ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/actionexecution: uuid: 386b6893-db57-4b71-8133-d44dcacafd07 runtime: start: 2023-07-26T15:28:51.561876-07:00 end: 2023-07-26T15:28:51.569243-07:00 duration: 7367 microseconds action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: a2465031-c2f6-4596-93ea-184a3602e460 - ints2: a2465031-c2f6-4596-93ea-184a3602e460 - ints3: c0888e03-83a6-4a21-b9b4-378f3e3f8341 parameters: - int1: 7 - int2: 100 output-name: concatenated_ints citations: - !cite 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0' transformers: inputs: ints1: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints2: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints3: - from: IntSequenceV2DirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.6.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.6.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.6.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 artifacts/000077500000000000000000000000001462552636000400255ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenancea2465031-c2f6-4596-93ea-184a3602e460/000077500000000000000000000000001462552636000443015ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsa2465031-c2f6-4596-93ea-184a3602e460/VERSION000066400000000000000000000000471462552636000453520ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsQIIME 2 archive: 4 framework: 2018.6.0 a2465031-c2f6-4596-93ea-184a3602e460/action/000077500000000000000000000000001462552636000455565ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsa2465031-c2f6-4596-93ea-184a3602e460/action/action.yaml000066400000000000000000000031701462552636000477200ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsexecution: uuid: 619b10aa-9c30-4f1c-b228-64e408ed2657 runtime: start: 2023-07-26T15:28:51.541334-07:00 end: 2023-07-26T15:28:51.544271-07:00 duration: 2937 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.6.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.6.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.6.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 a2465031-c2f6-4596-93ea-184a3602e460/citations.bib000066400000000000000000000042601462552636000467560ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifacts@article{framework|qiime2:2018.6.0|0, author = {Caporaso, J Gregory and Kuczynski, Justin and Stombaugh, Jesse and Bittinger, Kyle and Bushman, Frederic D and Costello, Elizabeth K and Fierer, Noah and Peña, Antonio Gonzalez and Goodrich, Julia K and Gordon, Jeffrey I and Huttley, Gavin A and Kelley, Scott T and Knights, Dan and Koenig, Jeremy E and Ley, Ruth E and Lozupone, Catherine A and McDonald, Daniel and Muegge, Brian D and Pirrung, Meg and Reeder, Jens and Sevinsky, Joel R and Turnbaugh, Peter J and Walters, William A and Widmann, Jeremy and Yatsunenko, Tanya and Zaneveld, Jesse and Knight, Rob}, doi = {10.1038/nmeth.f.303}, journal = {Nature methods}, number = {5}, pages = {335}, publisher = {Nature Publishing Group}, title = {QIIME allows analysis of high-throughput community sequencing data}, volume = {7}, year = {2010} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } a2465031-c2f6-4596-93ea-184a3602e460/metadata.yaml000066400000000000000000000001411462552636000467410ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsuuid: a2465031-c2f6-4596-93ea-184a3602e460 type: IntSequence1 format: IntSequenceDirectoryFormat c0888e03-83a6-4a21-b9b4-378f3e3f8341/000077500000000000000000000000001462552636000444575ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsc0888e03-83a6-4a21-b9b4-378f3e3f8341/VERSION000066400000000000000000000000471462552636000455300ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsQIIME 2 archive: 4 framework: 2018.6.0 c0888e03-83a6-4a21-b9b4-378f3e3f8341/action/000077500000000000000000000000001462552636000457345ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsc0888e03-83a6-4a21-b9b4-378f3e3f8341/action/action.yaml000066400000000000000000000045111462552636000500760ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsexecution: uuid: 4d4c89a8-f54a-4f55-9ae8-cdb5ba48575c runtime: start: 2023-07-26T15:28:51.556107-07:00 end: 2023-07-26T15:28:51.557702-07:00 duration: 1595 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceV2DirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.6.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.6.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.6.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 c0888e03-83a6-4a21-b9b4-378f3e3f8341/citations.bib000066400000000000000000000112411462552636000471310ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifacts@article{framework|qiime2:2018.6.0|0, author = {Caporaso, J Gregory and Kuczynski, Justin and Stombaugh, Jesse and Bittinger, Kyle and Bushman, Frederic D and Costello, Elizabeth K and Fierer, Noah and Peña, Antonio Gonzalez and Goodrich, Julia K and Gordon, Jeffrey I and Huttley, Gavin A and Kelley, Scott T and Knights, Dan and Koenig, Jeremy E and Ley, Ruth E and Lozupone, Catherine A and McDonald, Daniel and Muegge, Brian D and Pirrung, Meg and Reeder, Jens and Sevinsky, Joel R and Turnbaugh, Peter J and Walters, William A and Widmann, Jeremy and Yatsunenko, Tanya and Zaneveld, Jesse and Knight, Rob}, doi = {10.1038/nmeth.f.303}, journal = {Nature methods}, number = {5}, pages = {335}, publisher = {Nature Publishing Group}, title = {QIIME allows analysis of high-throughput community sequencing data}, volume = {7}, year = {2010} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4, author = {Witcombe, Brian and Meyer, Dan}, journal = {BMJ}, number = {7582}, pages = {1285--1287}, publisher = {British Medical Journal Publishing Group}, title = {Sword swallowing and its side effects}, volume = {333}, year = {2006} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5, author = {Reimers, Eigil and Eftestøl, Sindre}, journal = {Arctic, antarctic, and alpine research}, number = {4}, pages = {483--489}, publisher = {BioOne}, title = {Response behaviors of Svalbard reindeer towards humans and humans disguised as polar bears on Edgeøya}, volume = {44}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6, author = {Barbeito, Manuel S and Mathews, Charles T and Taylor, Larry A}, journal = {Applied microbiology}, number = {4}, pages = {899--906}, publisher = {Am Soc Microbiol}, title = {Microbiological laboratory hazard of bearded men}, volume = {15}, year = {1967} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8, author = {Silvers, Vicki L and Kreiner, David S}, journal = {Literacy Research and Instruction}, number = {3}, pages = {217--223}, publisher = {Taylor & Francis}, title = {The effects of pre-existing inappropriate highlighting on reading comprehension}, volume = {36}, year = {1997} } c0888e03-83a6-4a21-b9b4-378f3e3f8341/metadata.yaml000066400000000000000000000001431462552636000471210ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance/artifactsuuid: c0888e03-83a6-4a21-b9b4-378f3e3f8341 type: IntSequence2 format: IntSequenceV2DirectoryFormat citations.bib000066400000000000000000000050031462552636000405160ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenance@article{framework|qiime2:2018.6.0|0, author = {Caporaso, J Gregory and Kuczynski, Justin and Stombaugh, Jesse and Bittinger, Kyle and Bushman, Frederic D and Costello, Elizabeth K and Fierer, Noah and Peña, Antonio Gonzalez and Goodrich, Julia K and Gordon, Jeffrey I and Huttley, Gavin A and Kelley, Scott T and Knights, Dan and Koenig, Jeremy E and Ley, Ruth E and Lozupone, Catherine A and McDonald, Daniel and Muegge, Brian D and Pirrung, Meg and Reeder, Jens and Sevinsky, Joel R and Turnbaugh, Peter J and Walters, William A and Widmann, Jeremy and Yatsunenko, Tanya and Zaneveld, Jesse and Knight, Rob}, doi = {10.1038/nmeth.f.303}, journal = {Nature methods}, number = {5}, pages = {335}, publisher = {Nature Publishing Group}, title = {QIIME allows analysis of high-throughput community sequencing data}, volume = {7}, year = {2010} } @article{action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } metadata.yaml000066400000000000000000000001411462552636000405050ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v4/856502cb-66f2-45aa-a86c-e484cc9bfd57/provenanceuuid: 856502cb-66f2-45aa-a86c-e484cc9bfd57 type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/000077500000000000000000000000001462552636000270265ustar00rootroot0000000000000048af8384-2b0a-4b26-b85c-11b79c0d6ea6/000077500000000000000000000000001462552636000336065ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5VERSION000066400000000000000000000000501462552636000346510ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6QIIME 2 archive: 5 framework: 2018.11.0 checksums.md5000066400000000000000000000023031462552636000362000ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea69c23cd98b6f30e693ab31711771fff4e VERSION dce71be8cb19d6f40c1175d8bd457e64 metadata.yaml a1315b18ca0d82f18561154b797f3016 data/ints.txt 9c23cd98b6f30e693ab31711771fff4e provenance/VERSION 0d8079e65685e5e09096ff62fa517a82 provenance/citations.bib dce71be8cb19d6f40c1175d8bd457e64 provenance/metadata.yaml 21ba5d3ad907ecfc7560f73f07563d64 provenance/action/action.yaml 9c23cd98b6f30e693ab31711771fff4e provenance/artifacts/21b490ad-8364-493a-a89a-532c5efb308f/VERSION 653d5b94fdb0637bffeb7959115d3848 provenance/artifacts/21b490ad-8364-493a-a89a-532c5efb308f/citations.bib 81113f7abeb87d5bbed7456558b589c2 provenance/artifacts/21b490ad-8364-493a-a89a-532c5efb308f/metadata.yaml bec1a2f307a2d8c936225e9538b40894 provenance/artifacts/21b490ad-8364-493a-a89a-532c5efb308f/action/action.yaml 9c23cd98b6f30e693ab31711771fff4e provenance/artifacts/886c64b6-46ff-40d9-8bdb-bf4dc1338048/VERSION f9ca8ccd075f26a5a012385ea0f53e54 provenance/artifacts/886c64b6-46ff-40d9-8bdb-bf4dc1338048/citations.bib 07aaf26a88a8029a65baf233b656a0de provenance/artifacts/886c64b6-46ff-40d9-8bdb-bf4dc1338048/metadata.yaml d38fbee1499196dc8a450ed4b7d0880c provenance/artifacts/886c64b6-46ff-40d9-8bdb-bf4dc1338048/action/action.yaml data/000077500000000000000000000000001462552636000345175ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6ints.txt000066400000000000000000000000261462552636000362330ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000362460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6uuid: 48af8384-2b0a-4b26-b85c-11b79c0d6ea6 type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000357465ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6VERSION000066400000000000000000000000501462552636000370110ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenanceQIIME 2 archive: 5 framework: 2018.11.0 action/000077500000000000000000000000001462552636000372235ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenanceaction.yaml000066400000000000000000000052721462552636000413720ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/actionexecution: uuid: abb9da30-7735-48fb-b650-fe0a3369e4d2 runtime: start: 2023-07-26T15:29:30.751527-07:00 end: 2023-07-26T15:29:30.759167-07:00 duration: 7640 microseconds action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: 886c64b6-46ff-40d9-8bdb-bf4dc1338048 - ints2: 886c64b6-46ff-40d9-8bdb-bf4dc1338048 - ints3: 21b490ad-8364-493a-a89a-532c5efb308f parameters: - int1: 7 - int2: 100 output-name: concatenated_ints citations: - !cite 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0' transformers: inputs: ints1: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints2: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints3: - from: IntSequenceV2DirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.11.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.11.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.11.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 artifacts/000077500000000000000000000000001462552636000377265ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance21b490ad-8364-493a-a89a-532c5efb308f/000077500000000000000000000000001462552636000445115ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts21b490ad-8364-493a-a89a-532c5efb308f/VERSION000066400000000000000000000000501462552636000455540ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsQIIME 2 archive: 5 framework: 2018.11.0 21b490ad-8364-493a-a89a-532c5efb308f/action/000077500000000000000000000000001462552636000457665ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts21b490ad-8364-493a-a89a-532c5efb308f/action/action.yaml000066400000000000000000000045141462552636000501330ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsexecution: uuid: f9e5cd62-e1e2-40e1-a9ed-f85f57411026 runtime: start: 2023-07-26T15:29:30.745198-07:00 end: 2023-07-26T15:29:30.746705-07:00 duration: 1507 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceV2DirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.11.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.11.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.11.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 21b490ad-8364-493a-a89a-532c5efb308f/citations.bib000066400000000000000000000146571462552636000472010ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts@article{framework|qiime2:2018.11.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R and Bokulich, Nicholas A and Abnet, Christian and Al-Ghalith, Gabriel A and Alexander, Harriet and Alm, Eric J and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J and Brown, C Titus and Callahan, Benjamin J and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily and Da Silva, Ricardo and Dorrestein, Pieter C and Douglas, Gavin M and Durall, Daniel M and Duvallet, Claire and Edwardson, Christian F and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M and Gibson, Deanna L and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin and Janssen, Stefan and Jarmusch, Alan K and Jiang, Lingjing and Kaehler, Benjamin and Kang, Kyo Bin and Keefe, Christopher R and Keim, Paul and Kelley, Scott T and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan GI and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan and McDonald, Daniel and McIver, Lauren J and Melnik, Alexey V and Metcalf, Jessica L and Morgan, Sydney C and Morton, Jamie and Naimey, Ahmad Turan and Navas-Molina, Jose A and Nothias, Louis Felix and Orchanian, Stephanie B and Pearson, Talima and Peoples, Samuel L and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, II, Michael S and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R and Swafford, Austin D and Thompson, Luke R and Torres, Pedro J and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J and Ul-Hasan, Sabah and van der Hooft, Justin JJ and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C and Williamson, Chase HD and Willis, Amy D and Xu, Zhenjiang Zech and Zaneveld, Jesse R and Zhang, Yilong and Knight, Rob and Caporaso, J Gregory}, doi = {10.7287/peerj.preprints.27295v1}, issn = {2167-9843}, journal = {PeerJ Preprints}, month = {oct}, pages = {e27295v1}, title = {QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science}, url = {https://doi.org/10.7287/peerj.preprints.27295v1}, volume = {6}, year = {2018} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4, author = {Witcombe, Brian and Meyer, Dan}, journal = {BMJ}, number = {7582}, pages = {1285--1287}, publisher = {British Medical Journal Publishing Group}, title = {Sword swallowing and its side effects}, volume = {333}, year = {2006} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5, author = {Reimers, Eigil and Eftestøl, Sindre}, journal = {Arctic, antarctic, and alpine research}, number = {4}, pages = {483--489}, publisher = {BioOne}, title = {Response behaviors of Svalbard reindeer towards humans and humans disguised as polar bears on Edgeøya}, volume = {44}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6, author = {Barbeito, Manuel S and Mathews, Charles T and Taylor, Larry A}, journal = {Applied microbiology}, number = {4}, pages = {899--906}, publisher = {Am Soc Microbiol}, title = {Microbiological laboratory hazard of bearded men}, volume = {15}, year = {1967} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8, author = {Silvers, Vicki L and Kreiner, David S}, journal = {Literacy Research and Instruction}, number = {3}, pages = {217--223}, publisher = {Taylor & Francis}, title = {The effects of pre-existing inappropriate highlighting on reading comprehension}, volume = {36}, year = {1997} } 21b490ad-8364-493a-a89a-532c5efb308f/metadata.yaml000066400000000000000000000001431462552636000471530ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsuuid: 21b490ad-8364-493a-a89a-532c5efb308f type: IntSequence2 format: IntSequenceV2DirectoryFormat 886c64b6-46ff-40d9-8bdb-bf4dc1338048/000077500000000000000000000000001462552636000446115ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts886c64b6-46ff-40d9-8bdb-bf4dc1338048/VERSION000066400000000000000000000000501462552636000456540ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsQIIME 2 archive: 5 framework: 2018.11.0 886c64b6-46ff-40d9-8bdb-bf4dc1338048/action/000077500000000000000000000000001462552636000460665ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts886c64b6-46ff-40d9-8bdb-bf4dc1338048/action/action.yaml000066400000000000000000000031731462552636000502330ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsexecution: uuid: 96157f9a-a876-4dbc-94d9-3890e839ca04 runtime: start: 2023-07-26T15:29:30.731768-07:00 end: 2023-07-26T15:29:30.735653-07:00 duration: 3885 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) [GCC Clang 10.0.0 ] framework: version: 2018.11.0 website: https://qiime2.org citations: - !cite 'framework|qiime2:2018.11.0|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: zipp: 3.6.0 wheel: 0.37.1 tzlocal: '4.2' tzdata: '2023.3' six: 1.16.0 setuptools: 58.0.4 qiime2: 2018.11.0 PyYAML: 6.0.1 pytz: '2023.3' pytz-deprecation-shim: 0.1.0.post0 python-dateutil: 2.8.2 pyparsing: 3.1.0 pip: 21.2.2 pandas: 1.1.5 numpy: 1.19.5 importlib-resources: 5.4.0 decorator: 5.1.1 certifi: 2021.5.30 bibtexparser: 1.4.0 backports.zoneinfo: 0.2.1 886c64b6-46ff-40d9-8bdb-bf4dc1338048/citations.bib000066400000000000000000000076761462552636000473040ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifacts@article{framework|qiime2:2018.11.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R and Bokulich, Nicholas A and Abnet, Christian and Al-Ghalith, Gabriel A and Alexander, Harriet and Alm, Eric J and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J and Brown, C Titus and Callahan, Benjamin J and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily and Da Silva, Ricardo and Dorrestein, Pieter C and Douglas, Gavin M and Durall, Daniel M and Duvallet, Claire and Edwardson, Christian F and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M and Gibson, Deanna L and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin and Janssen, Stefan and Jarmusch, Alan K and Jiang, Lingjing and Kaehler, Benjamin and Kang, Kyo Bin and Keefe, Christopher R and Keim, Paul and Kelley, Scott T and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan GI and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan and McDonald, Daniel and McIver, Lauren J and Melnik, Alexey V and Metcalf, Jessica L and Morgan, Sydney C and Morton, Jamie and Naimey, Ahmad Turan and Navas-Molina, Jose A and Nothias, Louis Felix and Orchanian, Stephanie B and Pearson, Talima and Peoples, Samuel L and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, II, Michael S and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R and Swafford, Austin D and Thompson, Luke R and Torres, Pedro J and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J and Ul-Hasan, Sabah and van der Hooft, Justin JJ and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C and Williamson, Chase HD and Willis, Amy D and Xu, Zhenjiang Zech and Zaneveld, Jesse R and Zhang, Yilong and Knight, Rob and Caporaso, J Gregory}, doi = {10.7287/peerj.preprints.27295v1}, issn = {2167-9843}, journal = {PeerJ Preprints}, month = {oct}, pages = {e27295v1}, title = {QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science}, url = {https://doi.org/10.7287/peerj.preprints.27295v1}, volume = {6}, year = {2018} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } 886c64b6-46ff-40d9-8bdb-bf4dc1338048/metadata.yaml000066400000000000000000000001411462552636000472510ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance/artifactsuuid: 886c64b6-46ff-40d9-8bdb-bf4dc1338048 type: IntSequence1 format: IntSequenceDirectoryFormat citations.bib000066400000000000000000000104211462552636000404170ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenance@article{framework|qiime2:2018.11.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R and Bokulich, Nicholas A and Abnet, Christian and Al-Ghalith, Gabriel A and Alexander, Harriet and Alm, Eric J and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J and Brown, C Titus and Callahan, Benjamin J and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily and Da Silva, Ricardo and Dorrestein, Pieter C and Douglas, Gavin M and Durall, Daniel M and Duvallet, Claire and Edwardson, Christian F and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M and Gibson, Deanna L and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin and Janssen, Stefan and Jarmusch, Alan K and Jiang, Lingjing and Kaehler, Benjamin and Kang, Kyo Bin and Keefe, Christopher R and Keim, Paul and Kelley, Scott T and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan GI and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan and McDonald, Daniel and McIver, Lauren J and Melnik, Alexey V and Metcalf, Jessica L and Morgan, Sydney C and Morton, Jamie and Naimey, Ahmad Turan and Navas-Molina, Jose A and Nothias, Louis Felix and Orchanian, Stephanie B and Pearson, Talima and Peoples, Samuel L and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, II, Michael S and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R and Swafford, Austin D and Thompson, Luke R and Torres, Pedro J and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J and Ul-Hasan, Sabah and van der Hooft, Justin JJ and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C and Williamson, Chase HD and Willis, Amy D and Xu, Zhenjiang Zech and Zaneveld, Jesse R and Zhang, Yilong and Knight, Rob and Caporaso, J Gregory}, doi = {10.7287/peerj.preprints.27295v1}, issn = {2167-9843}, journal = {PeerJ Preprints}, month = {oct}, pages = {e27295v1}, title = {QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science}, url = {https://doi.org/10.7287/peerj.preprints.27295v1}, volume = {6}, year = {2018} } @article{action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } metadata.yaml000066400000000000000000000001411462552636000404060ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v5/48af8384-2b0a-4b26-b85c-11b79c0d6ea6/provenanceuuid: 48af8384-2b0a-4b26-b85c-11b79c0d6ea6 type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/000077500000000000000000000000001462552636000270275ustar00rootroot000000000000006facaf61-1676-45eb-ada0-d530be678b27/000077500000000000000000000000001462552636000337465ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6VERSION000066400000000000000000000000471462552636000350170ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27QIIME 2 archive: 6 framework: 2023.5.1 checksums.md5000066400000000000000000000023031462552636000363400ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b272f8ba2bcaec6859457078fc22d53a9df VERSION 70568a60e5ac4d0821296cf6a6f71ff7 metadata.yaml a1315b18ca0d82f18561154b797f3016 data/ints.txt 2f8ba2bcaec6859457078fc22d53a9df provenance/VERSION 6682d32e912a0c17776a0e5752962822 provenance/citations.bib 70568a60e5ac4d0821296cf6a6f71ff7 provenance/metadata.yaml 6825448d037af0d597a06e55ab4404c2 provenance/action/action.yaml 2f8ba2bcaec6859457078fc22d53a9df provenance/artifacts/7727c060-5384-445d-b007-b64b41a090ee/VERSION 6b555e26fb9e7ae7d0f9bc3ea6f24ac3 provenance/artifacts/7727c060-5384-445d-b007-b64b41a090ee/citations.bib 634320b640a9fbf82a5b4a8b28ff018b provenance/artifacts/7727c060-5384-445d-b007-b64b41a090ee/metadata.yaml 235493d6721d2129f9ae9f653e8ef7ac provenance/artifacts/7727c060-5384-445d-b007-b64b41a090ee/action/action.yaml 2f8ba2bcaec6859457078fc22d53a9df provenance/artifacts/8dea2f1a-2164-4a85-9f7d-e0641b1db22b/VERSION 6935eddc031e4b123a138262a9c00d7b provenance/artifacts/8dea2f1a-2164-4a85-9f7d-e0641b1db22b/citations.bib faaca4642b556373df8f729076077513 provenance/artifacts/8dea2f1a-2164-4a85-9f7d-e0641b1db22b/metadata.yaml 498ebc2e3b7b9fea39d0bd6574485aa6 provenance/artifacts/8dea2f1a-2164-4a85-9f7d-e0641b1db22b/action/action.yaml data/000077500000000000000000000000001462552636000346575ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27ints.txt000066400000000000000000000000261462552636000363730ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/data1 2 3 1 2 3 4 8 7 100 metadata.yaml000066400000000000000000000001411462552636000364060ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27uuid: 6facaf61-1676-45eb-ada0-d530be678b27 type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000361065ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27VERSION000066400000000000000000000000471462552636000371570ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenanceQIIME 2 archive: 6 framework: 2023.5.1 action/000077500000000000000000000000001462552636000373635ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenanceaction.yaml000066400000000000000000000202041462552636000415220ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/actionexecution: uuid: 5035a60e-6f9a-40d4-b412-48ae52255bb5 runtime: start: 2023-07-26T15:45:12.678884-07:00 end: 2023-07-26T15:45:12.685724-07:00 duration: 6840 microseconds execution_context: type: synchronous action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: 8dea2f1a-2164-4a85-9f7d-e0641b1db22b - ints2: 8dea2f1a-2164-4a85-9f7d-e0641b1db22b - ints3: 7727c060-5384-445d-b007-b64b41a090ee parameters: - int1: 7 - int2: 100 output-name: concatenated_ints citations: - !cite 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0' transformers: inputs: ints1: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints2: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints3: - from: IntSequenceV2DirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.1 website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.1|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 arrow: 1.2.3 astor: 0.8.1 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 fqdn: 1.5.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isoduration: 20.11.0 jedi: 0.18.2 joblib: 1.2.0 jsonpointer: '2.0' jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 pycparser: '2.21' pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2cli: 2023.5.1 q2galaxy: 2023.5.0 q2templates: 2023.5.0 qiime2: 2023.5.1 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 uri-template: 1.3.0 urllib3: 2.0.2 wcwidth: 0.2.6 webcolors: '1.13' webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 artifacts/000077500000000000000000000000001462552636000400665ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance7727c060-5384-445d-b007-b64b41a090ee/000077500000000000000000000000001462552636000444165ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts7727c060-5384-445d-b007-b64b41a090ee/VERSION000066400000000000000000000000471462552636000454670ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.1 7727c060-5384-445d-b007-b64b41a090ee/action/000077500000000000000000000000001462552636000456735ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts7727c060-5384-445d-b007-b64b41a090ee/action/action.yaml000066400000000000000000000173451462552636000500460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsexecution: uuid: 12988290-1ebf-47ad-8c34-5469d42e5ffe runtime: start: 2023-07-26T15:45:12.664907-07:00 end: 2023-07-26T15:45:12.666980-07:00 duration: 2073 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceV2DirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.1 website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.1|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 arrow: 1.2.3 astor: 0.8.1 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 fqdn: 1.5.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isoduration: 20.11.0 jedi: 0.18.2 joblib: 1.2.0 jsonpointer: '2.0' jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 pycparser: '2.21' pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2cli: 2023.5.1 q2galaxy: 2023.5.0 q2templates: 2023.5.0 qiime2: 2023.5.1 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 uri-template: 1.3.0 urllib3: 2.0.2 wcwidth: 0.2.6 webcolors: '1.13' webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 7727c060-5384-445d-b007-b64b41a090ee/citations.bib000066400000000000000000000150371462552636000470770ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts@article{framework|qiime2:2023.5.1|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4, author = {Witcombe, Brian and Meyer, Dan}, journal = {BMJ}, number = {7582}, pages = {1285--1287}, publisher = {British Medical Journal Publishing Group}, title = {Sword swallowing and its side effects}, volume = {333}, year = {2006} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5, author = {Reimers, Eigil and Eftestøl, Sindre}, journal = {Arctic, antarctic, and alpine research}, number = {4}, pages = {483--489}, publisher = {BioOne}, title = {Response behaviors of Svalbard reindeer towards humans and humans disguised as polar bears on Edgeøya}, volume = {44}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6, author = {Barbeito, Manuel S and Mathews, Charles T and Taylor, Larry A}, journal = {Applied microbiology}, number = {4}, pages = {899--906}, publisher = {Am Soc Microbiol}, title = {Microbiological laboratory hazard of bearded men}, volume = {15}, year = {1967} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8, author = {Silvers, Vicki L and Kreiner, David S}, journal = {Literacy Research and Instruction}, number = {3}, pages = {217--223}, publisher = {Taylor & Francis}, title = {The effects of pre-existing inappropriate highlighting on reading comprehension}, volume = {36}, year = {1997} } 7727c060-5384-445d-b007-b64b41a090ee/metadata.yaml000066400000000000000000000001431462552636000470600ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsuuid: 7727c060-5384-445d-b007-b64b41a090ee type: IntSequence2 format: IntSequenceV2DirectoryFormat 8dea2f1a-2164-4a85-9f7d-e0641b1db22b/000077500000000000000000000000001462552636000447755ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts8dea2f1a-2164-4a85-9f7d-e0641b1db22b/VERSION000066400000000000000000000000471462552636000460460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.1 8dea2f1a-2164-4a85-9f7d-e0641b1db22b/action/000077500000000000000000000000001462552636000462525ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts8dea2f1a-2164-4a85-9f7d-e0641b1db22b/action/action.yaml000066400000000000000000000160251462552636000504170ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsexecution: uuid: b49e497c-19b2-49f7-b9a2-0d837016c151 runtime: start: 2023-07-26T15:45:12.596590-07:00 end: 2023-07-26T15:45:12.606734-07:00 duration: 10144 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.1 website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.1|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 arrow: 1.2.3 astor: 0.8.1 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 fqdn: 1.5.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isoduration: 20.11.0 jedi: 0.18.2 joblib: 1.2.0 jsonpointer: '2.0' jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 pycparser: '2.21' pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2cli: 2023.5.1 q2galaxy: 2023.5.0 q2templates: 2023.5.0 qiime2: 2023.5.1 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 uri-template: 1.3.0 urllib3: 2.0.2 wcwidth: 0.2.6 webcolors: '1.13' webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 8dea2f1a-2164-4a85-9f7d-e0641b1db22b/citations.bib000066400000000000000000000100561462552636000474520ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifacts@article{framework|qiime2:2023.5.1|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } 8dea2f1a-2164-4a85-9f7d-e0641b1db22b/metadata.yaml000066400000000000000000000001411462552636000474350ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance/artifactsuuid: 8dea2f1a-2164-4a85-9f7d-e0641b1db22b type: IntSequence1 format: IntSequenceDirectoryFormat citations.bib000066400000000000000000000106011462552636000405570ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenance@article{framework|qiime2:2023.5.1|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } metadata.yaml000066400000000000000000000001411462552636000405460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-v6/6facaf61-1676-45eb-ada0-d530be678b27/provenanceuuid: 6facaf61-1676-45eb-ada0-d530be678b27 type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/000077500000000000000000000000001462552636000300455ustar00rootroot00000000000000220b8d2a-a951-4930-9591-f76d3071db3d/000077500000000000000000000000001462552636000344055ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-mdVERSION000066400000000000000000000000711462552636000354530ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3dQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty checksums.md5000066400000000000000000000042011462552636000367760ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d476f4ac80cf5d54ed7d080461fbafb1d VERSION d5a81e4dfb50224bed02d8f1a8c9bef2 metadata.yaml 5943ea6156feddfd2b51547256f0038c data/ints.txt 476f4ac80cf5d54ed7d080461fbafb1d provenance/VERSION 8823efccad896347fad9cb7236da7928 provenance/citations.bib d5a81e4dfb50224bed02d8f1a8c9bef2 provenance/metadata.yaml 18e12febaaf2de619388644c6b8fc5ff provenance/action/action.yaml 476f4ac80cf5d54ed7d080461fbafb1d provenance/artifacts/0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/VERSION ca04942adda98f77c8c55cdc6f16ad46 provenance/artifacts/0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/citations.bib 7e4678992f0f120e32d4046ca636ef2a provenance/artifacts/0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/metadata.yaml 09d60644328cde75d246974dcd53bbdc provenance/artifacts/0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/action/action.yaml 476f4ac80cf5d54ed7d080461fbafb1d provenance/artifacts/8f71b73d-b028-4cbc-9894-738bdfe718bf/VERSION 63e87f1600d294e90f3613b7bae4a402 provenance/artifacts/8f71b73d-b028-4cbc-9894-738bdfe718bf/citations.bib 1228fb717c8d806690cc1617e92c1ffc provenance/artifacts/8f71b73d-b028-4cbc-9894-738bdfe718bf/metadata.yaml 84d366e8cc95a67f2e00cd7a843ffcb9 provenance/artifacts/8f71b73d-b028-4cbc-9894-738bdfe718bf/action/action.yaml 476f4ac80cf5d54ed7d080461fbafb1d provenance/artifacts/be472b56-d205-43ee-8180-474da575c4d5/VERSION 049ad4ed7120ef1a9bbfaa8cb7e8a57f provenance/artifacts/be472b56-d205-43ee-8180-474da575c4d5/citations.bib 48432017078c1670582edbe665fa1203 provenance/artifacts/be472b56-d205-43ee-8180-474da575c4d5/metadata.yaml a3a3bdd1caef224c8d56ee8b7763ef14 provenance/artifacts/be472b56-d205-43ee-8180-474da575c4d5/action/action.yaml 02601af550914cff73314852de0bf655 provenance/artifacts/be472b56-d205-43ee-8180-474da575c4d5/action/metadata.tsv 476f4ac80cf5d54ed7d080461fbafb1d provenance/artifacts/e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/VERSION 427c8401eefda7d4f046483924278057 provenance/artifacts/e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/citations.bib 442f99778614d2bab37538f798eabbdc provenance/artifacts/e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/metadata.yaml ff423be4f47bbe53c284a1fd0f9f0345 provenance/artifacts/e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/action/action.yaml data/000077500000000000000000000000001462552636000353165ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3dints.txt000066400000000000000000000000261462552636000370320ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/data1 1 2 1 1 2 3 5 81 64 metadata.yaml000066400000000000000000000001411462552636000370450ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3duuid: 220b8d2a-a951-4930-9591-f76d3071db3d type: IntSequence1 format: IntSequenceDirectoryFormat provenance/000077500000000000000000000000001462552636000365455ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3dVERSION000066400000000000000000000000711462552636000376130ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenanceQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty action/000077500000000000000000000000001462552636000400225ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenanceaction.yaml000066400000000000000000000207201462552636000421640ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/actionexecution: uuid: 63683a81-5118-4a13-9ead-d44df80115d6 runtime: start: 2023-08-25T14:55:38.289078-07:00 end: 2023-08-25T14:55:38.315165-07:00 duration: 26087 microseconds execution_context: type: synchronous action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: concatenate_ints inputs: - ints1: e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d - ints2: be472b56-d205-43ee-8180-474da575c4d5 - ints3: 0bb6d731-155a-4dd0-8a1e-98827bc4e0bf parameters: - int1: 81 - int2: 64 output-name: concatenated_ints citations: - !cite 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0' transformers: inputs: ints1: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints2: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ints3: - from: IntSequenceV2DirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.0+69.g6316efe.dirty website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.0+69.g6316efe.dirty|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 astor: 0.8.1 astroid: 2.15.5 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cGraph: '0.1' cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 code2flow: 2.5.1 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flake8: 6.0.0 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 graphviz: 0.20.1 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipdb: 0.13.13 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isort: 5.12.0 jedi: 0.18.2 joblib: 1.2.0 jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 lazy-object-proxy: 1.9.0 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mccabe: 0.7.0 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pipdeptree: 2.9.3 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 py2puml: 0.7.2 pyan3: 1.2.0 pycodestyle: 2.10.0 pycparser: '2.21' pyflakes: 3.0.1 pygraphviz: '1.11' pylint: 2.17.4 pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2galaxy: 2023.5.0 q2lint: 0.0.1 q2templates: 2023.5.0 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 urllib3: 2.0.2 wcwidth: 0.2.6 webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 q2cli: 2023.5.0.dev0+21.gf9a7a5b qiime2: 2023.5.0+38.g72f1a6c.dirty artifacts/000077500000000000000000000000001462552636000405255ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/000077500000000000000000000000001462552636000454405ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/VERSION000066400000000000000000000000711462552636000465060ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty 0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/action/000077500000000000000000000000001462552636000467155ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/action/action.yaml000066400000000000000000000200611462552636000510550ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsexecution: uuid: 45d87840-efe3-437a-8544-b86fb5b8fed8 runtime: start: 2023-08-25T14:55:36.189680-07:00 end: 2023-08-25T14:55:36.200404-07:00 duration: 10724 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceV2DirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7' - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.0+69.g6316efe.dirty website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.0+69.g6316efe.dirty|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 astor: 0.8.1 astroid: 2.15.5 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cGraph: '0.1' cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 code2flow: 2.5.1 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flake8: 6.0.0 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 graphviz: 0.20.1 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipdb: 0.13.13 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isort: 5.12.0 jedi: 0.18.2 joblib: 1.2.0 jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 lazy-object-proxy: 1.9.0 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mccabe: 0.7.0 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pipdeptree: 2.9.3 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 py2puml: 0.7.2 pyan3: 1.2.0 pycodestyle: 2.10.0 pycparser: '2.21' pyflakes: 3.0.1 pygraphviz: '1.11' pylint: 2.17.4 pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2galaxy: 2023.5.0 q2lint: 0.0.1 q2templates: 2023.5.0 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 urllib3: 2.0.2 wcwidth: 0.2.6 webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 q2cli: 2023.5.0.dev0+21.gf9a7a5b qiime2: 2023.5.0+38.g72f1a6c.dirty 0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/citations.bib000066400000000000000000000150611462552636000501160ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts@article{framework|qiime2:2023.5.0+69.g6316efe.dirty|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|2, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|3, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|4, author = {Witcombe, Brian and Meyer, Dan}, journal = {BMJ}, number = {7582}, pages = {1285--1287}, publisher = {British Medical Journal Publishing Group}, title = {Sword swallowing and its side effects}, volume = {333}, year = {2006} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|5, author = {Reimers, Eigil and Eftestøl, Sindre}, journal = {Arctic, antarctic, and alpine research}, number = {4}, pages = {483--489}, publisher = {BioOne}, title = {Response behaviors of Svalbard reindeer towards humans and humans disguised as polar bears on Edgeøya}, volume = {44}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|6, author = {Barbeito, Manuel S and Mathews, Charles T and Taylor, Larry A}, journal = {Applied microbiology}, number = {4}, pages = {899--906}, publisher = {Am Soc Microbiol}, title = {Microbiological laboratory hazard of bearded men}, volume = {15}, year = {1967} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|7, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceV2DirectoryFormat|8, author = {Silvers, Vicki L and Kreiner, David S}, journal = {Literacy Research and Instruction}, number = {3}, pages = {217--223}, publisher = {Taylor & Francis}, title = {The effects of pre-existing inappropriate highlighting on reading comprehension}, volume = {36}, year = {1997} } 0bb6d731-155a-4dd0-8a1e-98827bc4e0bf/metadata.yaml000066400000000000000000000001431462552636000501020ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsuuid: 0bb6d731-155a-4dd0-8a1e-98827bc4e0bf type: IntSequence2 format: IntSequenceV2DirectoryFormat 8f71b73d-b028-4cbc-9894-738bdfe718bf/000077500000000000000000000000001462552636000454775ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts8f71b73d-b028-4cbc-9894-738bdfe718bf/VERSION000066400000000000000000000000711462552636000465450ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty 8f71b73d-b028-4cbc-9894-738bdfe718bf/action/000077500000000000000000000000001462552636000467545ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts8f71b73d-b028-4cbc-9894-738bdfe718bf/action/action.yaml000066400000000000000000000162341462552636000511230ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsexecution: uuid: ca4ba114-22de-488a-a204-b72cade32677 runtime: start: 2023-08-25T14:53:24.932336-07:00 end: 2023-08-25T14:53:24.939605-07:00 duration: 7269 microseconds action: type: import transformers: output: - from: builtins:dict to: MappingDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.0+69.g6316efe.dirty website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.0+69.g6316efe.dirty|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 astor: 0.8.1 astroid: 2.15.5 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cGraph: '0.1' cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 code2flow: 2.5.1 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flake8: 6.0.0 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 graphviz: 0.20.1 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipdb: 0.13.13 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isort: 5.12.0 jedi: 0.18.2 joblib: 1.2.0 jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 lazy-object-proxy: 1.9.0 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mccabe: 0.7.0 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pipdeptree: 2.9.3 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 py2puml: 0.7.2 pyan3: 1.2.0 pycodestyle: 2.10.0 pycparser: '2.21' pyflakes: 3.0.1 pygraphviz: '1.11' pylint: 2.17.4 pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2galaxy: 2023.5.0 q2lint: 0.0.1 q2templates: 2023.5.0 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 urllib3: 2.0.2 wcwidth: 0.2.6 webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 q2cli: 2023.5.0.dev0+21.gf9a7a5b qiime2: 2023.5.0+38.g72f1a6c.dirty 8f71b73d-b028-4cbc-9894-738bdfe718bf/citations.bib000066400000000000000000000064401462552636000501560ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts@article{framework|qiime2:2023.5.0+69.g6316efe.dirty|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } 8f71b73d-b028-4cbc-9894-738bdfe718bf/metadata.yaml000066400000000000000000000001301462552636000501350ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsuuid: 8f71b73d-b028-4cbc-9894-738bdfe718bf type: Mapping format: MappingDirectoryFormat be472b56-d205-43ee-8180-474da575c4d5/000077500000000000000000000000001462552636000452335ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsbe472b56-d205-43ee-8180-474da575c4d5/VERSION000066400000000000000000000000711462552636000463010ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty be472b56-d205-43ee-8180-474da575c4d5/action/000077500000000000000000000000001462552636000465105ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsbe472b56-d205-43ee-8180-474da575c4d5/action/action.yaml000066400000000000000000000176431462552636000506640ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsexecution: uuid: 8dae7a81-83ce-48db-9313-6e3131b0933c runtime: start: 2023-08-25T14:54:59.827951-07:00 end: 2023-08-25T14:54:59.848887-07:00 duration: 20936 microseconds execution_context: type: synchronous action: type: method plugin: !ref 'environment:plugins:dummy-plugin' action: identity_with_metadata inputs: - ints: e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d parameters: - metadata: !metadata '8f71b73d-b028-4cbc-9894-738bdfe718bf:metadata.tsv' output-name: out transformers: inputs: ints: - from: IntSequenceDirectoryFormat to: builtins:list plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.0+69.g6316efe.dirty website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.0+69.g6316efe.dirty|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 astor: 0.8.1 astroid: 2.15.5 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cGraph: '0.1' cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 code2flow: 2.5.1 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flake8: 6.0.0 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 graphviz: 0.20.1 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipdb: 0.13.13 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isort: 5.12.0 jedi: 0.18.2 joblib: 1.2.0 jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 lazy-object-proxy: 1.9.0 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mccabe: 0.7.0 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pipdeptree: 2.9.3 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 py2puml: 0.7.2 pyan3: 1.2.0 pycodestyle: 2.10.0 pycparser: '2.21' pyflakes: 3.0.1 pygraphviz: '1.11' pylint: 2.17.4 pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2galaxy: 2023.5.0 q2lint: 0.0.1 q2templates: 2023.5.0 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 urllib3: 2.0.2 wcwidth: 0.2.6 webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 q2cli: 2023.5.0.dev0+21.gf9a7a5b qiime2: 2023.5.0+38.g72f1a6c.dirty be472b56-d205-43ee-8180-474da575c4d5/action/metadata.tsv000066400000000000000000000000431462552636000510230ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsid a #q2:types categorical 0 42 be472b56-d205-43ee-8180-474da575c4d5/citations.bib000066400000000000000000000101001462552636000476760ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts@article{framework|qiime2:2023.5.0+69.g6316efe.dirty|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } be472b56-d205-43ee-8180-474da575c4d5/metadata.yaml000066400000000000000000000001411462552636000476730ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsuuid: be472b56-d205-43ee-8180-474da575c4d5 type: IntSequence1 format: IntSequenceDirectoryFormat e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/000077500000000000000000000000001462552636000455255ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactse6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/VERSION000066400000000000000000000000711462552636000465730ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsQIIME 2 archive: 6 framework: 2023.5.0+69.g6316efe.dirty e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/action/000077500000000000000000000000001462552636000470025ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactse6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/action/action.yaml000066400000000000000000000165411462552636000511520ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsexecution: uuid: 950f01bc-6d00-495f-a957-b597b140e359 runtime: start: 2023-08-25T14:53:05.138535-07:00 end: 2023-08-25T14:53:05.373871-07:00 duration: 235336 microseconds action: type: import transformers: output: - from: builtins:list to: IntSequenceDirectoryFormat plugin: !ref 'environment:plugins:dummy-plugin' citations: - !cite 'transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0' - !cite 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' environment: platform: macosx-10.9-x86_64 python: |- 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:05:36) [Clang 14.0.6 ] framework: version: 2023.5.0+69.g6316efe.dirty website: https://qiime2.org citations: - !cite 'framework|qiime2:2023.5.0+69.g6316efe.dirty|0' plugins: dummy-plugin: version: 0.0.0-dev website: https://github.com/qiime2/qiime2 citations: - !cite 'plugin|dummy-plugin:0.0.0-dev|0' - !cite 'plugin|dummy-plugin:0.0.0-dev|1' python-packages: CacheControl: 0.12.11 Cython: 0.29.35 DendroPy: 4.5.2 Jinja2: 3.1.2 MarkupSafe: 2.1.2 Pillow: 9.5.0 PyJWT: 2.7.0 PyNaCl: 1.5.0 PySocks: 1.7.1 PyYAML: '6.0' Pygments: 2.15.1 Send2Trash: 1.8.2 altair: 5.0.1 anyio: 3.7.0 appdirs: 1.4.4 appnope: 0.1.3 argcomplete: 3.0.8 argon2-cffi: 21.3.0 argon2-cffi-bindings: 21.2.0 astor: 0.8.1 astroid: 2.15.5 asttokens: 2.2.1 atpublic: 3.0.1 attrs: 23.1.0 backcall: 0.2.0 backports.functools-lru-cache: 1.6.4 bcrypt: 3.2.2 beautifulsoup4: 4.12.2 bibtexparser: 1.4.0 biom-format: 2.1.12 bleach: 6.0.0 bokeh: 3.1.1 cGraph: '0.1' cached-property: 1.5.2 certifi: 2023.5.7 cffi: 1.15.1 charset-normalizer: 3.1.0 click: 8.1.3 code2flow: 2.5.1 colorama: 0.4.6 comm: 0.1.3 contourpy: 1.0.7 cryptography: 40.0.2 cutadapt: '4.4' cycler: 0.11.0 deblur: 1.1.1 debugpy: 1.6.7 decorator: 4.4.2 defusedxml: 0.7.1 dill: 0.3.6 dnaio: 0.10.0 emperor: 1.0.3 entrypoints: '0.4' exceptiongroup: 1.1.1 executing: 1.2.0 fastcluster: 1.2.6 fastjsonschema: 2.17.1 flake8: 6.0.0 flit-core: 3.9.0 flufl.lock: '7.1' fonttools: 4.39.4 formulaic: 0.6.1 future: 0.18.3 globus-sdk: 3.19.0 gneiss: 0.4.6 graphlib-backport: 1.0.3 graphviz: 0.20.1 h5py: 2.10.0 hdmedians: 0.14.2 idna: '3.4' ijson: 3.2.0.post0 importlib-metadata: 4.8.3 importlib-resources: 5.12.0 iniconfig: 2.0.0 interface-meta: 1.3.0 iow: 1.0.5 ipdb: 0.13.13 ipykernel: 6.23.1 ipython: 8.12.2 ipython-genutils: 0.2.0 ipywidgets: 8.0.6 isal: 1.1.0 isort: 5.12.0 jedi: 0.18.2 joblib: 1.2.0 jsonschema: 4.17.3 jupyter-client: 8.2.0 jupyter-core: 5.3.0 jupyter-events: 0.6.3 jupyter-server: 2.6.0 jupyter-server-terminals: 0.4.4 jupyterlab-pygments: 0.2.2 jupyterlab-widgets: 3.0.7 kiwisolver: 1.4.4 lazy-object-proxy: 1.9.0 llvmlite: 0.39.1 lockfile: 0.12.2 lxml: 4.9.2 lz4: 4.3.2 matplotlib: 3.6.0 matplotlib-inline: 0.1.6 mccabe: 0.7.0 mistune: 2.0.5 msgpack: 1.0.5 munkres: 1.1.4 mypy: 1.3.0 mypy-extensions: 1.0.0 natsort: 8.3.1 nbclassic: 1.0.0 nbclient: 0.8.0 nbconvert: 7.4.0 nbformat: 5.8.0 nest-asyncio: 1.5.6 networkx: '3.1' nlopt: 2.7.1 nose: 1.3.7 notebook: 6.5.4 notebook-shim: 0.2.3 numba: 0.56.4 numpy: 1.23.5 overrides: 7.3.1 packaging: '23.1' pandas: 1.5.3 pandocfilters: 1.5.0 paramiko: 3.2.0 parsl: 2023.5.29 parso: 0.8.3 patsy: 0.5.3 pexpect: 4.8.0 pickleshare: 0.7.5 pip: 23.1.2 pipdeptree: 2.9.3 pkgutil-resolve-name: 1.3.10 platformdirs: 3.5.1 pluggy: 1.0.0 prometheus-client: 0.17.0 prompt-toolkit: 3.0.38 provenance-lib: 2023.5.1 psutil: 5.9.5 ptyprocess: 0.7.0 pure-eval: 0.2.2 py2puml: 0.7.2 pyan3: 1.2.0 pycodestyle: 2.10.0 pycparser: '2.21' pyflakes: 3.0.1 pygraphviz: '1.11' pylint: 2.17.4 pynndescent: 0.5.10 pyobjc-core: 9.1.1 pyobjc-framework-Cocoa: 9.1.1 pyparsing: 3.0.9 pyrsistent: 0.19.3 pytest: 7.3.1 python-dateutil: 2.8.2 python-json-logger: 2.0.7 pytz: '2023.3' pyzmq: 25.0.2 q2-alignment: 2023.5.0 q2-composition: 2023.5.0 q2-cutadapt: 2023.5.1 q2-dada2: 2023.5.0 q2-deblur: 2023.5.0 q2-demux: 2023.5.0 q2-diversity: 2023.5.1 q2-diversity-lib: 2023.5.0 q2-emperor: 2023.5.0 q2-feature-classifier: 2023.5.0 q2-feature-table: 2023.5.0 q2-fragment-insertion: 2023.5.0 q2-gneiss: 2023.5.0 q2-longitudinal: 2023.5.0 q2-metadata: 2023.5.0 q2-mystery-stew: 2023.5.0 q2-phylogeny: 2023.5.0 q2-quality-control: 2023.5.0 q2-quality-filter: 2023.5.0 q2-sample-classifier: 2023.5.0 q2-taxa: 2023.5.0 q2-types: 2023.5.0 q2-vsearch: 2023.5.0 q2galaxy: 2023.5.0 q2lint: 0.0.1 q2templates: 2023.5.0 requests: 2.31.0 rfc3339-validator: 0.1.4 rfc3986-validator: 0.1.1 scikit-bio: 0.5.7 scikit-learn: 0.24.1 scipy: 1.8.1 seaborn: 0.12.2 sepp: 4.3.10 setproctitle: 1.3.2 setuptools: 67.7.2 six: 1.16.0 sniffio: 1.3.0 soupsieve: 2.3.2.post1 stack-data: 0.6.2 statsmodels: 0.14.0 tblib: 1.7.0 terminado: 0.17.1 threadpoolctl: 3.1.0 tinycss2: 1.2.1 toml: 0.10.2 tomli: 2.0.1 tomlkit: 0.11.8 toolz: 0.12.0 tornado: 6.3.2 tqdm: 4.65.0 traitlets: 5.9.0 typeguard: 2.13.3 types-cryptography: 3.3.23.2 types-enum34: 1.1.8 types-ipaddress: 1.0.8 types-paramiko: 3.0.0.10 types-requests: 2.31.0.1 types-six: 1.16.21.8 types-urllib3: 1.26.25.13 typing-extensions: 4.6.2 typing-utils: 0.1.0 tzlocal: '2.1' umap-learn: 0.5.3 unicodedata2: 15.0.0 unifrac: 1.0.0 urllib3: 2.0.2 wcwidth: 0.2.6 webencodings: 0.5.1 websocket-client: 1.5.2 wheel: 0.40.0 widgetsnbextension: 4.0.7 wrapt: 1.15.0 xmltodict: 0.13.0 xopen: 1.7.0 xyzservices: 2023.5.0 yq: 3.2.2 zipp: 3.15.0 zstandard: 0.19.0 q2cli: 2023.5.0.dev0+21.gf9a7a5b qiime2: 2023.5.0+38.g72f1a6c.dirty e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/citations.bib000066400000000000000000000101001462552636000501700ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifacts@article{framework|qiime2:2023.5.0+69.g6316efe.dirty|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d/metadata.yaml000066400000000000000000000001411462552636000501650ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance/artifactsuuid: e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d type: IntSequence1 format: IntSequenceDirectoryFormat citations.bib000066400000000000000000000106231462552636000412220ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenance@article{framework|qiime2:2023.5.0+69.g6316efe.dirty|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0, author = {Baerheim, Anders and Sandvik, Hogne}, journal = {BMJ}, number = {6970}, pages = {1689}, publisher = {British Medical Journal Publishing Group}, title = {Effect of ale, garlic, and soured cream on the appetite of leeches}, volume = {309}, year = {1994} } @article{plugin|dummy-plugin:0.0.0-dev|0, author = {Unger, Donald L}, journal = {Arthritis & Rheumatology}, number = {5}, pages = {949--950}, publisher = {Wiley Online Library}, title = {Does knuckle cracking lead to arthritis of the fingers?}, volume = {41}, year = {1998} } @article{plugin|dummy-plugin:0.0.0-dev|1, author = {Berry, Michael Victor and Geim, Andre Konstantin}, journal = {European Journal of Physics}, number = {4}, pages = {307}, publisher = {IOP Publishing}, title = {Of flying frogs and levitrons}, volume = {18}, year = {1997} } @article{view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0, author = {Mayer, Hans C and Krechetnikov, Rouslan}, journal = {Physical Review E}, number = {4}, pages = {046117}, publisher = {APS}, title = {Walking with coffee: Why does it spill?}, volume = {85}, year = {2012} } @article{transformer|dummy-plugin:0.0.0-dev|builtins:list->IntSequenceDirectoryFormat|0, author = {Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traoré, Mahamadou and N'Goran, Eliézer K and Utzinger, Jürg}, journal = {PLoS neglected tropical diseases}, number = {12}, pages = {e1969}, publisher = {Public Library of Science}, title = {An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, volume = {6}, year = {2012} } metadata.yaml000066400000000000000000000001411462552636000412050ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/concated-ints-with-md/220b8d2a-a951-4930-9591-f76d3071db3d/provenanceuuid: 220b8d2a-a951-4930-9591-f76d3071db3d type: IntSequence1 format: IntSequenceDirectoryFormat qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/dupes.bib000066400000000000000000000110011462552636000255320ustar00rootroot00000000000000@article{framework|qiime2:2019.10.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A.}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, number = {8}, pages = {852-857}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, volume = {37}, year = {2019} } @article{framework|qiime2:2019.4.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R and Bokulich, Nicholas A}, doi = {10.7287/peerj.preprints.27295v1}, issn = {2167-9843}, journal = {PeerJ Preprints}, month = {oct}, pages = {e27295v1}, title = {QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science}, url = {https://doi.org/10.7287/peerj.preprints.27295v1}, volume = {6}, year = {2018} } @article{framework|qiime2:2019.7.0|0, author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A}, doi = {10.1038/s41587-019-0209-9}, issn = {1546-1696}, journal = {Nature Biotechnology}, title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2}, url = {https://doi.org/10.1038/s41587-019-0209-9}, year = {2019} } @article{view|types:2019.10.0|BIOMV210Format|0, author = {McDonald, Daniel and Clemente, Jose C and Kuczynski, Justin and Rideout, Jai Ram and Stombaugh, Jesse and Wendel, Doug and Wilke, Andreas and Huse, Susan and Hufnagle, John and Meyer, Folker and Knight, Rob and Caporaso, J Gregory}, doi = {10.1186/2047-217X-1-7}, journal = {GigaScience}, number = {1}, pages = {7}, publisher = {BioMed Central}, title = {The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome}, volume = {1}, year = {2012} } @article{view|types:2019.10.0|BIOMV210DirFmt|0, author = {McDonald, Daniel and Clemente, Jose C and Kuczynski, Justin and Rideout, Jai Ram and Stombaugh, Jesse and Wendel, Doug and Wilke, Andreas and Huse, Susan and Hufnagle, John and Meyer, Folker and Knight, Rob and Caporaso, J Gregory}, doi = {10.1186/2047-217X-1-7}, journal = {GigaScience}, number = {1}, pages = {7}, publisher = {BioMed Central}, title = {The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome}, volume = {1}, year = {2012} } @article{view|types:2019.10.0|biom.table:Table|0, author = {McDonald, Daniel and Clemente, Jose C and Kuczynski, Justin and Rideout, Jai Ram and Stombaugh, Jesse and Wendel, Doug and Wilke, Andreas and Huse, Susan and Hufnagle, John and Meyer, Folker and Knight, Rob and Caporaso, J Gregory}, doi = {10.1186/2047-217X-1-7}, journal = {GigaScience}, number = {1}, pages = {7}, publisher = {BioMed Central}, title = {The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome}, volume = {1}, year = {2012} } @inproceedings{view|types:2019.10.0|pandas.core.series:Series|0, author = { Wes McKinney }, booktitle = { Proceedings of the 9th Python in Science Conference }, editor = { Stéfan van der Walt and Jarrod Millman }, pages = { 51 -- 56 }, title = { Data Structures for Statistical Computing in Python }, year = { 2010 } } @inproceedings{view|types:2020.2.0|pandas.core.frame:DataFrame|0, author = { Wes McKinney }, booktitle = { Proceedings of the 9th Python in Science Conference }, editor = { Stéfan van der Walt and Jarrod Millman }, pages = { 51 -- 56 }, title = { Data Structures for Statistical Computing in Python }, year = { 2010 } } @inproceedings{view|types:2020.2.0|pandas.core.series:Series|0, author = { Wes McKinney }, booktitle = { Proceedings of the 9th Python in Science Conference }, editor = { Stéfan van der Walt and Jarrod Millman }, pages = { 51 -- 56 }, title = { Data Structures for Statistical Computing in Python }, year = { 2010 } } @inproceedings{view|types:2021.2.0|pandas.core.frame:DataFrame|0, author = { Wes McKinney }, booktitle = { Proceedings of the 9th Python in Science Conference }, editor = { Stéfan van der Walt and Jarrod Millman }, pages = { 51 -- 56 }, title = { Data Structures for Statistical Computing in Python }, year = { 2010 } } @inproceedings{view|types:2021.2.0|pandas.core.series:Series|0, author = { Wes McKinney }, booktitle = { Proceedings of the 9th Python in Science Conference }, editor = { Stéfan van der Walt and Jarrod Millman }, pages = { 51 -- 56 }, title = { Data Structures for Statistical Computing in Python }, year = { 2010 } } qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0/000077500000000000000000000000001462552636000253555ustar00rootroot0000000000000089af91c0-033d-4e30-8ac4-f29a3b407dc1/000077500000000000000000000000001462552636000321265ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0VERSION000066400000000000000000000000441462552636000331740ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0/89af91c0-033d-4e30-8ac4-f29a3b407dc1QIIME 2 archive: 0 framework: 2.0.5 data/000077500000000000000000000000001462552636000330375ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0/89af91c0-033d-4e30-8ac4-f29a3b407dc1feature-table.biom000066400000000000000000002502451462552636000364370ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0/89af91c0-033d-4e30-8ac4-f29a3b407dc1/dataHDF  P`  TREEHEAPX observationsample8 @id ` @type` H format-url` P format-version@ H generated-by` Hcreation-date` H shape@" 0 nnz@`TREE @ GCOL No Table IDhttp://biom-format.orgqiime2 2018.11.02018-11-08T17:53:50.106154 459244e45f29ce28991de0114a16f9fa bb2704c9052362b112da42bf843e7d9b 7595e123b71bdae8a8c1c28b7405a5c0 cbc2f795edfaebaf35d10b85062b426d b1e686a5c5f127a8b92de2863503a9e5 a03aa6a23dc8f296567452ec2242a5fd d32d15407d86a2044eead0d72cd3f9e7 d30b3fb9c1dc7d9f0d4bd6ce8e975238 1e00822422afb942c9955cd450385a8b 12ed71ced0f4e45c0aa5b37144e6f84b 684b032107a3a32f6664dea4ef723a42 3d6f762d8cda1915ad93d014e391c118 87a04eba5a3dc45f9e899d8ab4c050b4 4738a05820f1a7840410c66476953498 0effb4592c11dcdd619141df770563e1 e2cbf79a06f1221cf49d1bef8e5db9cd 3d6555a4886c6e4a89104136dd5ca39e 74106ca246bcf2da5c4f95b82aa6867a 955280212814a2050dd2f640e10fc25d e80d9134b601511a2702b54268b57249 3111520c8424a2a5bdd46558ae7650e8 867e71db0537674539934bf97db51bc0 c961cc0ecd0ac693e2fb7dcf7794c3e5 e6df6369108817c93a749fa640528e92 f351326f9505ffc18da207391818c48e d1c7ee5172f16dda522e46aaa4e5fa79 9548347b69043c274ff7a1ce599a69ef! 02878fe3ccc81d4c884ca5574178d6a0" 9f4cdae8a020a8d9c4f54e5abf5a5769# d9095748835ade1b8914c5f57b6acbcf$ a826883ed572461c97a23561f7ecea3c% 1ca133f4d30ab344f11e6624e841d9cd& 6ffdeb862aed8aedda8a8624fa960c57' 8bcc747f27b10bb3e1182fe530c7ed24( faf2bd479f04891ef2afba2f407749ac) 2ab2f3551b7aa27fd483fdcbaa1a0afb* 6f384c926424d43c215fbe9337d55324+ 831514b8b3617bed49e2a4bd9d3d6268, 53dd9f9a89854c8ebdcfb105b7a26df1- f4eefbd4edf2a372131825289fffcca6. 752621a2dcf600f86357e9c2fa1fc985/ 6a9b2745a31a3e5ae16682a323c28a4d0 c81e310dae79b8767597762a093082c81 9df911e407318b496425698999a1d81f2 5ed23cabf504db9af85bce57e9008ea23 7cd67e7e02db4db794fd67ab72755f8b4 a3dc97e11838f1bf157aa875f2f6367d5 aee707afe3847427a6d3d34e314379236 31ed03e41d53de58a6348a242165c05b7 458462fa1e15c439f12deec2f19af9338 b901052d2411ea898be450af4a8187449 8c5f2b5fce1b357f54ed4c8c047b6845: 33176e81f3e7ef2b9de1b783ff1ca33c; f66effc279917d65747b7c3ef8d62432< 160029903cd9c779390f4719179aeea2= 0741ae033054073fece98b9948ec4387> 7fa8f515ee1ac5cb8b529b13e6a89790? ff8c7f94f941a0c647120d4e142db316@ 51588a76b68ff3041e8b272256df5ae2A a54095a893fd5abc045d162bbe20644aB d985172bebaa0bbe61ff5f682d372c6eC fbf110ef5de76f86121c8504a0cb786eD f8fe4ee0ae3da9a1432872f5545a0d42E 9b959f65abdb6cb9adc1d6160a5384ecF cee164362eb3296a3d29e33f7b479d2eG 816e266d71b9a9af9d3d74893dee443eH 480b52afa972e6deb75357ec39db1998I 35b6dde81c92eb7f66ca9f741da3514eJ ae4c61e598c093caa5b66b6919f02ba3K 0a50155a2def454344e1f2305eff7092L e54bcf1c8ec24811e7c355b8d82fa59eM 5b220f66423e4004eb3520bfae1f25d0N 959a178855242ad3833af85d38ccff45O e6dfd15b38618c3c7ff0a025083f56d4P 3253177a44672689e4e154e842d63e35Q b815913c61abb4a540babe5c3fae17e7R 45c36b8b2965484407b8b213419d4c6cS e9f0399c43aff643ceb1480929cf89b2T 2b427cf77172293239bc1c1cdb32ace4U a841dd7e2f38b26bb7bb37179b4141d5V 73291cac0e802b6a1fb25ae7079390efWL5S104HEAPX8metadatagroup-metadatamatrixids SNOD`bTREEHEAPX`PSNOD( 0"a(  " @TREEHEAPX@ P "TREE,HEAPX #dataindicesindptr8 ?@4 4 deflateh$y[PTREE-?y6ySNODX#>Ox^u9U_p FYB1` 0`{q<̡EZ#FB:쐐`  $ފM~uO{uWR;M/S ǿ ,g{ |k;\ <X n ,nɏ?>iW%WA3)qG?݈xL/>%o[#oߩws?8w[- 98)Ln֯10&N_̿zߕ!kyV@C=>ĥ hOuL@[a fo207sHaNOd*3:|]} ߅y`ߔeEnNv g>:_UߙOycWz}x)'?/lkYɧz9gow>q>VTM'&ܟs7f=۹վYqέgu_8~y޸q :̜3ǦǬ:pݬc+ǝS#6Euу'oa߬Cg^ CM=Q?>9sd>, g`OL*G<93~UBS5oXpMmy.T[֏vQ{mմ۾܂N MyAΑ<Î̷ϛ힁5X:e>^9'ϴKwhx7smWSSԍ1R/s7>g?_7<P?cR&N}֭_o:k`9.%Q'];pT72֭VgG|lqN_'uHܦ.898G#u4uq6/o߁vF~~?8vϜӌg}̛vfNW[_֛q2Oz:޸owKi┼M\{AW[;GOhG/h\S>xƥ2~ qs;9Ⱦ?أ~N鿺S|μy3^Mn~V/Y/yg{ٯt/Wyν_~rNî8GwsjW)/k%~k/8/[o\v'3|ASc]ON7n=ZfJ~o|gZO̹6>ځ>dgs{LE5p}#>.X} ~VhK=<|P_|^ݙ8{ް~ZAwsN1NbCQOw :`~OW{o5x^ΔEYt 1BQqPGy/4sN8JٜtUOUvKqϼ=|p^|VݚB~] ~=A߿xO~;v`j]NyaB`sf7A~~{m1><~cq=CO~gϯ_X'OgLی=<㍿0/;(.| ~ ƿGܞd#3Os^t3f?~Zb^ikzӣ{s~;u(3 .uNF]qұo}>ߩ{)/T^=uV&/tk!ȟ|=__u=+}vq䃟OyY)C_:|?MS>C=t_{?^ԃuփ{t_ ?z}?9oa]Gq}3m?h]گ_@|OtF:/Sƣg\zW4.u/v}]~ۺ^ ~,WG/A|AcߛggBu]xܷs{W}s:eyxF?oCsG}mbOil]OvFaoZ||KE GʃF®Ww%]tw4{m߸nlu{[N_i}uA~]><4?j~ݟa[}A8mv/sX<mU^up|" v ぽcQi=4wvr]_;oð7oa  deflate?[XTREE5Gx^F G{WqRϞFBnP0$v޻K?K?K?KK۳pK?W{xW{KK.w'_^wϗr鯗v.}w.}l_[a?[>/{_Grłb='_ǣ>|918=!GX\~q[b۾'[y/`{rpc<.?WOB6X[{S^a3'67u 0`Ŝcc7{ţ?[爧ϗ8q=jcc=ҞSk5bNg  c9Թ)vq!Ⱏ~!ϾgIͻ5߭ {qs{̧S:Vnicm; 9okūgNϮk mskhOK} Nu< r>ͱ>P;!koθz Y\7'f@΃u}-mmZhk֐{Uڰlq,WGNyYvq55==*}'Gψsfcw=Zw~{άU2l{Zk.oCyZl9ZG{/:{=ٰXY[߈U=Ӝqnɴ~óv{̽V@ܕy=՘3d߹~GbIsc}'^Oj=O| b=+/?{=&8oc:Ӌx ls?\o}1~cp _b\Okr7"w=wVNWYĞ4&xޚA03b6utΈS{^/r}}C&:]o㳩>|RؒG1f=5<+1Ny}X;X> y~Z3cb9qvqu^|o=/D{~^Zo|8)%69ST-ܢhM9yj o*9Pıky.wiy<)y:q.9#bz/.Ox^ -d!8@b&5<K8@6kf{rŃ?>pFo>EY =M`쟏{8v營%x^]0 2DQDn O ܗ]]-y,kXDH}b\|YܸN7p.&7p ~7aߖgZКuk_&%  deflateP[XTREEp"Yx^ rrDt ]k%W(*Z"G$a$9œs\C+Jܫfb枻iLZs|APCcЋhF3xS\P(©MiMz1aL$9ē H% 29I6yܢAʃT)jQ4="qLd:d#ߒF&")׹ (]^*ըA}Ҋ6Dҁc FbC , V$v$QUnJQxQg)XN"_5% l&H!A:`7{9AQqS!\Α.R%.s n BŹR,`2d11%qgdbT1d3c> Y") |,a) /|*VoXKX6l!oIa+l4~$"'3c?8!s,r$8/; e9849de9c0f7be74e13c457f05177aad? 72cbf125e17c4e14d748543564d3d4ec@ b59d5976acdb148c29f724d6b7990410A 7b9b46a7c3fb8b76b1c68911a9b56da1B 4cecaa1cfabe89a905940ae23dd94a64C bf6066f701b31c6bba541503797da013D ca1b627f011b52798bfb8fd2c6b76d20E 029cc71dc93341de90188b686798aa0dF c9ffd432429a7600931358e3a09860f5G 1a2d9f4af510ce624fa0337c3ea9fedcH 64222ae4c3f0f1cc06375a3814da355dI 6967d97d0d19d82633e2d05305850f80J 55f85b9f95aa14453f466b2ee388737bK bae6b03d63955f134b6e523285263226L 8c12a12aedcd680ab7e259bfd627ff9aM 01173073677a287321be3484fbed0007N 33e2cadd9d0b2b4ebeb6261766032e4aO ea646f13713c1f3393a8b66afa03ff4aP 2a010fd0ac0d1761c2e64c9cd3fd97c4Q b8413d82d36f772b9a375231cb504579R 7a9e7bd6bd3444b67f37c95db2960d50S 9abbfb135a6fceeb2ae107b7be7cb133T e7e23553bcb3b809fe953ff90c99da54U b18b389447d758e7cb173b4b4f2ad960V 9815d07f1b564c9be5aab03db01608daW 217845ff2f7195cbc781eba1f7e7eaa1X d37b36cd780fdec9627c298370571ed5Y 9c773cfe6cd90b39267153cd9eac020eZ 5202077471c942e678df50a8e4c706d6[ 648532aadbacef2c7f8f8468a6146fb2\ 04fd81a94c775a5906f2d92cf0548e4d] f1d63da515a0c96eb9f7af5de75655bb^ 167baae5fe04a754df7eb073c07adaa7_ e1e1ece9cc43057ec074912f7545a432` ab402c6b9fa511b0aa2f968cddc364eba 43e4b46e2ba2455e3974874722cbd75ab 76d0a9cc7116aeaa8d26bc09e768cbc4c f7a8c8f87af98f9c5f479beaadc0b892d ef4cafaa5121e47ff7f31d5553a66214e d8adf2a20249cefe6c627f9c17abb202f dd9d9aad1de9345686f5cc614c5616dbg 0524cb6dcc8f60dafc09c46911c9d0ach 63e29d89c820b4baafc3f5622946d6eei 0316af109bb877b5c7a85ddecd6dfd1ej f306a3c555687b1dc2c61372504fbbeak 58946a0b64aa351fd8e7b97d02d6639dl bb823ca6d961133b2cd91e9377cefba7m 8935afd2b1d59a37191411131d75e9f6n 124e839582c0a3329101182cb28619a8o 51b573b10d63a2cbe455deeef5bae002p 6b4a0a64da52980e7a1b1a770978bb88q 952fabf17975467e69af4ebb5efb0fb6r 5cf1581a0381cd73e278302acd8751b1s 3e7f7d8ab3a5fb0e3401a0fda19d8bcbt e83fa923d1e185d9c9de13ca3b37c8f2u a1759b9ca6b3363edacc599498d62b56v 71e9ece7818452114b1e87f8145aefd5w e581d288eadb8d4527694b6fe7aa5df4x 2c9e19a0423a00f8c17d8086a8ab39a5y 4851a0c86e3c1b3f3c67e88d8a38960az 171930820a0db82224a948125002bf46{ f78c2a4fb584f61fc0fbd75c2cc22a99| 8406abe6d9a72018bf32d189d1340472} 5e76bbf2de9c36c591f8aab3784ffd1b~ 50feebeace77894a50a224bf001c3502 a9e0ac523112f40942da575caf1f386e f9686d94e355b66c5324677db52b3aa4 58c63b4710a70a640cb86afa4d851178 9d68bd5a8479880505cf88ab4695b5c5 5c4ad81078908a3b228188a847d12f9e dbd1f3c3e0f401f60d44103595acdb69 b89da5ee2f75672a2656789aea91071d 48a6cea7eaf29dcb52dc2985933f4249 4e34087bb8399deb145896c92e1b58a5 c23ddff0e5a041733d633d7bfb93e0cc 3d3566af17e7e52ae5413956d004e657 81026563293d6ed29eec4d7f73eab34d d45006a487d94665893aa9e759be4c97 5546d0fe4a512f96a6d73509a1ec6e55 40f13b077d41d6837ca6fae65763cf01 26f1b1ff96f049f0e354df4c3d85efb0 25087bd679769914988fc1e7c6d045ff 197807ce20d797d34d428f6de66f7999 6fda1de9604a1be1b99a41028226f4a7 76d2032108822bcfaa90a4873f7a51aa 0bf5530da55bc16c52034db82e5434af 5beb76b0d6631d10fc820c2d128d4cb6 14732e0b78dbc03c20c505d7681b6820 cdf8a49dcdb0256a8ead65f65c752e40 f6bb195f075d463c5e9b620a6a76dbfd d1dbb266dc8841470f3c42fe8f465507 80b9cf8b7fde7d45e0bc84a9cf9ddd56 3b91757d40499fecfd75d1ad215fdd2e 0b0a240a7ae19f3560d04427d753b603 e0a941a7b62e87ae6301bce7185e5f4a 96802b28a3942cb6b7b40f4b9bfc0d99 d4c5acaf755ef50d18008aa6c79f60b0 a0daabc3e11deb9b43aafbf3c8e31e35 e9d3af4175420ffab49c29d1a6bb2030 eb19aca9a5979db01ba9947baa292633 8be3d398e9e6b768ed1aa713f2981204 42f7a4caf54667a21b5de3b888213935 47be834b0f9232d552e00a36fb3c626b 03178a409a41853ea09a94055e138e19 ba3ff6e27b67596844f11db93e44c496 af12fc27ee17864717830cafd06453cd 1076b2b7de5233cc917c9425aa2ea200 a80183c045cf890899121db3bbcbd39d 6bf95d9914dc404e98ebff522496533d 4042bf819b2707dcb567a04950667ec7 d8469ca44fd367ae2aab986a57833586 95b94ac4c1c3e18210f501268e56e50e 5656d8b980bfee07e29e8fc119850901 d002269a53cf1579adca5f0dadd4f332 498128a80a796620b238d8ad5cd1d2f9 464ff59b632870b5028ed3092b9443ff 15c03c51375c7d03bd3dda24674b9936 a3ebb6445ca32f39df9f84fe48525773 47e6d3cd169787ff35f9a2278b8d4a2f e3f051168532c89bf76e329c715c35f5 47fb251b4679084fee2dafef11c6bc4f 9dd67b89e90b1517d6fc340bb1745e15 75a61e0ca465f169e5d2ea2d19f8aefb b9e3d6fcb82de619e9e574be03ecaa41 93982d4e0f9eb9a6179d6abd57560c26 8cfda573467e6a959664801713cadba1 b61a4c5ec96e7276004c81cc845e2fec 0335b1664150f1e151340b1450eae898 90b16eca6a6edc00f49dfb5f72be8b61 7d893311a14a858907d4c8ca21d32dc4 2d832109f6396e0f93297efd12e1321e 47fe572ed6a6b1d5f4a60a3884ec72e1 77f79d776f3845bc2901cf9c4229de0b 69e4868406c82bdfd67d2b449ca9f1d8 e7709791d7ef6b47747de9fb43880d14 f908d554fc6eeb9fef7a7028fa791475 e3d37109be4a17d8437f6fcb6072982d 09fc3d8a98c2ca72c69b63a51b5b564e 3a31937521e389cbdf89ca01039e5ec3 9f816758294dec368624eabd3595ec6f 2a71af2786dfb7aeb7ac7d5372c41686 59758d94cc7ed9a16526bfc9eb207575 524f22cecc9957e9bf629b5fb0047adc 190dfe815683ee977e120d35b3c83b7e 1d7e45fd568a577b5a6a8b4a20743211 e2f6c7b758ac2c0476a7e31787dbba84 df98cba64f84c4e8b7f7ea8396f4f3f9 88191bcb077f725a8c8e37d74e110baa 3b0e180ea70b1a946a6214eacb539c97 95dff3e66b8bdf4d8e509ceae4373c69 84e6f7df01548135d92febfeae9a1ea6 dd530a50ee0277096b0cbd210cdb80b2 f8fee8796e3188a996dfeca9e93f47e1 033511c7ff4fe93866075cfb9129aa3b 216d5bb567182b6210e2c3dbc455f2b6 a9541756b547061c02922f1a3a913fce 059408ca99f1001f64e6df1e06fb951d 111311cc108c483f3782c9214681ed19 427e48835f176c1a367a14b39b882c58 73bca1b91fb8a80ee698f1833c073664 fc8f87209db793f740eea731e64f0b2a b387961d274808e3ffed8c1594f4cd1c e1b32002ee69c4e3210a2a257511f96a 6e342714744bf1b213ac0767d6c3998d 14374c3f2fe75bb158b439ebe4a4ee87 10ae3ca534ce9bb9f80822b4b4a6375c 30d793622ebd90e7687dd3827701d46e fe72a66c85df0ab333e3a37d0332df67 bc15061b61cf6b5002c58284591f97d4 dbae1b46b67ad6769ea63d33ed3d44bf 99acae6a2b24652d92a1ee4cbf8faca5 d0dee8a51da30ed74bcadcd7d8e43537 8bcec52021ee6467bf008848d55b7a9f 32a4d183139d99ace97a18b276d7c169 2fe6a7331192dc6bf073d906fba93929 be5ed9c02284fa6aa31037f0e2d93470 8f146ce7c43f38ad3673dabe93c7c5bb 8dc69d65ac9e71a130bc6dbeb71d2f03 d30a7f0a26bd40402be57831cfa4da69 26583b57be9de00bee14d30f2dc986d8 895453b352ae060e0ec5f7721547a3d1 254482075b19ff911db5e5cb26817498 21b849f15da11a7cc5122773552e0fef f59aa47df669ef50955ce1a0512d3abf c89812744d8755bef87bedf499bd19cf 5af6e343611d715985e35e7a3e5671cb 317c0beca40f29f3aff863b8f028fd48 85943cff5ec22653bc51c22adeabe685 2c47de0940a8213b1ba751a7cd0b649b 14b20f972cfa94f77bc33eccc5ffca4c 9cd758ad7582755a7f52f2b003787795 5d4a0085030ee808369a8f29be41e0d2 c88977d348c943b60eea8f1571e9a7ee 81e706ae981fcbfceea2ac547fdd7f26 35bfc371d940cffdc527b7b4dc954456 4a146c717d33b2f2719e648bd7d22059 541947958ad35c1591eb3ef433141196 34f493c9b3b75210406fecac2dba0f22 a95851baf426c85eae4419617db902a7 514a4ebb290842347e3f7ac7cbba276e cfce45c3a7af608093fd99677d635891 ef571c059fe495366b01e977f7494fbc c73d4cf2ab0b699fb47d44488b82fd40  f79e815e1e628908e4ab605022b0ac09  23b560e97aacfd0422ed3f8a1104e7d7  eed50a4877de2fd21b146b4f0c7327ac  6d7d9658988c6e2b5e45ea087f1b9155  502b6544978890cda2ece2d14bfdba01 ca08eabd09756731f095632656d45b01 694caec8dad781d338acf4a5b69c3058 14e1a93cf2379225b1fd6ed0f186bacb 6488fe0aaa024ba9c06647578b5a6caf 5ce3a51e54f30fa53b1e56dc64d11340 4d987a66ee0e67b68c5c4428097eb22e c3fd5b1d03591477953d49f0f19600c4 08bd73939f7bfd85daf2eaee7e9c9bf8 93a769a6b67e5f358eb9ad0e03c53ec3 654f282fc504b4dc9af5b8e66a4d1a87 b29143b255f4b97da62310d85fe9b7f3 da3328e15aae60ea55a8765511d0327e ebcbd71a778e6a1a776d2d50d0801fed c80037fe31359840bb73250f330b60e3 a3a69f621716f9345def32e2b5e7b011 bd759efe26a5b4b869d3fa5e4693080d 723858f72b4469efb79c323f75bc181f 8021f872c32700d46738f3591cb4c58a  47c7eb4df73a5d553cd2f7a360ff5872! ad492bcae03f566b36a19e31f04d659a" eb8ef4756ed538fe480d979e740a04d8# 5db2cf37007f874e25eb2c901917e15a$ fa3729663b98de0c0af7913e9f30c19e% 504572e3afd673db749ee5e8e3e57b97& a6b6f29a1196cacfc392e3d71f55e2a2' 0e5df3d01cc073e3c9674c2534169f03( 06845c67bc4203081a981200f33e87eb) 98d250a339a635f20e26397dafc6ced3* 1830c14ead81ad012f1db0e12f8ab6a4+ fea5c05e9b8245c5ca46f6673f59e0bf, dd222108ed685d6a936d2011ea220998- 30a88dc67e6a522087c3dfc8019550a6. 126d3c82f184c4d9058cab124f287968/ 9616cf56be891f816aa7abc6a13c9e8f0 ae881162d813e498f40bf33b8e42d54a1 32875774cffdc8c820f91d8e86f9a0e62 14d168c89e381de6fc3091837a557e483 868a4fe285b85fa1b8eb40071d5397be4 38aff035451e491e5905508b9a173fb65 201b6484b6abe3d4c587464be6cbd5746 67f9c74eeea5a4609bd1d2fab66ad3cc7 2e202c4e4e803dbc6d54c9e3117afedd8 0c4b17f88dae47697d3bceb3663f47ce9 fc1f49989e78fa71bc6404ff2a148576: c18826df5af5da174f580164c805a38a; 919764896275058362e55f0849d2b38b< 4b13979d719e78a50cb0eac34d72dec1= 0bd7addf1b999f01ad58fec8412b8931> b8b19c4f0719a5d5b965a0f50f05827c? 1c7d1f74c2c43b872972f974eb196915@ d8f7f446987e67092b6256c10b1ee69eA 9e1c6b703a4312cce306c7757688a351B a0b0961c7ba318abff3e5a93f09f495eC 96a7dbb0e2c57e289c05b82e2db880beD 24dfe6a7325e4f2be7dbd522ef613e42E 1edc5497d3daf6d914427d32b71daa52F 6fea19332b7fb980e2124d0339c596c2G 55937e5781c91185e27a91216008e4caH 43a9c44adeac51fd3ab95081da56d02dI f3194c2eb241f821239009ca55dbb30cJ ecf9eb9fa3970ff27a221e626395c75dK 823698689031b0b9c2c5152f92ffc70fL 3b65e1db6742cc44493248fff266f564M 10710d3d706451c3536228e4b7d146dbN b428acd97e6af974db1b053a2bc2b5c8O 7f4909ae8d31a3995fa2c200c690a21fP e7a3af060bcc6686986ddfdf5536fac8Q aadef7fba5a0754fb631409c11fb331fR b00d44992702bd3743d7f353638d42f9S e22afe3cb91239d19aeb01ba79015173T dab4d3e6078cfc510502a47c98e82081U 7dd981802694dec3ecdb61ee53805956V bfdc8d2e7693336b4f3781920d3fa253W 0dcaf9358eddf685864afb883cdf2363X 9b0549bb6e63f681bf133d2c7996055dY ed75725613954c459019eec50b431f9fZ 9f1913b781d2cde1c8a4c57b7dc2ab83[ 4fccea69a702bcb90cc0147c0fff2995\ 7f99061140cbf6c9081a204380efd8e1] 50e96bbd1a8267119529843a1acb43a5^ 2de266b1ffa8e9ec6383c98096caf8f0_ 2ca928ad9749bb9726c35d6528fefec1` 5a02c9985adcd7543dff8c846cff62a5a 9acad60e1ad588fb9250aab68dee645db cfe4948493ea1b183b006512be10924ac 53729ba7429e5bd2fe96229b14452bdcd 9da7a39de7e82006cc9533681adf765ae 7ee56e891d069e18c3c8eb866924a26cf 955d8fcbb35e8bc04c8df0c6aae34d34g c04b03fdea7f81c8c52fc540c50a24a4h f99678fdc17d27a97c79faac3f2980a4i 54a5af5112600a1faf06be268f95616ej 8b111e12f8f73205bac732c7479a97e8k 3add7df323955b22d1cea664ef38165dl a1b971c01ce0b220275c218adbe717c1m 01b99cb344ed2530f7d80897ffe257a9n 4ab53e40e06f3a9b0b9b2669d8a52f65o 66a605fcdaa020dbdd6136f241ea79d9p 0be23adca941c1335eb04db17dc66e98q e73fc2655e1cca4a2b1c419da3630c72r 7a20f7055ca2c7462cfa6a7aef16ec28s 39b5567d2d979a9172c02435a5ede59dt c57b70d65a63c4bf9f12c179c2416839u 5df503c1ddf8f3f4ac9e49e983e3d7fdv 28560cffcd31045fa07dd88557b43e05w da247df6fe91b74bf38d3ecc1c05b4b0x 6be3ad080a12e28c46c9ca4f94710ac7y 95f07780b7e9b8b504c2563e63950663z 7f2aa6f3481a4039dcc8d1b8be32beb7{ 41cb0c62655aa8da8968a06cde0f14a1| ebd1dcd162acede67bb633c42c7b2ac2} 88b4862756df007aa0c6bc2ba049708b~ 498ae808db957a63e20e836e1bf77f54 3b74a95b189e134bb34c191b7693455d 6be678de197b54f9a04f6c984b91ef22 80b20e907aa4fcf2309796bc303d151d 58d6cee566e26e231916ccc2093ae053 acba6a53153faccd2c02a9c0e8a2d941 f094a383692eb30761eb3d5c93240236 ba0941f03086680ee946a10dfe90cf12 c9f68eb27669d95f1e365fc085b51ee6 7fee9350277038eef98dee70b4190563 1a4ff0ff25ff4e919711422c2f605519 5042942b77f5b924b8f8c11de37a3626 5decdad9c550489629ce441d32a69a1c 101968ec709b68fcd964a68ff226dcd1 bb173654871d294466912ef22f685faa f9bc8716f02b0a22f03b1a686895aae1 34e40677eda6fb724e9323b0835d5ab0 ce0a68ef210192a51f60ec3eb8e3e92e 48eda83fda558aea8fdd32b9c8dfff98 0f55566622d8244073432b2e67b24baa 956d0a6f5a8e1d2fbb39df909cc14894 5fb5ace22c46fefb30969da7bb0dcb74 0f2a155a57fe6d0bf34767080706b3e1 aa9f89f9af07d53e077462471423f3ab 921966b15e5f1508e8650d053443c385 1a5ff65c7f309159df626f810d96ae8b f32674aec6d3fa6a7aa2323cb35699fc ecbf086d6ccbe5e8c2a69d0afb144662 1c40ac02c49ea9e03881b0f183373417 3f448350064694e647b284259bdebb91 c9a1082e87563e01347729352358bbcf 816b94bcd935ca38afe71fd1e3c5356c bfbf3b84b315e741c4345ebb43e2b9cc 37c7d17f9e371c1e11a0e22b1ce15ab0 fe98555e63f547c33b3987391cd34d5d d329ff7ff76454d2282c9f01c2798588 ee7ce85e38500b28d9d79b63c7fbf9a6 8659c05e82a2d1399cbc3fd26ab938e4 3677e15d86603bf0a6bb50f8b010afe7 30ebe1a33c21c25c55e4fdbb5894b832 30273eb317696f87e7b6fd37ada4bb8f 54b4964000ad1631e547c46a828ed1a0 55ec4b057c122957d25013675512c73d 072698f9668d5a094cdc1a672acc4747 8c43291754ba53fa1e206f0ac7a18772 a5185df881810390f29afa3ba3ccf522 1893b0ad352ee7db0f565bc4594e6dc6 d3f59bdb915bb43f2ecd96ad9296ae92 2b148d88c49e316f85b6e7863089a886 cbb7ea319a7633cf0917c6e5cde0e923 a70441600c181b747a763ce5a7414a3f 68c05655a58355526692d3a1f0968aff f8026fa73c97b3f69cf4369cf194fb8b edacf632dcadc21c328669befafe2af6 e2c355360209f0b71b70b85ed0f9532d 97e0a84031c426df75af82aeff951f26 1f926dee6544b252d38067cc7753383a 3b795f0122b28b0e6a07bec73894acee e2f0caba18b002ca9762ed56167c6188 02ef9a59d6da8b642271166d3ffd1b52 3bb6760ebcd348e0c695a5fb5fccf994 42872dc875fef6070dfa78984184c096 7103ca740aeadee8bdf7cedaefac990b de391a8294254c47e8f460325eea85f8 1efd43e712291698f34853dac3b67f54 c7ec57695a6d83b461eba523752755ec b1141c1ab18997c3339e8229e0b9f41a dc7d3610c77e7c118082f25002f26cd4 4516f28133b919f8d8039ece23d8dd2e 2cec2792e54f61e12e1a0b2507826118 66fb1fc4317badb5dfaec2bf8e5055d7 bcb349a220a0c33a6b780dbc29a25f71 ce1d8d111ba2aaec28d68a04e483bf3f 74c2326ccd0bc2bd9e0f23f15b79ea16 ac809fd715cced98911f73f1dfb1ffb9 22ed70e219c67c2bf4e20e0b144b12bb cf8a25fe1a1b134a027842e7769a285b 04c7e0ea3038f942f5a28778a74cd1c0 a875cba4e2e4da128cec805cd014731f b8329e527429f07e08f04ecf93563779 d9efe89c658c7996e7738e1bb777c9f0 e0f1938338e6f9b8cf8408e672ee6cd1 c1dedbc2ee20ed310022b9a1976d56c4 d5ef913e6aee3067001dc610ac8af1d1 a621ed0faf5f6d7f3ad8f5e66cd41faf 1c0a00b159b2154357cc9bc7499c58b2 6de4955c509e8f32f905760cad0beeae 8f0a7d866b4e5ebf1e34d0a44eb950a4 d8db52dac1023a8fa0e56e92670d7ac1 51757b1a0a985640893725af5d87d350 c5dfb8aa2b481cb89e2602fc20941587 726f76e9275bfa07ccb329dda324c58f 41c686088d124f17c99dcf9bf3acf9d5 01ce91fd8dbecf637eb5e67cdab5c5aa c2562f279f2b32428e71ef8e8e1cd3d8 d51cd90e95b57acb614698dc580759fb 8c255f9dc0be58d1389877ec645f1a17 d12759fe8dda1d65fe9077cc1ca9cf28 44f993a57b632b8fde0c34ed075e1d66 6401660183a33dded59f163fc63aab4d 1e9fc37a1112b198ed3d0543db7863c8 685ea779ee012329ec2a171f1823f8a8 b1413b635612c7db2c140bc671789ec7 b0ca0d2568c23c26c6704017eae471e1 1a8d25f7529d783a219ef689701dd4e2 491ef1635067f62c395bdffd58966092 aa24dbd408e5d69c6b0d62846145a244 5c82bad13c4ebd2ed1bd23a8c2e37fc5 401d7866836c54f3cd18bd0fc814885e 684721318896ed01048f3502333c214d 839fafb762395fab1ba257442ec29e7f 6b82e6cb771d45d6c925405b66299d7f eecc4a4317225eb579540e82ab785716 a895b66cac813ac67ce1c908259b646f f59a4d48a03cc224462bf83e4d4ab126 79dcabe7f92f8cf2723b796dcd2f239f d781fa7e866ac295ce9a56bd97912983 0b925631436d3775231b60f9d0011cae 4075a90811bf9c19f02f5aa2ece6da37 2ff525a1e58c54525e7f2d68a6c0b23e f91c81859bf6eede55eef9c915dd8bab a5551fc60aa8bbaab737487552ea2a75 a4c08e2c3a34038ea877b16d01736837 4980b85712efeef1fc4563a9a949fe2e 5601f094afc823d75ccb34440645a99e 21a0198d0a83a8f3ec0e401067138940 56c98126dde3abb2d263088db55e12c8 393e6e2cbe258ed27a4ca0a2ee0d2cdc a659784c026bd9c8cde94da36aa656d7 f16a387e898dbae6432b2ed0241120bd 3125abe90c7ae91aec06eb104d71845f b383ade799e9c83c649cec37705a251d 1d17561d16d803f652b0d6bdd671c1cc b0f7e29de400c6b517b39230dc1b4365 3bf2fc236887eda11c969b84d30f9fbf 3cdd97ba69a504117de24626ba790e96 4132561a08d25757e4bee9f73ec4a70a 3a1c2d5547ddb7f76306a042f8bef6a4 30faafef155964edd9db9374f36c45bf  b1e22f094d8c64fb56dbacf680dab8c9  b92e89d105e0a6e09e5984879c4c7809  ff3df50eea7490dd794148649c480fd3  3982159c2603cacc4bc4893003bda323  846cbaf603b77be035fe0cb61d14a23c a5df9a825c3d8eeb228a6781c3db134c 8da02c7114ba0d9ee40312926e037a8f 7eedfc5c8d0d6b8852fabb20d52a8c5d 145f11599b92ad3a4c93896b0edf8c0c 87899779aec8d38dfaf48d895cfe8021 9d3ec96491b522546acbdfa3b21b6071 1d284bea12006c939c03d622ce2abd5e d46bfd723f625e317606ae28580eba3e d3e265209c342d15d93a910e38ad9be2 e8fb2b3d01d369dbb5d3764548193ae9 1b963d0d7457ac133aed9497f03b5a64 fb4f7d9d3f91ac2f22a542fbd13019c1 c4f955168102011d272b685aa772c10a 185df2b73b4805164470dfaea2d05223 513de43781dc4c2831eb83b23940d66b 24443b70088636edf08f0c9572d7faf4 475741f076c42d2c75d2d63c14fe4d3f 848587c4a03949500c5815bd6edb7b6e  2f4c5e75c0e4cdfcbddbe93ef1f410a3! ed950935b660fb10449d5f229dc13fb9" 626b8ee112be0ec208bb16445611d001# bb370945a6777f712cfd963c55f2ff54$ 59fefb91cc66ef6930b0201e4bf24cbb% 1f0a072da4906270a684a389c1586582& aa668eb01ca9dcc59aeb1454b600c8c8' 8cd2a871fab61bdea4ef23a295c371cf( 3a69cb2f27ac76a05405efbb5088d249) 44543d9c6bb193c64d38c9bdb6790eec* 5437cb1c86c72638cbbc12896f88d1b1+ 724c43ad3632cb8c5c6230c874551bd2, 9590aed21545774c71a124f2b5200691- cba5f5b1153235425628bd77a28257c7. 2fa276cce32e900f0f68d30e460b8f61/ 24523c45ca39b6c81a01ce286c0c092b0 91d0b3b7c0329b9a164b6e04a181f1d61 c99439f78b73ffcf55858ae828aac6082 84755bc954af6d3b39bb4c369e5992e73 063fa4f5a6fa894f7f8243468324ac5e4 49232f27012731b0a65805c0919e2bbc5 4b8228f4d7d3809aa6827e8d9e11a9ae6 f023384b8f989d014dd2ead7f10db3077 4ce1be9cea00a1eb7f960600e1ff3cfa8 1b75626f6834620dc2c729a1a81f497a9 b5bf6383b53e0add0d6589b73949462a: 344b020d867e3210563eaf40b0c557ef; 136e9f5aa470119165e59d202fdb318c< ca96b8fd01aa6c508305bacac37da6c2= 07b2b1c3771f46d82c90b20a79fb3d04> dd7e1e45831871f719cc3cd68f568e7c? 51ddb685cfb1775931489ebbd3eef6ca@ 16564bc936c839c893f951bab613a30bA 4a5387c4bc61f2d8f3d9d2de983ba556B c5b61a414bdb4305544bb4f941e0bfc9C 465e9cab9abce306c1e0f7459b325c33D 7cdabbffaaa6cc6d9f478d91f312a8eeE 3e2c7eb0316409806267a8d9845b0b13F 77a517253ff10afeb78a37bc13d649a0G 62bf37c50994a06e6c8c93355ad1caa0H 636c27859c1ac0a7994ca8ed145d03feI 79e9e337b10e2d298bb1b3bde946782dJ f35ce9c514e1398308f5f84ed50b260fK e5c80b4a23b96b2fa35b6da6cac2bf73L 9118606a87ac19f8d041da20f16f3236M d8e8459398f3f096775f4298666880a7N fba3d9b3872ff9021d66434323fb9565O 4fdf8dcd9451f5742982d716fa9f0536P ae27459b7f5465815b1f565d792083ceQ 4ea5683e2c2a183c11c45dc8fbd0c67cR 6edca9464612efff71d8f97299f01663S 41fedaa9200e2933821f84ecdc0b0772T 469e4ff2129066178b74b112829fc03cU 41735c13fb7f63db49b10fba5d959ce3V 76174505aef45f134d8068d5ad2c42aaW b8afb7bb30b7a9170b7c6dcd40251d54X 4c4606a3dbb4f5b8bb1ad1db6dfc5617Y 3ef461f213bfc675ddf0b1bfeef2dc52Z 1f50ac250618ced095a2a5c34bae7651[ b3d4d129c955517b446df657236ba5a2\ fc15824cd9e9c29e1319d32843452b54] b2866fdca13681232057e3e85a1978cf^ f8f937eeee7762f446a0e87292f98eb6_ 052128d7d424728578efe7852b0afd0d` 2514cdaab96a1a42cf15506f2ecf6439a fc5b641a0b0408d99ddfb2a5ba64da59b 958bc15b660bce0b894665d13f910f5bc e063d1c3c9daf7390e5475e48d9b01c5d e0472171de6fa02a1f58e392ad3d000ee 2e55763e77234a7bfc3a311067a0c7e1f 5ada68b9a081358e1a7d5f1d351e656ag 162ed46308ef9dd1600fb0fe7c90b186h 1b448e00c078f81e49a61933072345d2i 9af77e369d2ba4b3bb6e0e808ed84d91j e77bac543df42865af751a32a2942fdbk 3ae3e825c5e3862af6c17798e2b40a37l f8a71161ac98c5ceafcc75e57ace7230m 880968796af86b3d6270c228fe8e7026n 81351ef6b4d49a3a6b2fc24745d70305o 26b8b1509edf09c438db1634308c3c90p 26133b4163fdcd22d86f4e04e74154f1q c4aad64f84f51ff331b986a67af4ed3dr 29a92aecac56c855cbfc0debf9bda0c8s e737a3a864791ec52f1398d83786f4f2t 7a5d5e6f85da7ca91daa2d803e8f9279u acfe4c003905a7074aeaf385b78ad9e0v 75258a676bcd7e73c2437d6a2012d490w 909755d7142ea53c7f61c97c3586f26cx fd86856409fe6433d719558a30910515y 44ff26d8105cf822417b4b0c4eb27c02z ad41fe8f8be5b01c96549309937e3b14{ ab6b063ed43d8cb3008e157a0a305c47| fcd4f95c05b868060121ff709085bf21} 57b34afdff2e84fb9b0befd219edcb38~ 047b7fb62a5e9d2711e639ae1cb1519a 0160e14a78b18b903618f11bc732746e d2613a43f4ed1dda8a673e7fb251ebef e7c8b8c8fced8eead4b767bffe6dd544 e31b5affbe5e23b79a8619589f3b6620 67e23140639733766b7e127f50221289 db793caa585e078770021a7979ae8ea2 7dd9c9124ec3f118a6cef1f0d422c011 01e0b7ac306895be84179f2715af269b 1685b25b43eb070cd7655de8c2d8c0a4 500d6b485cef4c52aad211d6d6e6dbc7 ee0fb26fca3afb05e0b288d6fcab899e d1d750ce3b00503c204516d27ca0d3b0 ebe0e9790089ad4f40151cb4a3231a8f 54d098d009b89779530eaf43a1bb136a 0a69950e803a5d9e7166c450e44f43cf b685f8246fb6fad0fd91574c444b8f51 90d32ffe026535b392c0bad850f213c4 ab4ef4399912b0507d8d1187e874684d d86ef5d6394f5dbeb945f39aa25e7426 a049763053c277b16c2a318f41eb23b4 699071a4a7ce68918e4cd93a61275ecd cdd14cce4ec44b3235923652b71e947e 7e598ad34909a3fb8ff1623f1ede11a6 b44621e5c80607cdfacbf7a81e1cbe41 e8e6b7fc969005938de8ac7ffb94f17c 535696cecf56f03d3401d7d6dbe66d7b 9f718b8f66b79c3de80546ca9a54539c f352c1f1efecf483511c2270aabd0ae6 0305a4993ecf2d8ef4149fdfc7592603 cb2fe0146e2fbcb101050edb996a0ee2 997056ba80681bbbdd5d09aa591eadc0 9079bfebcce01d4b5c758067b1208c31 3c9c437f27aca05f8db167cd080ff1ec bfbed36e63b69fec4627424163d20118 868528ca947bc57b69ffdf83e6b73bae 154709e160e8cada6bfb21115acc80f5 1d2e5f3444ca750c85302ceee2473331 d29fe3c70564fc0f69f2c03e0d1e5561 fe30ff0f71a38a39cf1717ec2be3a2fc 4b5eeb300368260019c1fbc7a3c718fcL3S294L3S313L3S341L3S360L3S378L4S112L4S137L4S63x^]S&Vֶm۶m۶mnl;m۶IUtH$&%}R%%R,>OS4>>Oxމwxދ>x^x ^x^Wx ^xވ7x ފ#;.+'{>/x !x(x$G1x, x"')x*x&g9x.?I؟ % # ' '$ ipZpF gYpV g9pN yp^pA\EpQ\ %pI\ epY\pE\ WUpU\ W5pM\ up]\ pC7MpS 7-pK mp[Ǖ.v\Wq \µq\q7q7q ­qqwqwq½q<ƒ<<£<<O“<O <³</‹/ «o›o»|‡| |§|×Ker|_kuz|߄oƷ[mv|߅{y|߇Ca~?O'Si ~?/Ke ~_o7[m~G="NqH8Ό8Ύs8΍8Ώ .ࢸ..KҸ .+ʸ kڸnƸ nn[ָ n;θ {޸cNS48-3R:k8k8k8k8k8=Kk8k8J8k8k8k8k8k8n[k8k8N8k8k8a <C0<#(<c8<OqHx xgxx^x ^x^Wx ^xވ7x ފxމwxދ>>>O >K2|9_Wk:|=߈o7[6|;߉w{<!0~?Ï')4~?%2~_ï7-6~#1?ß/+5'3ÿ?/HI8N8N38Ό8Ύs8΍8Ώ .ࢸ.K.ಸ.+?f{?TREEHEAPX8metadatagroup-metadatamatrixids JjTREEHEAPXPSNOD*Rr0e7(2"JjRrTREEHEAPXP2TREEHEAPX Rdataindicesindptr8 ?@4 4 deflatey[PTREE2 yySNOD%.x^ٻmU]ޒC1""ܢ\ Wr" R@ mbL()W`ay J+#d~A'4_1Xl6lx~1O\. \1ŸοID7p廭:pfƧ1ׂ fKA Z;v~)JLj='g~x&^Wҟ\O;߽;.w}>x9q u+0m_+x!`~_7o>'^ =jo1~yb?l8?<xx>_xGO «ʟ?6;zH}(cd5*wA뱣F%k얷ir[Kdj{tߧt>|^¿~tď?8p/C==xqo~"ZԱ0%z_ś_i 'щb==Q:'O;ߜsnoӥ;̛G;jpн?s;Ł+=r/Ͻ4wj^+GUKW7ſth)$%>^':g"њ{{6N/~Gkz ]sճw_Vgvx^ J}6-œ.xf|EQ:>twE.W߸~ ^qa_xKک|_>);=K>,g|<_B<ÃV3P)|){#u1 w{ߋkw \ą(~ OSйoa?^ۧWi"'yb>sUzHg9g{);W}_GD_ߕ[x^ˏC< YXtLD1O{c0m0n{\>A2"d^,-ϟ8KYZX jv:3͓WUgϏlv7?@pt_}Ez">h`l޵칡%+o *h=hA^8]C,H?[KĵWփƭg(}w@e{ 72.^77\nF6ڙqf??L/:R7ۑ#G'&P߫*xʗ!hkk͛[_}Ɠ }zgoJOv|eo^QиCOz>~~-N<'y}~=/*uUަ} {+#߭+֫SB.uΎ?q2^IKݭy(x[#Ñ {n #'wHSxk g~}a=?ɏ52k8׸j3>p /OGrP#{n~4񧆳A2'?)2~3~Z'wݿwk֫8y^PW:n(_ֶoBi*}T<6SE ~&z!X)N6LOSǛ_£SxR8{1} >Og_s7]?:W=#^[ߡIs%ɋr>uo;jUrZ~> s>:>]C]ˊXoW]\|~кxb/<5.4.0k W_j '2>FθYvt>v~պQozpNy9o;]pxCT?Ûx=Y/^㱆7.sUC}#}q/x,0mD?a9|?+_U9~X9oл*2YzGoy^Juxo#hsċS}uyVE%VFjI{Vǹw>AUxٵ?U\yqS?2r=~ }$8;mS_d<.noU} IOxQq?q~:.^ fӌ5}p-|WrY4|pg)^֕gLQW}*\NP_|-.DZ?ϩzpZ7;1_n9b鬋ov<*T3O3x;ߋ;r,a L1fcAt@RXk3b}lp0ő8'7T\1&܆;(sAbX3668Z܄0[a|܎vVgwX|U{쎝+0{=ǚXqj_KxsiĠVkl~<>k}6BgYy՘g+6;ZAqRND"F (syѕVuynV3x-me(^bM 68#|%ٚX G( 68v1<b[xj)kwʷW1T)v]*q3z/c|mQSK1wJV/fռ㶹M~U.]z7[Z׭Z爃bZyzX(wO׊ElA+mۛ%M/0L3L*ʗb#E +{TOqgoeQ5&_wk[ܕ/j4V;XmĹƙF FF?iSb:S?(,nW^iwv̍y0y/m"XK`i,eh|~vs&CJ{.8Hq)+xu7)~HޯŴwv́ CwcqG/\q =#AמKgf󛜳3U7[ad~f[ynqf=<ϭٝj\0 wc~oX~/yŴy:̘܏>~3?K:>gYtw=QuG"fiU\/`e7SܾcX͕<^,%6Ƕb/8"yLJ~7ua c9~OPf@cK.\>Vc`e\#Ug犱\5we+kTsեj#p(ϡ?9)rv31NWgwi=5W4[5*}w|k'}/hyŚ粶gX,AkF5omsy b?qyQՙǪ>0OlgWhz̐΄Y0kYin,T*{:X7yntCwݱ/N^g18g}y Wz܀[p 13|>c숝38acWO>כSZ9cܮy]s󓊲^͗)YЪMk!f5W)q#q~Æ,jPWjhuQWs;wRdy9QkV+cTOm>i1i{~ݲnU}њ#y ץ~mimU+5Fl^7k6UToUKc5kFLU*Ϊ!Zc>Z<кY%f| 1o(#4Ch^g}$ϴ]>ݰKQP|w{Wx}Qc 5{5_Gs,PD>5;7+Y홨Tw?yWyUۣ7wD9Ydyf#vTW2c|RyS@KxR^(Y>GzNM{aa8nMs߳~n| Ϛ`WVx}q2NA3zOb(7i\yc¿\L *ʹc8/SmLU͕RYYɼ6իIGzi2ww`&ȡ!5I3rH}n+߉?c,+ը|/ּ6N-C*kWڵ oTVIorQԻ7HMdTK[꛾ısg''ףּ1$7ے'TZBިzo+}b\T$ԅV΂YW퀎hX-lUt¦ c l;(p8@?qcY8`&܌[p+nKxU[x]|q3~xGtutt;W6YVuUc7Z1ψ;:<&˪kծy=Uz}O*OAܤ?7jO(^+7u z1OoTgeسc<~zgh/߮w5ufQa y7^qoE=yq,ձ-꯻ޚg^]w5hRov \sj=Z=Zg9Xwf6Eq V/)&ly##  deflate5/#[XTREEgN#"" deflateu8"[PTREE~N"GCOLL5S155L5S174L5S203L5S222L5S240L6S20L6S68L6S93 L2S175 L2S204 L2S222 L2S240 L2S309L2S357L2S382L3S242L1S57L1S76L1S8L2S155L1S257L1S281L1S208L1S140L1S105 metadata.yaml000066400000000000000000000001401462552636000345650ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/data/table-v0/89af91c0-033d-4e30-8ac4-f29a3b407dc1uuid: 89af91c0-033d-4e30-8ac4-f29a3b407dc1 type: FeatureTable[Frequency] format: BIOMV210DirFmt qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_archive_parser.py000066400000000000000000000625261462552636000274530ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from datetime import timedelta import os import networkx as nx import pathlib from unittest.mock import MagicMock import pandas as pd import tempfile import unittest import zipfile import pytest from .._checksum_validator import ChecksumDiff, ValidationCode from .testing_utilities import ( DummyArtifacts, is_root_provnode_data, write_zip_archive ) from ..archive_parser import ( ProvNode, Config, _Action, _Citations, _ResultMetadata, ParserResults, ArchiveParser, ParserV0, ParserV1, ParserV2, ParserV3, ParserV4, ParserV5, ParserV6, ) from ...provenance import MetadataInfo from qiime2.core.testing.util import ReallyEqualMixin class ParserVxTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_parse_root_md(self): for artifact in self.das.all_artifact_versions: fp = artifact.filepath uuid = artifact.uuid parser = ArchiveParser.get_parser(fp) with zipfile.ZipFile(fp) as zf: root_md = parser._parse_root_md(zf, uuid) self.assertEqual(root_md.uuid, uuid) if artifact == self.das.table_v0: self.assertEqual(root_md.format, 'BIOMV210DirFmt') self.assertEqual(root_md.type, 'FeatureTable[Frequency]') else: self.assertEqual(root_md.format, 'IntSequenceDirectoryFormat') self.assertEqual(root_md.type, 'IntSequence1') def test_parse_root_md_no_md_yaml(self): for artifact in self.das.all_artifact_versions: parser = ArchiveParser.get_parser(artifact.filepath) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(artifact.filepath) as zf: zf.extractall(tempdir) metadata_path = os.path.join(tempdir, artifact.uuid, 'metadata.yaml') os.remove(metadata_path) fn = os.path.basename(artifact.filepath) fp = os.path.join(tempdir, fn) write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: with self.assertRaisesRegex( ValueError, 'Malformed.*metadata' ): parser._parse_root_md(zf, artifact.uuid) @pytest.mark.filterwarnings('ignore::UserWarning') def test_populate_archive(self): for artifact in self.das.all_artifact_versions: parser = ArchiveParser.get_parser(artifact.filepath) fp = artifact.filepath uuid = artifact.uuid version = artifact.archive_version if version == 0: with self.assertWarnsRegex( UserWarning, 'Artifact .*prior to provenance' ): res = parser.parse_prov(Config(), fp) else: res = parser.parse_prov(Config(), fp) self.assertIsInstance(res, ParserResults) pa_uuids = res.parsed_artifact_uuids self.assertIsInstance(pa_uuids, set) self.assertIsInstance(next(iter(pa_uuids)), str) self.assertIsInstance(res.prov_digraph, (type(None), nx.DiGraph)) self.assertIsInstance(res.provenance_is_valid, ValidationCode) if version < 5: self.assertIsInstance(res.checksum_diff, type(None)) else: self.assertIsInstance(res.checksum_diff, ChecksumDiff) self.assertIn(uuid, res.prov_digraph) self.assertIsInstance( res.prov_digraph.nodes[uuid]['node_data'], ProvNode) def test_validate_checksums(self): for artifact in self.das.all_artifact_versions: parser = ArchiveParser.get_parser(artifact.filepath) with zipfile.ZipFile(artifact.filepath) as zf: is_valid, diff = parser._validate_checksums(zf) if artifact.archive_version < 5: self.assertEqual(is_valid, ValidationCode.PREDATES_CHECKSUMS) self.assertEqual(diff, None) else: self.assertEqual(is_valid, ValidationCode.VALID) self.assertEqual(diff, ChecksumDiff({}, {}, {})) @pytest.mark.filterwarnings('ignore::UserWarning') def test_correct_validate_checksums_method_called(self): ''' We want to confirm that parse_prov uses the local _validate_checksums even when it calls super().parse_prov() internally ''' for artifact in self.das.all_artifact_versions: parser = ArchiveParser.get_parser(artifact.filepath) if artifact.archive_version < 5: parser._validate_checksums = MagicMock( # return values only here to facilitate normal execution return_value=(ValidationCode.PREDATES_CHECKSUMS, None) ) parser.parse_prov(Config(), artifact.filepath) parser._validate_checksums.assert_called_once() else: parser._validate_checksums = MagicMock( return_value=( ValidationCode.VALID, ChecksumDiff({}, {}, {}) ) ) parser.parse_prov(Config(), artifact.filepath) parser._validate_checksums.assert_called_once() class ArchiveParserTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_get_parser(self): parsers = [ ParserV0, ParserV1, ParserV2, ParserV3, ParserV4, ParserV5, ParserV6 ] for artifact, parser_version in zip( self.das.all_artifact_versions, parsers ): parser = ArchiveParser.get_parser(artifact.filepath) self.assertEqual(type(parser), parser_version) def test_get_parser_nonexistent_fp(self): fn = 'not_a_filepath.qza' fp = os.path.join(self.tempdir, fn) with self.assertRaises(FileNotFoundError): ArchiveParser.get_parser(fp) def test_artifact_parser_parse_prov(self): with self.assertRaisesRegex(NotImplementedError, "Use a subclass"): ArchiveParser().parse_prov(Config(), 'doesnotmatter.txt') class ResultMetadataTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.uuid = cls.das.concated_ints.uuid md_fp = f'{cls.uuid}/provenance/metadata.yaml' with zipfile.ZipFile(cls.das.concated_ints.filepath) as zf: cls.root_md = _ResultMetadata(zf, md_fp) @classmethod def tearDownClass(cls): cls.das.free() def test_smoke(self): self.assertEqual(self.root_md.uuid, self.uuid) self.assertEqual(self.root_md.type, 'IntSequence1') self.assertEqual(self.root_md.format, 'IntSequenceDirectoryFormat') def test_repr(self): exp = (f'UUID:\t\t{self.uuid}\n' 'Type:\t\tIntSequence1\n' 'Data Format:\tIntSequenceDirectoryFormat') self.assertEqual(repr(self.root_md), exp) class ActionTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir action_path = os.path.join(cls.das.concated_ints_v6.uuid, 'provenance', 'action', 'action.yaml') with zipfile.ZipFile(cls.das.concated_ints_v6.filepath) as zf: cls.concat_action = _Action(zf, action_path) action_path = os.path.join(cls.das.single_int.uuid, 'provenance', 'action', 'action.yaml') with zipfile.ZipFile(cls.das.single_int.filepath) as zf: cls.import_action = _Action(zf, action_path) action_path = os.path.join(cls.das.pipeline_viz.uuid, 'provenance', 'action', 'action.yaml') with zipfile.ZipFile(cls.das.pipeline_viz.filepath) as zf: cls.pipeline_action = _Action(zf, action_path) @classmethod def tearDownClass(cls): cls.das.free() def test_action_id(self): exp = '5035a60e-6f9a-40d4-b412-48ae52255bb5' self.assertEqual(self.concat_action.action_id, exp) def test_action_type(self): self.assertEqual(self.concat_action.action_type, 'method') self.assertEqual(self.import_action.action_type, 'import') self.assertEqual(self.pipeline_action.action_type, 'pipeline') def test_runtime(self): exp_t = timedelta exp = timedelta(microseconds=6840) self.assertIsInstance(self.concat_action.runtime, exp_t) self.assertEqual(self.concat_action.runtime, exp) def test_runtime_str(self): exp = '6840 microseconds' self.assertEqual(self.concat_action.runtime_str, exp) def test_action(self): exp = 'concatenate_ints' self.assertEqual(self.concat_action.action_name, exp) def test_plugin(self): exp = 'dummy_plugin' self.assertEqual(self.concat_action.plugin, exp) ''' Import is not handled by a plugin, so the parser provides values for the action_name and plugin properties not present in action.yaml ''' def test_action_for_import_node(self): exp = 'import' self.assertEqual(self.import_action.action_name, exp) def test_plugin_for_import_node(self): exp = 'framework' self.assertEqual(self.import_action.plugin, exp) def test_inputs(self): exp = { 'ints1': '8dea2f1a-2164-4a85-9f7d-e0641b1db22b', 'ints2': '8dea2f1a-2164-4a85-9f7d-e0641b1db22b', 'ints3': '7727c060-5384-445d-b007-b64b41a090ee' } self.assertEqual(self.concat_action.inputs, exp) exp = {} self.assertEqual(self.import_action.inputs, exp) def test_parameters(self): exp = { 'int1': 7, 'int2': 100, } self.assertEqual(self.concat_action.parameters, exp) exp = {} self.assertEqual(self.import_action.parameters, exp) def test_output_name(self): exp = 'concatenated_ints' self.assertEqual(self.concat_action.output_name, exp) exp = None self.assertEqual(self.import_action.output_name, exp) def test_format(self): exp = None self.assertEqual(self.concat_action.format, exp) self.assertEqual(self.import_action.format, exp) def test_transformers(self): int_seq_dir_citation = ( 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0' ) transformer_citation = ( 'transformer|dummy-plugin:0.0.0-dev|builtins:list' '->IntSequenceDirectoryFormat|0' ) output_citations = [transformer_citation, int_seq_dir_citation] exp = { 'inputs': { 'ints1': [{ 'from': 'IntSequenceDirectoryFormat', 'to': 'builtins:list', 'plugin': 'dummy-plugin', 'citations': [int_seq_dir_citation] }], 'ints2': [{ 'from': 'IntSequenceDirectoryFormat', 'to': 'builtins:list', 'plugin': 'dummy-plugin', 'citations': [int_seq_dir_citation] }], 'ints3': [{ 'from': 'IntSequenceV2DirectoryFormat', 'to': 'builtins:list', 'plugin': 'dummy-plugin', }], }, 'output': [{ 'from': 'builtins:list', 'to': 'IntSequenceDirectoryFormat', 'plugin': 'dummy-plugin', 'citations': output_citations }] } self.assertEqual(self.concat_action.transformers, exp) def test_repr(self): exp = ( '_Action(action_id=5035a60e-6f9a-40d4-b412-48ae52255bb5, ' 'type=method, plugin=dummy_plugin, ' 'action=concatenate_ints)' ) self.assertEqual(repr(self.concat_action), exp) class CitationsTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cite_strs = ['cite_none', 'cite_one', 'cite_many'] cls.bibs = [bib+'.bib' for bib in cite_strs] cls.zips = [ os.path.join(cls.das.datadir, bib+'.zip') for bib in cite_strs ] @classmethod def tearDownClass(cls): cls.das.free() def test_empty_bib(self): with zipfile.ZipFile(self.zips[0]) as zf: citations = _Citations(zf, self.bibs[0]) self.assertEqual(len(citations.citations), 0) def test_citation(self): with zipfile.ZipFile(self.zips[1]) as zf: exp = 'framework' citations = _Citations(zf, self.bibs[1]) for key in citations.citations: self.assertRegex(key, exp) def test_many_citations(self): exp = ['2020.6.0.dev0', 'unweighted_unifrac.+0', 'unweighted_unifrac.+1', 'unweighted_unifrac.+2', 'unweighted_unifrac.+3', 'unweighted_unifrac.+4', 'BIOMV210DirFmt', 'BIOMV210Format'] with zipfile.ZipFile(self.zips[2]) as zf: citations = _Citations(zf, self.bibs[2]) for i, key in enumerate(citations.citations): self.assertRegex(key, exp[i]) def test_repr(self): exp = ("Citations(['framework|qiime2:2020.6.0.dev0|0'])") with zipfile.ZipFile(self.zips[1]) as zf: citations = _Citations(zf, self.bibs[1]) self.assertEqual(repr(citations), exp) class ProvNodeTests(unittest.TestCase, ReallyEqualMixin): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir # build root nodes for all archive format versions cfg = Config(parse_study_metadata=True) cls.nodes = {} for artifact in cls.das.all_artifact_versions: with zipfile.ZipFile(artifact.filepath) as zf: all_filenames = zf.namelist() root_md_fnames = filter(is_root_provnode_data, all_filenames) root_md_fps = [pathlib.Path(fp) for fp in root_md_fnames] cls.nodes[str(artifact.archive_version)] = \ ProvNode(cfg, zf, root_md_fps) with zipfile.ZipFile(cls.das.concated_ints_with_md.filepath) as zf: root_node_id = cls.das.concated_ints_with_md.uuid all_filenames = zf.namelist() dag = cls.das.concated_ints_with_md.dag for node in dag.nodes: md_path = os.path.join( root_node_id, 'provenance', 'artifacts', node, 'action', 'metadata.tsv' ) if md_path in all_filenames: md_node_id = node else: non_md_node_id = node # build a nonroot node without study metadata node_fps = [ pathlib.Path(fp) for fp in all_filenames if non_md_node_id in fp and ('metadata.yaml' in fp or 'action.yaml' in fp or 'VERSION' in fp) ] cls.nonroot_non_md_node = ProvNode(cfg, zf, node_fps) # build a nonroot node with study metadata all_filenames = zf.namelist() node_fps = [ pathlib.Path(fp) for fp in all_filenames if md_node_id in fp and ('metadata.yaml' in fp or 'action.yaml' in fp or 'VERSION' in fp) ] cls.nonroot_md_node = ProvNode(cfg, zf, node_fps) # build a root node and parse study metadata files root_md_fnames = filter(is_root_provnode_data, all_filenames) root_md_fps = [pathlib.Path(fp) for fp in root_md_fnames] cfg = Config(parse_study_metadata=True) cls.root_node_parse_md = ProvNode(cfg, zf, root_md_fps) # build a root node and don't parse study metadata files cfg = Config(parse_study_metadata=False) cls.root_node_dont_parse_md = ProvNode(cfg, zf, root_md_fps) # build a node with a collection as input with zipfile.ZipFile(cls.das.int_from_collection.filepath) as zf: all_filenames = zf.namelist() root_md_fnames = filter(is_root_provnode_data, all_filenames) root_md_fps = [pathlib.Path(fp) for fp in root_md_fnames] cls.input_collection_node = ProvNode(cfg, zf, root_md_fps) # build a node with an optional input that defaults to None with zipfile.ZipFile(cls.das.int_seq_optional_input.filepath) as zf: all_filenames = zf.namelist() root_md_fnames = filter(is_root_provnode_data, all_filenames) root_md_fps = [pathlib.Path(fp) for fp in root_md_fnames] cls.optional_input_node = ProvNode(cfg, zf, root_md_fps) @classmethod def tearDownClass(cls): cls.das.free() def test_smoke(self): self.assertTrue(True) for node_vzn in self.nodes: self.assertIsInstance(self.nodes[node_vzn], ProvNode) def test_node_properties(self): # hardcoded from test data framework_versions = { '0': '2.0.5', '1': '2017.2.0', '2': '2017.9.0', '3': '2018.2.0', '4': '2018.6.0', '5': '2018.11.0', '6': '2023.5.1', } for node, archive_version in zip( self.nodes, [str(i) for i in range(7)] ): if archive_version == '0': self.assertEqual(self.nodes[node].format, 'BIOMV210DirFmt') self.assertEqual(self.nodes[node].type, 'FeatureTable[Frequency]') self.assertEqual(self.nodes[node].has_provenance, False) else: self.assertEqual(self.nodes[node].format, 'IntSequenceDirectoryFormat') self.assertEqual(self.nodes[node].type, 'IntSequence1') if archive_version == '1': self.assertEqual(self.nodes[node].has_provenance, False) else: self.assertEqual(self.nodes[node].has_provenance, True) self.assertEqual(self.nodes[node].archive_version, archive_version) self.assertEqual( self.nodes[node].framework_version, framework_versions[archive_version] ) def test_self_eq(self): self.assertReallyEqual(self.nodes['5'], self.nodes['5']) def test_eq(self): # Mock has no matching UUID mock_node = MagicMock() self.assertNotEqual(self.nodes['5'], mock_node) # Mock has bad UUID mock_node._uuid = 'gerbil' self.assertReallyNotEqual(self.nodes['5'], mock_node) # Matching UUIDs insufficient if classes differ mock_node._uuid = self.das.concated_ints_v5.uuid self.assertReallyNotEqual(self.nodes['5'], mock_node) mock_node.__class__ = ProvNode self.assertReallyEqual(self.nodes['5'], mock_node) def test_is_hashable(self): exp_hash = hash(self.das.concated_ints_v5.uuid) self.assertReallyEqual(hash(self.nodes['5']), exp_hash) def test_str(self): for node_vzn, artifact in zip( self.nodes, self.das.all_artifact_versions ): uuid = artifact.uuid self.assertRegex(str(self.nodes[node_vzn]), f'(?s)UUID:\t\t{uuid}.*Type.*Data Format') def test_repr(self): for node_vzn, artifact in zip( self.nodes, self.das.all_artifact_versions ): uuid = artifact.uuid self.assertRegex(repr(self.nodes[node_vzn]), f'(?s)UUID:\t\t{uuid}.*Type.*Data Format') def test_get_metadata_from_action(self): find_md = self.root_node_parse_md._get_metadata_from_Action # create dummy hash '0', not relevant here md = MetadataInfo( ['d5b4cf78-f5e2-44e0-aa24-66b02564e9f1'], 'metadata.tsv', '0' ) action_details = { 'parameters': [{'metadata': md}] } all_md, artifacts_as_md = find_md(action_details) all_exp = {'metadata': 'metadata.tsv'} a_as_md_exp = [{ 'artifact_passed_as_metadata': 'd5b4cf78-f5e2-44e0-aa24-66b02564e9f1' }] self.assertEqual(all_md, all_exp) self.assertEqual(artifacts_as_md, a_as_md_exp) def test_get_metadata_from_action_with_no_params(self): find_md = self.nodes['5']._get_metadata_from_Action action_details = \ {'parameters': []} all_md, artifacts_as_md = find_md(action_details) self.assertEqual(all_md, {}) self.assertEqual(artifacts_as_md, []) action_details = {'non-parameters-key': 'here is a thing'} all_md, artifacts_as_md = find_md(action_details) self.assertEqual(all_md, {}) self.assertEqual(artifacts_as_md, []) def test_metadata_available_in_property(self): self.assertEqual(type(self.nonroot_md_node.metadata), dict) self.assertIn('metadata', self.nonroot_md_node.metadata) self.assertEqual(type(self.nonroot_md_node.metadata['metadata']), pd.DataFrame) def test_metadata_not_available_in_property_w_opt_out(self): self.assertEqual(self.root_node_dont_parse_md.metadata, None) def test_metadata_is_correct(self): self.assertIn('metadata', self.nonroot_md_node.metadata) md_data = { 'id': ['#q2:types', '0'], 'a': ['categorical', '42'], } md_exp = pd.DataFrame(md_data, columns=md_data.keys()) pd.testing.assert_frame_equal( md_exp, self.nonroot_md_node.metadata['metadata'] ) def test_has_no_provenance_so_no_metadata(self): self.assertEqual(self.nodes['0'].has_provenance, False) self.assertEqual(self.nodes['0'].metadata, None) def test_node_has_provenance_but_no_metadata(self): self.assertEqual(self.nonroot_non_md_node.has_provenance, True) self.assertEqual(self.nonroot_non_md_node.metadata, {}) def test_parse_metadata_for_nonroot_node(self): self.assertEqual(self.nonroot_md_node.has_provenance, True) self.assertIn('metadata', self.nonroot_md_node.metadata) def test_parents(self): actual_parent_names = [] for parent in self.nodes['5']._parents: actual_parent_names += parent.keys() self.assertIn('ints1', actual_parent_names) self.assertIn('ints2', actual_parent_names) self.assertIn('ints3', actual_parent_names) self.assertEqual(len(self.nodes['5']._parents), 3) self.assertEqual(len(actual_parent_names), 3) def test_parents_no_prov(self): no_prov_node = self.nodes['0'] self.assertFalse(no_prov_node.has_provenance) self.assertEqual(no_prov_node._parents, None) def test_parents_with_artifact_passed_as_md(self): actual_parent_names = [] for parent in self.nonroot_md_node._parents: actual_parent_names += parent.keys() self.assertIn('ints', actual_parent_names) self.assertIn('artifact_passed_as_metadata', actual_parent_names) self.assertEqual(len(self.nonroot_md_node._parents), 2) self.assertEqual(len(actual_parent_names), 2) def test_parents_for_import_node(self): uuid = self.das.single_int.uuid with zipfile.ZipFile(self.das.single_int.filepath) as zf: required_fps = ('VERSION', 'metadata.yaml', 'action.yaml') import_node_fps = [ pathlib.Path(fp) for fp in zf.namelist() if uuid in fp and any(map(lambda x: x in fp, required_fps)) ] import_node = ProvNode(Config(), zf, import_node_fps) self.assertEqual(import_node._parents, []) def test_parents_collection_of_inputs(self): parents = self.input_collection_node._parents self.assertIn('int1', parents[0].keys()) self.assertIn('int2', parents[1].keys()) self.assertEqual(len(parents), 2) def test_parents_optional_input(self): # NOTE: The None-type input is not captured parents = self.optional_input_node._parents self.assertIn('ints', parents[0].keys()) self.assertEqual(len(parents), 1) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_checksum_validator.py000066400000000000000000000073261462552636000303220ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import shutil import tempfile import unittest import zipfile import pytest from qiime2 import Artifact from qiime2.core.archive.archiver import ChecksumDiff from qiime2.sdk.plugin_manager import PluginManager from .._checksum_validator import validate_checksums, ValidationCode from .testing_utilities import write_zip_archive class ValidateChecksumTests(unittest.TestCase): def setUp(self): self.pm = PluginManager() self.dp = self.pm.plugins['dummy-plugin'] self.tempdir = tempfile.mkdtemp( prefix='qiime2-test-checksum-validator-temp-' ) def tearDown(self): shutil.rmtree(self.tempdir) def test_validate_checksums(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq.qza') int_seq.save(fp) with zipfile.ZipFile(fp) as zf: is_valid, diff = validate_checksums(zf) self.assertEqual(is_valid, ValidationCode.VALID) self.assertEqual(diff, ChecksumDiff({}, {}, {})) @pytest.mark.filterwarnings('ignore::UserWarning') def test_validate_checksums_invalid(self): ''' Mangle an intact v5 Archive so that its checksums.md5 is invalid, and then confirm that we're catching all the changes we've made Specifically: - remove the root `/metadata.yaml` - add a new file called '/tamper.txt` - overwrite `/provenance/citations.bib` ''' int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq-altered.qza') int_seq.save(fp) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(fp) as zf: zf.extractall(tempdir) uuid = os.listdir(tempdir)[0] root_dir = os.path.join(tempdir, uuid) os.remove(os.path.join(root_dir, 'metadata.yaml')) with open(os.path.join(root_dir, 'tamper.txt'), 'w') as fh: pass citations_path = \ os.path.join(root_dir, 'provenance', 'citations.bib') with open(citations_path, 'w') as fh: fh.write('file overwritten\n') write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: is_valid, diff = validate_checksums(zf) self.assertEqual(is_valid, ValidationCode.INVALID) self.assertEqual(list(diff.added.keys()), ['tamper.txt']) self.assertEqual(list(diff.removed.keys()), ['metadata.yaml']) self.assertEqual(list(diff.changed.keys()), ['provenance/citations.bib']) @pytest.mark.filterwarnings('ignore::UserWarning') def test_validate_checksums_checksums_missing(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq-missing-version.qza') int_seq.save(fp) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(fp) as zf: zf.extractall(tempdir) uuid = os.listdir(tempdir)[0] os.remove(os.path.join(tempdir, uuid, 'checksums.md5')) write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: is_valid, diff = validate_checksums(zf) self.assertEqual(is_valid, ValidationCode.INVALID) self.assertEqual(diff, None) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_parse.py000066400000000000000000001257401462552636000255660ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import io import os import pathlib import random import shutil import tempfile import unittest import zipfile from contextlib import redirect_stdout import networkx as nx from networkx import DiGraph import pytest from .._checksum_validator import ValidationCode from ..parse import ( ProvDAG, DirectoryParser, EmptyParser, ProvDAGParser, select_parser, parse_provenance, UnparseableDataError ) from ..archive_parser import ( ParserV0, ParserV1, ParserV2, ParserV3, ParserV4, ParserV5, ParserV6, Config, ProvNode, ParserResults, ArchiveParser, ) from .testing_utilities import ( is_root_provnode_data, generate_archive_with_file_removed, DummyArtifacts ) from qiime2 import Artifact from qiime2.core.archive.archiver import ChecksumDiff class ProvDAGTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def save_artifact_to_dir(self, artifact, directory): dir_fp = os.path.join(self.tempdir, directory) try: os.mkdir(dir_fp) except FileExistsError: pass fp = os.path.join(dir_fp, f'{artifact.name}.qza') artifact.artifact.save(fp) return dir_fp def test_number_of_nodes(self): num_single_int_nodes = len(self.das.single_int.dag.nodes) self.assertEqual(num_single_int_nodes, 1) num_int_seq1_nodes = len(self.das.int_seq1.dag.nodes) self.assertEqual(num_int_seq1_nodes, 1) num_concated_ints_nodes = len(self.das.concated_ints.dag.nodes) self.assertEqual(num_concated_ints_nodes, 3) def test_number_of_nodes_pipeline(self): # input int-seq, input mapping, pipeline output viz, aliased viz, # split int-seq num_pipeline_viz_nodes = len(self.das.pipeline_viz.dag.nodes) self.assertEqual(num_pipeline_viz_nodes, 5) def test_number_of_terminal_nodes(self): num_int_seq1_term_nodes = len(self.das.int_seq1.dag.terminal_nodes) self.assertEqual(num_int_seq1_term_nodes, 1) num_concated_ints_term_nodes = \ len(self.das.concated_ints.dag.terminal_nodes) self.assertEqual(num_concated_ints_term_nodes, 1) self.save_artifact_to_dir(self.das.int_seq1, 'two-ints') dir_fp = self.save_artifact_to_dir(self.das.int_seq2, 'two-ints') dag = ProvDAG(dir_fp) self.assertEqual(len(dag.terminal_nodes), 2) self.save_artifact_to_dir(self.das.int_seq1, 'int-and-concated-ints') dir_fp = self.save_artifact_to_dir(self.das.concated_ints, 'int-and-concated-ints') dag = ProvDAG(dir_fp) self.assertEqual(len(dag.terminal_nodes), 1) def test_number_of_terminal_nodes_pipeline(self): num_pipeline_viz_term_nodes = \ len(self.das.pipeline_viz.dag.terminal_nodes) self.assertEqual(num_pipeline_viz_term_nodes, 1) def test_root_node_is_archive_root(self): with zipfile.ZipFile(self.das.concated_ints.filepath) as zf: all_filenames = zf.namelist() root_filenames = filter(is_root_provnode_data, all_filenames) root_filepaths = [pathlib.Path(fp) for fp in root_filenames] exp_node = ProvNode(Config(), zf, root_filepaths) act_terminal_node, *_ = self.das.concated_ints.dag.terminal_nodes self.assertEqual(exp_node, act_terminal_node) def test_number_of_actions(self): self.assertEqual(self.das.int_seq1.dag.dag.number_of_edges(), 0) self.assertEqual(self.das.concated_ints.dag.dag.number_of_edges(), 2) def test_number_of_actions_pipeline(self): # (1) one edge from input intseq to pipeline output viz # (2) one edge from input mapping to pipeline output viz # (3) one edge from input intseq to left split int # (4) one edge from left split int to true output viz, of which # pipeline output viz is an alias self.assertEqual(self.das.pipeline_viz.dag.dag.number_of_edges(), 4) def test_nonexistent_fp(self): fp = os.path.join(self.tempdir, 'does-not-exist.qza') err_msg = 'FileNotFoundError' with self.assertRaisesRegex(UnparseableDataError, err_msg): ProvDAG(fp) def test_insufficient_permissions(self): fp = os.path.join(self.tempdir, 'int-seq-1-permissions-copy.qza') self.das.int_seq1.artifact.save(fp) os.chmod(fp, 0o000) err_msg = 'PermissionError.*Permission denied' with self.assertRaisesRegex(UnparseableDataError, err_msg): ProvDAG(fp) os.chmod(fp, 0o123) err_msg = 'PermissionError.*Permission denied' with self.assertRaisesRegex(UnparseableDataError, err_msg): ProvDAG(fp) def test_not_a_zip_file(self): fp = os.path.join(self.tempdir, 'not-a-zip.txt') with open(fp, 'w') as fh: fh.write("This is just a text file.") err_msg = 'zipfile.BadZipFile.*File is not a zip file' with self.assertRaisesRegex(UnparseableDataError, err_msg): ProvDAG(fp) def test_has_digraph(self): self.assertIsInstance(self.das.int_seq1.dag.dag, DiGraph) self.assertIsInstance(self.das.concated_ints.dag.dag, DiGraph) def test_dag_attributes(self): dag = self.das.int_seq1.dag terminal_node, *_ = dag.terminal_nodes self.assertIsInstance(terminal_node, ProvNode) self.assertEqual(dag.provenance_is_valid, ValidationCode.VALID) empty_checksum_diff = ChecksumDiff(added={}, removed={}, changed={}) self.assertEqual(dag.checksum_diff, empty_checksum_diff) def test_node_action_names(self): int_seq1_node, *_ = self.das.int_seq1.dag.terminal_nodes self.assertEqual(int_seq1_node.action.action_name, 'import') concated_ints_node, *_ = self.das.concated_ints.dag.terminal_nodes self.assertEqual(concated_ints_node.action.action_name, 'concatenate_ints') def test_node_action_names_pipeline(self): pipeline_viz_node, *_ = self.das.pipeline_viz.dag.terminal_nodes self.assertEqual(pipeline_viz_node.action.action_name, 'typical_pipeline') def test_has_correct_edges(self): edges = self.das.concated_ints.dag.dag.edges self.assertIn( (self.das.int_seq1.uuid, self.das.concated_ints.uuid), edges ) self.assertIn( (self.das.int_seq2.uuid, self.das.concated_ints.uuid), edges ) self.assertNotIn( (self.das.int_seq1.uuid, self.das.int_seq2.uuid), edges ) self.assertNotIn( (self.das.concated_ints.uuid, self.das.int_seq1.uuid), edges ) def test_dag_repr(self): exp_repr = f'ProvDAG.*Artifacts.*{self.das.int_seq1.uuid}' self.assertRegex(repr(self.das.int_seq1.dag), exp_repr) def test_node_repr(self): int_seq1_node, *_ = self.das.int_seq1.dag.terminal_nodes uuid = int_seq1_node._uuid type_ = int_seq1_node.type format_ = int_seq1_node.format exp_repr = f'(?s)UUID.*{uuid}.*Type:.*{type_}.*Data Format:.*{format_}' self.assertRegex(repr(int_seq1_node), exp_repr) def test_dag_eq(self): dag = self.das.int_seq1.dag self.assertEqual(dag, dag) fp = self.das.int_seq1.filepath self.assertEqual(ProvDAG(fp), ProvDAG(fp)) # because they are isomorphic self.assertEqual(self.das.int_seq1.dag, self.das.int_seq2.dag) def test_dag_not_eq(self): self.assertNotEqual(self.das.int_seq1.dag, self.das.concated_ints.dag) def test_captures_full_history(self): concat_ints = self.das.dp.actions['concatenate_ints'] next_concated_ints = self.das.concated_ints.artifact iterations = random.randint(1, 10) for _ in range(iterations): next_concated_ints, = concat_ints(next_concated_ints, next_concated_ints, self.das.int_seq2.artifact, 4, 6) fp = os.path.join(self.tempdir, 'very-concated-ints.qza') next_concated_ints.save(fp) dag = ProvDAG(fp) # iterations + o.g. concated_ints + o.g. int_seq + o.g. int_seq2 self.assertEqual(len(dag), iterations + 3) def test_get_outer_provenance_nodes(self): fp = os.path.join(self.tempdir, 'disconnected-provenances') os.mkdir(fp) self.das.concated_ints.artifact.save( os.path.join(fp, 'concated-ints.qza')) unattached_int_seq = Artifact.import_data('IntSequence1', [8, 8]) unattached_int_seq.save(os.path.join(fp, 'unattached-int-seq.qza')) dag = ProvDAG(fp) actual = dag.get_outer_provenance_nodes(self.das.concated_ints.uuid) exp = { self.das.concated_ints.uuid, self.das.int_seq1.uuid, self.das.int_seq2.uuid } self.assertEqual(actual, exp) def test_get_outer_provenance_nodes_pipeline(self): dag = self.das.pipeline_viz.dag actual = dag.get_outer_provenance_nodes(self.das.pipeline_viz.uuid) exp = { self.das.pipeline_viz.uuid, self.das.int_seq1.uuid, self.das.mapping1.uuid } self.assertEqual(actual, exp) def test_collapsed_view(self): view = self.das.concated_ints.dag.collapsed_view self.assertIsInstance(view, DiGraph) self.assertEqual(len(view), 3) exp_nodes = [ self.das.concated_ints.uuid, self.das.int_seq1.uuid, self.das.int_seq2.uuid ] for exp_node in exp_nodes: self.assertIn(exp_node, view.nodes) def test_collapsed_view_pipeline(self): view = self.das.pipeline_viz.dag.collapsed_view self.assertIsInstance(view, DiGraph) self.assertEqual(len(view), 3) exp_nodes = [ self.das.pipeline_viz.uuid, self.das.int_seq1.uuid, self.das.mapping1.uuid ] for exp_node in exp_nodes: self.assertIn(exp_node, view.nodes) @pytest.mark.filterwarnings('ignore::UserWarning') def test_invalid_provenance(self): ''' Mangle an intact v5 Archive so that its checksums.md5 is invalid, and then build a ProvDAG with it to confirm the ProvDAG constructor handles broken checksums appropriately ''' uuid = self.das.int_seq1.uuid with generate_archive_with_file_removed( self.das.int_seq1.filepath, uuid, os.path.join('data', 'ints.txt') ) as altered_archive: new_fp = os.path.join(uuid, 'data', 'tamper.txt') overwrite_fp = os.path.join(uuid, 'provenance', 'citations.bib') with zipfile.ZipFile(altered_archive, 'a') as zf: zf.writestr(new_fp, 'added file') with zf.open(overwrite_fp, 'w') as fh: fh.write(b'999\n') expected = ( '(?s)' f'Checksums are invalid for Archive {uuid}.*' 'Archive may be corrupt.*' 'Files added.*tamper.txt.*' 'Files removed.*ints.txt.*' 'Files changed.*provenance.*citations.bib.*' ) with self.assertWarnsRegex(UserWarning, expected): dag = ProvDAG(altered_archive) self.assertEqual(dag.provenance_is_valid, ValidationCode.INVALID) diff = dag.checksum_diff self.assertEqual(list(diff.removed.keys()), ['data/ints.txt']) self.assertEqual(list(diff.added.keys()), ['data/tamper.txt']) self.assertEqual(list(diff.changed.keys()), ['provenance/citations.bib']) def test_missing_checksums_md5(self): uuid = self.das.single_int.uuid with generate_archive_with_file_removed( self.das.single_int.filepath, uuid, 'checksums.md5' ) as altered_archive: expected = ( 'The checksums.md5 file is missing from the archive.*' 'Archive may be corrupt' ) with self.assertWarnsRegex(UserWarning, expected): dag = ProvDAG(altered_archive) self.assertEqual(dag.provenance_is_valid, ValidationCode.INVALID) diff = dag.checksum_diff self.assertEqual(diff, None) @pytest.mark.filterwarnings('ignore::UserWarning') def test_error_if_missing_node_files(self): path_prefix = os.path.join('provenance', 'artifacts') root_uuid = self.das.concated_ints.uuid for removed_file in [ 'metadata.yaml', 'citations.bib', 'VERSION', 'action/action.yaml' ]: for uuid in [self.das.int_seq1.uuid, self.das.int_seq2.uuid]: with generate_archive_with_file_removed( self.das.concated_ints.filepath, root_uuid, os.path.join(path_prefix, uuid, removed_file) ) as altered_archive: if removed_file == 'action/action.yaml': file = 'action.yaml' else: file = removed_file expected = ( f'(?s)Malformed.*{file}.*{uuid}.*corrupt.*' ) with self.assertRaisesRegex(ValueError, expected): ProvDAG(altered_archive) def test_v0_archive(self): dag = self.das.table_v0.dag uuid = self.das.table_v0.uuid self.assertEqual( dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual(dag.node_has_provenance(uuid), False) def test_v1_archive(self): dag = self.das.concated_ints_v1.dag uuid = self.das.concated_ints_v1.uuid self.assertEqual( dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual(dag.node_has_provenance(uuid), False) def test_v2_archive(self): dag = self.das.concated_ints_v2.dag uuid = self.das.concated_ints_v2.uuid self.assertEqual( dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual(dag.node_has_provenance(uuid), True) def test_v4_archive(self): dag = self.das.concated_ints_v4.dag uuid = self.das.concated_ints_v4.uuid self.assertEqual( dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual(dag.node_has_provenance(uuid), True) with zipfile.ZipFile(self.das.concated_ints_v4.filepath) as zf: citations_path = os.path.join(uuid, 'provenance', 'citations.bib') self.assertIn(citations_path, zf.namelist()) def test_v5_archive(self): dag = self.das.concated_ints_v5.dag uuid = self.das.concated_ints_v5.uuid self.assertEqual(dag.provenance_is_valid, ValidationCode.VALID) self.assertEqual(dag.node_has_provenance(uuid), True) def test_artifact_passed_as_metadata_archive(self): dag = self.das.mapping1.dag uuid = self.das.mapping1.uuid self.assertEqual(dag.node_has_provenance(uuid), True) self.assertEqual(dag.get_node_data(uuid)._uuid, uuid) self.assertEqual(dag.get_node_data(uuid).type, 'Mapping') def test_artifact_with_collection_of_inputs(self): dag = self.das.merged_mappings.dag uuid = self.das.merged_mappings.uuid root_node = dag.get_node_data(uuid) self.assertEqual(root_node.type, 'Mapping') exp_parents = {self.das.mapping1.uuid, self.das.mapping2.uuid} self.assertEqual(dag.predecessors(uuid), exp_parents) def test_provdag_initialized_from_provdag(self): for dag in [self.das.single_int.dag, self.das.concated_ints.dag, self.das.merged_mappings.dag]: copied = ProvDAG(dag) self.assertEqual(dag, copied) self.assertIsNot(dag, copied) def test_union_zero_or_one_dags(self): with self.assertRaisesRegex(ValueError, "pass.*two ProvDAGs"): ProvDAG.union([]) with self.assertRaisesRegex(ValueError, "pass.*two ProvDAGs"): ProvDAG.union([self.das.single_int.dag]) def test_union_identity(self): dag = self.das.single_int.dag uuid = self.das.single_int.uuid unioned_dag = ProvDAG.union([dag, dag]) self.assertEqual(dag, unioned_dag) self.assertSetEqual({uuid}, unioned_dag._parsed_artifact_uuids) self.assertEqual(unioned_dag.provenance_is_valid, ValidationCode.VALID) self.assertRegex( repr(unioned_dag), f'ProvDAG representing the provenance.*Artifacts.*{uuid}' ) def test_union_two(self): unioned_dag = ProvDAG.union( [self.das.single_int.dag, self.das.int_seq2.dag]) self.assertEqual( {self.das.single_int.uuid, self.das.int_seq2.uuid}, unioned_dag._parsed_artifact_uuids ) self.assertEqual(unioned_dag.provenance_is_valid, ValidationCode.VALID) rep = repr(unioned_dag) self.assertRegex( rep, 'ProvDAG representing the provenance.*Artifacts.' ) self.assertRegex(rep, f'{self.das.single_int.uuid}') self.assertRegex(rep, f'{self.das.int_seq2.uuid}') self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 2 ) def test_union_many(self): unioned_dag = ProvDAG.union([ self.das.single_int.dag, self.das.int_seq1.dag, self.das.mapping1.dag ]) self.assertEqual( { self.das.single_int.uuid, self.das.int_seq1.uuid, self.das.mapping1.uuid }, unioned_dag._parsed_artifact_uuids ) self.assertEqual(unioned_dag.provenance_is_valid, ValidationCode.VALID) rep = repr(unioned_dag) self.assertRegex( rep, 'ProvDAG representing the provenance.*Artifacts.*' ) self.assertRegex(rep, f'{self.das.single_int.uuid}') self.assertRegex(rep, f'{self.das.int_seq1.uuid}') self.assertRegex(rep, f'{self.das.mapping1.uuid}') self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 3 ) def test_union_self_missing_checksums_md5(self): unioned_dag = ProvDAG.union( [self.das.dag_missing_md5, self.das.single_int.dag] ) self.assertRegex( repr(unioned_dag), 'ProvDAG representing the provenance.*Artifacts.*' f'{self.das.single_int.uuid}' ) # The ChecksumDiff==None from the tinkered dag gets ignored... self.assertEqual(unioned_dag.checksum_diff, ChecksumDiff({}, {}, {})) # ...but this should make clear that the provenance is bad # (or that the user opted out of validation) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.INVALID) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) def test_union_other_missing_checksums_md5(self): ''' Tests unions of v5 dags where the other ProvDAG is missing its checksums.md5 but the calling ProvDAG is not ''' unioned_dag = ProvDAG.union([self.das.single_int.dag, self.das.dag_missing_md5]) self.assertRegex(repr(unioned_dag), 'ProvDAG representing the provenance.*Artifacts.*' f'{self.das.single_int.uuid}') self.assertEqual(unioned_dag.checksum_diff, ChecksumDiff({}, {}, {})) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.INVALID ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) def test_union_both_missing_checksums_md5(self): ''' Tests unions of v5 dags where both artifacts are missing their checksums.md5 files. ''' unioned_dag = ProvDAG.union( [self.das.dag_missing_md5, self.das.dag_missing_md5]) self.assertRegex( repr(unioned_dag), 'ProvDAG representing the provenance.*Artifacts.*' f'{self.das.single_int.uuid}' ) # Both DAGs have NoneType checksum_diffs, so the ChecksumDiff==None self.assertEqual(unioned_dag.checksum_diff, None) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.INVALID ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) def test_union_v0_v1_archives(self): unioned_dag = ProvDAG.union( [self.das.table_v0.dag, self.das.concated_ints_v2.dag] ) self.assertIn(f'{self.das.table_v0.uuid}', repr(unioned_dag)) self.assertIn(f'{self.das.concated_ints_v2.uuid}', repr(unioned_dag)) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 2 ) self.assertFalse( unioned_dag.node_has_provenance(self.das.table_v0.uuid) ) self.assertTrue( unioned_dag.node_has_provenance(self.das.concated_ints_v2.uuid) ) def test_union_v3_v5_archives(self): unioned_dag = ProvDAG.union( [self.das.concated_ints_v3.dag, self.das.concated_ints_v5.dag] ) self.assertIn(f'{self.das.concated_ints_v3.uuid}', repr(unioned_dag)) self.assertIn(f'{self.das.concated_ints_v5.uuid}', repr(unioned_dag)) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.PREDATES_CHECKSUMS ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 2 ) self.assertTrue( unioned_dag.node_has_provenance(self.das.concated_ints_v3.uuid) ) self.assertTrue( unioned_dag.node_has_provenance(self.das.concated_ints_v5.uuid) ) def test_union_v5_v6_archives(self): unioned_dag = ProvDAG.union( [self.das.concated_ints_v5.dag, self.das.concated_ints_v6.dag] ) self.assertIn(f'{self.das.concated_ints_v5.uuid}', repr(unioned_dag)) self.assertIn(f'{self.das.concated_ints_v6.uuid}', repr(unioned_dag)) self.assertEqual( unioned_dag.provenance_is_valid, ValidationCode.VALID ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 2 ) self.assertTrue( unioned_dag.node_has_provenance(self.das.concated_ints_v5.uuid) ) self.assertTrue( unioned_dag.node_has_provenance(self.das.concated_ints_v6.uuid) ) def test_dag_is_superset(self): ''' Tests union of three dags, where one dag is a proper superset of the others. We expect three _parsed_artifact_uuids, one terminal uuid, and one weakly_connected_component. ''' unioned_dag = ProvDAG.union([ self.das.concated_ints.dag, self.das.int_seq1.dag, self.das.int_seq2.dag ]) self.assertIn( self.das.int_seq1.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.int_seq2.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.concated_ints.uuid, unioned_dag._parsed_artifact_uuids ) self.assertEqual(len(unioned_dag._parsed_artifact_uuids), 3) self.assertEqual(len(unioned_dag.terminal_uuids), 1) self.assertEqual( unioned_dag.terminal_uuids, {self.das.concated_ints.uuid} ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) # == tests identity of objects in memory, so we need is_isomorphic self.assertTrue( nx.is_isomorphic(self.das.concated_ints.dag.dag, unioned_dag.dag) ) self.assertEqual(self.das.concated_ints.dag, unioned_dag) def test_three_artifacts_two_terminal_uuids(self): ''' Tests union of four dags, where two artifacts have shared parents but no direct relationship with eachother. We expect four _parsed_artifact_uuids, two terminal uuids, and one weakly_connected_component. ''' unioned_dag = ProvDAG.union([ self.das.int_seq1.dag, self.das.int_seq2.dag, self.das.concated_ints.dag, self.das.other_concated_ints.dag ]) self.assertIn( self.das.int_seq1.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.int_seq2.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.concated_ints.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.other_concated_ints.uuid, unioned_dag._parsed_artifact_uuids ) self.assertEqual(len(unioned_dag._parsed_artifact_uuids), 4) self.assertEqual(len(unioned_dag.terminal_uuids), 2) self.assertEqual( unioned_dag.terminal_uuids, {self.das.concated_ints.uuid, self.das.other_concated_ints.uuid} ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) def test_one_analysis_two_artifacts(self): ''' In this set of test archives, both artifacts are derived from the same parents, so should produce one connected DAG even though we are missing the parent artifacts used to create them ''' unioned_dag = ProvDAG.union([ self.das.concated_ints.dag, self.das.other_concated_ints.dag ]) self.assertIn( self.das.concated_ints.uuid, unioned_dag._parsed_artifact_uuids ) self.assertIn( self.das.other_concated_ints.uuid, unioned_dag._parsed_artifact_uuids ) self.assertEqual(len(unioned_dag._parsed_artifact_uuids), 2) self.assertEqual(len(unioned_dag.terminal_uuids), 2) self.assertEqual( unioned_dag.terminal_uuids, {self.das.concated_ints.uuid, self.das.other_concated_ints.uuid} ) self.assertEqual( nx.number_weakly_connected_components(unioned_dag.dag), 1 ) def test_no_checksum_validation_on_intact_artifact(self): no_validation_dag = ProvDAG( self.das.concated_ints.filepath, validate_checksums=False ) self.assertEqual(len(no_validation_dag.terminal_uuids), 1) terminal_uuid, *_ = no_validation_dag.terminal_uuids self.assertEqual(terminal_uuid, self.das.concated_ints.uuid) self.assertEqual(len(no_validation_dag), 3) self.assertEqual( no_validation_dag.provenance_is_valid, ValidationCode.VALIDATION_OPTOUT ) self.assertEqual(no_validation_dag.checksum_diff, None) def test_no_checksum_validation_missing_checksums_md5(self): with generate_archive_with_file_removed( self.das.concated_ints.filepath, self.das.concated_ints.uuid, 'checksums.md5' ) as altered_archive: dag = ProvDAG(altered_archive, validate_checksums=False) self.assertEqual( dag.provenance_is_valid, ValidationCode.VALIDATION_OPTOUT ) self.assertEqual(dag.checksum_diff, None) def test_no_checksum_validation_missing_node_files(self): path_prefix = os.path.join('provenance', 'artifacts') root_uuid = self.das.concated_ints.uuid for removed_file in [ 'metadata.yaml', 'citations.bib', 'VERSION', 'action/action.yaml' ]: for uuid in [self.das.int_seq1.uuid, self.das.int_seq2.uuid]: with generate_archive_with_file_removed( self.das.concated_ints.filepath, root_uuid, os.path.join(path_prefix, uuid, removed_file) ) as altered_archive: if removed_file == 'action/action.yaml': file = 'action.yaml' else: file = removed_file expected = (f'(?s)Malformed.*{file}.*{uuid}.*corrupt.*') with self.assertRaisesRegex(ValueError, expected): ProvDAG(altered_archive, validate_checksums=False) class EmptyParserTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp(prefix='qiime2-test-parse-temp-') def tearDown(self): shutil.rmtree(self.tempdir) def test_get_parser(self): parser = EmptyParser.get_parser(None) self.assertIsInstance(parser, EmptyParser) def test_get_parser_input_data_not_none(self): fn = 'not-a-zip.txt' fp = os.path.join(self.tempdir, fn) with open(fp, 'w') as fh: fh.write('some text\n') with self.assertRaisesRegex( TypeError, f"EmptyParser.*{fn} is not None" ): EmptyParser.get_parser(fp) def test_parse_a_nonetype(self): ''' tests that we can actually create empty ProvDAGs ''' parser = EmptyParser() parsed = parser.parse_prov(Config(), None) self.assertIsInstance(parsed, ParserResults) self.assertEqual(parsed.parsed_artifact_uuids, set()) self.assertTrue(nx.is_isomorphic(parsed.prov_digraph, nx.DiGraph())) self.assertEqual(parsed.provenance_is_valid, ValidationCode.VALID) self.assertEqual(parsed.checksum_diff, None) class ProvDAGParserTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() @classmethod def tearDownClass(cls): cls.das.free() def test_get_parser(self): for archive in self.das.all_artifact_versions: parser = ProvDAGParser.get_parser(archive.dag) self.assertIsInstance(parser, ProvDAGParser) def test_get_parser_input_data_not_a_provdag(self): fn = 'not_a_zip.txt' fp = os.path.join(self.das.tempdir, fn) with self.assertRaisesRegex( TypeError, f"ProvDAGParser.*{fn} is not a ProvDAG"): ProvDAGParser.get_parser(fp) def test_parse_a_provdag(self): parser = ProvDAGParser() for archive in self.das.all_artifact_versions: dag = archive.dag parsed = parser.parse_prov(Config(), dag) self.assertIsInstance(parsed, ParserResults) self.assertEqual( parsed.parsed_artifact_uuids, dag._parsed_artifact_uuids ) self.assertTrue(nx.is_isomorphic(parsed.prov_digraph, dag.dag)) self.assertEqual( parsed.provenance_is_valid, dag.provenance_is_valid ) self.assertEqual(parsed.checksum_diff, dag.checksum_diff) class SelectParserTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_correct_parser_type(self): empty = select_parser(None) self.assertIsInstance(empty, EmptyParser) archive = select_parser(self.das.concated_ints.filepath) self.assertIsInstance(archive, ArchiveParser) dag = ProvDAG() pdag = select_parser(dag) self.assertIsInstance(pdag, ProvDAGParser) # check dir_fp as fp test_dir = os.path.join(self.tempdir, 'parse_dir_test') os.mkdir(test_dir) dir_fp = pathlib.Path(test_dir) dir_p = select_parser(dir_fp) self.assertIsInstance(dir_p, DirectoryParser) # check dir_fp as str dir_fp_str = str(dir_fp) dir_p = select_parser(dir_fp_str) self.assertIsInstance(dir_p, DirectoryParser) def test_correct_archive_parser_version(self): parsers = [ ParserV0, ParserV1, ParserV2, ParserV3, ParserV4, ParserV5, ParserV6 ] for archive, parser in zip(self.das.all_artifact_versions, parsers): handler = select_parser(archive.filepath) self.assertEqual(type(handler), parser) class ParseProvenanceTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.cfg = Config() @classmethod def tearDownClass(cls): cls.das.free() def test_parse_with_artifact_parser(self): uuid = self.das.concated_ints.uuid fp = self.das.concated_ints.filepath parser_results = parse_provenance(self.cfg, fp) self.assertIsInstance(parser_results, ParserResults) p_a_uuids = parser_results.parsed_artifact_uuids self.assertIsInstance(p_a_uuids, set) self.assertIsInstance(next(iter(p_a_uuids)), str) self.assertEqual(len(parser_results.prov_digraph), 3) self.assertIn(uuid, parser_results.prov_digraph) self.assertIsInstance( parser_results.prov_digraph.nodes[uuid]['node_data'], ProvNode ) self.assertEqual( parser_results.provenance_is_valid, ValidationCode.VALID ) self.assertEqual( parser_results.checksum_diff, ChecksumDiff({}, {}, {}) ) def test_parse_with_provdag_parser(self): uuid = self.das.concated_ints.uuid dag = self.das.concated_ints.dag parser_results = parse_provenance(self.cfg, dag) self.assertIsInstance(parser_results, ParserResults) p_a_uuids = parser_results.parsed_artifact_uuids self.assertIsInstance(p_a_uuids, set) self.assertIsInstance(next(iter(p_a_uuids)), str) self.assertEqual(len(parser_results.prov_digraph), 3) self.assertIn(uuid, parser_results.prov_digraph) self.assertIsInstance( parser_results.prov_digraph.nodes[uuid]['node_data'], ProvNode ) self.assertEqual( parser_results.provenance_is_valid, ValidationCode.VALID ) self.assertEqual( parser_results.checksum_diff, ChecksumDiff({}, {}, {}) ) def test_parse_with_empty_parser(self): res = parse_provenance(self.cfg, None) self.assertIsInstance(res, ParserResults) self.assertEqual(res.parsed_artifact_uuids, set()) self.assertTrue(nx.is_isomorphic(res.prov_digraph, nx.DiGraph())) self.assertEqual(res.provenance_is_valid, ValidationCode.VALID) self.assertEqual(res.checksum_diff, None) def test_parse_with_directory_parser(self): # Non-recursive parse_dir_fp = os.path.join(self.tempdir, 'parse_dir') os.mkdir(parse_dir_fp) concated_ints_path = os.path.join(parse_dir_fp, 'concated-ints.qza') shutil.copy(self.das.concated_ints.filepath, concated_ints_path) res = parse_provenance(self.cfg, parse_dir_fp) self.assertEqual(self.cfg.recurse, False) self.assertIsInstance(res, ParserResults) concated_ints_uuid = self.das.concated_ints.uuid self.assertEqual(res.parsed_artifact_uuids, {concated_ints_uuid}) self.assertEqual(len(res.prov_digraph), 3) self.assertEqual(res.provenance_is_valid, ValidationCode.VALID) self.assertEqual(res.checksum_diff, ChecksumDiff({}, {}, {})) # Recursive inner_dir_path = os.path.join(parse_dir_fp, 'inner-dir') os.mkdir(inner_dir_path) mapping_path = os.path.join(inner_dir_path, 'mapping1.qza') shutil.copy(self.das.mapping1.filepath, mapping_path) self.cfg.recurse = True self.assertEqual(self.cfg.recurse, True) res = parse_provenance(self.cfg, parse_dir_fp) self.assertIsInstance(res, ParserResults) mapping_uuid = self.das.mapping1.uuid self.assertEqual( res.parsed_artifact_uuids, {concated_ints_uuid, mapping_uuid} ) self.assertEqual(len(res.prov_digraph), 4) self.assertEqual(res.provenance_is_valid, ValidationCode.VALID) self.assertEqual(res.checksum_diff, ChecksumDiff({}, {}, {})) def test_parse_with_directory_parser_bad_dir_path(self): dir_fp = os.path.join(self.tempdir, 'fake_dir') with self.assertRaisesRegex(Exception, 'not a valid dir'): parse_provenance(self.cfg, dir_fp) def test_no_correct_parser_found_error(self): input_data = {'this': 'is not parseable'} with self.assertRaisesRegex( UnparseableDataError, f'(?s)Input data {input_data}.*not supported.*' 'ArchiveParser expects a string or pathlib.PosixPath.*' 'DirectoryParser.*expects a directory.*' 'ProvDAGParser.*is not a ProvDAG.*' 'EmptyParser.*is not None' ): select_parser(input_data) class DirectoryParserTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.cfg = Config() parse_dir_fp = os.path.join(cls.tempdir, 'parse-dir') os.mkdir(parse_dir_fp) concated_ints_path = os.path.join(parse_dir_fp, 'concated-ints.qza') shutil.copy(cls.das.concated_ints.filepath, concated_ints_path) int_seq_path = os.path.join(parse_dir_fp, 'int-seq1.qza') shutil.copy(cls.das.int_seq1.filepath, int_seq_path) @classmethod def tearDownClass(cls): cls.das.free() def test_parse_empty_dir(self): empty_dir_path = os.path.join(self.tempdir, 'empty-dir') os.mkdir(empty_dir_path) with self.assertRaisesRegex( ValueError, f'No .qza or .qzv files.*{empty_dir_path}' ): ProvDAG(empty_dir_path) def test_directory_parser_works_regardless_trailing_slash(self): parse_dir = os.path.join(self.tempdir, 'parse-dir') dag = ProvDAG(parse_dir) dag2 = ProvDAG(parse_dir) self.assertEqual(dag, dag2) def test_directory_parser_captures_all_parsed_artifact_uuids(self): ''' The test dir contains a concatenated ints artifact and an int sequence. Though the concatenated ints artifact contains the int sequence as one of its parents, both should be present in _parsed_artifact_uuids because both are considered terminal outputs of the analysis, because both are present in the parsed directory. ''' parse_dir = os.path.join(self.tempdir, 'parse-dir') dag = ProvDAG(parse_dir, recurse=True) self.assertEqual( dag._parsed_artifact_uuids, {self.das.concated_ints.uuid, self.das.int_seq1.uuid} ) def test_directory_parser_handles_duplicates(self): with tempfile.TemporaryDirectory() as tempdir: shutil.copy(self.das.concated_ints.filepath, tempdir) copy_path = os.path.join(tempdir, 'concated_ints_2.qza') shutil.copy(self.das.concated_ints.filepath, copy_path) both_dag = ProvDAG(tempdir) self.assertEqual(len(both_dag._parsed_artifact_uuids), 1) one_dag = ProvDAG(copy_path) self.assertEqual(one_dag, both_dag) def test_directory_parser_idempotent_with_parse_and_union(self): # Non-recursive concated_ints_path = os.path.join( self.tempdir, 'parse-dir', 'concated-ints.qza' ) int_seq_path = os.path.join(self.tempdir, 'parse-dir', 'int-seq1.qza') concated_ints_dag = ProvDAG(concated_ints_path) int_seq_dag = ProvDAG(int_seq_path) union_dag = ProvDAG.union([concated_ints_dag, int_seq_dag]) parse_dir_path = os.path.join(self.tempdir, 'parse-dir') dir_dag = ProvDAG(parse_dir_path) self.assertEqual(union_dag, dir_dag) # Recursive inner_dir = os.path.join(self.tempdir, 'parse-dir', 'inner-dir') os.mkdir(inner_dir) mapping_path = os.path.join(inner_dir, 'mapping1.qza') shutil.copy(self.das.mapping1.filepath, mapping_path) mapping_dag = ProvDAG(mapping_path) inner_dir_union_dag = ProvDAG.union([ concated_ints_dag, int_seq_dag, mapping_dag ]) recursive_dir_dag = ProvDAG(parse_dir_path, recurse=True) self.assertEqual(inner_dir_union_dag, recursive_dir_dag) def test_directory_parser_multiple_imports(self): outer_path = os.path.join(self.tempdir, 'mutliple-import-outer') os.mkdir(outer_path) inner_path = os.path.join(outer_path, 'inner') os.mkdir(inner_path) shutil.copy( self.das.mapping1.filepath, os.path.join(outer_path, 'mapping1.qza') ) shutil.copy( self.das.mapping2.filepath, os.path.join(outer_path, 'mapping2.qza') ) shutil.copy( self.das.mapping1.filepath, os.path.join(inner_path, 'duplicate-mapping1.qza') ) shutil.copy( self.das.mapping2.filepath, os.path.join(inner_path, 'duplicate-mapping2.qza') ) inner_dag = ProvDAG(inner_path) self.assertEqual(len(inner_dag), 2) self.assertIn(self.das.mapping1.uuid, inner_dag.dag) self.assertIn(self.das.mapping2.uuid, inner_dag.dag) outer_dag = ProvDAG(outer_path) self.assertEqual(len(inner_dag), 2) self.assertIn(self.das.mapping1.uuid, outer_dag.dag) self.assertIn(self.das.mapping2.uuid, outer_dag.dag) self.assertEqual(inner_dag, outer_dag) recursive_outer_dag = ProvDAG(outer_path, recurse=True) self.assertEqual(len(inner_dag), 2) self.assertIn(self.das.mapping1.uuid, recursive_outer_dag.dag) self.assertIn(self.das.mapping2.uuid, recursive_outer_dag.dag) self.assertEqual(inner_dag, outer_dag, recursive_outer_dag) def test_verbose(self): buffer = io.StringIO() concated_ints = 'parse-dir/concated-ints.qza' int_seq = 'parse-dir/int-seq1.qza' with redirect_stdout(buffer): dag = ProvDAG( os.path.join(self.tempdir, 'parse-dir'), verbose=True, recurse=True ) self.assertEqual(dag.cfg.verbose, True) stdout_log = buffer.getvalue() self.assertRegex(stdout_log, f'parsing.*{concated_ints}') self.assertRegex(stdout_log, f'parsing.*{int_seq}') qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_replay.py000066400000000000000000001565761462552636000257630ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import bibtexparser as bp import networkx as nx import os import pathlib import shutil import tempfile import unittest from unittest.mock import patch import zipfile from qiime2 import Artifact from qiime2.sdk import PluginManager from qiime2.sdk.usage import Usage, UsageVariable from qiime2.plugins import ArtifactAPIUsageVariable from ..parse import ProvDAG from ..replay import ( ActionCollections, BibContent, ReplayConfig, ReplayNamespaces, UsageVariableRecord, build_no_provenance_node_usage, build_import_usage, build_action_usage, build_usage_examples, collect_citations, dedupe_citations, dump_recorded_md_file, group_by_action, init_md_from_artifacts, init_md_from_md_file, init_md_from_recorded_md, replay_provenance, replay_citations ) from .testing_utilities import CustomAssertions, DummyArtifacts from ..usage_drivers import ReplayPythonUsage from ...provenance import MetadataInfo from qiime2.sdk.util import camel_to_snake class ReplayNamespacesTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_make_unique_name(self): ns = ReplayNamespaces() self.assertEqual('name_0', ns._make_unique_name('name')) ns.add_usg_var_record('uuid1', 'name') self.assertEqual('name_1', ns._make_unique_name('name')) def test_get_usg_var_uuid(self): ns = ReplayNamespaces() ns.add_usg_var_record('uuid1', 'name') self.assertEqual('uuid1', ns.get_usg_var_uuid('name_0')) def test_add_usage_var_workflow(self): """ Smoke tests a common workflow with this data structure - Create a unique variable name by adding to the namespace - Create a UsageVariable with that name - use the name to get the UUID (when we have Results, we have no UUIDs) - add the correctly-named UsageVariable to the namespace """ use = Usage() uuid = self.das.concated_ints.uuid base_name = 'concated_ints' exp_name = base_name + '_0' ns = ReplayNamespaces() ns.add_usg_var_record(uuid, base_name) self.assertEqual(ns.get_usg_var_record(uuid).name, exp_name) def factory(): # pragma: no cover return Artifact.load(self.das.concated_ints.filepath) u_var = use.init_artifact(ns.get_usg_var_record(uuid).name, factory) self.assertEqual(u_var.name, exp_name) actual_uuid = ns.get_usg_var_uuid(u_var.name) self.assertEqual(actual_uuid, uuid) ns.update_usg_var_record(uuid, u_var) self.assertIsInstance( ns.get_usg_var_record(uuid).variable, UsageVariable ) self.assertEqual(ns.get_usg_var_record(uuid).name, exp_name) class ReplayProvenanceTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_replay_from_fp(self): with tempfile.TemporaryDirectory() as tmpdir: out_fp = pathlib.Path(tmpdir) / 'rendered.txt' out_fn = str(out_fp) in_fn = self.das.concated_ints_with_md.filepath replay_provenance( ReplayPythonUsage, in_fn, out_fn, md_out_dir=tmpdir ) self.assertTrue(out_fp.is_file()) with open(out_fn, 'r') as fp: rendered = fp.read() self.assertIn('from qiime2 import Artifact', rendered) self.assertIn('from qiime2 import Metadata', rendered) self.assertIn( 'import qiime2.plugins.dummy_plugin.actions as ' 'dummy_plugin_actions', rendered ) self.assertIn('mapping_0 = Artifact.import_data(', rendered) self.assertRegex(rendered, 'The following command.*additional metadata') self.assertIn('mapping_0.view(Metadata)', rendered) self.assertIn('dummy_plugin_actions.identity_with_metadata', rendered) self.assertIn('dummy_plugin_actions.concatenate_ints', rendered) def test_replay_from_fp_use_md_without_parse(self): in_fp = self.das.concated_ints.filepath with self.assertRaisesRegex( ValueError, "Metadata not parsed for replay. Re-run" ): replay_provenance( ReplayPythonUsage, in_fp, 'unused_fp', parse_metadata=False, use_recorded_metadata=True ) def test_replay_dump_md_without_parse(self): in_fp = self.das.concated_ints.filepath with self.assertRaisesRegex( ValueError, "(?s)Metadata not parsed,.*dump_recorded_meta" ): replay_provenance( ReplayPythonUsage, in_fp, 'unused_fp', parse_metadata=False, dump_recorded_metadata=True ) def test_replay_md_out_dir_without_parse(self): in_fp = self.das.concated_ints.filepath with self.assertRaisesRegex( ValueError, "(?s)Metadata not parsed,.*not.*metadata output" ): replay_provenance( ReplayPythonUsage, in_fp, 'unused_fp', parse_metadata=False, dump_recorded_metadata=False, md_out_dir='/user/dumb/some_filepath' ) def test_replay_use_md_without_dump_md(self): in_fp = self.das.concated_ints.filepath with self.assertRaisesRegex( NotImplementedError, "(?s)uses.*metadata.*must.*written to disk" ): replay_provenance( ReplayPythonUsage, in_fp, 'unused_fp', use_recorded_metadata=True, dump_recorded_metadata=False ) def test_replay_from_provdag(self): with tempfile.TemporaryDirectory() as tmpdir: out_fp = pathlib.Path(tmpdir) / 'rendered.txt' out_fn = str(out_fp) dag = self.das.concated_ints_with_md.dag replay_provenance(ReplayPythonUsage, dag, out_fn, md_out_dir=tmpdir) self.assertTrue(out_fp.is_file()) with open(out_fn, 'r') as fp: rendered = fp.read() self.assertIn('from qiime2 import Artifact', rendered) self.assertIn('from qiime2 import Metadata', rendered) self.assertIn( 'import qiime2.plugins.dummy_plugin.actions as ' 'dummy_plugin_actions', rendered ) self.assertIn('mapping_0 = Artifact.import_data(', rendered) self.assertRegex(rendered, 'The following command.*additional metadata') self.assertIn('mapping_0.view(Metadata)', rendered) self.assertIn('dummy_plugin_actions.identity_with_metadata', rendered) self.assertIn('dummy_plugin_actions.concatenate_ints', rendered) def test_replay_from_provdag_use_md_without_parse(self): dag = ProvDAG(self.das.concated_ints_with_md.filepath, validate_checksums=False, parse_metadata=False) with self.assertRaisesRegex( ValueError, "Metadata not parsed for replay" ): replay_provenance( ReplayPythonUsage, dag, 'unused', use_recorded_metadata=True ) def test_replay_from_provdag_ns_collision(self): """ This artifact's dag contains a few results with the output-name filtered-table, so is a good check for namespace collisions if we're not uniquifying variable names properly. """ with tempfile.TemporaryDirectory() as tempdir: self.das.concated_ints.artifact.save( os.path.join(tempdir, 'c1.qza')) self.das.other_concated_ints.artifact.save( os.path.join(tempdir, 'c2.qza')) dag = ProvDAG(tempdir) exp = ['concatenated_ints_0', 'concatenated_ints_1'] with tempfile.TemporaryDirectory() as tempdir: out_path = pathlib.Path(tempdir) / 'ns_coll.txt' replay_provenance( ReplayPythonUsage, dag, out_path, md_out_dir=tempdir ) with open(out_path, 'r') as fp: rendered = fp.read() for name in exp: self.assertIn(name, rendered) def test_replay_optional_param_is_none(self): dag = self.das.int_seq_optional_input.dag with tempfile.TemporaryDirectory() as tempdir: out_path = pathlib.Path(tempdir) / 'ns_coll.txt' replay_provenance( ReplayPythonUsage, dag, out_path, md_out_dir=tempdir ) with open(out_path, 'r') as fp: rendered = fp.read() self.assertIn('ints=int_sequence1_0', rendered) self.assertIn('num1=', rendered) self.assertNotIn('optional1=', rendered) self.assertNotIn('num2=', rendered) class MultiplePluginTests(unittest.TestCase): @classmethod def setUpClass(cls): from qiime2.sdk.plugin_manager import PluginManager from qiime2 import Artifact cls.pm = PluginManager() cls.dp = cls.pm.plugins['dummy-plugin'] cls.op = cls.pm.plugins['other-plugin'] cls.tempdir = tempfile.mkdtemp(prefix='qiime2-other-plugin-temp-') int_seq = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) concat_ints = cls.op.methods['concatenate_ints'] split_ints = cls.dp.methods['split_ints'] concated_ints, = concat_ints( int_seq, int_seq, int_seq, 5, 6 ) cls.splitted_ints, _ = split_ints(concated_ints) @classmethod def tearDownClass(cls): shutil.rmtree(cls.tempdir) def test_multiple_plugins_in_provenance(self): fp = os.path.join(self.tempdir, 'splitted_ints.qza') self.splitted_ints.save(fp) with tempfile.TemporaryDirectory() as tempdir: out_fp = os.path.join(tempdir, 'rendered.txt') replay_provenance( ReplayPythonUsage, fp, out_fp, md_out_dir=tempdir ) with open(out_fp, 'r') as fp: rendered = fp.read() self.assertIn('from qiime2 import Artifact', rendered) self.assertIn( 'import qiime2.plugins.dummy_plugin.actions as ' 'dummy_plugin_actions', rendered ) self.assertIn( 'import qiime2.plugins.other_plugin.actions as ' 'other_plugin_actions', rendered ) self.assertIn( 'dummy_plugin_actions.split_ints(', rendered ) self.assertIn( 'other_plugin_actions.concatenate_ints(', rendered ) class ReplayProvDAGDirectoryTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_directory_replay_multiple_imports(self): """ The directory being parsed here contains two pairs of duplicates, and should replay as only two import statements. """ outer_dir = os.path.join(self.tempdir, 'outer') inner_dir = os.path.join(outer_dir, 'inner') os.makedirs(inner_dir) for artifact in self.das.single_int, self.das.single_int2: for dir_ in inner_dir, outer_dir: artifact.artifact.save( os.path.join(dir_, f'{artifact.name}.qza') ) dir_dag = ProvDAG(outer_dir) self.assertEqual(len(dir_dag._parsed_artifact_uuids), 2) self.assertIn(self.das.single_int.uuid, dir_dag.dag) self.assertIn(self.das.single_int2.uuid, dir_dag.dag) exp_1 = ( '(?s)from qiime2 import Artifact.*' 'single_int_0 = Artifact.import_data.*' '.*' ) exp_2 = ( '(?s)from qiime2 import Artifact.*' 'single_int_1 = Artifact.import_data.*' '.*' ) with tempfile.TemporaryDirectory() as tempdir: out_path = pathlib.Path(tempdir) / 'rendered.txt' replay_provenance( ReplayPythonUsage, dir_dag, out_path, md_out_dir=tempdir ) self.assertTrue(out_path.is_file()) with open(out_path, 'r') as fp: rendered = fp.read() self.assertRegex(rendered, exp_1) self.assertRegex(rendered, exp_2) class BuildUsageExamplesTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() @patch('qiime2.core.archive.provenance_lib.replay.build_action_usage') @patch('qiime2.core.archive.provenance_lib.replay.build_import_usage') @patch('qiime2.core.archive.provenance_lib.replay.' 'build_no_provenance_node_usage') def test_build_usage_examples(self, n_p_builder, imp_builder, act_builder): ns = ReplayNamespaces() dag = self.das.concated_ints_with_md.dag cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) build_usage_examples(dag, cfg, ns) n_p_builder.assert_not_called() self.assertEqual(imp_builder.call_count, 3) self.assertEqual(act_builder.call_count, 2) @patch('qiime2.core.archive.provenance_lib.replay.build_action_usage') @patch('qiime2.core.archive.provenance_lib.replay.build_import_usage') @patch('qiime2.core.archive.provenance_lib.replay.' 'build_no_provenance_node_usage') def test_build_usage_examples_lone_v0( self, n_p_builder, imp_builder, act_builder ): ns = ReplayNamespaces() uuid = self.das.table_v0.uuid with self.assertWarnsRegex( UserWarning, f'(:?)Art.*{uuid}.*prior.*incomplete' ): dag = ProvDAG(self.das.table_v0.filepath) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) build_usage_examples(dag, cfg, ns) # This is a single v0 archive, so should have only one np node n_p_builder.assert_called_once() imp_builder.assert_not_called() act_builder.assert_not_called() @patch('qiime2.core.archive.provenance_lib.replay.build_action_usage') @patch('qiime2.core.archive.provenance_lib.replay.build_import_usage') @patch('qiime2.core.archive.provenance_lib.replay.' 'build_no_provenance_node_usage') def test_build_usage_examples_mixed( self, n_p_builder, imp_builder, act_builder ): mixed_dir = os.path.join(self.tempdir, 'mixed-dir') os.mkdir(mixed_dir) shutil.copy(self.das.table_v0.filepath, mixed_dir) shutil.copy(self.das.concated_ints_v6.filepath, mixed_dir) ns = ReplayNamespaces() v0_uuid = self.das.table_v0.uuid with self.assertWarnsRegex( UserWarning, f'(:?)Art.*{v0_uuid}.*prior.*incomplete' ): dag = ProvDAG(mixed_dir) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) build_usage_examples(dag, cfg, ns) n_p_builder.assert_called_once() self.assertEqual(imp_builder.call_count, 2) act_builder.assert_called_once() @patch('qiime2.core.archive.provenance_lib.replay.build_action_usage') @patch('qiime2.core.archive.provenance_lib.replay.build_import_usage') @patch('qiime2.core.archive.provenance_lib.replay.' 'build_no_provenance_node_usage') def test_build_usage_examples_big( self, n_p_builder, imp_builder, act_builder): many_dir = os.path.join(self.tempdir, 'many-dir') os.mkdir(many_dir) shutil.copy(self.das.concated_ints_with_md.filepath, many_dir) shutil.copy(self.das.splitted_ints.filepath, many_dir) shutil.copy(self.das.pipeline_viz.filepath, many_dir) ns = ReplayNamespaces() dag = ProvDAG(many_dir) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) build_usage_examples(dag, cfg, ns) n_p_builder.assert_not_called() # concated_ints_with_md is loaded from disk so imports don't overlap # with splitted_ints and pipeline_viz self.assertEqual(imp_builder.call_count, 6) self.assertEqual(act_builder.call_count, 4) class MiscHelperFnTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_uniquify_action_name(self): ns = ReplayNamespaces() p1 = 'dummy_plugin' a1 = 'action_jackson' p2 = 'dummy_plugin' a2 = 'missing_in_action' unique1 = ns.uniquify_action_name(p1, a1) self.assertEqual(unique1, 'dummy_plugin_action_jackson_0') unique2 = ns.uniquify_action_name(p2, a2) self.assertEqual(unique2, 'dummy_plugin_missing_in_action_0') duplicate = ns.uniquify_action_name(p1, a1) self.assertEqual(duplicate, 'dummy_plugin_action_jackson_1') def test_dump_recorded_md_file_no_md(self): uuid = self.das.table_v0.uuid dag = self.das.table_v0.dag cfg = ReplayConfig(use=ReplayPythonUsage(), pm=self.pm) provnode = dag.get_node_data(uuid) action_name = 'old_action' md_id = 'metadata' fn = 'metadata.tsv' with self.assertRaisesRegex( ValueError, "should only be called.*if.*metadata" ): dump_recorded_md_file(cfg, provnode, action_name, md_id, fn) class GroupByActionTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_gba_with_provenance(self): self.maxDiff = None ns = ReplayNamespaces() dag = self.das.concated_ints_v6.dag sorted_nodes = nx.topological_sort(dag.collapsed_view) actual = group_by_action(dag, sorted_nodes, ns) exp = { 'b49e497c-19b2-49f7-b9a2-0d837016c151': { '8dea2f1a-2164-4a85-9f7d-e0641b1db22b': 'int_sequence1' }, '12988290-1ebf-47ad-8c34-5469d42e5ffe': { '7727c060-5384-445d-b007-b64b41a090ee': 'int_sequence2' }, '5035a60e-6f9a-40d4-b412-48ae52255bb5': { '6facaf61-1676-45eb-ada0-d530be678b27': 'concatenated_ints' } } self.assertEqual(actual.std_actions, exp) self.assertEqual(actual.no_provenance_nodes, []) def test_gba_no_provenance(self): ns = ReplayNamespaces() dag = self.das.table_v0.dag uuid = self.das.table_v0.uuid sorted_nodes = nx.topological_sort(dag.collapsed_view) action_collections = group_by_action(dag, sorted_nodes, ns) self.assertEqual(action_collections.std_actions, {}) self.assertEqual(action_collections.no_provenance_nodes, [uuid]) def test_gba_some_nodes_missing_provenance(self): mixed_dir = os.path.join(self.tempdir, 'mixed-dir') os.mkdir(mixed_dir) shutil.copy(self.das.table_v0.filepath, mixed_dir) shutil.copy(self.das.concated_ints_v6.filepath, mixed_dir) ns = ReplayNamespaces() v0_uuid = self.das.table_v0.uuid with self.assertWarnsRegex( UserWarning, f'(:?)Art.*{v0_uuid}.*prior.*incomplete' ): dag = ProvDAG(mixed_dir) sorted_nodes = nx.topological_sort(dag.collapsed_view) action_collections = group_by_action(dag, sorted_nodes, ns) exp = { 'b49e497c-19b2-49f7-b9a2-0d837016c151': { '8dea2f1a-2164-4a85-9f7d-e0641b1db22b': 'int_sequence1' }, '12988290-1ebf-47ad-8c34-5469d42e5ffe': { '7727c060-5384-445d-b007-b64b41a090ee': 'int_sequence2' }, '5035a60e-6f9a-40d4-b412-48ae52255bb5': { '6facaf61-1676-45eb-ada0-d530be678b27': 'concatenated_ints' } } self.assertEqual(action_collections.std_actions, exp) self.assertEqual(action_collections.no_provenance_nodes, [v0_uuid]) class InitializerTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() with zipfile.ZipFile(cls.das.concated_ints_with_md.filepath) as zf: root_node_id = cls.das.concated_ints_with_md.uuid all_filenames = zf.namelist() dag = cls.das.concated_ints_with_md.dag for node in dag.nodes: md_path = os.path.join( root_node_id, 'provenance', 'artifacts', node, 'action', 'metadata.tsv' ) if md_path in all_filenames: cls.md_node_id = node else: cls.non_md_node_id = node with zipfile.ZipFile( cls.das.concated_ints_with_md_column.filepath ) as zf: root_node_id = cls.das.concated_ints_with_md_column.uuid all_filenames = zf.namelist() dag = cls.das.concated_ints_with_md_column.dag for node in dag.nodes: md_path = os.path.join( root_node_id, 'provenance', 'artifacts', node, 'action', 'metadata.tsv' ) if md_path in all_filenames: cls.mdc_node_id = node else: cls.non_mdc_node_id = node @classmethod def tearDownClass(cls): cls.das.free() def test_init_md_from_artifacts_no_artifacts(self): cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) ns = ReplayNamespaces # create dummy hash '0', not relevant here md_info = MetadataInfo([], 'hmm.tsv', '0') with self.assertRaisesRegex( ValueError, "not.*used.*input_artifact_uuids.*empty" ): init_md_from_artifacts(md_info, ns, cfg) def test_init_md_from_artifacts_one_art(self): # This helper doesn't capture real data, so we're only smoke testing, # checking type, and confirming the repr looks reasonable. cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) # We expect artifact vars have already been added to the namespace a1 = cfg.use.init_artifact(name='thing1', factory=lambda: None) ns = ReplayNamespaces() ns._usg_var_ns = {'uuid1': UsageVariableRecord('thing1', a1)} # create dummy hash '0', not relevant here md_info = MetadataInfo(['uuid1'], 'hmm.tsv', '0') var = init_md_from_artifacts(md_info, ns, cfg) self.assertIsInstance(var, UsageVariable) self.assertEqual(var.var_type, 'metadata') rendered = var.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn('thing1_a_0_md = thing1.view(Metadata)', rendered) def test_init_md_from_artifacts_many(self): # This helper doesn't capture real data, so we're only smoke testing, # checking type, and confirming the repr looks reasonable. cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) # We expect artifact vars have already been added to the namespace a1 = cfg.use.init_artifact(name='thing1', factory=lambda: None) a2 = cfg.use.init_artifact(name='thing2', factory=lambda: None) a3 = cfg.use.init_artifact(name='thing3', factory=lambda: None) ns = ReplayNamespaces() ns._usg_var_ns = { 'uuid1': UsageVariableRecord('thing1', a1), 'uuid2': UsageVariableRecord('thing2', a2), 'uuid3': UsageVariableRecord('thing3', a3), } # create dummy hash '0', not relevant here md_info = MetadataInfo(['uuid1', 'uuid2', 'uuid3'], 'hmm.tsv', '0') var = init_md_from_artifacts(md_info, ns, cfg) self.assertIsInstance(var, UsageVariable) self.assertEqual(var.var_type, 'metadata') rendered = var.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn('thing1_a_0_md = thing1.view(Metadata)', rendered) self.assertIn('thing2_a_0_md = thing2.view(Metadata)', rendered) self.assertIn('thing3_a_0_md = thing3.view(Metadata)', rendered) self.assertIn( 'merged_artifacts_0_md = ' 'thing1_a_0_md.merge(thing2_a_0_md, thing3_a_0_md)', rendered ) def test_init_md_from_md_file(self): dag = self.das.concated_ints_with_md.dag md_node = dag.get_node_data(self.md_node_id) md_id = 'whatevs' param_name = 'metadata' ns = ReplayNamespaces() ns.add_usg_var_record(md_id, param_name) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) var = init_md_from_md_file(md_node, param_name, md_id, ns, cfg) rendered = var.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn( 'metadata_0_md = Metadata.load()', rendered ) def test_init_md_from_recorded_md(self): dag = self.das.concated_ints_with_md.dag no_md_node = dag.get_node_data(self.non_md_node_id) md_node = dag.get_node_data(self.md_node_id) var_name = 'metadata_0' param_name = 'metadata' ns = ReplayNamespaces() ns.add_usg_var_record(var_name, param_name) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) md_fn = 'identity_with_metadata/metadata_0' with self.assertRaisesRegex(ValueError, 'only.*call.*if.*metadata'): init_md_from_recorded_md( no_md_node, param_name, var_name, ns, cfg, md_fn ) var = init_md_from_recorded_md( md_node, param_name, var_name, ns, cfg, md_fn ) self.assertIsInstance(var, UsageVariable) self.assertEqual(var.var_type, 'metadata') rendered = cfg.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn('metadata_0_md = Metadata.load', rendered) self.assertIn( 'recorded_metadata/identity_with_metadata/metadata_0', rendered ) def test_init_md_from_recorded_mdc(self): dag = self.das.concated_ints_with_md_column.dag no_md_node = dag.get_node_data(self.non_mdc_node_id) md_node = dag.get_node_data(self.mdc_node_id) var_name = 'metadata_0' param_name = 'metadata' ns = ReplayNamespaces() ns.add_usg_var_record(var_name, param_name) cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) md_fn = 'identity_with_metadata_column/metadata_0' with self.assertRaisesRegex(ValueError, 'only.*call.*if.*metadata'): init_md_from_recorded_md( no_md_node, param_name, var_name, ns, cfg, md_fn ) var = init_md_from_recorded_md( md_node, param_name, var_name, ns, cfg, md_fn ) self.assertIsInstance(var, UsageVariable) self.assertEqual(var.var_type, 'column') rendered = cfg.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn('metadata_0_md = Metadata.load', rendered) self.assertIn('.get_column(', rendered) self.assertIn('recorded_metadata/identity_with_metadata_column/' 'metadata_0.tsv', rendered) class BuildNoProvenanceUsageTests(CustomAssertions): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_build_no_provenance_node_usage_w_complete_node(self): ns = ReplayNamespaces() cfg = ReplayConfig(use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm) uuid = self.das.table_v0.uuid dag = self.das.table_v0.dag v0_node = dag.get_node_data(uuid) build_no_provenance_node_usage(v0_node, uuid, ns, cfg) out_var_name = 'feature_table_frequency_0' self.assertIn(uuid, ns._usg_var_ns) self.assertEqual(ns._usg_var_ns[uuid].name, out_var_name) rendered = cfg.use.render() # Confirm the initial context comment is present once. self.assertREAppearsOnlyOnce(rendered, 'nodes have no provenance') header = '# Original Node ID String Description' self.assertREAppearsOnlyOnce(rendered, header) # Confirm expected values have been rendered exp_v0 = f'# {uuid} feature_table_frequency_0' self.assertRegex(rendered, exp_v0) def test_build_no_provenance_node_usage_uuid_only_node(self): ns = ReplayNamespaces() cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) uuid = 'some-uuid' node = None build_no_provenance_node_usage(node, uuid, ns, cfg) out_var_name = 'no-provenance-node_0' self.assertIn(uuid, ns._usg_var_ns) self.assertEqual(ns._usg_var_ns[uuid].name, out_var_name) rendered = cfg.use.render() # Confirm the initial context comment is present once. self.assertREAppearsOnlyOnce(rendered, 'nodes have no provenance') header = '# Original Node ID String Description' self.assertREAppearsOnlyOnce(rendered, header) # Confirm expected values have been rendered exp_v0 = f'# {uuid} no_provenance_node_0' self.assertRegex(rendered, exp_v0) def test_build_no_provenance_node_usage_many(self): ns = ReplayNamespaces() cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) # This function doesn't actually know about the DAG, so no need to join uuid = self.das.table_v0.uuid dag = self.das.table_v0.dag v0_node = dag.get_node_data(uuid) dummy_node_uuid = uuid + '-dummy' dummy_node = dag.get_node_data(uuid) build_no_provenance_node_usage(v0_node, uuid, ns, cfg) build_no_provenance_node_usage(dummy_node, dummy_node_uuid, ns, cfg) self.assertIn(uuid, ns._usg_var_ns) self.assertIn(dummy_node_uuid, ns._usg_var_ns) self.assertEqual( ns._usg_var_ns[uuid].name, 'feature_table_frequency_0' ) self.assertEqual( ns._usg_var_ns[dummy_node_uuid].name, 'feature_table_frequency_1' ) rendered = cfg.use.render() # Confirm the initial context isn't repeated. self.assertREAppearsOnlyOnce(rendered, 'nodes have no provenance') header = '# Original Node ID String Description' self.assertREAppearsOnlyOnce(rendered, header) # Confirm expected values have been rendered exp_og = f'# {uuid} feature_table_frequency_0' exp_dummy = f'# {uuid}-dummy feature_table_frequency_1' self.assertRegex(rendered, exp_og) self.assertRegex(rendered, exp_dummy) class BuildImportUsageTests(CustomAssertions): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_build_import_usage_python(self): ns = ReplayNamespaces() cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) dag = self.das.concated_ints_v6.dag import_uuid = '8dea2f1a-2164-4a85-9f7d-e0641b1db22b' import_node = dag.get_node_data(import_uuid) c_to_s_type = camel_to_snake(import_node.type) unq_var_nm = c_to_s_type + '_0' build_import_usage(import_node, ns, cfg) usg_var = ns.get_usg_var_record(import_uuid).variable self.assertIsInstance(usg_var, UsageVariable) self.assertEqual(usg_var.var_type, 'artifact') self.assertEqual(usg_var.name, unq_var_nm) rendered = cfg.use.render() out_name = usg_var.to_interface_name() self.assertRegex(rendered, 'from qiime2 import Artifact') self.assertRegex(rendered, rf'{out_name} = Artifact.import_data\(') self.assertRegex(rendered, import_node.type) self.assertRegex(rendered, '') class ReplayResultCollectionTests(CustomAssertions): ''' One of three structures may be encoutered when parsing the inputs section of action.yaml, described below: case 1 (single artifact case): inputs: - some_input_name: some_uuid - some_other_input_name: some_other_uuid (...) For the single artifact case we may find the artifact in the usage variable namespace or it might only exist in a result collection. - case 1a: The artifact exists in the usage variable namespace so access it from there, not from any result collection that may contain it. - case 1b: The artifact does not exist in the usage variable namespace so find it in a result collection, destructure it, and then use the destructured artifact. case 2 (list of artifacts case): inputs: - some_input_name: - some_uuid - some_other_uuid (...) For the list of artifacts case the list contents may be equivalent to an existing result collection, or it may not be. - case 2a: The list contents are not equivalent to any existing result collection so for each member do either case 1a or 1b as described above. - case 2b: The list contents are equivalent to an existing result collection so pass in the result collection directly (which will be cast to a list). case 3 (result collection case): inputs: - result_collection_name: - some_key: some_uuid - some_other_key: some_other_uuid (...) For the result collection case an equivalent existing result collection may exist or it may not. - case 3a: No equivalent result collection is found so for each member follow case 1a or 1b as described above. - case 3b: An equivalent result collection is found so pass it in directly. ''' @classmethod def setUpClass(cls): cls.pm = PluginManager() cls.dp = cls.pm.plugins['dummy-plugin'] cls.single_int = Artifact.import_data('SingleInt', 0) cls.dict_of_ints = cls.dp.methods['dict_of_ints'] cls.list_of_ints = cls.dp.methods['list_of_ints'] def test_cases_1a_1b(self): ''' The `single_int` usage variable is not found before the first call to `optional_artifact_pipeline`, so it is accessed from the result collection it belongs to (case 1b) and added to the usage variable namespace. When we call `optional_artifact_pipeline` again it should not destructure the result collection a second time but instead just use the available usage variable that we added to the namespace previously (case 1a). ''' int_seq = Artifact.import_data('IntSequence1', [1, 1, 2]) opt_art_pipeline = self.dp.pipelines['optional_artifact_pipeline'] rc = {'int1': self.single_int} rc_out, = self.dict_of_ints(rc) int_destructured = rc_out['int1'] int_seq_out, = opt_art_pipeline(int_seq, int_destructured) int_seq_out, = opt_art_pipeline(int_seq_out, int_destructured) with tempfile.TemporaryDirectory() as tempdir: in_fp = pathlib.Path(tempdir) / 'int_seq_out.qza' int_seq_out.save(in_fp) dag = ProvDAG(in_fp) out_fp = pathlib.Path(tempdir) / 'rendered.txt' out_fn = str(out_fp) replay_provenance(ReplayPythonUsage, dag, out_fn) with open(out_fp) as fh: rendered = fh.read() exp = '''\ single_int_1 = output_0_artifact_collection['int1'] ints_1, = dummy_plugin_actions.optional_artifact_pipeline( int_sequence=int_sequence1_0, single_int=single_int_1, ) # SAVE: comment out the following with '# ' to skip saving Results to disk ints_1.save('ints_1') ints_2, = dummy_plugin_actions.optional_artifact_pipeline( int_sequence=ints_1, single_int=single_int_1, ) ''' self.assertIn(exp, rendered) def test_case_2a(self): rc = {'int1': self.single_int, 'int2': self.single_int} list_rc_out, = self.list_of_ints(rc) int1_destructured = list_rc_out['0'] with tempfile.TemporaryDirectory() as tempdir: in_fp = pathlib.Path(tempdir) / 'int1_destructured.qza' int1_destructured.save(in_fp) dag = ProvDAG(in_fp) out_fp = pathlib.Path(tempdir) / 'rendered.txt' out_fn = str(out_fp) replay_provenance(ReplayPythonUsage, dag, out_fn) with open(out_fp) as fh: rendered = fh.read() exp = '''\ output_0_artifact_collection, = dummy_plugin_actions.list_of_ints( ints=[single_int_0, single_int_0], ) ''' self.assertIn(exp, rendered) def test_case_2b(self): rc = {'int1': self.single_int} rc_out, = self.dict_of_ints(rc) list_rc_out, = self.list_of_ints(rc_out) int1_destructured = list_rc_out['0'] with tempfile.TemporaryDirectory() as tempdir: in_fp = pathlib.Path(tempdir) / 'int1_destructured.qza' int1_destructured.save(in_fp) dag = ProvDAG(in_fp) out_fp = pathlib.Path(tempdir) / 'rendered.txt' out_fn = str(out_fp) replay_provenance(ReplayPythonUsage, dag, out_fn) with open(out_fp) as fh: rendered = fh.read() exp = '''\ output_1_artifact_collection, = dummy_plugin_actions.list_of_ints( ints=output_0_artifact_collection, ) ''' self.assertIn(exp, rendered) def test_case_3a(self): ''' The `int1_destructured` artifact won't be found in the usage variable namespace, while `self.single_int` will be, so assert that int1 is destructured and `self.single_int` is just used as is because its variable name is already available. ''' rc1 = {'int1': self.single_int} rc1_out, = self.dict_of_ints(rc1) int1_destructured = rc1_out['int1'] rc2 = {'int1': int1_destructured, 'int2': self.single_int} rc2_out, = self.dict_of_ints(rc2) int3_destructured = rc2_out['int2'] with tempfile.TemporaryDirectory() as tempdir: in_fp = pathlib.Path(tempdir) / 'int2_destructured.qza' int3_destructured.save(in_fp) dag = ProvDAG(in_fp) out_fp = pathlib.Path(tempdir) / 'rendered.txt' out_fn = str(out_fp) replay_provenance(ReplayPythonUsage, dag, out_fn) with open(out_fp) as fh: rendered = fh.read() exp = '''\ ints_1 = output_0_artifact_collection['int1'] ints_2_artifact_collection = ResultCollection({ 'int1': ints_1, 'int2': single_int_0, }) output_1_artifact_collection, = dummy_plugin_actions.dict_of_ints( ints=ints_2_artifact_collection, ) ''' self.assertIn(exp, rendered) self.assertREAppearsOnlyOnce(rendered, r'\[.*\]') def test_case_3b(self): rc = {'int1': self.single_int, 'int2': self.single_int} rc_out, = self.dict_of_ints(rc) rc_out_2, = self.dict_of_ints(rc_out) int1_destructured = rc_out_2['int1'] with tempfile.TemporaryDirectory() as tempdir: in_fp = pathlib.Path(tempdir) / 'int1_destructured.qza' int1_destructured.save(in_fp) dag = ProvDAG(in_fp) out_fp = pathlib.Path(tempdir) / 'rendered.txt' out_fn = str(out_fp) replay_provenance(ReplayPythonUsage, dag, out_fn) with open(out_fp) as fh: rendered = fh.read() exp = '''\ output_1_artifact_collection, = dummy_plugin_actions.dict_of_ints( ints=output_0_artifact_collection, ) ''' self.assertIn(exp, rendered) self.assertREAppearsOnlyOnce(rendered, r'ResultCollection\(') class BuildActionUsageTests(CustomAssertions): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_build_action_usage_python(self): plugin = 'dummy_plugin' action = 'concatenate_ints' cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm ) ns = ReplayNamespaces() import_var_1 = ArtifactAPIUsageVariable( 'imported_ints_0', lambda: None, 'artifact', cfg.use ) import_var_2 = ArtifactAPIUsageVariable( 'imported_ints_1', lambda: None, 'artifact', cfg.use ) import_uuid_1 = '8dea2f1a-2164-4a85-9f7d-e0641b1db22b' import_uuid_2 = '7727c060-5384-445d-b007-b64b41a090ee' ns.add_usg_var_record(import_uuid_1, 'imported_ints', import_var_1) ns.add_usg_var_record(import_uuid_2, 'imported_ints', import_var_2) dag = self.das.concated_ints_v6.dag action_uuid = '5035a60e-6f9a-40d4-b412-48ae52255bb5' node_uuid = '6facaf61-1676-45eb-ada0-d530be678b27' node = dag.get_node_data(node_uuid) actions = ActionCollections( std_actions={action_uuid: {node_uuid: 'concatenated_ints'}} ) unique_var_name = node.action.output_name + '_0' build_action_usage(node, ns, actions.std_actions, action_uuid, cfg) usg_var = ns.get_usg_var_record(node_uuid).variable out_name = usg_var.to_interface_name() self.assertIsInstance(usg_var, UsageVariable) self.assertEqual(usg_var.var_type, 'artifact') self.assertEqual(usg_var.name, unique_var_name) rendered = cfg.use.render() self.assertRegex( rendered, f"import.*{plugin}.actions as {plugin}_actions" ) self.assertIn( f'{out_name}, = dummy_plugin_actions.{action}(', rendered ) def test_build_action_usage_recorded_md(self): action = 'identity_with_metadata' with tempfile.TemporaryDirectory() as tempdir: cfg = ReplayConfig( use=ReplayPythonUsage(), use_recorded_metadata=False, pm=self.pm, md_out_dir=tempdir ) action_uuid = '8dae7a81-83ce-48db-9313-6e3131b0933c' node_uuid = 'be472b56-d205-43ee-8180-474da575c4d5' dag = self.das.concated_ints_with_md.dag node = dag.get_node_data(node_uuid) ns = ReplayNamespaces() mapping_var = ArtifactAPIUsageVariable( 'imported_mapping_0', lambda: None, 'artifact', cfg.use ) intseq_var_1 = ArtifactAPIUsageVariable( 'imported_ints_0', lambda: None, 'artifact', cfg.use ) intseq_var_2 = ArtifactAPIUsageVariable( 'imported_ints_1', lambda: None, 'artifact', cfg.use ) mapping_import_uuid = '8f71b73d-b028-4cbc-9894-738bdfe718bf' intseq_import_uuid_1 = '0bb6d731-155a-4dd0-8a1e-98827bc4e0bf' intseq_import_uuid_2 = 'e6b37bae-3a14-40f7-87b4-52cf5c7c7a1d' ns.add_usg_var_record( mapping_import_uuid, 'imported_mapping', mapping_var ) ns.add_usg_var_record( intseq_import_uuid_1, 'imported_ints', intseq_var_1 ) ns.add_usg_var_record( intseq_import_uuid_2, 'imported_ints', intseq_var_2 ) actions = ActionCollections( std_actions={action_uuid: {node_uuid: 'out'}} ) build_action_usage(node, ns, actions.std_actions, action_uuid, cfg) usg_var = ns.get_usg_var_record(node_uuid).variable self.assertIsInstance(usg_var, UsageVariable) self.assertEqual(usg_var.var_type, 'artifact') self.assertEqual(usg_var.name, 'out_0') rendered = cfg.use.render() self.assertIn('from qiime2 import Metadata', rendered) self.assertIn('.view(Metadata)', rendered) self.assertIn(f'.{action}(', rendered) class BibContentTests(unittest.TestCase): def test_contents(self): series_21 = { 'year': ' 2010 ', 'title': ' Data Structures for Statistical Computing in Python ', 'pages': ' 51 -- 56 ', 'editor': ' Stéfan van der Walt and Jarrod Millman ', 'booktitle': ' Proceedings of the 9th Python in Science Conferen', 'author': ' Wes McKinney ', 'ENTRYTYPE': 'inproceedings', 'ID': 'view|types:2021.2.0|pandas.core.series:Series|0'} df_20 = { 'year': ' 2010 ', 'title': ' Data Structures for Statistical Computing in Python ', 'pages': ' 51 -- 56 ', 'editor': ' Stéfan van der Walt and Jarrod Millman ', 'booktitle': ' Proceedings of the 9th Python in Science Conferen', 'author': ' Wes McKinney ', 'ENTRYTYPE': 'inproceedings', 'ID': 'view|types:2020.2.0|pandas.core.frame:DataFrame|0'} self.assertEqual(BibContent(series_21), BibContent(df_20)) self.assertEqual(hash(BibContent(series_21)), hash(BibContent(df_20))) # Set membership because these objects are equal and hash-equal self.assertIn(BibContent(series_21), {BibContent(df_20)}) class CitationsTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir cls.pm = PluginManager() @classmethod def tearDownClass(cls): cls.das.free() def test_dedupe_citations(self): fn = os.path.join(self.das.datadir, 'dupes.bib') with open(fn) as bibtex_file: bib_db = bp.load(bibtex_file) deduped = dedupe_citations(bib_db.entries) # Dedupe by DOI will preserve only one of the biom.table entries # Dedupe by contents should preserve only one of the pandas entries self.assertEqual(len(deduped), 3) # Confirm each paper is present. The len assertion ensures one-to-one lower_keys = [entry['ID'].lower() for entry in deduped] self.assertTrue(any('framework' in key for key in lower_keys)) self.assertTrue(any('biom' in key for key in lower_keys)) self.assertTrue(any('pandas' in key for key in lower_keys)) def test_dedupe_pandas(self): """ No match on ID, framework, or DOI, but matching content should deduplicate these """ series_21 = { 'year': ' 2010 ', 'title': ' Data Structures for Statistical Computing in Python ', 'pages': ' 51 -- 56 ', 'editor': ' Stéfan van der Walt and Jarrod Millman ', 'booktitle': ' Proceedings of the 9th Python in Science Conferen', 'author': ' Wes McKinney ', 'ENTRYTYPE': 'inproceedings', 'ID': 'view|types:2021.2.0|pandas.core.series:Series|0' } df_20 = { 'year': ' 2010 ', 'title': ' Data Structures for Statistical Computing in Python ', 'pages': ' 51 -- 56 ', 'editor': ' Stéfan van der Walt and Jarrod Millman ', 'booktitle': ' Proceedings of the 9th Python in Science Conferen', 'author': ' Wes McKinney ', 'ENTRYTYPE': 'inproceedings', 'ID': 'view|types:2020.2.0|pandas.core.frame:DataFrame|0' } deduped = dedupe_citations([series_21, df_20]) self.assertEqual(len(deduped), 1) def test_dedupe_silva(self): """ These similar publications should not be deduped by content filter """ s0 = { 'year': '2007', 'volume': '35', 'title': 'SILVA: a comprehensive online resource for quality ' 'checked and aligned ribosomal RNA sequence data ' 'compatible with ARB', 'pages': '7188-7196', 'number': '21', 'journal': 'Nucleic Acids Res', 'author': 'Pruesse, Elmar and Quast, Christian and Knittel, Katrin' ' and Fuchs, Bernhard M and Ludwig, Wolfgang and Peplies' ', Jorg and Glockner, Frank Oliver', 'ENTRYTYPE': 'article', 'ID': 'action|rescript:2020.6.0+3.g772294c|' 'method:parse_silva_taxonomy|0' } s1 = { 'year': '2013', 'volume': '41', 'title': 'The SILVA ribosomal RNA gene database project: ' 'improved data processing and web-based tools', 'publisher': 'Oxford University Press', 'pages': 'D590-6', 'number': 'Database issue', 'journal': 'Nucleic Acids Res', 'author': 'Quast, Christian and Pruesse, Elmar and Yilmaz, Pelin ' 'and Gerken, Jan and Schweer, Timmy and Yarza, Pablo and' ' Peplies, Jorg and Glockner, Frank Oliver', 'ENTRYTYPE': 'article', 'ID': 'action|rescript:2020.6.0+3.g772294c|' 'method:parse_silva_taxonomy|1' } deduped = dedupe_citations([s0, s1]) self.assertEqual(len(deduped), 2) def test_collect_citations_no_dedupe(self): dag = self.das.concated_ints_v6.dag exp_keys = { 'framework|qiime2:2023.5.1|0', 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0', 'plugin|dummy-plugin:0.0.0-dev|0', 'plugin|dummy-plugin:0.0.0-dev|1', 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceDirectoryFormat|0', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|0', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|1', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|2', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|3', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|4', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|5', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|6', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|7', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|8', } citations = collect_citations(dag, deduplicate=False) keys = set(citations.entries_dict.keys()) self.assertEqual(len(keys), len(exp_keys)) self.assertEqual(keys, exp_keys) def test_collect_citations_dedupe(self): dag = self.das.concated_ints_v6.dag exp_keys = { 'framework|qiime2:2023.5.1|0', 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0', 'plugin|dummy-plugin:0.0.0-dev|0', 'plugin|dummy-plugin:0.0.0-dev|1', 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceDirectoryFormat|0', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|4', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|5', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|6', 'transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceV2DirectoryFormat|8' } citations = collect_citations(dag, deduplicate=True) print(citations.entries_dict.keys()) keys = set(citations.entries_dict.keys()) self.assertEqual(len(keys), len(exp_keys)) self.assertEqual(keys, exp_keys) def test_collect_citations_no_prov(self): dag = self.das.table_v0.dag exp_keys = set() citations = collect_citations(dag) keys = set(citations.entries_dict.keys()) self.assertEqual(len(keys), 0) self.assertEqual(keys, exp_keys) def test_replay_citations(self): dag = self.das.concated_ints_v6.dag exp_keys = { 'framework|qiime2:2023.5.1|0', 'action|dummy-plugin:0.0.0-dev|method:concatenate_ints|0', 'plugin|dummy-plugin:0.0.0-dev|0', 'plugin|dummy-plugin:0.0.0-dev|1', 'view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0', } with tempfile.TemporaryDirectory() as tempdir: out_fp = os.path.join(tempdir, 'citations.bib') replay_citations(dag, out_fp) with open(out_fp, 'r') as fp: written = fp.read() for key in exp_keys: self.assertIn(key, written) def test_replay_citations_no_prov(self): dag = self.das.table_v0.dag exp = "No citations were registered" with tempfile.TemporaryDirectory() as tempdir: out_fp = os.path.join(tempdir, 'citations.bib') replay_citations(dag, out_fp) with open(out_fp, 'r') as fp: written = fp.read() self.assertIn(exp, written) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_usage_drivers.py000066400000000000000000000160661462552636000273160ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import shutil import tempfile import unittest from qiime2.sdk.plugin_manager import PluginManager from qiime2.core.testing.type import IntSequence1 from ..usage_drivers import ReplayPythonUsage from ..replay import replay_provenance class ReplayPythonUsageTests(unittest.TestCase): def setUp(self): self.pm = PluginManager() self.dp = self.pm.plugins['dummy-plugin'] self.tempdir = tempfile.mkdtemp( prefix='qiime2-test-usage-drivers-temp-' ) def return_many_ints() -> (list, list, list, list, list, list): return ([1, 2, 3], [4, 5, 6], [7], [4, 4], [0], [9, 8]) self.dp.methods.register_function( function=return_many_ints, inputs={}, parameters={}, outputs=[ ('ints1', IntSequence1), ('ints2', IntSequence1), ('ints3', IntSequence1), ('ints4', IntSequence1), ('ints5', IntSequence1), ('ints6', IntSequence1), ], output_descriptions={ 'ints1': 'ints', 'ints2': 'ints', 'ints3': 'ints', 'ints4': 'ints', 'ints5': 'ints', 'ints6': 'ints', }, name='return_many_ints', description='' ) def return_four_ints() -> (list, list, list, list): return ([1, 2, 3], [4, 5, 6], [7, 8, 9], [4, 4]) self.dp.methods.register_function( function=return_four_ints, inputs={}, parameters={}, outputs=[ ('ints1', IntSequence1), ('ints2', IntSequence1), ('ints3', IntSequence1), ('ints4', IntSequence1), ], output_descriptions={ 'ints1': 'ints', 'ints2': 'ints', 'ints3': 'ints', 'ints4': 'ints', }, name='return_four_ints', description='' ) def tearDown(self): shutil.rmtree(self.tempdir) def test_template_action_lumps_many_outputs(self): """ ReplayPythonUsage._template_action should "lump" multiple outputs from one command into a single Results-like object when the total number of outputs from a single command > 5 In these cases, our rendering should look like: `action_results = plugin_actions.action()...` instead of: `_, _, thing3, _, _, _ = plugin_actions.action()...` """ ints = self.dp.actions['return_many_ints']() first_ints = ints[0] first_ints.save(os.path.join(self.tempdir, 'int-seq.qza')) fp = os.path.join(self.tempdir, 'int-seq.qza') out_fp = os.path.join(self.tempdir, 'action_collection.txt') replay_provenance(ReplayPythonUsage, fp, out_fp) exp = 'action_results = dummy_plugin_actions.return_many_ints' with open(out_fp) as fh: rendered = fh.read() self.assertRegex(rendered, exp) def test_template_action_does_not_lump_four_outputs(self): """ ReplayPythonUsage._template_action should not "lump" multiple outputs one command into a single Results-like object when the total number of outputs from a single command <= 5, unless the total number of results is high (see above). In these cases, our rendering should look like: `_, _, thing3, _ = plugin_actions.action()...` instead of: `action_results = plugin_actions.action()...` """ ints = self.dp.actions['return_four_ints']() first_ints = ints[0] first_ints.save(os.path.join(self.tempdir, 'int-seq.qza')) fp = os.path.join(self.tempdir, 'int-seq.qza') out_fp = os.path.join(self.tempdir, 'action_collection.txt') replay_provenance(ReplayPythonUsage, fp, out_fp) exp = 'ints1_0, _, _, _ = dummy_plugin_actions.return_four_ints' with open(out_fp) as fh: rendered = fh.read() self.assertRegex(rendered, exp) def test_template_action_lumps_three_variables(self): """ ReplayPythonUsage._template_action should "lump" multiple outputs from one command into a single Results-like object when there are more than two usage variables (i.e. replay of 3+ results from a single command) In these cases, our rendering should look like: ``` action_results = plugin_actions.action(...) thing1 = action_results.thinga etc. ``` instead of: `thing1, _, thing3, _, thing5, _ = plugin_actions.action()...` """ ints = self.dp.actions['return_four_ints']() os.mkdir(os.path.join(self.tempdir, 'three-ints-dir')) for i in range(3): out_path = os.path.join(self.tempdir, 'three-ints-dir', f'int-seq-{i}.qza') ints[i].save(out_path) fp = os.path.join(self.tempdir, 'three-ints-dir') out_fp = os.path.join(self.tempdir, 'action_collection.txt') replay_provenance(ReplayPythonUsage, fp, out_fp) exp = ( 'action_results = dummy_plugin_actions.return_four_ints', 'ints1_0 = action_results.ints1', 'ints2_0 = action_results.ints2', 'ints3_0 = action_results.ints3' ) with open(out_fp) as fh: rendered = fh.read() for pattern in exp: self.assertRegex(rendered, pattern) def test_template_action_does_not_lump_two_vars(self): """ ReplayPythonUsage._template_action should not "lump" multiple outputs from one command into a single Results-like object when the total count of usage variables (i.e. replayed outputs) from a single command < 3, unless the total number of outputs is high (see above). In these cases, our rendering should look like: `thing1, _, thing3, _ = plugin_actions.action()...` instead of: `action_results = plugin_actions.action()...` """ ints1, ints2, _, _ = self.dp.actions['return_four_ints']() os.mkdir(os.path.join(self.tempdir, 'two-ints-dir')) ints1.save(os.path.join(self.tempdir, 'two-ints-dir', 'int-seq-1.qza')) ints2.save(os.path.join(self.tempdir, 'two-ints-dir', 'int-seq-2.qza')) fp = os.path.join(self.tempdir, 'two-ints-dir') out_fp = os.path.join(self.tempdir, 'action_collection.txt') replay_provenance(ReplayPythonUsage, fp, out_fp) exp = 'ints1_0, ints2_0, _, _ = dummy_plugin_actions.return_four_ints' with open(out_fp) as fh: rendered = fh.read() self.assertRegex(rendered, exp) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_util.py000066400000000000000000000053601462552636000254240ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pathlib import unittest import zipfile from .testing_utilities import CustomAssertions, DummyArtifacts from ..util import get_root_uuid, get_nonroot_uuid class GetRootUUIDTests(unittest.TestCase): @classmethod def setUpClass(cls): cls.das = DummyArtifacts() cls.tempdir = cls.das.tempdir @classmethod def tearDownClass(cls): cls.das.free() def test_get_root_uuid(self): exp_root_uuids = { '0': '89af91c0-033d-4e30-8ac4-f29a3b407dc1', '1': '5b929500-e4d6-4d3f-8f5f-93fd95d1117d', '2': 'e01f0484-40d4-420e-adcf-ca9be58ed1ee', '3': 'aa960110-4069-4b7c-97a3-8a768875e515', '4': '856502cb-66f2-45aa-a86c-e484cc9bfd57', '5': '48af8384-2b0a-4b26-b85c-11b79c0d6ea6', '6': '6facaf61-1676-45eb-ada0-d530be678b27', } for artifact, exp_uuid in zip( self.das.all_artifact_versions, exp_root_uuids.values() ): with zipfile.ZipFile(artifact.filepath) as zfh: self.assertEqual(exp_uuid, get_root_uuid(zfh)) class GetNonRootUUIDTests(unittest.TestCase): def test_get_nonroot_uuid(self): md_example = pathlib.Path( 'arch_root/provenance/artifacts/uuid123/metadata.yaml') action_example = pathlib.Path( 'arch_root/provenance/artifacts/uuid123/action/action.yaml') exp = 'uuid123' self.assertEqual(get_nonroot_uuid(md_example), exp) self.assertEqual(get_nonroot_uuid(action_example), exp) class CustomAssertionsTests(CustomAssertions): def test_assert_re_appears_only_once(self): t = ("Lick an orange. It tastes like an orange.\n" "The strawberries taste like strawberries!\n" "The snozzberries taste like snozzberries!") self.assertREAppearsOnlyOnce(t, 'Lick an orange') self.assertREAppearsOnlyOnce(t, 'tastes like') with self.assertRaisesRegex(AssertionError, 'Regex.*match.*orange'): self.assertREAppearsOnlyOnce(t, 'orange') with self.assertRaisesRegex(AssertionError, 'Regex.*taste like'): self.assertREAppearsOnlyOnce(t, 'taste like') with self.assertRaisesRegex(AssertionError, 'Regex.*snozzberries'): self.assertREAppearsOnlyOnce(t, 'snozzberries') with self.assertRaisesRegex(AssertionError, 'Regex.*!'): self.assertREAppearsOnlyOnce(t, '!') qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_version_parser.py000066400000000000000000000163241462552636000275120ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import codecs import os import shutil import tempfile import unittest import zipfile import qiime2 from qiime2 import Artifact from qiime2.sdk.plugin_manager import PluginManager from qiime2.core.archive.archiver import Archiver from .testing_utilities import ( write_zip_archive, monkeypatch_archive_version, monkeypatch_framework_version ) from ..util import _VERSION_MATCHER, parse_version class TestVersionParser(unittest.TestCase): def setUp(self): self.pm = PluginManager() self.dp = self.pm.plugins['dummy-plugin'] self.tempdir = tempfile.mkdtemp( prefix='qiime2-test-version-parser-temp-' ) self.framework_version_exp = qiime2.__version__ self.archive_version_exp = Archiver.CURRENT_FORMAT_VERSION def tearDown(self): shutil.rmtree(self.tempdir) def test_parse_version(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) int_seq.save(os.path.join(self.tempdir, 'int-seq.qza')) fp = os.path.join(self.tempdir, 'int-seq.qza') with zipfile.ZipFile(fp) as zf: actual = parse_version(zf) self.assertEqual( actual, (self.archive_version_exp, self.framework_version_exp) ) def test_parse_version_old_archive_format(self): archive_version_exp = '2' fp = os.path.join(self.tempdir, 'int-seq-av2.qza') with monkeypatch_archive_version(archive_version_exp): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) int_seq.save(fp) with zipfile.ZipFile(fp) as zf: actual = parse_version(zf) self.assertEqual( actual, (archive_version_exp, self.framework_version_exp) ) def test_artifact_with_commit_version(self): framework_version_exp = '2022.8.0+29.gb053440' fp = os.path.join(self.tempdir, 'int-seq-custom-fv.qza') with monkeypatch_framework_version(framework_version_exp): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) int_seq.save(fp) with zipfile.ZipFile(fp) as zf: actual = parse_version(zf) self.assertEqual( actual, (self.archive_version_exp, framework_version_exp) ) def test_parse_version_no_VERSION_file(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq-no-v.qza') int_seq.save(fp) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(fp) as zf: zf.extractall(tempdir) uuid = os.listdir(tempdir)[0] os.remove(os.path.join(tempdir, uuid, 'VERSION')) write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: with self.assertRaisesRegex(ValueError, '(?s)VERSION.*nonexistent.*'): parse_version(zf) def test_parse_version_VERSION_file_missing_archive_field(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq-no-af.qza') int_seq.save(fp) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(fp) as zf: zf.extractall(tempdir) uuid = os.listdir(tempdir)[0] with open(os.path.join(tempdir, uuid, 'VERSION')) as fh: lines = fh.readlines() missing_archive_lines = [lines[0], lines[2]] with open(os.path.join(tempdir, uuid, 'VERSION'), 'w') as fh: for line in missing_archive_lines: fh.write(line) write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: with self.assertRaisesRegex(ValueError, 'VERSION.*out of spec.*'): parse_version(zf) def test_parse_version_VERSION_file_extra_field(self): int_seq = Artifact.import_data('IntSequence1', [1, 2, 3]) fp = os.path.join(self.tempdir, 'int-seq-extra-f.qza') int_seq.save(fp) with tempfile.TemporaryDirectory() as tempdir: with zipfile.ZipFile(fp) as zf: zf.extractall(tempdir) uuid = os.listdir(tempdir)[0] with open(os.path.join(tempdir, uuid, 'VERSION'), 'a') as fh: fh.write('fourth line\n') write_zip_archive(fp, tempdir) with zipfile.ZipFile(fp) as zf: with self.assertRaisesRegex(ValueError, 'VERSION.*out of spec.*'): parse_version(zf) ''' Tests of the regex match itself below ''' def test_version_too_short(self): short = ( r'QIIME 2\n' r'archive: 4' ) self.assertNotRegex(short, _VERSION_MATCHER) def test_version_too_long(self): long = ( r'QIIME 2\n' r'archive: 4\n' r'framework: 2019.8.1.dev0\n' r'This line should not be here' ) self.assertNotRegex(long, _VERSION_MATCHER) splitvm = codecs.decode(_VERSION_MATCHER.encode('utf-8'), 'unicode-escape').split(sep='\n') re_l1, re_l2, re_l3 = splitvm def test_line1_good(self): self.assertRegex('QIIME 2\n', self.re_l1) def test_line1_bad(self): self.assertNotRegex('SHIMMY 2\n', self.re_l1) def test_archive_version_1digit_numeric(self): self.assertRegex('archive: 1\n', self.re_l2) def test_archive_version_2digit_numeric(self): self.assertRegex('archive: 12\n', self.re_l2) def test_archive_version_bad(self): self.assertNotRegex('agama agama\n', self.re_l2) def test_archive_version_3digit_numeric(self): self.assertNotRegex('archive: 123\n', self.re_l2) def test_archive_version_nonnumeric(self): self.assertNotRegex('archive: 1a\n', self.re_l2) def test_fmwk_version_good_semver(self): self.assertRegex('framework: 2.0.6', self.re_l3) def test_fmwk_version_good_semver_dev(self): self.assertRegex('framework: 2.0.6.dev0', self.re_l3) def test_fmwk_version_good_year_month_patch(self): self.assertRegex('framework: 2020.2.0', self.re_l3) def test_fmwk_version_good_year_month_patch_2digit_month(self): self.assertRegex('framework: 2018.11.0', self.re_l3) def test_fmwk_version_good_year_month_patch_dev(self): self.assertRegex('framework: 2020.2.0.dev1', self.re_l3) def test_fmwk_version_good_ymp_2digit_month_dev(self): self.assertRegex('framework: 2020.11.0.dev0', self.re_l3) def test_fmwk_version_invalid_month(self): self.assertNotRegex('framework: 2020.13.0', self.re_l3) def test_fmwk_version_invalid_month_leading_zero(self): self.assertNotRegex('framework: 2020.03.0', self.re_l3) def test_fmwk_version_invalid_year(self): self.assertNotRegex('framework: 1953.3.0', self.re_l3) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/test_yaml_constructors.py000066400000000000000000000141041462552636000302350ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import warnings import yaml from ...provenance import MetadataInfo from qiime2.core.util import md5sum class YamlConstructorTests(unittest.TestCase): ''' YAML Constructors are used to handle the custom YAML tags defined by the framework. ''' def test_unknown_tag(self): ''' Makes explicit the current handling of unimplemented custom tags. In future, we may want to deal with these more graciously (e.g. warn), but for now we're going to fail fast. ''' tag = r"!foo 'this is not an implemented tag'" with self.assertRaisesRegex( yaml.constructor.ConstructorError, 'could not determine a constructor.*!foo' ): yaml.safe_load(tag) def test_citation_key_constructor(self): tag = r"!cite 'framework|qiime2:2020.6.0.dev0|0'" actual = yaml.safe_load(tag) self.assertEqual(actual, 'framework|qiime2:2020.6.0.dev0|0') def test_color_primitive_constructor(self): tag = r"!color '#57f289'" actual = yaml.safe_load(tag) self.assertEqual(actual, '#57f289') def test_forward_ref_action_plugin_ref(self): tag = r"plugin: !ref 'environment:plugins:diversity'" actual = yaml.safe_load(tag) self.assertEqual(actual, {'plugin': 'diversity'}) def test_forward_ref_generic_ref(self): tag = r"plugin: !ref 'environment:framework:version'" actual = yaml.safe_load(tag) exp = {'plugin': ['environment', 'framework', 'version']} self.assertEqual(exp, actual) def test_metadata_path_constructor(self): tag = r"!metadata 'metadata.tsv'" with tempfile.TemporaryDirectory() as tempdir: md_fp = os.path.join(tempdir, 'metadata.tsv') with open(md_fp, 'w') as fh: fh.write('rows and columns and stuff') action_fp = os.path.join(tempdir, 'action.yaml') with open(action_fp, 'w') as fh: fh.write(f'{tag}\n') with open(action_fp, 'r') as fh: actual = yaml.safe_load(fh) md5sum_hash = md5sum(md_fp) self.assertEqual(actual, MetadataInfo([], 'metadata.tsv', md5sum_hash)) def test_metadata_path_constructor_one_Artifact_as_md(self): tag = r"!metadata '415409a4-stuff-e3eaba5301b4:feature_metadata.tsv'" with tempfile.TemporaryDirectory() as tempdir: md_fp = os.path.join(tempdir, 'feature_metadata.tsv') with open(md_fp, 'w') as fh: fh.write('rows and columns and stuff') action_fp = os.path.join(tempdir, 'action.yaml') with open(action_fp, 'w') as fh: fh.write(f'{tag}\n') with open(action_fp, 'r') as fh: actual = yaml.safe_load(fh) md5sum_hash = md5sum(md_fp) self.assertEqual( actual, MetadataInfo( ['415409a4-stuff-e3eaba5301b4'], 'feature_metadata.tsv', md5sum_hash ) ) def test_metadata_path_constructor_many_Artifacts_as_md(self): tag = ( r"!metadata '415409a4-stuff-e3eaba5301b4,12345-other-stuff-67890" r":feature_metadata.tsv'" ) with tempfile.TemporaryDirectory() as tempdir: md_fp = os.path.join(tempdir, 'feature_metadata.tsv') with open(md_fp, 'w') as fh: fh.write('rows and columns and stuff') action_fp = os.path.join(tempdir, 'action.yaml') with open(action_fp, 'w') as fh: fh.write(f'{tag}\n') with open(action_fp, 'r') as fh: actual = yaml.safe_load(fh) md5sum_hash = md5sum(md_fp) self.assertEqual( actual, MetadataInfo( ['415409a4-stuff-e3eaba5301b4', '12345-other-stuff-67890'], 'feature_metadata.tsv', md5sum_hash ) ) def test_no_provenance_constructor(self): tag = "!no-provenance '34b07e56-27a5-4f03-ae57-ff427b50aaa1'" with self.assertWarnsRegex( UserWarning, 'Artifact 34b07e.*prior to provenance' ): actual = yaml.safe_load(tag) self.assertEqual(actual, '34b07e56-27a5-4f03-ae57-ff427b50aaa1') def test_no_provenance_multiple_warnings_fire(self): tag_list = """ - !no-provenance '34b07e56-27a5-4f03-ae57-ff427b50aaa1' - !no-provenance 'gerbil' """ with warnings.catch_warnings(record=True) as w: # Just in case something else has modified the filter state warnings.simplefilter("default") yaml.safe_load(tag_list) # There should be exactly two warnings self.assertEqual(len(w), 2) # The first should be a Userwarning containing these strings self.assertEqual(UserWarning, w[0].category) self.assertIn('Artifact 34b07e', str(w[0].message)) self.assertIn('prior to provenance', str(w[0].message)) # And the second should look similar self.assertEqual(UserWarning, w[1].category) self.assertIn('gerbil', str(w[1].message)) self.assertIn('prior to provenance', str(w[0].message)) def test_set_ref(self): flow_tag = r"!set ['foo', 'bar', 'baz']" flow = yaml.safe_load(flow_tag) self.assertEqual(flow, {'foo', 'bar', 'baz'}) # NOTE: we don't expect duplicate values here (because dumped values # were a set), but it doesn't hurt to test the behavior block_tag = '!set\n- spam\n- egg\n- spam\n' block = yaml.safe_load(block_tag) self.assertEqual(block, {'spam', 'egg'}) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/tests/testing_utilities.py000066400000000000000000000262671462552636000271710ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import pathlib import shutil import tempfile import unittest from contextlib import contextmanager from dataclasses import dataclass from typing import Generator import warnings from zipfile import ZipFile, ZIP_DEFLATED from ..parse import ProvDAG import qiime2 from qiime2 import Artifact, Metadata, ResultCollection from qiime2.core.archive import Archiver from qiime2.sdk.plugin_manager import PluginManager @dataclass class DummyArtifact: name: str artifact: Artifact uuid: str filepath: str dag: ProvDAG archive_version: int = 6 class DummyArtifacts: def __init__(self): self.pm = PluginManager() self.dp = self.pm.plugins['dummy-plugin'] self.tempdir = tempfile.mkdtemp(prefix='qiime2-dummy-artifacts-temp-') self.datadir = os.path.join( os.path.dirname(os.path.abspath(__file__)), 'data' ) self.init_import_artifacts() self.init_action_artifacts() self.init_all_version_artifacts() self.init_artifact_with_md_in_provenance() self.init_no_checksum_dag() def init_import_artifacts(self): ''' artifacts with only import in their provenance ''' single_int = Artifact.import_data('SingleInt', 0) single_int2 = Artifact.import_data('SingleInt', 7) int_seq1 = Artifact.import_data('IntSequence1', [1, 1, 2]) int_seq2 = Artifact.import_data('IntSequence2', [3, 5]) mapping1 = Artifact.import_data('Mapping', {'a': 42}) mapping2 = Artifact.import_data('Mapping', {'c': 8, 'd': 13}) for name in ( 'single_int', 'single_int2', 'int_seq1', 'int_seq2', 'mapping1', 'mapping2' ): artifact = locals()[name] fp = os.path.join(self.tempdir, f'{name}.qza') artifact.save(fp) test_artifact = DummyArtifact( name, artifact, str(artifact.uuid), fp, ProvDAG(fp) ) setattr(self, name, test_artifact) def init_action_artifacts(self): ''' artifacts that have at least one non-import action in their provenance ''' concat_ints = self.dp.methods['concatenate_ints'] split_ints = self.dp.methods['split_ints'] merge_mappings = self.dp.methods['merge_mappings'] identity_with_metadata = self.dp.actions['identity_with_metadata'] identity_with_metadata_column = \ self.dp.actions['identity_with_metadata_column'] dict_of_ints = self.dp.actions['dict_of_ints'] optional_artifacts_method = \ self.dp.actions['optional_artifacts_method'] concated_ints, = concat_ints( self.int_seq1.artifact, self.int_seq1.artifact, self.int_seq2.artifact, 7, 13 ) other_concated_ints, = concat_ints( self.int_seq1.artifact, self.int_seq1.artifact, self.int_seq2.artifact, 81, 64 ) splitted_ints, _ = split_ints(self.int_seq2.artifact) merged_mappings, = merge_mappings( self.mapping1.artifact, self.mapping2.artifact ) # artifacts with input artifact viewed as metadata int_seq_with_md, = identity_with_metadata( self.int_seq1.artifact, self.mapping1.artifact.view(Metadata) ) int_seq_with_md_column, = identity_with_metadata_column( self.int_seq1.artifact, self.mapping2.artifact.view(Metadata).get_column('c') ) concated_ints_with_md_column, = concat_ints( self.int_seq1.artifact, int_seq_with_md_column, self.int_seq2.artifact, 69, 2001 ) # artifact with input collection ints_dict = { 'int1': self.single_int.artifact, 'int2': self.single_int2.artifact, } ints_collection = ResultCollection(ints_dict) ints_from_collection, = dict_of_ints(ints_collection) int_from_collection = ints_from_collection['int1'] # artifact with optional inputs left to default None int_seq_optional_input, = optional_artifacts_method( self.int_seq1.artifact, 8 ) # artifact from pipeline typical_pipeline = self.dp.pipelines['typical_pipeline'] _, _, _, pipeline_viz, _ = typical_pipeline( self.int_seq1.artifact, self.mapping1.artifact, False ) for name in ( 'concated_ints', 'other_concated_ints', 'splitted_ints', 'merged_mappings', 'pipeline_viz', 'int_seq_with_md', 'concated_ints_with_md_column', 'int_from_collection', 'int_seq_optional_input' ): artifact = locals()[name] if name == 'pipeline_viz': ext = '.qzv' else: ext = '.qza' fp = os.path.join(self.tempdir, f'{name}{ext}') artifact.save(fp) test_artifact = DummyArtifact( name, artifact, str(artifact.uuid), fp, ProvDAG(fp) ) setattr(self, name, test_artifact) def init_all_version_artifacts(self): ''' import artifacts for all archive versions (0-6), which are stored uncompressed in the test data directory--necessary because we can not make artifacts of non-current versions on the fly ''' for version in range(0, 7): if version == 0: dirname = 'table-v0' else: dirname = f'concated-ints-v{version}' versioned_artifact_dir = os.path.join(self.datadir, dirname) temp_zf_path = os.path.join(self.tempdir, 'temp.zip') write_zip_file(temp_zf_path, versioned_artifact_dir) filename = f'{dirname}.qza' fp = os.path.join(self.tempdir, filename) if version == 0: shutil.copy(temp_zf_path, fp) a = None else: a = Artifact.load(temp_zf_path) a.save(fp) with warnings.catch_warnings(): warnings.filterwarnings('ignore', category=UserWarning) dag = ProvDAG(fp) assert len(dag.terminal_nodes) == 1 terminal_node, *_ = dag.terminal_nodes uuid = terminal_node._uuid name = filename.replace('-', '_').replace('.qza', '') da = DummyArtifact(name, a, uuid, fp, dag, version) setattr(self, name, da) def init_artifact_with_md_in_provenance(self): dirname = 'concated-ints-with-md' artifact_dir = os.path.join(self.datadir, dirname) temp_zf_path = os.path.join(self.tempdir, 'temp.zip') write_zip_file(temp_zf_path, artifact_dir) filename = f'{dirname}.qza' fp = os.path.join(self.tempdir, filename) a = Artifact.load(temp_zf_path) a.save(fp) dag = ProvDAG(fp) terminal_node, *_ = dag.terminal_nodes uuid = terminal_node._uuid name = filename.replace('-', '_').replace('.qza', '') da = DummyArtifact(name, a, uuid, fp, dag, 6) setattr(self, name, da) def init_no_checksum_dag(self): ''' create archive with missing checksums.md5 ''' with warnings.catch_warnings(): warnings.filterwarnings('ignore', category=UserWarning) with generate_archive_with_file_removed( self.single_int.filepath, self.single_int.uuid, 'checksums.md5' ) as altered_archive: self.dag_missing_md5 = ProvDAG(altered_archive) @property def all_artifact_versions(self): return ( self.table_v0, self.concated_ints_v1, self.concated_ints_v2, self.concated_ints_v3, self.concated_ints_v4, self.concated_ints_v5, self.concated_ints_v6 ) def free(self): shutil.rmtree(self.tempdir) def write_zip_file(zfp, unzipped_dir): zf = ZipFile(zfp, 'w', ZIP_DEFLATED) for root, dirs, files in os.walk(unzipped_dir): for file in files: filepath = os.path.join(root, file) zf.write( filepath, os.path.relpath(filepath, unzipped_dir) ) zf.close() class CustomAssertions(unittest.TestCase): def assertREAppearsOnlyOnce(self, text, only_once, msg=None): appears_once_re = \ (f'(?s)^(?:(?!{only_once}).)*{only_once}(?!.*{only_once}).*$') self.assertRegex(text, appears_once_re, msg) def is_root_provnode_data(fp): ''' a filter predicate which returns metadata, action, citation, and VERSION fps with which we can construct a ProvNode ''' # Handle provenance files... if 'provenance' in fp and 'artifacts' not in fp: if 'action.yaml' in fp or 'citations.bib' in fp: return True # then handle files available at root, which require a cast if pathlib.Path(fp).parts[1] in ( 'VERSION', 'metadata.yaml', 'checksums.md5' ): return True @contextmanager def generate_archive_with_file_removed( qzv_fp: str, root_uuid: str, file_to_drop: pathlib.Path ) -> Generator[pathlib.Path, None, None]: """ Deleting files from zip archives is hard, so this makes a temporary copy of qzf_fp with fp_to_drop removed and returns a handle to this archive file_to_drop should represent the relative path to the file within the zip archive, excluding the root directory (named for the root UUID). e.g. `/d9e080bb-e245-4ab0-a2cf-0a89b63b8050/metadata.yaml` should be passed in as `metadata.yaml` adapted from https://stackoverflow.com/a/513889/9872253 """ with tempfile.TemporaryDirectory() as tmpdir: tmp_arc = pathlib.Path(tmpdir) / 'mangled.qzv' fp_pfx = pathlib.Path(root_uuid) zin = ZipFile(qzv_fp, 'r') zout = ZipFile(str(tmp_arc), 'w') for item in zin.infolist(): buffer = zin.read(item.filename) drop_filename = str(fp_pfx / file_to_drop) if (item.filename != drop_filename): zout.writestr(item, buffer) zout.close() zin.close() yield tmp_arc @contextmanager def monkeypatch_archive_version(patch_version): try: og_version = Archiver.CURRENT_FORMAT_VERSION Archiver.CURRENT_FORMAT_VERSION = patch_version yield finally: Archiver.CURRENT_FORMAT_VERSION = og_version @contextmanager def monkeypatch_framework_version(patch_version): try: og_version = qiime2.__version__ qiime2.__version__ = patch_version yield finally: qiime2.__version__ = og_version def write_zip_archive(zfp, unzipped_dir): with ZipFile(zfp, 'w') as zf: for root, dirs, files in os.walk(unzipped_dir): for file in files: path = os.path.join(root, file) archive_name = os.path.relpath(path, start=unzipped_dir) zf.write(path, arcname=archive_name) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/usage_drivers.py000066400000000000000000000343021462552636000251060ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from datetime import datetime from importlib.metadata import metadata import pkg_resources import textwrap from typing import Any, Callable, List from .parse import ProvDAG from qiime2.sdk import Action from qiime2.plugins import ArtifactAPIUsage from qiime2.sdk.usage import ( Usage, UsageVariable, UsageInputs, UsageOutputs ) def build_header( shebang: str = '', boundary: str = '', copyright: str = '', extra_text: List[str] = [] ) -> List[str]: ''' Constructs the header contents for a replay script. Parameters ---------- shebang : str The shebang line to add to the rendered script, if any. boundary : str The visual boundary to add to the rendred script, if any. copyright : str The copyright notice to add to the rendered script, if any. extra_text : list of str Extra lines of text to add to the header in the rendered script, if any. Returns ------- list of str The constructed header lines. ''' qiime2_md = metadata('qiime2') vzn = qiime2_md['Version'] ts = datetime.now() header = [] if shebang: header.append(shebang) if boundary: header.append(boundary) header.extend([ f'# Auto-generated by qiime2 v.{vzn} at ' f'{ts.strftime("%I:%M:%S %p")} on {ts.strftime("%d %b, %Y")}', ]) if copyright: header.extend(copyright) header.append( '# For User Support, post to the QIIME2 Forum at ' 'https://forum.qiime2.org.' ) if extra_text: header.extend(extra_text) if boundary: header.append(boundary) return header def build_footer(dag: ProvDAG, boundary: str) -> List[str]: ''' Constructs the footer contents for a replay script. Parameters ---------- dag : ProvDAG The ProvDAG object representing the input artifact(s). boundary : str The visual boundary demarcating the footer. Returns ------- list of str The constructed footer lines as a list of strings. ''' footer = [] pairs = [] uuids = sorted(dag._parsed_artifact_uuids) # two UUIDs fit on a line for idx in range(0, u_len := len(uuids), 2): if idx == u_len - 1: pairs.append(f'# {uuids[idx]}') else: pairs.append(f'# {uuids[idx]} \t {uuids[idx + 1]}') footer.append(boundary) footer.append( '# The following QIIME 2 Results were parsed to produce this script:' ) footer.extend(pairs) footer.append(boundary) footer.append('') return footer class ReplayPythonUsage(ArtifactAPIUsage): shebang = '#!/usr/bin/env python' header_boundary = '# ' + ('-' * 77) copyright = pkg_resources.resource_string( 'qiime2.core.archive.provenance_lib', 'assets/copyright_note.txt' ).decode('utf-8').split('\n') how_to = pkg_resources.resource_string( 'qiime2.core.archive.provenance_lib', 'assets/python_howto.txt' ).decode('utf-8').split('\n') def __init__( self, enable_assertions: bool = False, action_collection_size: int = 2 ): ''' Identical to parent, but with smaller default action_collection_size. Parameters ---------- enable_assertions : bool Whether to render has-line-matching and output type assertions. action_collection_size : int The maximum number of outputs returned by an action above which results are grouped into and destructured from a single variable. ''' super().__init__() self.enable_assertions = enable_assertions self.action_collection_size = action_collection_size self._reset_state(reset_global_imports=True) def _reset_state(self, reset_global_imports=False): ''' Clears all state associated with the usage driver, excepting global imports by default. Parameters ---------- resest_global_imports : bool Whether to reset self.global_imports to an empty set. ''' self.local_imports = set() self.header = [] self.recorder = [] self.footer = [] self.init_data_refs = dict() if reset_global_imports: self.global_imports = set() def _template_action( self, action: Action, input_opts: UsageInputs, variables: UsageOutputs ): ''' Templates the artifact api python code for the action `action`. Extends the parent method to: - accommodate action signatures that may differ between those found in provenance and those accessible in the currently executing environment - render artifact api code that saves the results to disk. Parameters ---------- action : Action The qiime2 Action object. input_opts : UsageInputs The UsageInputs mapping for the action. variables : UsageOutputs The UsageOutputs object for the action. ''' action_f = action.get_action() if ( len(variables) > self.action_collection_size or len(action_f.signature.outputs) > 5 ): output_vars = 'action_results' else: output_vars = self._template_outputs(action, variables) plugin_id = action.plugin_id action_id = action.action_id lines = [ f'{output_vars} = {plugin_id}_actions.{action_id}(' ] all_inputs = (list(action_f.signature.inputs.keys()) + list(action_f.signature.parameters.keys())) for k, v in input_opts.items(): line = '' if k not in all_inputs: line = self.INDENT + ( '# FIXME: The following parameter name was not found in ' 'your current\n # QIIME 2 environment. This may occur ' 'when the plugin version you have\n # installed does ' 'not match the version used in the original analysis.\n ' ' # Please see the docs and correct the parameter name ' 'before running.\n' ) line += self._template_input(k, v) lines.append(line) lines.append(')') if ( len(variables) > self.action_collection_size or len(action.get_action().signature.outputs) > 5 ): for k, v in variables._asdict().items(): interface_name = v.to_interface_name() lines.append('%s = action_results.%s' % (interface_name, k)) lines.append( '# SAVE: comment out the following with \'# \' to skip saving ' 'Results to disk' ) for k, v in variables._asdict().items(): interface_name = v.to_interface_name() lines.append( '%s.save(\'%s\')' % (interface_name, interface_name,)) lines.append('') self._add(lines) def _template_outputs( self, action: Action, variables: UsageOutputs ) -> str: ''' Extends the parent method to allow the replay an action when a ProvDAG doesn't have a record of all outputs from an action. These unknown outputs are given the conventional '_' variable name. Parameters ---------- action : Action The Action object associated with the output variables. variables : UsageOutputs The UsageOutputs object associated with the action. Returns ------- str The templated output variables names as a comma-separated string. ''' output_vars = [] action_f = action.get_action() # need to coax the outputs into the correct order for unpacking for output in action_f.signature.outputs: try: variable = getattr(variables, output) output_vars.append(str(variable.to_interface_name())) except AttributeError: output_vars.append('_') if len(output_vars) == 1: output_vars.append('') return ', '.join(output_vars).strip() def init_metadata( self, name: str, factory: Callable, dumped_md_fn: str = '' ) -> UsageVariable: ''' Renders the loading of Metadata from disk. Parameters ---------- name : str The name of the created and returned UsageVariable. factory : Callable The factory responsible for constructing the metadata UsageVariable. dumped_md_fn : str Optional. The filename of the dumped metadata. Returns ------- UsageVariable The UsageVariable of var_type metadata corresponding to the loaded metadata. ''' var = super().init_metadata(name, factory) self._update_imports(from_='qiime2', import_='Metadata') input_fp = var.to_interface_name() if dumped_md_fn: lines = [f'{input_fp} = Metadata.load("{dumped_md_fn}.tsv")'] else: self.comment( 'NOTE: You may substitute already-loaded Metadata for the ' 'following, or cast a pandas.DataFrame to Metadata as needed.' ) lines = [f'{input_fp} = Metadata.load()'] self._add(lines) return var def import_from_format( self, name: str, semantic_type: str, variable: UsageVariable, view_type: Any = None ): ''' Extends the parent method to: - use '' instead of import_fp for the import filepath - render artifact api code that saves the result to disk. Parameters ---------- name : str The name of the UsageVariable `variable`. semantic_type : str The semantic type of the UsageVariable `variable`. variable : UsageVariable The usage variable object. view_type : str or some format The view type to use for importing. Returns ------- UsageVariable The imported artifact UsageVariable. ''' imported_var = Usage.import_from_format( self, name, semantic_type, variable, view_type=view_type ) interface_name = imported_var.to_interface_name() import_fp = self.repr_raw_variable_name('') lines = [ '%s = Artifact.import_data(' % (interface_name,), self.INDENT + '%r,' % (semantic_type,), self.INDENT + '%r,' % (import_fp,), ] if view_type is not None: if type(view_type) is not str: # Show users where these formats come from when used in the # Python API to make things less 'magical'. import_path = super()._canonical_module(view_type) view_type = view_type.__name__ if import_path is not None: self._update_imports(from_=import_path, import_=view_type) else: # May be in scope already, but something is quite wrong at # this point, so assume the plugin_manager is sufficiently # informed. view_type = repr(view_type) else: view_type = repr(view_type) lines.append(self.INDENT + '%s,' % (view_type,)) lines.extend([ ')', '# SAVE: comment out the following with \'# \' to skip saving this' ' Result to disk', '%s.save(\'%s\')' % (interface_name, interface_name,), '' ]) self._update_imports(from_='qiime2', import_='Artifact') self._add(lines) return imported_var class repr_raw_variable_name: # allows us to repr col name without enclosing quotes # (as in qiime2.qiime2.plugins.ArtifactAPIUsageVariable) def __init__(self, value): self.value = value def __repr__(self): return self.value def comment(self, line: str): ''' Communicate that a comment should be rendered. Parameters ---------- line : str The comment to be rendered. ''' LINE_LEN = 79 lines = textwrap.wrap( line, LINE_LEN, break_long_words=False, initial_indent='# ', subsequent_indent='# ' ) lines.append('') self._add(lines) def render(self, flush: bool = False) -> str: ''' Return a newline-seperated string of Artifact API python code. Parameters ---------- flush : bool Whether to 'flush' the current code. Importantly, this will clear the top-line imports for future invocations. Returns ------- str The rendered string of python code. ''' sorted_imps = sorted(self.local_imports) if self.header: self.header = self.header + [''] if self.footer: self.footer = [''] + self.footer if sorted_imps: sorted_imps = sorted_imps + [''] rendered = '\n'.join( self.header + sorted_imps + self.recorder + self.footer ) if flush: self._reset_state() return rendered def build_header(self): '''Constructs a renderable header from its components.''' self.header.extend(build_header( self.shebang, self.header_boundary, self.copyright, self.how_to )) def build_footer(self, dag: ProvDAG): ''' Constructs a renderable footer using the terminal uuids of a ProvDAG. ''' self.footer.extend(build_footer(dag, self.header_boundary)) qiime2-2024.5.0/qiime2/core/archive/provenance_lib/util.py000066400000000000000000000056021462552636000232220ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import codecs import pathlib import re import warnings from typing import Tuple from zipfile import ZipFile def get_root_uuid(zf: ZipFile) -> str: ''' Returns the root UUID of a QIIME 2 Archive. Parameters ---------- zf : ZipFile The zipfile object of an archive. Returns ------- str The uuid of the root artifact in the archive. ''' return pathlib.Path(zf.namelist()[0]).parts[0] def get_nonroot_uuid(fp: pathlib.Path) -> str: ''' For non-root provenance files, get the Result's uuid from its path. Parameters ---------- fp : pathlib.Path The path to a file in a non-root artifact inside an archive, relative to archive root. Returns ------- str The uuid of the non-root artifact. ''' if fp.name == 'action.yaml': return fp.parts[-3] return fp.parts[-2] _VERSION_MATCHER = ( r'QIIME 2\n' r'archive: [0-9]{1,2}$\n' r'framework: ' r'(?:20[0-9]{2}|2)\.(?:[1-9][0-2]?|0)\.[0-9](?:\.dev[0-9]?)?' r'(?:\+[.\w]+)?\Z' ) def parse_version(zf: ZipFile) -> Tuple[str, str]: ''' Finds and parses the VERSION file inside of an archive. Parameters ---------- zf : ZipFile The zipfile object of an archive. Returns ------- tuple of (str, str) The archive version and framework version of the archive. ''' uuid = get_root_uuid(zf) version_fp = pathlib.Path(uuid) / 'VERSION' try: with zf.open(str(version_fp)) as v_fp: version_contents = str(v_fp.read().strip(), 'utf-8') except KeyError: raise ValueError( f'Malformed Archive: VERSION file for node {uuid} misplaced ' f'or nonexistent.\nArchive {zf.filename} may be corrupt or ' 'provenance may be false.' ) if not re.match(_VERSION_MATCHER, version_contents, re.MULTILINE): warnings.filterwarnings( 'ignore', 'invalid escape sequence', DeprecationWarning ) version_match_repr = codecs.decode( _VERSION_MATCHER.encode('utf-8'), 'unicode-escape' ) raise ValueError( f'Malformed Archive: VERSION file out of spec in {zf.filename}.\n' f'Should match this regular expression:\n{version_match_repr}\n' f'Actually looks like:\n{version_contents}\n' ) _, archive_version, framework_version = [ line.strip().split()[-1] for line in version_contents.split(sep='\n') if line ] return (archive_version, framework_version) qiime2-2024.5.0/qiime2/core/archive/tests/000077500000000000000000000000001462552636000200445ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/archive/tests/__init__.py000066400000000000000000000005351462552636000221600ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/archive/tests/test_archiver.py000066400000000000000000000313421462552636000232630ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import uuid import zipfile import pathlib from qiime2.core.archive import Archiver from qiime2.core.archive import ImportProvenanceCapture from qiime2.core.archive.archiver import _ZipArchive, ArchiveCheck from qiime2.core.archive.format.util import artifact_version from qiime2.core.testing.format import IntSequenceDirectoryFormat from qiime2.core.testing.type import IntSequence1 from qiime2.core.testing.util import ArchiveTestingMixin from qiime2.core.util import is_uuid4, set_permissions, OTHER_NO_WRITE class TestArchiver(unittest.TestCase, ArchiveTestingMixin): def setUp(self): prefix = "qiime2-test-temp-" self.temp_dir = tempfile.TemporaryDirectory(prefix=prefix) # Initialize an Archiver. The values passed to the constructor mostly # don't matter to the Archiver, but we'll pass valid Artifact test data # anyways in case Archiver's behavior changes in the future. def data_initializer(data_dir): fp = os.path.join(str(data_dir), 'ints.txt') with open(fp, 'w') as fh: fh.write('1\n') fh.write('2\n') fh.write('3\n') self.archiver = Archiver.from_data( IntSequence1, IntSequenceDirectoryFormat, data_initializer=data_initializer, provenance_capture=ImportProvenanceCapture()) def tearDown(self): self.temp_dir.cleanup() def test_save_invalid_filepath(self): # Empty filepath. with self.assertRaisesRegex(FileNotFoundError, 'No such file'): self.archiver.save('') # Directory. with self.assertRaisesRegex(IsADirectoryError, 'directory'): self.archiver.save(self.temp_dir.name) # Ends with path separator (no basename, e.g. /tmp/foo/). with self.assertRaises((IsADirectoryError, FileNotFoundError)): self.archiver.save(os.path.join(self.temp_dir.name, 'foo', '')) def test_save_excludes_dotfiles_in_data_dir(self): def data_initializer(data_dir): data_dir = str(data_dir) fp = os.path.join(data_dir, 'ints.txt') with open(fp, 'w') as fh: fh.write('1\n') fh.write('2\n') fh.write('3\n') hidden_fp = os.path.join(data_dir, '.hidden-file') with open(hidden_fp, 'w') as fh: fh.write("You can't see me if I can't see you\n") hidden_dir = os.path.join(data_dir, '.hidden-dir') os.mkdir(hidden_dir) with open(os.path.join(hidden_dir, 'ignored-file'), 'w') as fh: fh.write("I'm ignored because I live in a hidden dir :(\n") archiver = Archiver.from_data( IntSequence1, IntSequenceDirectoryFormat, data_initializer=data_initializer, provenance_capture=ImportProvenanceCapture()) fp = os.path.join(self.temp_dir.name, 'archive.zip') archiver.save(fp) root_dir = str(archiver.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/ints.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp, root_dir, expected) def test_save_archive_members(self): fp = os.path.join(self.temp_dir.name, 'archive.zip') self.archiver.save(fp) root_dir = str(self.archiver.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/ints.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp, root_dir, expected) def test_load_archive(self): fp = os.path.join(self.temp_dir.name, 'archive.zip') self.archiver.save(fp) archiver = Archiver.load(fp) self.assertEqual(archiver.uuid, self.archiver.uuid) self.assertEqual(archiver.type, IntSequence1) self.assertEqual(archiver.format, IntSequenceDirectoryFormat) self.assertEqual({str(p.relative_to(archiver.data_dir)) for p in archiver.data_dir.iterdir()}, {'ints.txt'}) def test_load_ignores_root_dotfiles(self): fp = os.path.join(self.temp_dir.name, 'archive.zip') self.archiver.save(fp) # Add some dotfiles to the archive. with zipfile.ZipFile(fp, mode='a') as zf: zf.writestr('.DS_Store', "The world's most beloved file\n") zf.writestr('.hidden-file', "You can't see me if I can't see you\n") zf.writestr('.hidden-dir/ignored-file', "I'm ignored because I live in a hidden dir :(\n") # Assert the expected files exist in the archive to verify this test # case is testing what we want it to. with zipfile.ZipFile(fp, mode='r') as zf: root_dir = str(self.archiver.uuid) expected = { '.DS_Store', '.hidden-file', '.hidden-dir/ignored-file', '%s/VERSION' % root_dir, '%s/checksums.md5' % root_dir, '%s/metadata.yaml' % root_dir, '%s/data/ints.txt' % root_dir, '%s/provenance/metadata.yaml' % root_dir, '%s/provenance/VERSION' % root_dir, '%s/provenance/citations.bib' % root_dir, '%s/provenance/action/action.yaml' % root_dir } observed = set(zf.namelist()) # Not using self.assertArchiveMembers() because it accepts paths # relative to root_dir, and we have extra paths at the same level # as root_dir. self.assertEqual(observed, expected) archiver = Archiver.load(fp) self.assertEqual(archiver.uuid, self.archiver.uuid) self.assertEqual(archiver.type, IntSequence1) self.assertEqual(archiver.format, IntSequenceDirectoryFormat) self.assertEqual({str(p.relative_to(archiver.data_dir)) for p in archiver.data_dir.iterdir()}, {'ints.txt'}) def test_load_empty_archive(self): fp = os.path.join(self.temp_dir.name, 'empty.zip') with zipfile.ZipFile(fp, mode='w') as zf: pass with zipfile.ZipFile(fp, mode='r') as zf: expected = set() observed = set(zf.namelist()) self.assertEqual(observed, expected) with self.assertRaisesRegex(ValueError, 'visible root directory'): Archiver.load(fp) def test_load_dotfile_only_archive(self): fp = os.path.join(self.temp_dir.name, 'dotfiles-only.zip') with zipfile.ZipFile(fp, mode='w') as zf: zf.writestr('.DS_Store', "The world's most beloved file\n") zf.writestr('.hidden-file', "You can't see me if I can't see you\n") zf.writestr('.hidden-dir/ignored-file', "I'm ignored because I live in a hidden dir :(\n") with zipfile.ZipFile(fp, mode='r') as zf: expected = { '.DS_Store', '.hidden-file', '.hidden-dir/ignored-file' } observed = set(zf.namelist()) self.assertEqual(observed, expected) with self.assertRaisesRegex(ValueError, 'visible root directory'): Archiver.load(fp) def test_load_multiple_root_dirs(self): fp = os.path.join(self.temp_dir.name, 'multiple-root-dirs.zip') self.archiver.save(fp) # Add another semi-valid root dir. second_root_dir = str(uuid.uuid4()) with zipfile.ZipFile(fp, mode='a') as zf: zf.writestr('%s/VERSION' % second_root_dir, "foo") with zipfile.ZipFile(fp, mode='r') as zf: root_dir = str(self.archiver.uuid) expected = { '%s/VERSION' % root_dir, '%s/checksums.md5' % root_dir, '%s/metadata.yaml' % root_dir, '%s/data/ints.txt' % root_dir, '%s/provenance/metadata.yaml' % root_dir, '%s/provenance/VERSION' % root_dir, '%s/provenance/citations.bib' % root_dir, '%s/provenance/action/action.yaml' % root_dir, '%s/VERSION' % second_root_dir } observed = set(zf.namelist()) self.assertEqual(observed, expected) with self.assertRaisesRegex(ValueError, 'multiple root directories'): Archiver.load(fp) def test_load_invalid_uuid4_root_dir(self): _uuid = uuid.uuid4() fp = pathlib.Path(self.temp_dir.name) / 'invalid-uuid4' zp = pathlib.Path(self.temp_dir.name) / 'bad.zip' (fp / str(_uuid)).mkdir(parents=True) # Invalid uuid4 taken from https://gist.github.com/ShawnMilo/7777304 root_dir = '89eb3586-8a82-47a4-c911-758a62601cf7' record = _ZipArchive.setup(_uuid, fp / str(_uuid), 'foo', 'bar') (fp / str(record.uuid)).rename(fp / root_dir) _ZipArchive.save(fp, zp) with self.assertRaisesRegex(ValueError, 'root directory.*valid version 4 UUID'): _ZipArchive(zp) def test_is_uuid4_valid(self): uuid_str = str(uuid.uuid4()) self.assertTrue(is_uuid4(uuid_str)) def test_parse_uuid_invalid(self): # Invalid uuid4 taken from https://gist.github.com/ShawnMilo/7777304 uuid_str = '89eb3586-8a82-47a4-c911-758a62601cf7' self.assertFalse(is_uuid4(uuid_str)) # Not a UUID. uuid_str = 'abc123' self.assertFalse(is_uuid4(uuid_str)) # Other UUID versions. for uuid_ in (uuid.uuid1(), uuid.uuid3(uuid.NAMESPACE_DNS, 'foo'), uuid.uuid5(uuid.NAMESPACE_DNS, 'bar')): uuid_str = str(uuid_) self.assertFalse(is_uuid4(uuid_str)) def test_checksums_match(self): diff = self.archiver.validate_checksums() self.assertEqual(diff.added, {}) self.assertEqual(diff.removed, {}) self.assertEqual(diff.changed, {}) def test_checksums_mismatch(self): # We set everything in the artifact to be read-only. This test needs to # mimic if the user were to somehow write it anyway, so we set write # for self and group set_permissions(self.archiver.root_dir, OTHER_NO_WRITE, OTHER_NO_WRITE) with (self.archiver.root_dir / 'data' / 'ints.txt').open('w') as fh: fh.write('999\n') with (self.archiver.root_dir / 'tamper.txt').open('w') as fh: fh.write('extra file') (self.archiver.root_dir / 'VERSION').unlink() diff = self.archiver.validate_checksums() self.assertEqual(diff.added, {'tamper.txt': '296583001b00d2b811b5871b19e0ad28'}) # The contents of most files is either stochastic, or has the current # version (which is an unknown commit sha1), so just check name self.assertEqual(list(diff.removed.keys()), ['VERSION']) self.assertEqual(diff.changed, {'data/ints.txt': ('c0710d6b4f15dfa88f600b0e6b624077', 'f47bc36040d5c7db08e4b3a457dcfbb2') }) def test_checksum_backwards_compat(self): self.tearDown() with artifact_version(4): self.setUp() diff = self.archiver.validate_checksums() self.assertEqual(diff.added, {}) self.assertEqual(diff.removed, {}) self.assertEqual(diff.changed, {}) def test_archive_check(self): """Rough test of our machinery to support showing visualizations in Jupyter notebooks without actually spoofing the notebook """ archive = ArchiveCheck(self.archiver.path) # Make sure this _get_uuid actually works self.assertEqual(archive._get_uuid(), archive.uuid) expected = set([ 'metadata.yaml', 'data', 'checksums.md5', 'provenance', 'VERSION' ]) observed = set(file for file in archive.relative_iterdir()) self.assertEqual(observed, expected) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/archive/tests/test_citations.py000066400000000000000000000071351462552636000234600ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import qiime2 from qiime2.core.testing.type import IntSequence1 from qiime2.core.testing.util import get_dummy_plugin class TestCitationsTracked(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() def test_import(self): data = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3, 4]) archiver = data._archiver expected = [ ('framework|qiime2:%s|0' % qiime2.__version__, 'Reproducible, interactive, scalable and extensible microbiome ' 'data science using QIIME 2'), ('plugin|dummy-plugin:0.0.0-dev|0', 'Does knuckle cracking lead to arthritis of the fingers?'), ('plugin|dummy-plugin:0.0.0-dev|1', 'Of flying frogs and levitrons'), ('transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceDirectoryFormat|0', 'An in-depth analysis of a piece of shit: distribution of' ' Schistosoma mansoni and hookworm eggs in human stool'), ('view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0', 'Walking with coffee: Why does it spill?')] obs = list(map(lambda item: (item[0], item[1].fields['title']), archiver.citations.items())) self.assertEqual(obs, expected) with (archiver.provenance_dir / 'action' / 'action.yaml').open() as fh: action_yaml = fh.read() for key, _ in expected: self.assertIn('!cite %r' % key, action_yaml) def test_action(self): data = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3, 4]) action = self.plugin.methods['split_ints'] left, right = action(data) archiver = left._archiver expected = [ ('framework|qiime2:%s|0' % qiime2.__version__, 'Reproducible, interactive, scalable and extensible microbiome ' 'data science using QIIME 2'), ('action|dummy-plugin:0.0.0-dev|method:split_ints|0', 'Sword swallowing and its side effects'), ('action|dummy-plugin:0.0.0-dev|method:split_ints|1', 'Response behaviors of Svalbard reindeer towards humans and' ' humans disguised as polar bears on Edge\u00f8ya'), ('plugin|dummy-plugin:0.0.0-dev|0', 'Does knuckle cracking lead to arthritis of the fingers?'), ('plugin|dummy-plugin:0.0.0-dev|1', 'Of flying frogs and levitrons'), ('view|dummy-plugin:0.0.0-dev|IntSequenceDirectoryFormat|0', 'Walking with coffee: Why does it spill?'), ('transformer|dummy-plugin:0.0.0-dev|' 'builtins:list->IntSequenceDirectoryFormat|0', 'An in-depth analysis of a piece of shit: distribution of' ' Schistosoma mansoni and hookworm eggs in human stool')] obs = list(map(lambda item: (item[0], item[1].fields['title']), archiver.citations.items())) self.assertEqual(obs, expected) with (archiver.provenance_dir / 'action' / 'action.yaml').open() as fh: action_yaml = fh.read() for key, _ in expected: self.assertIn('!cite %r' % key, action_yaml) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/archive/tests/test_provenance.py000066400000000000000000000260311462552636000236170ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import re import unittest.mock as mock import pandas as pd import qiime2 from qiime2.plugins import dummy_plugin from qiime2.core.testing.type import IntSequence1, Mapping import qiime2.core.archive.provenance as provenance class TestProvenanceIntegration(unittest.TestCase): def test_chain_with_metadata(self): df = pd.DataFrame({'a': ['1', '2', '3']}, index=pd.Index(['0', '1', '2'], name='feature ID')) a = qiime2.Artifact.import_data('IntSequence1', [1, 2, 3]) m = qiime2.Metadata(df) mc = qiime2.CategoricalMetadataColumn(df['a']) b = dummy_plugin.actions.identity_with_metadata(a, m).out c = dummy_plugin.actions.identity_with_metadata_column(b, mc).out p_dir = c._archiver.provenance_dir new_m = qiime2.Metadata.load( str(p_dir / 'artifacts' / str(b.uuid) / 'action' / 'metadata.tsv')) pd.testing.assert_frame_equal(m.to_dataframe(), new_m.to_dataframe()) with (p_dir / 'action' / 'metadata.tsv').open() as fh: self.assertEqual( fh.read(), 'feature ID\ta\n#q2:types\tcategorical\n0\t1\n1\t2\n2\t3\n') def test_chain_with_artifact_metadata(self): metadata_artifact_1 = qiime2.Artifact.import_data( 'Mapping', {'a': 'foo', 'b': 'bar'}) metadata_artifact_2 = qiime2.Artifact.import_data( 'Mapping', {'c': 'baz'}) m = metadata_artifact_1.view(qiime2.Metadata) mc = metadata_artifact_2.view(qiime2.Metadata).get_column('c') a = qiime2.Artifact.import_data('IntSequence1', [1, 2, 3]) b = dummy_plugin.actions.identity_with_metadata(a, m).out c = dummy_plugin.actions.identity_with_metadata_column(b, mc).out p_dir = c._archiver.provenance_dir m_yaml_value = "%s:metadata.tsv" % metadata_artifact_1.uuid mc_yaml_value = "%s:metadata.tsv" % metadata_artifact_2.uuid # Check action files for uuid-metadata values with (p_dir / 'action' / 'action.yaml').open() as fh: self.assertIn(mc_yaml_value, fh.read()) with (p_dir / 'artifacts' / str(b.uuid) / 'action' / 'action.yaml').open() as fh: self.assertIn(m_yaml_value, fh.read()) # Check that metadata is written out fully new_m = qiime2.Metadata.load( str(p_dir / 'artifacts' / str(b.uuid) / 'action' / 'metadata.tsv')) pd.testing.assert_frame_equal(m.to_dataframe(), new_m.to_dataframe()) # Check that provenance of originating metadata artifact exists self.assertTrue((p_dir / 'artifacts' / str(metadata_artifact_1.uuid) / 'action' / 'action.yaml').exists()) self.assertTrue((p_dir / 'artifacts' / str(metadata_artifact_2.uuid) / 'action' / 'action.yaml').exists()) def test_chain_with_merged_artifact_metadata(self): md_artifact1 = qiime2.Artifact.import_data( 'Mapping', {'a': 'foo', 'b': 'bar'}) md_artifact2 = qiime2.Artifact.import_data( 'Mapping', {'c': 'baz'}) md1 = md_artifact1.view(qiime2.Metadata) md2 = md_artifact2.view(qiime2.Metadata) merged_md = md1.merge(md2) merged_mdc = merged_md.get_column('c') a = qiime2.Artifact.import_data('IntSequence1', [1, 2, 3]) b = dummy_plugin.actions.identity_with_metadata(a, merged_md).out c = dummy_plugin.actions.identity_with_metadata_column( b, merged_mdc).out p_dir = c._archiver.provenance_dir yaml_value = "%s,%s:metadata.tsv" % (md_artifact1.uuid, md_artifact2.uuid) # Check action files for uuid-metadata values with (p_dir / 'action' / 'action.yaml').open() as fh: self.assertIn(yaml_value, fh.read()) with (p_dir / 'artifacts' / str(b.uuid) / 'action' / 'action.yaml').open() as fh: self.assertIn(yaml_value, fh.read()) # Check that metadata is written out fully with (p_dir / 'action' / 'metadata.tsv').open() as fh: self.assertEqual(fh.read(), 'id\tc\n#q2:types\tcategorical\n0\tbaz\n') new_merged_md = qiime2.Metadata.load( str(p_dir / 'artifacts' / str(b.uuid) / 'action' / 'metadata.tsv')) pd.testing.assert_frame_equal(new_merged_md.to_dataframe(), merged_md.to_dataframe()) # Check that provenance of originating metadata artifacts exists self.assertTrue((p_dir / 'artifacts' / str(md_artifact1.uuid) / 'action' / 'action.yaml').exists()) self.assertTrue((p_dir / 'artifacts' / str(md_artifact2.uuid) / 'action' / 'action.yaml').exists()) def test_with_optional_artifacts(self): ints1 = qiime2.Artifact.import_data(IntSequence1, [0, 42, 43]) ints2 = qiime2.Artifact.import_data(IntSequence1, [99, -22]) # One optional artifact is provided (`optional1`) while `optional2` is # omitted. obs = dummy_plugin.actions.optional_artifacts_method( ints1, 42, optional1=ints2).output p_dir = obs._archiver.provenance_dir with (p_dir / 'action' / 'action.yaml').open() as fh: yaml = fh.read() self.assertIn('ints: %s' % ints1.uuid, yaml) self.assertIn('optional1: %s' % ints2.uuid, yaml) self.assertIn('optional2: null', yaml) self.assertIn('num1: 42', yaml) self.assertIn('num2: null', yaml) self.assertTrue((p_dir / 'artifacts' / str(ints1.uuid) / 'action' / 'action.yaml').exists()) self.assertTrue((p_dir / 'artifacts' / str(ints2.uuid) / 'action' / 'action.yaml').exists()) def test_output_name_different(self): ints = qiime2.Artifact.import_data(IntSequence1, [0, 1, 2, 3]) left, right = dummy_plugin.actions.split_ints(ints) left_p_dir = left._archiver.provenance_dir right_p_dir = right._archiver.provenance_dir with (left_p_dir / 'action' / 'action.yaml').open() as fh: left_yaml = fh.read() with (right_p_dir / 'action' / 'action.yaml').open() as fh: right_yaml = fh.read() self.assertNotEqual(left_yaml, right_yaml) self.assertIn('output-name: left', left_yaml) self.assertIn('output-name: right', right_yaml) def test_output_name_visualization(self): viz, = dummy_plugin.actions.no_input_viz() viz_p_dir = viz._archiver.provenance_dir with (viz_p_dir / 'action' / 'action.yaml').open() as fh: self.assertIn('output-name: visualization', fh.read()) def test_no_output_name_import(self): ints = qiime2.Artifact.import_data(IntSequence1, [0, 2, 4]) ints_p_dir = ints._archiver.provenance_dir with (ints_p_dir / 'action' / 'action.yaml').open() as fh: self.assertNotIn('output-name:', fh.read()) def test_pipeline_alias_of(self): ints = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3]) mapping = qiime2.Artifact.import_data(Mapping, {'foo': '42'}) r = dummy_plugin.actions.typical_pipeline(ints, mapping, False) # mapping is a pass-through new_mapping = r.out_map new_mapping_p_dir = new_mapping._archiver.provenance_dir with (new_mapping_p_dir / 'action' / 'action.yaml').open() as fh: new_mapping_yaml = fh.read() # Basic sanity check self.assertIn('type: pipeline', new_mapping_yaml) self.assertIn('int_sequence: %s' % ints.uuid, new_mapping_yaml) self.assertIn('mapping: %s' % mapping.uuid, new_mapping_yaml) # Remembers the original mapping uuid self.assertIn('alias-of: %s' % mapping.uuid, new_mapping_yaml) def test_nested_pipeline_alias_of(self): ints = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3]) mapping = qiime2.Artifact.import_data(Mapping, {'foo': '42'}) r = dummy_plugin.actions.pipelines_in_pipeline(ints, mapping) right_p_dir = r.right._archiver.provenance_dir with (right_p_dir / 'action' / 'action.yaml').open() as fh: right_yaml = fh.read() self.assertIn('type: pipeline', right_yaml) self.assertIn('action: pipelines_in_pipeline', right_yaml) self.assertIn('int_sequence: %s' % ints.uuid, right_yaml) match = re.search(r'alias\-of: ([a-zA-Z0-9\-]+)$', right_yaml, flags=re.MULTILINE) first_alias = match.group(1) with (right_p_dir / 'artifacts' / first_alias / 'action' / 'action.yaml').open() as fh: first_alias_yaml = fh.read() # Should be the same input self.assertIn('type: pipeline', first_alias_yaml) self.assertIn('int_sequence: %s' % ints.uuid, first_alias_yaml) self.assertIn('action: typical_pipeline', first_alias_yaml) match = re.search(r'alias\-of: ([a-zA-Z0-9\-]+)$', first_alias_yaml, flags=re.MULTILINE) second_alias = match.group(1) with (right_p_dir / 'artifacts' / second_alias / 'action' / 'action.yaml').open() as fh: actual_method_yaml = fh.read() self.assertIn('type: method', actual_method_yaml) self.assertIn('ints: %s' % ints.uuid, actual_method_yaml) self.assertIn('action: split_ints', actual_method_yaml) def test_unioned_primitives(self): r = dummy_plugin.actions.unioned_primitives(3, 2) prov_dir = r.out._archiver.provenance_dir with (prov_dir / 'action' / 'action.yaml').open() as fh: prov_yml = fh.read() self.assertIn('foo: 3', prov_yml) self.assertIn('bar: 2', prov_yml) @mock.patch('qiime2.core.archive.provenance.tzlocal.get_localzone', side_effect=ValueError()) def test_ts_to_date(self, mocked_tzlocal): q2_paper_date = 1563984000 obs = str(provenance._ts_to_date(q2_paper_date)) exp = "2019-07-24 16:00:00+00:00" self.assertEqual(obs, exp) self.assertTrue(mocked_tzlocal.called) def test_prov_rename(self): viz, = dummy_plugin.actions.no_input_viz() viz_p_dir = viz._archiver.provenance_dir self.assertTrue(viz_p_dir.exists()) @mock.patch('qiime2.core.path.ProvenancePath.rename', side_effect=FileExistsError) def test_prov_rename_file_exists(self, _): viz, = dummy_plugin.actions.no_input_viz() viz_p_dir = viz._archiver.provenance_dir with (viz_p_dir / 'action' / 'action.yaml').open() as fh: self.assertIn('output-name: visualization', fh.read()) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/cache.py000066400000000000000000002071131462552636000167020ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- """ The cache is used to store unzipped data on disk in a predictable and user controlled location. This allows us to skip constantly zipping and unzipping large amounts of data and taking up CPU time when storage space is not an issue. It also allows us to know exactly what data has been created and where. By default, a cache will be created under $TMPDIR/qiime2/$USER and all intermediate data created by QIIME 2 as it executes will be written into that directory. This means QIIME 2 reserves usage of the $TMPDIR/qiime2 directory. The user may also specify a new location to be used in place of this default directory. This location must meet a few criteria. **1.** It must be writable from any and all locations the QIIME 2 command intending to use it will be running. This means that in an HPC context, the location specified for the cache must be writable from the node QIIME 2 will be executing on. **2.** It must either not exist or already be a cache. The first time a directory is specified to be used as a cache, it should not exist. QIIME 2 will create a cache structure on disk at that location. Any existing directory you attempt to use as a cache should have been created as a cache by QIIME 2. """ import re import os import stat import yaml import time import atexit import psutil import shutil import getpass import pathlib import weakref import tempfile import warnings import threading from sys import maxsize from random import randint from datetime import timedelta import flufl.lock import qiime2 from .path import ArchivePath from qiime2.sdk.result import Result from qiime2.core.util import (is_uuid4, set_permissions, touch_under_path, load_action_yaml, READ_ONLY_FILE, READ_ONLY_DIR, USER_GROUP_RWX) from qiime2.core.archive.archiver import Archiver from qiime2.core.type import HashableInvocation, IndexedCollectionElement _VERSION_TEMPLATE = """\ QIIME 2 cache: %s framework: %s """ # Thread local indicating the cache to use _CACHE = threading.local() _CACHE.cache = None _CACHE.temp_cache = None # TODO: Do we want this on the thread local? I feel like maybe we do # Keep track of every cache used by this process for cleanup later USED_CACHES = set() # These permissions are directory with sticky bit and rwx for all set EXPECTED_PERMISSIONS = 0o41777 def get_cache(): """Gets the cache we have instructed QIIME 2 to use in this invocation. By default this is a cache located at $TMPDIR/qiime2/$USER, but if the user has set a cache it is the cache they set. This is used by various parts of the framework to determine what cache they should be saving to/loading from. Returns ------- Cache The cache QIIME 2 is using for the current invocation. Examples -------- >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> # get_cache() will return the temp cache, not the one we just made. >>> get_cache() == cache False >>> # After withing in the cache we just made, get_cache() will return it. >>> with cache: ... get_cache() == cache True >>> # Now that we have exited our cache, we will get the temp cache again. >>> get_cache() == cache False >>> test_dir.cleanup() """ # If we are on a new thread we may in fact not have a cache attribute here # at all if not hasattr(_CACHE, 'cache') or _CACHE.cache is None: if not hasattr(_CACHE, 'temp_cache') or _CACHE.temp_cache is None: _CACHE.temp_cache = Cache() return _CACHE.temp_cache return _CACHE.cache def _get_temp_path(): """Get path to temp cache if the user did not specify a named cache. This function will create the path if it does not exist and ensure it is suitable for use as a cache if it does. Returns ------- str The path created for the temp cache. """ tmpdir = tempfile.gettempdir() cache_dir = os.path.join(tmpdir, 'qiime2') # Make sure the sticky bit is set on the cache directory. Documentation on # what a sticky bit is can be found here # https://docs.python.org/3/library/stat.html#stat.S_ISVTX We also set # read/write/execute permissions for everyone on this directory. We only do # this if we are the owner of the /tmp/qiime2 directory or in other words # the first person to run QIIME 2 with this /tmp since the /tmp was wiped if not os.path.exists(cache_dir): try: os.mkdir(cache_dir) except FileExistsError: # we know that it didn't exist a moment ago, so we're probably # about to set it up in a different process time.sleep(0.5) # this sleep is to give the first process enough time to create # a cache object which we will then re-use. Ideally this would # be handled with a lock, but we don't have anywhere to put it # yet. Since this is the kind of thing that can only happen when # QIIME 2 has to create a new temp cache and there's a race for it # this small hack seems not too bad. else: # skip this if there was an error we ignored sticky_permissions = stat.S_ISVTX | stat.S_IRWXU | stat.S_IRWXG \ | stat.S_IRWXO os.chmod(cache_dir, sticky_permissions) elif os.stat(cache_dir).st_mode != EXPECTED_PERMISSIONS: raise ValueError(f"Directory '{cache_dir}' already exists without " f"proper permissions '{oct(EXPECTED_PERMISSIONS)}' " "set. Current permissions are " f"'{oct(os.stat(cache_dir).st_mode)}.' This most " "likely means something other than QIIME 2 created " f"the directory '{cache_dir}' or QIIME 2 failed " f"between creating '{cache_dir}' and setting " "permissions on it.") user = _get_user() user_dir = os.path.join(cache_dir, user) # It is conceivable that we already have a path matching this username that # belongs to another uid, if we do then we want to create a garbage name # for the temp cache that will be used by this user if os.path.exists(user_dir) and os.stat(user_dir).st_uid != os.getuid(): uid_name = _get_uid_cache_name() # This really shouldn't happen if user == uid_name: raise ValueError(f'Temp cache for uid path {user} already exists ' 'but does not belong to us.') user_dir = os.path.join(cache_dir, uid_name) return user_dir def _get_user(): """Get the uname for our default cache. Internally getpass.getuser is getting the uid then looking up the username associated with it. This could fail it we are running inside a container because the container is looking for its parent's uid in its own /etc/passwd which is unlikely to contain a user associated with that uid. If that failure does occur, we create an alternate default username. Returns ------- str The value we will be using as uname for our default cache. """ try: return getpass.getuser() except KeyError: return _get_uid_cache_name() def _get_uid_cache_name(): """Create an esoteric name that is unlikely to be the name of a real user in cases were getpass.getuser fails. This name is of the form uid=# which should be consistent across invocations of this function by the same user. Returns ------- str The aforementioned stand in name. """ return f'uid=#{os.getuid()}' @atexit.register def _exit_cleanup(): """When the process ends, for each cache used by this process we remove the process pool created by this process then run garbage collection. """ for cache in USED_CACHES: target = cache.processes / os.path.basename(cache.process_pool.path) # There are several legitimate reasons the path could not exist. It # happens during our cache tests when the entire cache is nuked in the # end. It also happens in asynchronous runs where the worker process # does not create a process pool (on Mac this atexit is invoked on # workers). It could also happen if someone deleted the process pool # but... They probably shouldn't do that try: cache.lock.__enter__() except Exception: continue else: try: if os.path.exists(target): shutil.rmtree(target) cache.garbage_collection() finally: cache.lock.__exit__() def monitor_thread(cache_dir, is_done): """MacOS reaps temp files that are three days old or older. This function will be running in a separate daemon and making sure MacOS doesn't cull anything still needed by a long running process by periodically updating the last accessed times on all files in the cache by touching them every six hours. The daemon running this function will be terminated when the process that invoked it ends. Parameters ---------- cache_dir : str or PathLike object The path to the cache that invoked this daemon. is_done : threading.Event The process that invoked this daemon sets this flag on exit to notify this daemon to terminate. """ while not is_done.is_set(): touch_under_path(cache_dir) time.sleep(60 * 60 * 6) # This is very important to our trademark tm = object class MEGALock(tm): """We need to lock out other processes with flufl, but we also need to lock out other threads with a Python thread lock (because parsl threadpools), so we put them together in one MEGALock(tm) """ def __init__(self, flufl_fp, lifetime): self.flufl_fp = flufl_fp self.lifetime = lifetime self.re_entries = 0 self.thread_lock = threading.Lock() self.flufl_lock = flufl.lock.Lock(flufl_fp, lifetime=lifetime) def __enter__(self): """ We acquire the thread lock first because the flufl lock isn't thread-safe which is why we need both locks in the first place """ if self.re_entries == 0: self.thread_lock.acquire() try: self.flufl_lock.lock() except Exception: self.thread_lock.release() raise self.re_entries += 1 def __exit__(self, *args): if self.re_entries > 0: self.re_entries -= 1 if self.re_entries == 0: self.flufl_lock.unlock() self.thread_lock.release() def __getstate__(self): lockless_dict = self.__dict__.copy() del lockless_dict['thread_lock'] del lockless_dict['flufl_lock'] return lockless_dict def __setstate__(self, state): self.__dict__.update(state) self.thread_lock = threading.Lock() self.flufl_lock = \ flufl.lock.Lock(self.flufl_fp, lifetime=self.lifetime) class Cache: """General structure of the cache: :: artifact_cache/ ├── data/ │ ├── uuid1/ │ ├── uuid2/ │ ├── uuid3/ │ └── uuid4/ ├── keys/ │ ├── bar.yaml │ ├── baz.yaml │ └── foo.yaml ├── pools/ │ └── puuid1/ │ ├── uuid1 -> ../../data/uuid1/ │ └── uuid2 -> ../../data/uuid2/ ├── processes/ │ └── -@/ │ ├── uuid3 -> ../../data/uuid3/ │ └── uuid4 -> ../../data/uuid4/ └── VERSION **Data:** The data directory contains all of the artifacts in the cache in unzipped form. **Keys:** The keys directory contains yaml files that refer to either a piece of data or a pool. The data/pool referenced by the key will be kept as long as the key exists. **Pools:** The pools directory contains all named (keyed) pools in the cache. Each pool contains symlinks to all of the data it contains. **Processes:** The processes directory contains process pools of the format -@ for each process that has used this cache. Each pool contains symlinks to each element in the data directory the process that created the pool has used in some way (created, loaded, etc.). These symlinks are ephemeral and have lifetimes <= the lifetime of the process that created them. More permanent storage is done using keys. **VERSION:** This file contains some information QIIME 2 uses to determine what version of QIIME 2 was used to create the cache and what version of cache it is (if we make breaking changes in the future this version number will allow for backwards compatibility). """ CURRENT_FORMAT_VERSION = '1' # The files and folders you expect to see at the top level of a cache base_cache_contents = \ set(('data', 'keys', 'pools', 'processes', 'VERSION')) def __new__(cls, path=None): if path is None: path = _get_temp_path() # We have to ensure we really have the same path here because otherwise # something as simple as path='/tmp/qiime2/x' and path='/tmp/qiime2/x/' # would create two different Cache objects for cache in USED_CACHES: if os.path.exists(path) and os.path.exists(cache.path) and \ os.path.samefile(path, cache.path): return cache return super(Cache, cls).__new__(cls) def __init__(self, path=None, process_pool_lifespan=45): """Creates a Cache object backed by the directory specified by path. If no path is provided, it gets a path to a temp cache. Warning ------- If no path is provided and the path $TMPDIR/qiime2/$USER exists but is not a valid cache, we remove the directory and create a cache there. Parameters ---------- path : str or PathLike object Should point either to a non-existent writable directory to be created as a cache or to an existing writable cache. Defaults to None which creates the cache at $TMPDIR/qiime2/$USER. process_pool_lifespan : int The number of days we should allow process pools to exist for before culling them. """ # If this is a new cache or if the cache somehow got invalidated # (MacOS culling) then we need to re-init the cache. This could # theoretically cause us to end up with two Cache instances pointing at # the same path again should a cache be in some way invalidated during # the lifetime of a process with an existing Cache instance pointing to # it, but if that happens you're probably in trouble anyway. if self not in USED_CACHES or not self.is_cache(self.path): self.__init(path=path, process_pool_lifespan=process_pool_lifespan) def __init(self, path=None, process_pool_lifespan=45): created_path = False temp_cache_path = pathlib.Path(_get_temp_path()) if path is not None: self.path = pathlib.Path(path) else: self.path = temp_cache_path # We need this directory to exist so we can lock it, but we also want # to keep track of whether we created it or not so if we didn't create # it and it isn't a cache we can handle that if not os.path.exists(self.path): # We could have another thread/process creating the cache at this # directory in which case the above check might say it does not # exist then it could be created elsewhere before we get here. We # cannot lock this because we don't have a place to put the lock # yet, and it isn't really a big enough deal to create a new lock # elsewhere just for this because we don't really care if another # instance makes the path out from under us try: os.makedirs(self.path) except FileExistsError: pass # We don't actually care if this is the specific thread/process # that created the path, we only care that some instance of QIIME 2 # must have just done it. Now any instance should treat it as a # QIIME 2 created path created_path = True self.lock = \ MEGALock(str(self.lockfile), lifetime=timedelta(minutes=10)) # We need to lock here to ensure that if we have multiple processes # trying to create the same cache one of them can actually succeed at # creating the cache contents without interference from the other # processes. with self.lock: # If the path already existed and wasn't a cache then we don't want # to create the cache contents here if not Cache.is_cache(self.path) and not created_path: # We own the temp_cache_path, so we can recreate it if there # was something wrong with it if self.path == temp_cache_path: set_permissions(self.path, USER_GROUP_RWX, USER_GROUP_RWX, skip_root=True) self._remove_cache_contents() self._create_cache_contents() warnings.warn( "Your temporary cache was found to be in an " "inconsistent state. It has been recreated.") else: raise ValueError(f"Path: '{self.path}' already exists and" " is not a cache.") elif not Cache.is_cache(self.path): self._create_cache_contents() # else: it was a cache with the contents already in it self.process_pool = self._create_process_pool() # Lifespan is supplied in days and converted to seconds for internal # use self.process_pool_lifespan = process_pool_lifespan * 3600 * 24 # This is set if a named pool is created on this cache and withed in self.named_pool = None # We were used by this process USED_CACHES.add(self) # Start thread that pokes things in the cache to ensure they aren't # culled for being too old (only if we are in a temp cache) if path == temp_cache_path: self._thread_is_done = threading.Event() self._thread_destructor = \ weakref.finalize(self, self._thread_is_done.set) self._thread = threading.Thread( target=monitor_thread, args=(self.path, self._thread_is_done), daemon=True) self._thread.start() def __enter__(self): """Tell QIIME 2 to use this cache in its current invocation (see get_cache). """ if hasattr(_CACHE, 'cache') and _CACHE.cache is not None \ and _CACHE.cache.path != self.path: raise ValueError("You cannot enter multiple caches at once, " "currently entered cache is located at: " f"'{_CACHE.cache.path}'") _CACHE.cache = self def __exit__(self, *args): """Tell QIIME 2 to go back to using the default cache. """ _CACHE.cache = None def __getstate__(self): """Tell the cache not to pickle anything related to the daemon that keeps files around on MacOS because it can't pickle, and we don't need it after pickling and rehydrating. It will already be managed by the original process. """ threadless_dict = self.__dict__.copy() # This will only even exist if we are a temp cache not a named cache. # If _thread exists the others should as well if '_thread' in threadless_dict: del threadless_dict['_thread_is_done'] del threadless_dict['_thread_destructor'] del threadless_dict['_thread'] return threadless_dict @classmethod def is_cache(cls, path): """Tells us if the path we were given is a cache. Parameters ---------- path : str or PathLike object The path to the cache we are checking. Returns ------- bool Whether the path we were given is a cache or not. Examples -------- >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> Cache.is_cache(cache_path) True >>> test_dir.cleanup() """ path = pathlib.Path(path) contents = set(os.listdir(path)) if not contents.issuperset(cls.base_cache_contents): return False regex = \ re.compile( r"QIIME 2\ncache: \d+\nframework: 20\d\d\.") with open(path / 'VERSION') as fh: version_file = fh.read() return regex.match(version_file) is not None @classmethod def validate_key(cls, key): """Validates that the given key is a valid Python idenitifier with the exception that - is allowed. Parameters ---------- key : str The name of the key to validate. Raises ------ ValueError If the key passed in is not a valid Python identifier. We enforce this to ensure no one creates keys that cause issues when we try to load them. """ validation_key = key.replace('-', '_') if not validation_key.isidentifier(): raise ValueError(f"Key '{key}' is not a valid Python identifier. " "Keys may contain '-' characters but must " "otherwise be valid Python identifiers. Python " "identifier rules may be found here " "https://www.askpython.com/python/" "python-identifiers-rules-best-practices") def _create_cache_contents(self): """Create the cache directory, all sub directories, and the version file. """ os.mkdir(self.data) os.mkdir(self.keys) os.mkdir(self.pools) os.mkdir(self.processes) self.version.write_text( _VERSION_TEMPLATE % (self.CURRENT_FORMAT_VERSION, qiime2.__version__)) def _remove_cache_contents(self): """Removes everything in a cache that isn't a lock file. If you want to completely remove a cache, just use shutil.rmtree (make sure you have permissions). Note ---- We ignore lock files because we want the process that is running this method to maintain its lock on the cache. """ for elem in os.listdir(self.path): if 'LOCK' not in elem: fp = os.path.join(self.path, elem) if os.path.isdir(fp): shutil.rmtree(os.path.join(self.path, fp)) else: os.unlink(fp) def create_pool(self, key, reuse=False): """Used to create named pools. Named pools can be used by pipelines to store all intermediate results created by the pipeline and prevent it from being reaped. This allows us to resume failed pipelines by collecting all of the data the pipeline saved to the named pool before it crashed and reusing it so we don't need to run the steps that created it again and can instead rerun the pipeline from where it failed. Once the pipeline completes, all of its final results will be saved to the pool as well with the idea being that the user can then reuse the pool keys to refer to the final data and get rid of the pool now that the pipeline that created it has completed. Parameters ---------- key : str The key to use to reference the pool. reuse : bool Whether to reuse a pool if a pool with the given keys already exists. Returns ------- Pool The pool we created. Examples -------- >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> pool = cache.create_pool(key='key') >>> cache.get_keys() == set(['key']) True >>> cache.get_pools() == set(['key']) True >>> test_dir.cleanup() """ pool = Pool(self, name=key, reuse=reuse) self._register_key(key, key, pool=True) return pool def _create_process_pool(self): """Creates a process pool which is identical in function to a named pool, but it lives in the processes subdirectory not the pools subdirectory, and is handled differently by garbage collection due to being un-keyed. Process pools are used to keep track of results for currently running processes and are removed when the process that created them ends. Returns ------- Pool The pool we created. """ return Pool(self, reuse=True) def _create_collection_pool(self, ref_collection, key): pool = Pool(self, name=key, reuse=False) self._register_key( key, key, pool=True, collection=ref_collection) return pool def _create_pool_keys(self, pool_name, keys): """A pool can have many keys referring to it. This function creates all of the keys referring to the pool. Parameters ---------- pool_name : str The name of the pool we are keying. keys : List[str] A list of all the keys to create referring to the pool. """ for key in keys: self._register_key(key, pool_name, pool=True) def garbage_collection(self): """Runs garbage collection on the cache in the following steps: **1.** Iterate over all keys and log all data and pools referenced by the keys. **2.** Iterate over all named pools and delete any that were not referred to by a key while logging all data in pools that were referred to by keys. **3.** Iterate over all process pools and log all data they refer to. **4.** Iterate over all data and remove any that was not referenced. This process destroys data and named pools that do not have keys along with process pools older than the process_pool_lifespan on the cache which defaults to 45 days. It removes keys and warns the user about the removal if the data referenced by the keys does not exist. We lock out other processes and threads from accessing the cache while garbage collecting to ensure the cache remains in a consistent state. """ referenced_pools = set() referenced_data = set() # Walk over keys and track all pools and data referenced # This needs to be locked so we ensure that we don't have other threads # or processes writing refs that we don't see leading to us deleting # their data with self.lock: for key in self.get_keys(): loaded_key = self.read_key(key) # If the data/pool referenced by the key actually exists then # track it. Otherwise remove the dangling reference if (data := loaded_key.get('data')) is not None: if not self._check_dangling_reference( self.data / data, self.keys / key): referenced_data.add(data) elif (pool := loaded_key.get('pool')) is not None: if not self._check_dangling_reference( self.pools / pool, self.keys / key): referenced_pools.add(pool) # This really should never be happening unless someone messes # with things manually else: raise ValueError(f"The key '{key}' in the cache" f" '{self.path}' does not point to" " anything") # Walk over pools and remove any that were not referred to by keys # while tracking all data within those that were referenced for pool in self.get_pools(): if pool not in referenced_pools: shutil.rmtree(self.pools / pool) else: for data in os.listdir(self.pools / pool): if not self._check_dangling_reference( self.data / data, self.pools / pool / data): referenced_data.add(data) # Add references to data in process pools for process_pool in self.get_processes(): # Pick the creation time out of the pool name of format # -@ create_time = float(process_pool.split('-')[1].split('@')[0]) if time.time() - create_time >= self.process_pool_lifespan: shutil.rmtree(self.processes / process_pool) else: for data in os.listdir(self.processes / process_pool): referenced_data.add(data.split('.')[0]) # Walk over all data and remove any that was not referenced for data in self.get_data(): # If this assert is ever tripped something real bad happened assert is_uuid4(data) if data not in referenced_data: target = self.data / data set_permissions(target, None, USER_GROUP_RWX) shutil.rmtree(target) def _check_dangling_reference(self, data_path, key_path): """ If the data specified does not exist then we have a dangling reference and we warn them about it and remove the reference. Parameters ---------- data_path : pathlib.Path The path we are expecting to see the data at. key_path : pathlib.Path The path to the key referencing the data path that will be removed if the data is missing. Returns ------- Boolean True if reference was dangling False if not """ if not os.path.exists(data_path): warnings.warn(f"Dangling reference {key_path}. Data at {data_path}" " does not exist. Reference will be removed.") os.remove(key_path) return True return False def save(self, ref, key): """Saves data into the cache by creating a key referring to the data then copying the data if it is not already in the cache. Parameters ---------- ref : Result The QIIME 2 result we are saving into the cache. key : str The key we are saving the result under. Returns ------- Result A Result backed by the data in the cache. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> saved_artifact = cache.save(artifact, 'key') >>> # save returned an artifact that is backed by the data in the cache >>> str(saved_artifact._archiver.path) == \ str(cache.data / str(artifact.uuid)) True >>> cache.get_keys() == set(['key']) True >>> test_dir.cleanup() """ # Create the key before the data, this is so that if another thread or # process is running garbage collection it doesn't see our un-keyed # data and remove it leaving us with a dangling reference and no data with self.lock: self._register_key(key, str(ref.uuid)) self._copy_to_data(ref) return self.load(key) def save_collection(self, ref_collection, key): """Saves a Collection to a pool in the cache with the given key. This pool's key file will keep track of the order of the Collection. """ if isinstance(ref_collection, qiime2.sdk.Results): ref_collection = ref_collection.output with self.lock: pool = self._create_collection_pool(ref_collection, key) for ref in ref_collection.values(): pool.save(ref) return self.load_collection(key) def _register_key(self, key, value, pool=False, collection=None): """Creates a key file pointing at the specified data or pool. Parameters ---------- key : str The name of the key to create. value : str The path to the data or pool we are keying. pool : bool Whether we are keying a pool or not. """ self.validate_key(key) key_fp = self.keys / key key_dict = {} key_dict['origin'] = key if pool: key_dict['pool'] = value if collection is not None: key_dict['order'] = \ [{k: str(v.uuid)} for k, v in collection.items()] else: key_dict['data'] = value if collection is not None: raise ValueError('An ordered Collection key can only be made' ' for a pool.') with open(key_fp, 'w') as fh: yaml.safe_dump(key_dict, fh) def read_key(self, key): """Reads the contents of a given key. Parameters ---------- key : str The name of the key to read Returns ------- dict Maps 'data' -> the data referenced or 'pool' -> the pool referenced. Only 'data' or 'pool' will have a value the other will be none. Raises ------ KeyError If the key does not exists in the cache. """ with self.lock: try: with open(self.keys / key) as fh: return yaml.safe_load(fh) except FileNotFoundError as e: raise KeyError(f"The cache '{self.path}' does not contain the " f"key '{key}'") from e def load(self, key): """Loads the data pointed to by a key. Will defer to Cache.load_collection if the key contains 'order'. Will error on keys that refer to pools without order. Parameters ---------- key : str The key to the data we are loading. Returns ------- Result The loaded data pointed to by the key. Raises ------ ValueError If the key does not reference any data meaning you probably tried to load a pool. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> saved_artifact = cache.save(artifact, 'key') >>> loaded_artifact = cache.load('key') >>> loaded_artifact == saved_artifact == artifact True >>> str(loaded_artifact._archiver.path) == \ str(cache.data / str(artifact.uuid)) True >>> test_dir.cleanup() """ with self.lock: key_values = self.read_key(key) if 'order' in key_values: return self.load_collection(key) if 'data' not in key_values: raise ValueError(f"The key file '{key}' does not point to any " "data. This most likely occurred because you " "tried to load a pool which is not " "supported.") path = self.data / key_values['data'] archiver = Archiver.load_raw(path, self) return Result._from_archiver(archiver) def load_collection(self, key): """Loads a pool referenced by a given key as a Collection. The pool loaded must have been created using Cache.save_collection. """ collection = {} with self.lock: loaded_key = self.read_key(key) if 'order' not in loaded_key: raise KeyError(f"The key file '{self.keys / key}' does not" " contain an order which is necessary for a" " collection.") for artifact in loaded_key['order']: # We created a list of one element dicts to make sure the yaml # looked as expected. This is how we parse one of those dicts # into its key value pair. k, v = list(artifact.items())[0] collection[k] = Result._from_archiver(self._load_uuid(v)) return collection def _load_uuid(self, uuid): """Load raw from the cache if the uuid is in the cache. Return None if it isn't. This is done so if someone already has an artifact in the cache then tries to use their qza for the artifact we can use the already cached version instead. """ path = self.data / str(uuid) with self.lock: if os.path.exists(path): return Archiver.load_raw(path, self) else: return None def remove(self, key): """Removes a key from the cache then runs garbage collection to remove anything the removed key was referencing and any other loose data. Parameters ---------- key : str The key we are removing. Raises ------ KeyError If the key does not exist in the cache. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> saved_artifact = cache.save(artifact, 'key') >>> cache.get_keys() == set(['key']) True >>> cache.remove('key') >>> cache.get_keys() == set() True >>> # Note that the data is still in the cache due to our >>> # saved_artifact causing the process pool to keep a reference to it >>> cache.get_data() == set([str(saved_artifact.uuid)]) True >>> del saved_artifact >>> # The data is still there even though the reference is gone because >>> # the cache has not run its own garbage collection yet. For various >>> # reasons, it is not feasible for us to safely garbage collect the >>> # cache when a reference in memory is deleted. Note also that >>> # "artifact" is not backed by the data in the cache, it only lives >>> # in memory, but it does have the same uuid as "saved_artifact." >>> cache.get_data() == set([str(artifact.uuid)]) True >>> cache.garbage_collection() >>> # Now it is gone >>> cache.get_data() == set() True >>> test_dir.cleanup() """ with self.lock: try: os.remove(self.keys / key) except FileNotFoundError as e: raise KeyError(f"The cache '{self.path}' does not contain the" f" key '{key}'") from e self.garbage_collection() def clear_lock(self): """Clears the flufl lock on the cache. This exists in case something goes horribly wrong and we end up in an unrecoverable state. It's easy to tell the user "Recreate the failed cache (use the same path) and run this method on it." Note ---- Forcibly removes the lock outside of the locking library's API. """ if os.path.exists(self.lockfile): os.remove(self.lockfile) def _copy_to_data(self, ref): """If the data does not already exist in the cache, it will copy the data into the cache's data directory and set the appropriate permissions on the data. If the data does already exist in the cache, it will do nothing. This is generally used to copy data from outside the cache into the cache. Parameters ---------- ref : Result The data we are copying into the cache's data directory. """ destination = self.data / str(ref.uuid) with self.lock: if not os.path.exists(destination): # We need to actually create the cache/data/uuid directory # manually because the uuid isn't a part of the ArchivePath if not isinstance(ref._archiver.path, ArchivePath): os.mkdir(destination) shutil.copytree( ref._archiver.path, destination, dirs_exist_ok=True) # Otherwise, the path we are copying should already contain the # uuid, so we don't need to manually create the uuid directory else: shutil.copytree( ref._archiver.path, self.data, dirs_exist_ok=True) set_permissions(destination, READ_ONLY_FILE, READ_ONLY_DIR) def _rename_to_data(self, uuid, src): """Takes some data in src and renames it into the cache's data dir. It then ensures there are symlinks for this data in the process pool and the named pool if one exists. This is generally used to move data from temporary per thread mount points in the process pool into the cache's data directory in one atomic action. Parameters ---------- uuid : str or uuid4 The uuid of the artifact whose data we are renaming into self.data src : str or Pathlike object The location of the data we are renaming into self.data. Returns ------- str The alias we created for the artifact in the cache's process pool. pathlib.Path The location we renamed the data into. """ uuid = str(uuid) dest = self.data / uuid alias = os.path.split(src)[0] with self.lock: # Rename errors if the destination already exists if not os.path.exists(dest): os.rename(src, dest) set_permissions(dest, READ_ONLY_FILE, READ_ONLY_DIR) # Create a new alias whether we renamed or not because this is # still loading a new reference to the data even if the data is # already there process_alias = self._alias(uuid) # Remove the aliased directory above the one we renamed. We need to do # this whether we renamed or not because we aren't renaming this # directory but the one beneath it shutil.rmtree(alias) return process_alias, dest def _alias(self, uuid): """Creates an alias and a symlink for the artifact with the given uuid in both the cache's process pool and its named pool if it has one. Parameters ---------- uuid : str or uuid4 The uuid of the artifact we are aliasing. Returns ------- str The alias we created for the artifact. """ with self.lock: process_alias = self.process_pool._alias(uuid) self.process_pool._make_symlink(uuid, process_alias) # Named pool links are not aliased if self.named_pool is not None: self.named_pool._make_symlink(uuid, uuid) return process_alias def _deallocate(self, symlink): """Removes a specific symlink from the process pool. This happens when an archiver goes out of scope. We remove that archiver's reference to the data from the process pool. We do this to prevent the cache from growing wildly during long running processes. Parameters ---------- symlink : str The basename of the symlink we are going to be removing from the process pool. """ # NOTE: Beware locking inside of this method. This method is called by # Python's garbage collector and that seems to cause deadlocks when # acquiring the thread lock target = self.process_pool.path / symlink if target.exists(): os.remove(target) @property def data(self): """The directory in the cache that stores the data. """ return self.path / 'data' def get_data(self): """Returns a set of all data in the cache. Returns ------- set[str] All of the data in the cache in the form of the top level directories which will be the uuids of the artifacts. """ with self.lock: return set(os.listdir(self.data)) @property def keys(self): """The directory in the cache that stores the keys. """ return self.path / 'keys' def get_keys(self): """Returns a set of all keys in the cache. Returns ------- set[str] All of the keys in the cache. Just the names now what they refer to. """ with self.lock: return set(os.listdir(self.keys)) @property def lockfile(self): """The path to the flufl lock file. """ return self.path / 'LOCK' @property def pools(self): """The directory in the cache that stores the named pools. """ return self.path / 'pools' def get_pools(self): """Returns a set of all pools in the cache. Returns ------- set[str] The names of all of the named pools in the cache. """ with self.lock: return set(os.listdir(self.pools)) @property def processes(self): """The directory in the cache that stores the process pools. """ return self.path / 'processes' def get_processes(self): """Returns a set of all process pools in the cache. Returns ------- set[str] The names of all of the process pools in the cache. """ with self.lock: return set(os.listdir(self.processes)) @property def version(self): """The path to the version file. """ return self.path / 'VERSION' class Pool: """Pools are folders in the cache that contain many symlinks to many different piece of data. There are two types of pool: **Process Pools:** These pools have names of the form -@ based on the process that created them. They only exist for the length of the process that created them and ensure data that process is using stays in the cache. **Named Pools:** Named pools are keyed just like individual pieces of data. They exist for as long as they have a key, and all of the data they symlink to is retained in the cache for as long as the pool exists. """ def __init__(self, cache, name=None, reuse=False): """Used with name=None and reuse=True to create a process pool. Used with a name to create named pools. Note ---- In general, you should not invoke this constructor directly and should instead use qiime2.core.cache.Cache.create_pool to create a pool properly on a given cache. Parameters ---------- cache : Cache The cache this pool will be created under. named : str The name of the pool we are creating if it is a named pool. reuse : bool Whether we will be reusing this pool if it already exists. Raises ------ ValueError If the pool already exists and reuse is False. """ # The pool keeps track of the cache it belongs to self.cache = cache # If they are creating a named pool, we already have this info if name: self.name = name self.path = cache.pools / name # The alternative is that we have a process pool. We want this pool to # exist in the process directory under the cache not the pools # directory. The name follows the scheme # -@ else: self.name = self._get_process_pool_name() self.path = cache.processes / self.name # Raise a value error if we thought we were making a new pool but # actually are not if not reuse and os.path.exists(self.path): raise ValueError("Pool already exists, please use reuse=True to " "reuse existing pool, or remove all keys " "indicating this pool to remove the pool") if not os.path.exists(self.path): os.mkdir(self.path) def __enter__(self): """Tells the currently set cache to use this named pool. If there is no cache set then set the cache this named pool is on as well. Note ---- If you have already set a cache then you cannot set a named pool that belongs to a different cache. Raises ------ ValueError If you try to set a pool that is not on the currently set cache. ValueError If you have already set a pool and try to set another. Examples -------- >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> pool = cache.create_pool(key='pool') >>> # When we with in the pool the set cache will be the cache the pool >>> # belongs to, and the named pool on that cache will be the pool >>> # we withed in >>> with pool: ... current_cache = get_cache() ... cache.named_pool == pool True >>> current_cache == cache True >>> # Now that we have exited the with, both cache and pool are unset >>> get_cache() == cache False >>> cache.named_pool == pool False >>> test_dir.cleanup() """ # This threadlocal may not even have a cache attribute has_cache = hasattr(_CACHE, 'cache') if has_cache and _CACHE.cache is not None \ and _CACHE.cache.path != self.cache.path: raise ValueError('Cannot enter a pool that is not on the ' 'currently set cache. The current cache is ' f'located at: {_CACHE.cache.path}') else: self.previously_entered_cache = _CACHE.cache if has_cache else None _CACHE.cache = self.cache if self.cache.named_pool is not None: raise ValueError("You cannot enter multiple pools at once, " "currently entered pool is located at: " f"'{self.cache.named_pool.path}'") self.cache.named_pool = self def __exit__(self, *args): """Unsets the named pool on the currently set cache. If there was no cache set before setting this named pool then unset the cache as well. Note ---- self.previously_entered_cache will either be None or the cache this named pool belongs to. It will be None if there was no cache set when we set this named pool. It will be this named pool's cache if that cache was already set when we set this named pool. If there was a different cache set when we set this named pool, we would have errored in __enter__. """ _CACHE.cache = self.previously_entered_cache self.cache.named_pool = None def _get_process_pool_name(self): """Creates a process pool name of the format -@ for the process that invoked this function. Returns ------- str The name of this process pool. """ pid = os.getpid() user = _get_user() process = psutil.Process(pid) time = process.create_time() return f'{pid}-{time}@{user}' def save(self, ref): """Saves the data into the pool then loads a new ref backed by the data in the pool. Parameters ---------- ref : Result The QIIME 2 result we are saving into this pool. Returns ------- Result A QIIME 2 result backed by the data in the cache the pool belongs to. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> pool = cache.create_pool(key='pool') >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> pool_artifact = pool.save(artifact) >>> # The data itself resides in the cache this pool belongs to >>> str(pool_artifact._archiver.path) == \ str(cache.data / str(artifact.uuid)) True >>> # The pool now contains a symlink to the data. The symlink is named >>> # after the uuid of the data. >>> pool.get_data() == set([str(artifact.uuid)]) True >>> test_dir.cleanup() """ uuid = str(ref.uuid) if self.path == self.cache.process_pool.path: alias = self._alias(uuid) else: alias = uuid self._make_symlink(uuid, alias) self.cache._copy_to_data(ref) return self.load(ref) def _alias(self, uuid): """We may want to create multiple references to a single artifact in a process pool, but we cannot create multiple symlinks with the same name, so we take the uuid and add a random number to the end of it and use uuid.random_number as the name of the symlink. This means when you look in a process pool you may see the same uuid multiple times with different random numbers appended. This means there are multiple references to the artifact with that uuid in the process poole because it was loaded multiple times in the process. Parameters ---------- uuid : str or uuid4 The uuid we are creating an alias for. Returns ------- str The aliased uuid. """ MAX_RETRIES = 5 uuid = str(uuid) with self.cache.lock: for _ in range(MAX_RETRIES): alias = uuid + '.' + str(randint(0, maxsize)) path = self.path / alias # os.path.exists returns false on broken symlinks if not os.path.exists(path) and not os.path.islink(path): break else: raise ValueError(f'Too many collisions ({MAX_RETRIES}) ' 'occurred while trying to save artifact ' f'<{uuid}> to process pool {self.path}. It ' 'is likely you have attempted to load the ' 'same artifact a very large number of times.') return alias def _allocate(self, uuid): """Allocate an empty directory under the process pool to extract to. This directory is of the form alias / uuid and provides a per thread mount location for artifacts. Parameters ---------- uuid : str or uuid4 The uuid of the artifact we are creating an extract location for. Returns ------- pathlib.Path The path we allocated to extract the artifact into. """ uuid = str(uuid) # We want to extract artifacts to this thread unique location in the # process pool before using Cache.rename to put them into Cache.data. # We need to do this in order to ensure that if a uuid exists in # Cache.data, it is actually populated with data and is usable as an # artifact. Otherwise it could just be an empty directory (or only # contain part of the artifact) when another thread/process tries to # access it. with self.cache.lock: alias = self._alias(uuid) allocated_path = self.path / alias / uuid os.makedirs(allocated_path) return allocated_path def _make_symlink(self, uuid, alias): """Symlinks self.path / alias to self.cache.data / uuid. This creates a reference to the artifact with the given uuid in the cache. Parameters ---------- uuid : str or uuid4 The uuid of the artifact we are creating a symlink reference for. alias : str The alias we are using as the actual name of the symlink. """ uuid = str(uuid) src = self.cache.data / uuid dest = self.path / alias # Symlink will error if the location we are creating the link at # already exists. This could happen legitimately from trying to save # the same thing to a named pool several times. with self.cache.lock: if not os.path.lexists(dest): os.symlink(src, dest) def _rename_to_collection_pool(self, uuid, src): uuid = str(uuid) dest = self.data / uuid alias = os.path.split(src)[0] with self.lock: # Rename errors if the destination already exists if not os.path.exists(dest): os.rename(src, dest) set_permissions(dest, READ_ONLY_FILE, READ_ONLY_DIR) # Create a new alias whether we renamed or not because this is # still loading a new reference to the data even if the data is # already there process_alias = self._alias(uuid) # Remove the aliased directory above the one we renamed. We need to do # this whether we renamed or not because we aren't renaming this # directory but the one beneath it shutil.rmtree(alias) return process_alias, dest def load(self, ref): """Loads a reference to an element in the pool. Parameters ---------- ref : str or Result The result we are loading out of this pool, or just its uuid as a string. Returns ------- Result A result backed by the data in the cache that this pool belongs to. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> pool = cache.create_pool(key='pool') >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> pool_artifact = pool.save(artifact) >>> loaded_artifact = pool.load(str(artifact.uuid)) >>> artifact == pool_artifact == loaded_artifact True >>> str(loaded_artifact._archiver.path) == \ str(cache.data / str(artifact.uuid)) True >>> test_dir.cleanup() """ # Could receive an artifact or just a string uuid if isinstance(ref, str): uuid = ref else: uuid = str(ref.uuid) path = self.cache.data / uuid archiver = Archiver.load_raw(path, self.cache) return Result._from_archiver(archiver) def remove(self, ref): """Removes an element from the pool. The element can be just the uuid of the data as a string, or it can be a Result object referencing the data we are trying to remove. Parameters ---------- ref : str or Result The result we are removing from this pool, or just its uuid as a string. Examples -------- >>> from qiime2.sdk.result import Artifact >>> from qiime2.core.testing.type import IntSequence1 >>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') >>> cache_path = os.path.join(test_dir.name, 'cache') >>> cache = Cache(cache_path) >>> pool = cache.create_pool('pool') >>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2]) >>> pool_artifact = pool.save(artifact) >>> pool.get_data() == set([str(artifact.uuid)]) True >>> pool.remove(str(artifact.uuid)) >>> pool.get_data() == set() True >>> # Note that the data is still in the cache due to our >>> # pool_artifact causing the process pool to keep a reference to it >>> cache.get_data() == set([str(pool_artifact.uuid)]) True >>> del pool_artifact >>> # The data is still there even though the reference is gone because >>> # the cache has not run its own garbage collection yet. For various >>> # reasons, it is not feasible for us to safely garbage collect the >>> # cache when a reference in memory is deleted. Note also that >>> # "artifact" is not backed by the data in the cache, it only lives >>> # in memory, but it does have the same uuid as "pool_artifact." >>> cache.get_data() == set([str(artifact.uuid)]) True >>> cache.garbage_collection() >>> # Now it is gone >>> cache.get_data() == set() True >>> test_dir.cleanup() """ # Could receive an artifact or just a string uuid if isinstance(ref, str): uuid = ref else: uuid = str(ref.uuid) target = self.path / uuid with self.cache.lock: if target.exists(): if os.path.islink(target): os.remove(target) else: shutil.rmtree(target) self.cache.garbage_collection() def get_data(self): """Returns a set of all data in the pool. Returns ------- set[str] The uuids of all of the data in the pool. """ return set(os.listdir(self.path)) def create_index(self): """Indexes all artifacts in this cache's data directory mapping the QIIME 2 invocations that made the given artifacts to the given artifacts in a dictionary with the following structure: { HashableInvocation(plugin_action=f'{plugin}:{action}', arguments=[input_uuids + parameters]): { output1_name: output1_uuid, output2_name: output2_uuid, ... }, ... } Where the output uuids are the uuids of the artifacts in the actual data directory. This information is parsed out of these artifacts' provenance. This index is used for pipeline resumption. We can tell if an artifact in the cache was created by an invocation of a pipeline that is identical to the one we are currently executing and take that cached artifact instead of recreating it. """ # Keep track of all invocations -> outputs self.index = {} with self.cache.lock: for _uuid in self.get_data(): # Make sure the process that indexed this artifact will still # have access to it if it is otherwise removed from the cache # by retaining a reference to it in our process pool alias = self.cache.process_pool._alias(_uuid) self.cache.process_pool._make_symlink(_uuid, alias) # Get action.yaml from this artifact's provenance path = self.cache.data / _uuid action_yaml = load_action_yaml(path) action = action_yaml['action'] # This means the artifact was created in the pipeline by # ctx.make_artifact, we don't index those because we cannot # guarantee that whatever view they imported was hashable. it # would be better to create an action that produces the # artifact rather than using make_artifact if 'type' in action and action['type'] == 'import': continue plugin_action = action['plugin'] + ':' + action['action'] arguments = action['inputs'] arguments.extend(action['parameters']) invocation = HashableInvocation(plugin_action, arguments) if invocation not in self.index: self.index[invocation] = {} self._add_index_output( self.index[invocation], action['output-name'], _uuid) def _add_index_output(self, outputs, name, value): """Adds a given output to the cache's index under the invocation that created it. Dispatches to _add_collection_index_output for output collections. Parameters ---------- outputs : dict A dictionary mapping the names of the outputs of the given invocation to their uuids. name : str The name of the output we are indexing. value : str The value of the output we are indexing Note ---- Modifies the output dictionary in place """ if isinstance(name, list): self._add_collection_index_output(outputs, name, value) else: outputs[name] = value def _add_collection_index_output(self, outputs, name, value): """Adds a given output collection to the cache's index under the invocation that created it. Parameters ---------- outputs : dict A dictionary mapping the names of the outputs of the given invocation to their uuids. name : tuple The name of the output, the key for this element in the output collection, the index of this element in the output collection in the form of index/num_elements. value : str The uuid of this element in the output collection. """ output_name, item_name, idx_out_of = name idx, total = idx_out_of.split('/') if output_name not in outputs: outputs[output_name] = {} item = IndexedCollectionElement(item_name, int(idx), int(total)) outputs[output_name][item] = value qiime2-2024.5.0/qiime2/core/cite.py000066400000000000000000000045251462552636000165650ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import pkg_resources import collections import bibtexparser as bp CitationRecord = collections.namedtuple('CitationRecord', ['type', 'fields']) class Citations(collections.OrderedDict): @classmethod def load(cls, path, package=None): if package is not None: root = pkg_resources.resource_filename(package, '.') root = os.path.abspath(root) path = os.path.join(root, path) parser = bp.bparser.BibTexParser() # Downstream tooling is much easier with unicode. For actual latex # users, use the modern biber backend instead of bibtex parser.customization = bp.customization.convert_to_unicode with open(path) as fh: try: db = bp.load(fh, parser=parser) except Exception as e: raise ValueError("There was a problem loading the BiBTex file:" "%r" % path) from e entries = collections.OrderedDict() for entry in db.entries: id_ = entry.pop('ID') type_ = entry.pop('ENTRYTYPE') if id_ in entries: raise ValueError("Duplicate entry-key found in BibTex file: %r" % id_) entries[id_] = CitationRecord(type_, entry) return cls(entries) def __iter__(self): return iter(self.values()) def save(self, f): entries = [] for key, citation in self.items(): entry = citation.fields.copy() entry['ID'] = key entry['ENTRYTYPE'] = citation.type entries.append(entry) db = bp.bibdatabase.BibDatabase() db.entries = entries writer = bp.bwriter.BibTexWriter() writer.order_entries_by = tuple(self.keys()) owned = False if type(f) is str: f = open(f, 'w') owned = True try: bp.dump(db, f, writer=writer) finally: if owned: f.close() qiime2-2024.5.0/qiime2/core/enan.py000066400000000000000000000071171462552636000165620ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import struct def _float_to_int(number: float) -> int: # convert float to a native-endian 8-byte sequence (a double): # (alignment doesn't matter because this isn't a struct, and since we are # on the same hardware when we go to int, the endian-ness doesn't matter # either) bytes_ = struct.pack('=d', number) integer, = struct.unpack('=Q', bytes_) return integer def _int_to_float(number: int) -> float: bytes_ = struct.pack('=Q', number) float_, = struct.unpack('=d', bytes_) return float_ # 1954 is an homage to R's NA value which is a quiet NaN with a mantissa which # appears to represent 1954, birth-year of Ross Ihaka (source: Hadley Wickham) # http://www.markvanderloo.eu # /yaRb/2012/07/08/representation-of-numerical-nas-in-r-and-the-1954-enigma/ # It also happens to let us tolerate small negative values (such as -1 used # in pd.Categorical.codes) without trashing the entire NaN. # ('Q' from struct does fortunately catch this issue before it becomes a # larger problem) _R_OFFSET = 1954 _DEFAULT_NAN_INT = _float_to_int(float('nan')) # at this point, calling `bin(_DEFAULT_NAN_INT)` should produce a # 64-bit positive quiet nan: # 0 11111111111 1000000000000000000000000000000000000000000000000000 # https://www.csee.umbc.edu/courses/undergraduate/CMSC211/spring03 # /burt/tech_help/IEEE-754references.html # unless Python changes some implementation detail, which isn't a problem so # long as XOR is used instead of AND def make_nan_with_payload(payload: int, namespace: int = 255): """Construct a NaN with a namespace and payload. The payload must be in the range [-1953, 2141] The namespace must be in the range [0, 255] sign exp mantissa v v---------v v----------------------------------------------------------v +qNaN "header" (includes 1 bit of the mantissa) namespace payload v-------------v v-------v v------------v 0 11111111111 10000000 00000000 00000000 00000000 0000 0000 0000 0000 0000 The namespace + payload requires 20 bits of the mantissa, which will support both 32-bit floats and 64-bit doubles. The purpose is to allow enumerations to be identified and values preserved. Custom enumerations will have a namespace of 255 and will require user guidance. Other built-in schemes should organize themselves within an unsigned byte. The enumeration values are then stored in the payload which uses an offset of 1954 to distinguish between default +qNaN and an enumeration scheme. This also permits small negative values in the payload. """ # To be safe, we will XOR our payload (instead of AND) so that we can take # the difference later, even if the default NaN changes to include a # mantissa payload for some reason nan_int = _DEFAULT_NAN_INT ^ (namespace << 12) nan_int = nan_int ^ (_R_OFFSET + payload) return _int_to_float(nan_int) def get_payload_from_nan(nan: float): nan_int = _float_to_int(nan) namespaced_payload = nan_int ^ _DEFAULT_NAN_INT if namespaced_payload == 0: return (None, None) namespace = namespaced_payload >> 12 payload = namespaced_payload - (namespace << 12) return (payload - _R_OFFSET, namespace) qiime2-2024.5.0/qiime2/core/exceptions.py000066400000000000000000000006731462552636000200220ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- class ValidationError(Exception): pass class ImplementationError(Exception): pass qiime2-2024.5.0/qiime2/core/format.py000066400000000000000000000021441462552636000171240ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.path as qpath class FormatBase: def __init__(self, path=None, mode='w'): import qiime2.plugin.model as model if path is None: if mode != 'w': raise ValueError("A path must be provided when reading.") else: if mode != 'r': raise ValueError("A path must be omitted when writing.") if mode == 'w': self.path = qpath.OutPath( # TODO: parents shouldn't know about their children dir=isinstance(self, model.DirectoryFormat), prefix='q2-%s-' % self.__class__.__name__) else: self.path = qpath.InPath(path) self._mode = mode def __str__(self): return str(self.path) qiime2-2024.5.0/qiime2/core/missing.py000066400000000000000000000061171462552636000173110ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pandas as pd import numpy as np from .enan import make_nan_with_payload as _make_nan_with_payload from .enan import get_payload_from_nan as _get_payload_from_nan def _encode_terms(namespace): enum = _MISSING_ENUMS[namespace] namespace = _NAMESPACE_LOOKUP.index(namespace) def encode(x): if type(x) is not str: return x try: code = enum.index(x) except ValueError: return x return _make_nan_with_payload(code, namespace=namespace) return encode def _handle_insdc_missing(series): return series.apply(_encode_terms('INSDC:missing')) def _handle_blank(series): return series def _handle_no_missing(series): if series.isna().any(): raise ValueError("Missing values are not allowed in series/column" " (name=%r) when using scheme 'no-missing'." % series.name) return series BUILTIN_MISSING = { 'INSDC:missing': _handle_insdc_missing, 'blank': _handle_blank, 'no-missing': _handle_no_missing } _MISSING_ENUMS = { 'INSDC:missing': ( 'not applicable', 'missing', 'not collected', 'not provided', 'restricted access') } # list index reflects the nan namespace, the "blank"/"no-missing" enums don't # apply here, since they aren't actually encoded in the NaNs _NAMESPACE_LOOKUP = ['INSDC:missing'] DEFAULT_MISSING = 'blank' def series_encode_missing(series: pd.Series, enumeration: str) -> pd.Series: if type(enumeration) is not str: TypeError("Wrong type for `enumeration`, expected string") try: encoder = BUILTIN_MISSING[enumeration] except KeyError: raise ValueError("Unknown enumeration: %r, (available: %r)" % (enumeration, list(BUILTIN_MISSING.keys()))) new = encoder(series) if series.dtype == object and new.isna().all(): # return to categorical of all missing values return new.astype(object) return new def series_extract_missing(series: pd.Series) -> pd.Series: def _decode(x): if np.issubdtype(type(x), np.floating) and np.isnan(x): code, namespace = _get_payload_from_nan(x) if namespace is None: return x elif namespace == 255: raise ValueError("Custom enumerations are not yet supported") else: try: enum = _MISSING_ENUMS[_NAMESPACE_LOOKUP[namespace]] except (IndexError, KeyError): return x try: return enum[code] except IndexError: return x return x missing = series[series.isna()] missing = missing.apply(_decode) return missing.astype(object) qiime2-2024.5.0/qiime2/core/path.py000066400000000000000000000125341462552636000165740ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import pathlib import shutil import distutils import tempfile import weakref from qiime2.core.util import set_permissions, USER_GROUP_RWX _ConcretePath = type(pathlib.Path()) def _party_parrot(self, *args): raise TypeError("Cannot mutate %r." % self) class OwnedPath(_ConcretePath): def __new__(cls, *args, **kwargs): self = super().__new__(cls, *args, **kwargs) self._user_owned = True return self def _copy_dir_or_file(self, other): if self.is_dir(): return distutils.dir_util.copy_tree(str(self), str(other)) else: return shutil.copy(str(self), str(other)) def _destruct(self): if self.is_dir(): distutils.dir_util.remove_tree(str(self)) else: self.unlink() def _move_or_copy(self, other): if self._user_owned: return self._copy_dir_or_file(other) else: # Certain networked filesystems will experience a race # condition on `rename`, so fall back to copying. try: return _ConcretePath.rename(self, other) except (FileExistsError, OSError) as e: # OSError errno 18 is cross device link, if we have this error # we can solve it by copying. If we have a different OSError we # still want to explode. FileExistsErrors are apparently # instances of OSError, so we also make sure we don't have one # of them when we explode if isinstance(e, OSError) and e.errno != 18 and \ not isinstance(e, FileExistsError): raise e copied = self._copy_dir_or_file(other) self._destruct() return copied class InPath(OwnedPath): def __new__(cls, path): self = super().__new__(cls, path) self.__backing_path = path if hasattr(path, '_user_owned'): self._user_owned = path._user_owned return self chmod = lchmod = rename = replace = rmdir = symlink_to = touch = unlink = \ write_bytes = write_text = _party_parrot def open(self, mode='r', buffering=-1, encoding=None, errors=None, newline=None): if 'w' in mode or '+' in mode or 'a' in mode: _party_parrot(self) return super().open(mode=mode, buffering=buffering, encoding=encoding, errors=errors, newline=newline) class OutPath(OwnedPath): @classmethod def _destruct(cls, path): if not os.path.exists(path): return if os.path.isdir(path): shutil.rmtree(path) else: os.unlink(path) def __new__(cls, dir=False, **kwargs): """ Create a tempfile, return pathlib.Path reference to it. """ if dir: name = tempfile.mkdtemp(**kwargs) else: fd, name = tempfile.mkstemp(**kwargs) # fd is now assigned to our process table, but we don't need to do # anything with the file. We will call `open` on the `name` later # producing a different file descriptor, so close this one to # prevent a resource leak. os.close(fd) obj = super().__new__(cls, name) obj._destructor = weakref.finalize(obj, cls._destruct, str(obj)) return obj def __exit__(self, t, v, tb): self._destructor() class InternalDirectory(_ConcretePath): DEFAULT_PREFIX = 'qiime2-' @classmethod def _destruct(cls, path): """DO NOT USE DIRECTLY, use `_destructor()` instead""" if os.path.exists(path): set_permissions(path, None, USER_GROUP_RWX) shutil.rmtree(path) @classmethod def __new(cls, *args): self = super().__new__(cls, *args) self._destructor = weakref.finalize(self, self._destruct, str(self)) return self def __new__(cls, *args, prefix=None): if args and prefix is not None: raise TypeError("Cannot pass a path and a prefix at the same time") elif args: # This happens when the base-class's __reduce__ method is invoked # for pickling. return cls.__new(*args) else: if prefix is None: prefix = cls.DEFAULT_PREFIX elif not prefix.startswith(cls.DEFAULT_PREFIX): prefix = cls.DEFAULT_PREFIX + prefix # TODO: normalize when temp-directories are configurable path = tempfile.mkdtemp(prefix=prefix) return cls.__new(path) def __truediv__(self, path): # We don't want to create self-destructing paths when using the join # operator return _ConcretePath(str(self), path) def __rtruediv__(self, path): # Same reasoning as truediv return _ConcretePath(path, str(self)) class ArchivePath(InternalDirectory): DEFAULT_PREFIX = 'qiime2-archive-' class ProvenancePath(InternalDirectory): DEFAULT_PREFIX = 'qiime2-provenance-' qiime2-2024.5.0/qiime2/core/testing/000077500000000000000000000000001462552636000167365ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/testing/__init__.py000066400000000000000000000005351462552636000210520ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/testing/citations.bib000066400000000000000000000052451462552636000214170ustar00rootroot00000000000000@article{unger1998does, title={Does knuckle cracking lead to arthritis of the fingers?}, author={Unger, Donald L}, journal={Arthritis \& Rheumatology}, volume={41}, number={5}, pages={949--950}, year={1998}, publisher={Wiley Online Library} } @article{berry1997flying, title={Of flying frogs and levitrons}, author={Berry, Michael Victor and Geim, Andre Konstantin}, journal={European Journal of Physics}, volume={18}, number={4}, pages={307}, year={1997}, publisher={IOP Publishing} } @article{mayer2012walking, title={Walking with coffee: Why does it spill?}, author={Mayer, Hans C and Krechetnikov, Rouslan}, journal={Physical Review E}, volume={85}, number={4}, pages={046117}, year={2012}, publisher={APS} } @article{baerheim1994effect, title={Effect of ale, garlic, and soured cream on the appetite of leeches}, author={Baerheim, Anders and Sandvik, Hogne}, journal={BMJ}, volume={309}, number={6970}, pages={1689}, year={1994}, publisher={British Medical Journal Publishing Group} } @article{witcombe2006sword, title={Sword swallowing and its side effects}, author={Witcombe, Brian and Meyer, Dan}, journal={BMJ}, volume={333}, number={7582}, pages={1285--1287}, year={2006}, publisher={British Medical Journal Publishing Group} } @article{reimers2012response, title={Response behaviors of Svalbard reindeer towards humans and humans disguised as polar bears on Edge{\o}ya}, author={Reimers, Eigil and Eftest{\o}l, Sindre}, journal={Arctic, antarctic, and alpine research}, volume={44}, number={4}, pages={483--489}, year={2012}, publisher={BioOne} } @article{barbeito1967microbiological, title={Microbiological laboratory hazard of bearded men}, author={Barbeito, Manuel S and Mathews, Charles T and Taylor, Larry A}, journal={Applied microbiology}, volume={15}, number={4}, pages={899--906}, year={1967}, publisher={Am Soc Microbiol} } @article{krauth2012depth, title={An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool}, author={Krauth, Stefanie J and Coulibaly, Jean T and Knopp, Stefanie and Traor{\'e}, Mahamadou and N'Goran, Eli{\'e}zer K and Utzinger, J{\"u}rg}, journal={PLoS neglected tropical diseases}, volume={6}, number={12}, pages={e1969}, year={2012}, publisher={Public Library of Science} } @article{silvers1997effects, title={The effects of pre-existing inappropriate highlighting on reading comprehension}, author={Silvers, Vicki L and Kreiner, David S}, journal={Literacy Research and Instruction}, volume={36}, number={3}, pages={217--223}, year={1997}, publisher={Taylor \& Francis} } qiime2-2024.5.0/qiime2/core/testing/examples.py000066400000000000000000000223641462552636000211350ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pandas as pd from qiime2 import Artifact, Metadata, ResultCollection from .type import IntSequence1, IntSequence2, Mapping, SingleInt def ints1_factory(): return Artifact.import_data(IntSequence1, [0, 1, 2]) def ints2_factory(): return Artifact.import_data(IntSequence1, [3, 4, 5]) def ints3_factory(): return Artifact.import_data(IntSequence2, [6, 7, 8]) def artifact_collection_factory(): return ResultCollection({'Foo': Artifact.import_data(SingleInt, 1), 'Bar': Artifact.import_data(SingleInt, 2)}) def mapping1_factory(): return Artifact.import_data(Mapping, {'a': 42}) def md1_factory(): return Metadata(pd.DataFrame({'a': ['1', '2', '3']}, index=pd.Index(['0', '1', '2'], name='id'))) def md2_factory(): return Metadata(pd.DataFrame({'b': ['4', '5', '6']}, index=pd.Index(['0', '1', '2'], name='id'))) def single_int1_factory(): return Artifact.import_data(SingleInt, 10) def single_int2_factory(): return Artifact.import_data(SingleInt, 11) def concatenate_ints_simple(use): ints_a = use.init_artifact('ints_a', ints1_factory) ints_b = use.init_artifact('ints_b', ints2_factory) ints_c = use.init_artifact('ints_c', ints3_factory) use.comment('This example demonstrates basic usage.') ints_d, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='concatenate_ints'), use.UsageInputs(ints1=ints_a, ints2=ints_b, ints3=ints_c, int1=4, int2=2), use.UsageOutputNames(concatenated_ints='ints_d'), ) def concatenate_ints_complex(use): ints_a = use.init_artifact('ints_a', ints1_factory) ints_b = use.init_artifact('ints_b', ints2_factory) ints_c = use.init_artifact('ints_c', ints3_factory) use.comment('This example demonstrates chained usage (pt 1).') ints_d, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='concatenate_ints'), use.UsageInputs(ints1=ints_a, ints2=ints_b, ints3=ints_c, int1=4, int2=2), use.UsageOutputNames(concatenated_ints='ints_d'), ) use.comment('This example demonstrates chained usage (pt 2).') concatenated_ints, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='concatenate_ints'), use.UsageInputs(ints1=ints_d, ints2=ints_b, ints3=ints_c, int1=41, int2=0), use.UsageOutputNames(concatenated_ints='concatenated_ints'), ) def typical_pipeline_simple(use): ints = use.init_artifact('ints', ints1_factory) mapper = use.init_artifact('mapper', mapping1_factory) use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='typical_pipeline'), use.UsageInputs(int_sequence=ints, mapping=mapper, do_extra_thing=True), use.UsageOutputNames(out_map='out_map', left='left', right='right', left_viz='left_viz', right_viz='right_viz') ) def typical_pipeline_complex(use): ints1 = use.init_artifact('ints1', ints1_factory) mapper1 = use.init_artifact('mapper1', mapping1_factory) mapper2, ints2, *_ = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='typical_pipeline'), use.UsageInputs(int_sequence=ints1, mapping=mapper1, do_extra_thing=True), use.UsageOutputNames(out_map='out_map1', left='left1', right='right1', left_viz='left_viz1', right_viz='right_viz1') ) _, _, right2, *_ = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='typical_pipeline'), use.UsageInputs(int_sequence=ints2, mapping=mapper2, do_extra_thing=False), use.UsageOutputNames(out_map='out_map2', left='left2', right='right2', left_viz='left_viz2', right_viz='right_viz2') ) right2.assert_has_line_matching( path='ints.txt', expression='1', ) # test that the non-string type works right2.assert_output_type(semantic_type=IntSequence1) # test that the string type works mapper2.assert_output_type(semantic_type='Mapping') def comments_only(use): use.comment('comment 1') use.comment('comment 2') def comments_only_factory(): def comments_only_closure(use): use.comment('comment 1') use.comment('comment 2') return comments_only_closure def identity_with_metadata_simple(use): ints = use.init_artifact('ints', ints1_factory) md = use.init_metadata('md', md1_factory) use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='identity_with_metadata'), use.UsageInputs(ints=ints, metadata=md), use.UsageOutputNames(out='out'), ) def identity_with_metadata_merging(use): ints = use.init_artifact('ints', ints1_factory) md1 = use.init_metadata('md1', md1_factory) md2 = use.init_metadata('md2', md2_factory) md3 = use.merge_metadata('md3', md1, md2) use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='identity_with_metadata'), use.UsageInputs(ints=ints, metadata=md3), use.UsageOutputNames(out='out'), ) def identity_with_metadata_column_get_mdc(use): ints = use.init_artifact('ints', ints1_factory) md = use.init_metadata('md', md1_factory) mdc = use.get_metadata_column('mdc', 'a', md) out, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='identity_with_metadata_column'), use.UsageInputs(ints=ints, metadata=mdc), use.UsageOutputNames(out='out'), ) def variadic_input_simple(use): ints_a = use.init_artifact('ints_a', ints1_factory) ints_b = use.init_artifact('ints_b', ints2_factory) single_int1 = use.init_artifact('single_int1', single_int1_factory) single_int2 = use.init_artifact('single_int2', single_int2_factory) use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='variadic_input_method'), use.UsageInputs(ints=[ints_a, ints_b], int_set={single_int1, single_int2}, nums={7, 8, 9}), use.UsageOutputNames(output='out'), ) def optional_inputs(use): ints = use.init_artifact('ints', ints1_factory) output1, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='optional_artifacts_method'), use.UsageInputs(ints=ints, num1=1), use.UsageOutputNames(output='output1'), ) output2, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='optional_artifacts_method'), use.UsageInputs(ints=ints, num1=1, num2=2), use.UsageOutputNames(output='output2'), ) output3, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='optional_artifacts_method'), use.UsageInputs(ints=ints, num1=1, num2=None), use.UsageOutputNames(output='output3'), ) output4, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='optional_artifacts_method'), use.UsageInputs(ints=ints, optional1=output3, num1=3, num2=4), use.UsageOutputNames(output='output4'), ) def collection_list_of_ints(use): ints = use.init_artifact_collection('ints', artifact_collection_factory) out, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='list_of_ints'), use.UsageInputs(ints=ints), use.UsageOutputNames(output='out'), ) def collection_dict_of_ints(use): ints = use.init_artifact_collection('ints', artifact_collection_factory) out, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='dict_of_ints'), use.UsageInputs(ints=ints), use.UsageOutputNames(output='out'), ) out.assert_output_type(semantic_type='SingleInt', key='Foo') def construct_and_access_collection(use): ints_a = use.init_artifact('ints_a', single_int1_factory) ints_b = use.init_artifact('ints_b', single_int2_factory) rc_in = use.construct_artifact_collection( 'rc_in', {'a': ints_a, 'b': ints_b} ) rc_out, = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='dict_of_ints'), use.UsageInputs(ints=rc_in), use.UsageOutputNames(output='rc_out') ) ints_b_from_collection = use.get_artifact_collection_member( # noqa: F841 'ints_b_from_collection', rc_out, 'b' ) qiime2-2024.5.0/qiime2/core/testing/format.py000066400000000000000000000126001462552636000205770ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.plugin import TextFileFormat, ValidationError import qiime2.plugin.model as model class IntSequenceFormat(TextFileFormat): """ A sequence of integers stored on new lines in a file. Since this is a sequence, the integers have an order and repetition of elements is allowed. Sequential values must have an inter-value distance other than 3 to be valid. """ def _validate_n_ints(self, n): with self.open() as fh: last_val = None for idx, line in enumerate(fh, 1): if n is not None and idx >= n: break try: val = int(line.rstrip('\n')) except (TypeError, ValueError): raise ValidationError("Line %d is not an integer." % idx) if last_val is not None and last_val + 3 == val: raise ValidationError("Line %d is 3 more than line %d" % (idx, idx-1)) last_val = val def _validate_(self, level): record_map = {'min': 5, 'max': None} self._validate_n_ints(record_map[level]) class IntSequenceFormatV2(IntSequenceFormat): """ Same as IntSequenceFormat, but has a header "VERSION 2" """ def _validate_(self, level): with self.open() as fh: if fh.readline() != 'VERSION 2\n': raise ValidationError("Missing header: VERSION 2") class MappingFormat(TextFileFormat): """ A mapping of keys to values stored in a TSV file. Since this is a mapping, key-value pairs do not have an order and duplicate keys are disallowed. """ def _validate_(self, level): with self.open() as fh: for line, idx in zip(fh, range(1, 6)): cells = line.rstrip('\n').split('\t') if len(cells) != 2: raise ValidationError("Line %d does not have exactly 2 " "elements seperated by a tab." % idx) class SingleIntFormat(TextFileFormat): """ Exactly one int on a single line in the file. """ def _validate_(self, level): with self.open() as fh: try: int(fh.readline().rstrip('\n')) except (TypeError, ValueError): raise ValidationError("File does not contain an integer") if fh.readline(): raise ValidationError("Too many lines in file.") IntSequenceDirectoryFormat = model.SingleFileDirectoryFormat( 'IntSequenceDirectoryFormat', 'ints.txt', IntSequenceFormat) IntSequenceV2DirectoryFormat = model.SingleFileDirectoryFormat( 'IntSequenceV2DirectoryFormat', 'integers.txt', IntSequenceFormatV2) class IntSequenceMultiFileDirectoryFormat(model.DirectoryFormat): pass # This could have been a `SingleFileDirectoryFormat`, but isn't for testing # purposes class MappingDirectoryFormat(model.DirectoryFormat): mapping = model.File('mapping.tsv', format=MappingFormat) class FourIntsDirectoryFormat(model.DirectoryFormat): """ A sequence of exactly four integers stored across multiple files, some of which are in a nested directory. Each file contains a single integer. Since this is a sequence, the integers have an order (corresponding to filename) and repetition of elements is allowed. """ single_ints = model.FileCollection( r'file[1-2]\.txt|nested/file[3-4]\.txt', format=SingleIntFormat) @single_ints.set_path_maker def single_ints_path_maker(self, num): if not 0 < num < 5: raise ValueError("`num` must be 1-4, not %r." % num) if num > 2: return 'nested/file%d.txt' % num else: return 'file%d.txt' % num class RedundantSingleIntDirectoryFormat(model.DirectoryFormat): """ Two files of SingleIntFormat which are exactly the same. """ int1 = model.File('file1.txt', format=SingleIntFormat) int2 = model.File('file2.txt', format=SingleIntFormat) def _validate_(self, level): if self.int1.view(int) != self.int2.view(int): raise ValidationError("file1.txt does not match file2.txt") class UnimportableFormat(TextFileFormat): """ Unimportable format used for testing. """ UnimportableDirectoryFormat = model.SingleFileDirectoryFormat( 'UnimportableDirectoryFormat', 'ints.txt', UnimportableFormat) class EchoFormat(TextFileFormat): def _validate_(self, level): pass # Anything is a valid echo file EchoDirectoryFormat = model.SingleFileDirectoryFormat( 'EchoDirectoryFormat', 'echo.txt', EchoFormat) class Cephalapod(TextFileFormat): """ Class that inherits from text file format. Used for testing validator sorting. """ CephalapodDirectoryFormat = model.SingleFileDirectoryFormat( 'CephalapodDirectoryFormat', 'squids.tsv', Cephalapod) class ImportableOnlyFormat(TextFileFormat): """ A format that can only be transformed from. """ class ExportableOnlyFormat(TextFileFormat): """ A format that can only be transformed to. """ qiime2-2024.5.0/qiime2/core/testing/mapped.py000066400000000000000000000066571462552636000205740ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os from .plugin import dummy_plugin, C1, C2, C3, Foo, Bar, Baz, EchoFormat from qiime2.plugin import ( TypeMap, TypeMatch, Properties, Visualization, Bool, Choices) def constrained_input_visualization(output_dir: str, a: EchoFormat, b: EchoFormat): with open(os.path.join(output_dir, 'index.html'), 'w') as fh: fh.write("

%s

" % a.path.read_text()) fh.write("

%s

" % b.path.read_text()) T, U, V = TypeMap({ (Foo, Foo): Visualization, (Bar, Bar): Visualization, (Baz, Baz): Visualization, (C1[Foo], C1[Foo]): Visualization, (C1[Bar], C1[Bar]): Visualization, (C1[Baz], C1[Baz]): Visualization }) dummy_plugin.visualizers.register_function( function=constrained_input_visualization, inputs={ 'a': T, 'b': U }, parameters={}, name="Constrained Input Visualization", description="Ensure Foo/Bar/Baz match" ) del T, U, V def combinatorically_mapped_method(a: EchoFormat, b: EchoFormat ) -> (EchoFormat, EchoFormat): return a, b T, R = TypeMap({ Foo: Bar, Bar: Baz, Baz: Foo }) X, Y = TypeMap({ C3[Foo | Bar | Baz, Foo | Bar | Baz, Foo]: Foo, C3[Foo | Bar | Baz, Foo | Bar | Baz, Bar]: Bar, C3[Foo | Bar | Baz, Foo | Bar | Baz, Baz]: Baz }) dummy_plugin.methods.register_function( function=combinatorically_mapped_method, inputs={ 'a': C1[T], 'b': X }, parameters={}, outputs=[ ('x', C2[R, R]), ('y', Y) ], name="Combinatorically Mapped Method", description="Test that multiple typemaps can be used" ) del T, R, X, Y def double_bound_variable_method(a: EchoFormat, b: EchoFormat, extra: EchoFormat) -> EchoFormat: return extra T, R = TypeMap({ Foo: Bar, Bar: Baz, Baz: Foo }) dummy_plugin.methods.register_function( function=double_bound_variable_method, inputs={ 'a': T, 'b': T, 'extra': Foo }, parameters={}, outputs=[ ('x', R) ], name="Double Bound Variable Method", description="Test reuse of variables" ) del T, R def bool_flag_swaps_output_method(a: EchoFormat, b: bool) -> EchoFormat: return a P, R = TypeMap({ Choices(True): C1[Foo], Choices(False): Foo }) dummy_plugin.methods.register_function( function=bool_flag_swaps_output_method, inputs={ 'a': Bar }, parameters={ 'b': Bool % P }, outputs=[ ('x', R) ], name='Bool Flag Swaps Output Method', description='Test if a parameter can change output' ) del P, R def predicates_preserved_method(a: EchoFormat) -> EchoFormat: return a P = TypeMatch([Properties('A'), Properties('B'), Properties('C'), Properties('X', 'Y')]) dummy_plugin.methods.register_function( function=predicates_preserved_method, inputs={ 'a': Foo % P }, parameters={}, outputs=[ ('x', Foo % P) ], name='Predicates Preserved Method', description='Test that predicates are preserved' ) del P qiime2-2024.5.0/qiime2/core/testing/method.py000066400000000000000000000120511462552636000205670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from typing import Union import qiime2 # Artifacts and parameters. def concatenate_ints(ints1: list, ints2: list, ints3: list, int1: int, int2: int) -> list: return ints1 + ints2 + ints3 + [int1] + [int2] # Multiple output artifacts. def split_ints(ints: list) -> (list, list): middle = int(len(ints) / 2) left = ints[:middle] right = ints[middle:] return left, right # No parameters, only artifacts. def merge_mappings(mapping1: dict, mapping2: dict) -> dict: merged = mapping1.copy() for key, value in mapping2.items(): if key in merged and merged[key] != value: raise ValueError( "Key %r exists in `mapping1` and `mapping2` with conflicting " "values: %r != %r" % (key, merged[key], value)) merged[key] = value return merged # No input artifacts, only parameters. def params_only_method(name: str, age: int) -> dict: return {name: age} # Unioned primitives def unioned_primitives(foo: int, bar: str = 'auto_bar') -> dict: return {'foo': foo, 'bar': bar} # No input artifacts or parameters. def no_input_method() -> dict: return {'foo': 42} def deprecated_method() -> dict: return {'foo': 43} def long_description_method(mapping1: dict, name: str, age: int) -> dict: return {name: age} def docstring_order_method(req_input: dict, req_param: str, opt_input: dict = None, opt_param: int = None) -> dict: return {req_param: opt_param} def identity_with_metadata(ints: list, metadata: qiime2.Metadata) -> list: assert isinstance(metadata, qiime2.Metadata) return ints # TODO unit tests (test_method.py) for 3 variations of MetadataColumn methods # below def identity_with_metadata_column(ints: list, metadata: qiime2.MetadataColumn) -> list: assert isinstance(metadata, (qiime2.CategoricalMetadataColumn, qiime2.NumericMetadataColumn)) return ints def identity_with_categorical_metadata_column( ints: list, metadata: qiime2.CategoricalMetadataColumn) -> list: assert isinstance(metadata, qiime2.CategoricalMetadataColumn) return ints def identity_with_numeric_metadata_column( ints: list, metadata: qiime2.NumericMetadataColumn) -> list: assert isinstance(metadata, qiime2.NumericMetadataColumn) return ints def identity_with_optional_metadata(ints: list, metadata: qiime2.Metadata = None) -> list: assert isinstance(metadata, (qiime2.Metadata, type(None))) return ints def identity_with_optional_metadata_column( ints: list, metadata: qiime2.MetadataColumn = None) -> list: assert isinstance(metadata, (qiime2.CategoricalMetadataColumn, qiime2.NumericMetadataColumn, type(None))) return ints def optional_artifacts_method(ints: list, num1: int, optional1: list = None, optional2: list = None, num2: int = None) -> list: result = ints + [num1] if optional1 is not None: result += optional1 if optional2 is not None: result += optional2 if num2 is not None: result += [num2] return result def variadic_input_method(ints: list, int_set: int, nums: int, opt_nums: int = None) -> list: results = [] for int_list in ints: results += int_list results += sorted(int_set) results += nums if opt_nums: results += opt_nums return results def type_match_list_and_set(ints: list, strs1: list, strs2: set) -> list: return [0] def union_inputs(ints1: Union[dict, list], ints2: list) -> list: return [0] def list_of_ints(ints: int) -> int: assert isinstance(ints, list) return ints def dict_of_ints(ints: int) -> int: assert isinstance(ints, qiime2.sdk.result.ResultCollection) return ints def returns_int(int: int) -> int: return int def collection_inner_union(ints: list) -> list: return [[0]] def collection_outer_union(ints: list) -> list: return [[0]] def dict_params(ints: dict) -> int: assert isinstance(ints, dict) return ints def list_params(ints: list) -> int: assert isinstance(ints, list) return ints def varied_method(ints1: int, ints2: list, int1: int = None, string: str = "NO") -> (int, list, int): if int1 is None: int1 = 1 assert isinstance(ints1, list) assert isinstance(ints2, qiime2.sdk.result.ResultCollection) assert isinstance(int1, int) assert isinstance(string, str) return ints1, ints2, int1 def _underscore_method() -> int: return 42 qiime2-2024.5.0/qiime2/core/testing/pipeline.py000066400000000000000000000241231462552636000211170ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .type import SingleInt, Mapping from qiime2.sdk.result import ResultCollection from qiime2.core.testing.util import PipelineError def parameter_only_pipeline(ctx, int1, int2=2, metadata=None): identity_with_optional_metadata = ctx.get_action( 'dummy_plugin', 'identity_with_optional_metadata') concatenate_ints = ctx.get_action('dummy_plugin', 'concatenate_ints') ints1 = ctx.make_artifact('IntSequence2', [int1, int2, 3]) ints2, = identity_with_optional_metadata(ints1, metadata) ints3, = identity_with_optional_metadata(ints1, metadata) more_ints, = concatenate_ints(ints3, ints2, ints1, int1=int1, int2=int2) return ints1, more_ints def typical_pipeline(ctx, int_sequence, mapping, do_extra_thing, add=1): split_ints = ctx.get_action('dummy_plugin', 'split_ints') most_common_viz = ctx.get_action('dummy_plugin', 'most_common_viz') left, right = split_ints(int_sequence) if do_extra_thing: left = ctx.make_artifact( 'IntSequence1', [i + add for i in left.view(list)]) val, = mapping.view(dict).values() # Some kind of runtime failure if val != '42': raise ValueError("Bad mapping") left_viz, = most_common_viz(left) right_viz, = most_common_viz(right) return mapping, left, right, left_viz, right_viz def optional_artifact_pipeline(ctx, int_sequence, single_int=None): optional_artifact_method = ctx.get_action( 'dummy_plugin', 'optional_artifacts_method') if single_int is None: # not a nested pipeline, just sharing the ctx object single_int = pointless_pipeline(ctx) num1 = single_int.view(int) ints, = optional_artifact_method(int_sequence, num1) return ints def visualizer_only_pipeline(ctx, mapping): no_input_viz = ctx.get_action('dummy_plugin', 'no_input_viz') mapping_viz = ctx.get_action('dummy_plugin', 'mapping_viz') viz1, = no_input_viz() viz2, = mapping_viz(mapping, mapping, 'foo', 'bar') return viz1, viz2 def pipelines_in_pipeline(ctx, int_sequence, mapping): pointless_pipeline = ctx.get_action('dummy_plugin', 'pointless_pipeline') typical_pipeline = ctx.get_action('dummy_plugin', 'typical_pipeline') visualizer_only_pipeline = ctx.get_action( 'dummy_plugin', 'visualizer_only_pipeline') results = [] results += pointless_pipeline() typical_results = typical_pipeline(int_sequence, mapping, True) results += typical_results results += visualizer_only_pipeline(typical_results[0]) return tuple(results) def resumable_pipeline(ctx, int_list, int_dict, fail=False): """ This pipeline is designed to be called first with fail=True then a second time with fail=False. The second call is meant to reuse cached results from the first call """ list_of_ints = ctx.get_action('dummy_plugin', 'list_of_ints') dict_of_ints = ctx.get_action('dummy_plugin', 'dict_of_ints') list_return, = list_of_ints(int_list) dict_return, = dict_of_ints(int_dict) if fail: list_uuids = [str(result.uuid) for result in list_return.values()] dict_uuids = [str(result.uuid) for result in dict_return.values()] raise ValueError(f'{list_uuids}_{dict_uuids}') return list_return, dict_return # Either both int1 and string should be default or neither should be def resumable_varied_pipeline(ctx, ints1, ints2, metadata, int1=None, string='None', fail=False): varied_method = ctx.get_action('dummy_plugin', 'varied_method') list_of_ints = ctx.get_action('dummy_plugin', 'list_of_ints') dict_of_ints = ctx.get_action('dummy_plugin', 'dict_of_ints') identity_with_metadata = ctx.get_action('dummy_plugin', 'identity_with_metadata') most_common_viz = ctx.get_action('dummy_plugin', 'most_common_viz') if int1 is None and string == 'None': ints1_ret, ints2_ret, int1_ret = varied_method(ints1, ints2) else: ints1_ret, ints2_ret, int1_ret = varied_method( ints1, ints2, int1, string) list_ret, = list_of_ints(ints1_ret) dict_ret, = dict_of_ints(ints1) identity_ret, = identity_with_metadata(ints2[0], metadata) viz_ret, = most_common_viz(ints2[1]) if fail: uuids = [] uuids.append([str(result.uuid) for result in ints1_ret.values()]) uuids.append([str(result.uuid) for result in ints2_ret.values()]) uuids.append(str(int1_ret.uuid)) uuids.append([str(result.uuid) for result in list_ret.values()]) uuids.append([str(result.uuid) for result in dict_ret.values()]) uuids.append(str(identity_ret.uuid)) uuids.append(str(viz_ret.uuid)) raise PipelineError(uuids) return (ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, identity_ret, viz_ret) # Either both int1 and string should be default or neither should be def resumable_nested_varied_pipeline(ctx, ints1, ints2, metadata, int1=None, string='None', fail=False): internal_pipeline = ctx.get_action('dummy_plugin', 'internal_fail_pipeline') list_of_ints = ctx.get_action('dummy_plugin', 'list_of_ints') dict_of_ints = ctx.get_action('dummy_plugin', 'dict_of_ints') identity_with_metadata = ctx.get_action('dummy_plugin', 'identity_with_metadata') most_common_viz = ctx.get_action('dummy_plugin', 'most_common_viz') list_ret, = list_of_ints(ints1) dict_ret, = dict_of_ints(ints1) identity_ret, = identity_with_metadata(ints2[0], metadata) viz_ret, = most_common_viz(ints2[1]) try: if int1 is None and string == 'None': ints1_ret, ints2_ret, int1_ret = internal_pipeline( ints1, ints2, fail=fail)._result() else: ints1_ret, ints2_ret, int1_ret = internal_pipeline( ints1, ints2, int1, string, fail=fail)._result() except PipelineError as e: uuids = [uuid for uuid in e.uuids] uuids.append([str(result.uuid) for result in list_ret.values()]) uuids.append([str(result.uuid) for result in dict_ret.values()]) uuids.append(str(identity_ret.uuid)) uuids.append(str(viz_ret.uuid)) raise PipelineError(uuids) return (ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, identity_ret, viz_ret) # Either both int1 and string should be default or neither should be def internal_fail_pipeline(ctx, ints1, ints2, int1=None, string='None', fail=False): varied_method = ctx.get_action('dummy_plugin', 'varied_method') if int1 is None and string == 'None': ints1_ret, ints2_ret, int1_ret = varied_method( ints1, ints2) else: ints1_ret, ints2_ret, int1_ret = varied_method( ints1, ints2, int1, string) if fail: uuids = [] uuids.append([str(result.uuid) for result in ints1_ret.values()]) uuids.append([str(result.uuid) for result in ints2_ret.values()]) uuids.append(str(int1_ret.uuid)) raise PipelineError(uuids) return ints1_ret, ints2_ret, int1_ret def de_facto_list_pipeline(ctx, kwarg=False, non_proxies=False): returns_int = ctx.get_action('dummy_plugin', 'returns_int') list_of_ints = ctx.get_action('dummy_plugin', 'list_of_ints') num_ints = 3 ints = [] for i in range(num_ints): ints_ret, = returns_int(i) ints.append(ints_ret) if non_proxies: ints.append(ctx.make_artifact(SingleInt, num_ints + 1)) if kwarg: ret, = list_of_ints(ints=ints) else: ret, = list_of_ints(ints) return ret def de_facto_dict_pipeline(ctx, kwarg=False, non_proxies=False): returns_int = ctx.get_action('dummy_plugin', 'returns_int') dict_of_ints = ctx.get_action('dummy_plugin', 'dict_of_ints') num_ints = 3 ints = {} for i in range(num_ints): ints_ret, = returns_int(i) ints[str(i + 1)] = ints_ret if non_proxies: ints[str(num_ints + 2)] = ctx.make_artifact(SingleInt, num_ints + 1) if kwarg: ret, = dict_of_ints(ints=ints) else: ret, = dict_of_ints(ints) return ret def list_pipeline(ctx, ints): assert isinstance(ints, list) return ([ctx.make_artifact(SingleInt, 4), ctx.make_artifact(SingleInt, 5)]) def collection_pipeline(ctx, ints): assert isinstance(ints, ResultCollection) return {'key1': ctx.make_artifact(SingleInt, 4), 'key2': ctx.make_artifact(SingleInt, 5)} def de_facto_collection_pipeline(ctx): method = ctx.get_action('dummy_plugin', 'no_input_method') art1, = method() art2, = method() return [art1, art2] def pointless_pipeline(ctx): # Use a real type expression instead of a string. return ctx.make_artifact(SingleInt, 4) def failing_pipeline(ctx, int_sequence, break_from='arity'): merge_mappings = ctx.get_action('dummy_plugin', 'merge_mappings') list_ = int_sequence.view(list) if list_: integer = list_[0] else: integer = 0 # Made here so that we can make sure it gets cleaned up wrong_output = ctx.make_artifact(SingleInt, integer) if break_from == 'arity': return int_sequence, int_sequence, int_sequence elif break_from == 'return-view': return None elif break_from == 'type': return wrong_output elif break_from == 'method': a = ctx.make_artifact(Mapping, {'foo': 'a'}) b = ctx.make_artifact(Mapping, {'foo': 'b'}) # has the same key merge_mappings(a, b) elif break_from == 'no-plugin': ctx.get_action('not%a$plugin', 'foo') elif break_from == 'no-action': ctx.get_action('dummy_plugin', 'not%a$method') else: raise ValueError('this never works') qiime2-2024.5.0/qiime2/core/testing/plugin.py000066400000000000000000001004221462552636000206050ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from importlib import import_module from qiime2.plugin import (Plugin, Bool, Int, Str, Choices, Range, List, Set, Collection, Visualization, Metadata, MetadataColumn, Categorical, Numeric, TypeMatch) from .format import ( IntSequenceFormat, IntSequenceFormatV2, IntSequenceMultiFileDirectoryFormat, MappingFormat, SingleIntFormat, IntSequenceDirectoryFormat, IntSequenceV2DirectoryFormat, MappingDirectoryFormat, FourIntsDirectoryFormat, RedundantSingleIntDirectoryFormat, UnimportableFormat, UnimportableDirectoryFormat, EchoFormat, EchoDirectoryFormat, Cephalapod, CephalapodDirectoryFormat, ImportableOnlyFormat, ExportableOnlyFormat ) from .type import (IntSequence1, IntSequence2, IntSequence3, Mapping, FourInts, SingleInt, Kennel, Dog, Cat, C1, C2, C3, Foo, Bar, Baz, AscIntSequence, Squid, Octopus, Cuttlefish) from .method import (concatenate_ints, split_ints, merge_mappings, identity_with_metadata, identity_with_metadata_column, identity_with_categorical_metadata_column, identity_with_numeric_metadata_column, identity_with_optional_metadata, identity_with_optional_metadata_column, params_only_method, no_input_method, deprecated_method, optional_artifacts_method, long_description_method, docstring_order_method, variadic_input_method, unioned_primitives, type_match_list_and_set, union_inputs, list_of_ints, dict_of_ints, returns_int, varied_method, collection_inner_union, collection_outer_union, dict_params, list_params, _underscore_method) from .visualizer import (most_common_viz, mapping_viz, params_only_viz, no_input_viz) from .pipeline import (parameter_only_pipeline, typical_pipeline, optional_artifact_pipeline, visualizer_only_pipeline, pipelines_in_pipeline, resumable_pipeline, resumable_varied_pipeline, resumable_nested_varied_pipeline, internal_fail_pipeline, de_facto_list_pipeline, de_facto_dict_pipeline, de_facto_collection_pipeline, list_pipeline, collection_pipeline, pointless_pipeline, failing_pipeline) from ..cite import Citations from .examples import (concatenate_ints_simple, concatenate_ints_complex, typical_pipeline_simple, typical_pipeline_complex, comments_only, identity_with_metadata_simple, identity_with_metadata_merging, identity_with_metadata_column_get_mdc, variadic_input_simple, optional_inputs, comments_only_factory, collection_list_of_ints, collection_dict_of_ints, construct_and_access_collection ) citations = Citations.load('citations.bib', package='qiime2.core.testing') dummy_plugin = Plugin( name='dummy-plugin', description='Description of dummy plugin.', short_description='Dummy plugin for testing.', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing', user_support_text='For help, see https://qiime2.org', citations=[citations['unger1998does'], citations['berry1997flying']] ) import_module('qiime2.core.testing.transformer') import_module('qiime2.core.testing.validator') # Register semantic types dummy_plugin.register_semantic_types(IntSequence1, IntSequence2, IntSequence3, Mapping, FourInts, Kennel, Dog, Cat, SingleInt, C1, C2, C3, Foo, Bar, Baz, AscIntSequence, Squid, Octopus, Cuttlefish) # Register formats dummy_plugin.register_formats( IntSequenceFormatV2, MappingFormat, IntSequenceV2DirectoryFormat, IntSequenceMultiFileDirectoryFormat, MappingDirectoryFormat, EchoDirectoryFormat, EchoFormat, Cephalapod, CephalapodDirectoryFormat, ImportableOnlyFormat, ExportableOnlyFormat) dummy_plugin.register_formats( FourIntsDirectoryFormat, UnimportableDirectoryFormat, UnimportableFormat, citations=[citations['baerheim1994effect']]) dummy_plugin.register_views( int, IntSequenceFormat, IntSequenceDirectoryFormat, SingleIntFormat, RedundantSingleIntDirectoryFormat, citations=[citations['mayer2012walking']]) # Create IntSequence1 import example usage example def is1_use(use): def factory(): from qiime2.core.testing.format import IntSequenceFormat from qiime2.plugin.util import transform ff = transform([1, 2, 3], to_type=IntSequenceFormat) ff.validate() return ff to_import = use.init_format('to_import', factory, ext='.hello') use.import_from_format('ints', semantic_type='IntSequence1', variable=to_import, view_type='IntSequenceFormat') dummy_plugin.register_artifact_class( IntSequence1, directory_format=IntSequenceDirectoryFormat, description="The first IntSequence", examples={'IntSequence1 import example': is1_use} ) # Create IntSequence2 import example usage example def is2_use(use): def factory(): from qiime2.core.testing.format import IntSequenceFormatV2 from qiime2.plugin.util import transform ff = transform([1, 2, 3], to_type=IntSequenceFormatV2) ff.validate() return ff to_import = use.init_format('to_import', factory, ext='.hello') use.import_from_format('ints', semantic_type='IntSequence2', variable=to_import, view_type='IntSequenceFormatV2') dummy_plugin.register_artifact_class( IntSequence2, directory_format=IntSequenceV2DirectoryFormat, description="The second IntSequence", examples={'IntSequence2 import example': is2_use} ) dummy_plugin.register_semantic_type_to_format( IntSequence3, directory_format=IntSequenceMultiFileDirectoryFormat ) dummy_plugin.register_semantic_type_to_format( Mapping, directory_format=MappingDirectoryFormat ) dummy_plugin.register_semantic_type_to_format( FourInts, directory_format=FourIntsDirectoryFormat ) dummy_plugin.register_semantic_type_to_format( SingleInt, directory_format=RedundantSingleIntDirectoryFormat ) dummy_plugin.register_semantic_type_to_format( Kennel[Dog | Cat], directory_format=MappingDirectoryFormat ) dummy_plugin.register_semantic_type_to_format( C3[C1[Foo | Bar | Baz] | Foo | Bar | Baz, C1[Foo | Bar | Baz] | Foo | Bar | Baz, C1[Foo | Bar | Baz] | Foo | Bar | Baz] | C2[Foo | Bar | Baz, Foo | Bar | Baz] | C1[Foo | Bar | Baz | C2[Foo | Bar | Baz, Foo | Bar | Baz]] | Foo | Bar | Baz, directory_format=EchoDirectoryFormat) dummy_plugin.register_semantic_type_to_format( AscIntSequence, directory_format=IntSequenceDirectoryFormat) dummy_plugin.register_semantic_type_to_format( Squid | Octopus | Cuttlefish, directory_format=CephalapodDirectoryFormat) # TODO add an optional parameter to this method when they are supported dummy_plugin.methods.register_function( function=concatenate_ints, inputs={ 'ints1': IntSequence1 | IntSequence2, 'ints2': IntSequence1, 'ints3': IntSequence2 }, parameters={ 'int1': Int, 'int2': Int }, outputs={ 'concatenated_ints': IntSequence1 }, name='Concatenate integers', description='This method concatenates integers into' ' a single sequence in the order they are provided.', citations=[citations['baerheim1994effect']], examples={'concatenate_ints_simple': concatenate_ints_simple, 'concatenate_ints_complex': concatenate_ints_complex, 'comments_only': comments_only, # execute factory to make a closure to test pickling 'comments_only_factory': comments_only_factory(), }, ) T = TypeMatch([IntSequence1, IntSequence2]) dummy_plugin.methods.register_function( function=split_ints, inputs={ 'ints': T }, parameters={}, outputs={ 'left': T, 'right': T }, name='Split sequence of integers in half', description='This method splits a sequence of integers in half, returning ' 'the two halves (left and right). If the input sequence\'s ' 'length is not evenly divisible by 2, the right half will ' 'have one more element than the left.', citations=[ citations['witcombe2006sword'], citations['reimers2012response']] ) dummy_plugin.methods.register_function( function=merge_mappings, inputs={ 'mapping1': Mapping, 'mapping2': Mapping }, input_descriptions={ 'mapping1': 'Mapping object to be merged' }, parameters={}, outputs=[ ('merged_mapping', Mapping) ], output_descriptions={ 'merged_mapping': 'Resulting merged Mapping object'}, name='Merge mappings', description='This method merges two mappings into a single new mapping. ' 'If a key is shared between mappings and the values differ, ' 'an error will be raised.' ) dummy_plugin.methods.register_function( function=identity_with_metadata, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': Metadata }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, but takes metadata', examples={ 'identity_with_metadata_simple': identity_with_metadata_simple, 'identity_with_metadata_merging': identity_with_metadata_merging}, ) dummy_plugin.methods.register_function( function=long_description_method, inputs={ 'mapping1': Mapping }, input_descriptions={ 'mapping1': ("This is a very long description. If asked about its " "length, I would have to say it is greater than 79 " "characters.") }, parameters={ 'name': Str, 'age': Int }, parameter_descriptions={ 'name': ("This is a very long description. If asked about its length," " I would have to say it is greater than 79 characters.") }, outputs=[ ('out', Mapping) ], output_descriptions={ 'out': ("This is a very long description. If asked about its length," " I would have to say it is greater than 79 characters.") }, name="Long Description", description=("This is a very long description. If asked about its length," " I would have to say it is greater than 79 characters.") ) dummy_plugin.methods.register_function( function=docstring_order_method, inputs={ 'req_input': Mapping, 'opt_input': Mapping }, input_descriptions={ 'req_input': "This should show up first.", 'opt_input': "This should show up third." }, parameters={ 'req_param': Str, 'opt_param': Int }, parameter_descriptions={ 'req_param': "This should show up second.", 'opt_param': "This should show up fourth." }, outputs=[ ('out', Mapping) ], output_descriptions={ 'out': "This should show up last, in it's own section." }, name="Docstring Order", description=("Tests whether inputs and parameters are rendered in " "signature order") ) dummy_plugin.methods.register_function( function=identity_with_metadata_column, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': MetadataColumn[Categorical | Numeric] }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, ' 'but takes a generic metadata column', examples={ 'identity_with_metadata_column_get_mdc': identity_with_metadata_column_get_mdc, }, ) dummy_plugin.methods.register_function( function=identity_with_categorical_metadata_column, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': MetadataColumn[Categorical] }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, but takes a categorical metadata ' 'column' ) dummy_plugin.methods.register_function( function=identity_with_numeric_metadata_column, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': MetadataColumn[Numeric] }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, but takes a numeric metadata column' ) dummy_plugin.methods.register_function( function=identity_with_optional_metadata, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': Metadata }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, but takes optional metadata' ) dummy_plugin.methods.register_function( function=identity_with_optional_metadata_column, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={ 'metadata': MetadataColumn[Numeric | Categorical] }, outputs=[ ('out', IntSequence1) ], name='Identity', description='This method does nothing, but takes an optional generic ' 'metadata column' ) dummy_plugin.methods.register_function( function=params_only_method, inputs={}, parameters={ 'name': Str, 'age': Int }, outputs=[ ('out', Mapping) ], name='Parameters only method', description='This method only accepts parameters.', ) dummy_plugin.methods.register_function( function=unioned_primitives, inputs={}, parameters={ 'foo': Int % Range(1, None) | Str % Choices(['auto_foo']), 'bar': Int % Range(1, None) | Str % Choices(['auto_bar']), }, outputs=[ ('out', Mapping) ], name='Unioned primitive parameter', description='This method has a unioned primitive parameter' ) dummy_plugin.methods.register_function( function=no_input_method, inputs={}, parameters={}, outputs=[ ('out', Mapping) ], name='No input method', description='This method does not accept any type of input.' ) dummy_plugin.methods.register_function( function=deprecated_method, inputs={}, parameters={}, outputs=[ ('out', Mapping) ], name='A deprecated method', description='This deprecated method does not accept any type of input.', deprecated=True, ) dummy_plugin.methods.register_function( function=optional_artifacts_method, inputs={ 'ints': IntSequence1, 'optional1': IntSequence1, 'optional2': IntSequence1 | IntSequence2 }, parameters={ 'num1': Int, 'num2': Int }, outputs=[ ('output', IntSequence1) ], name='Optional artifacts method', description='This method declares optional artifacts and concatenates ' 'whatever integers are supplied as input.', examples={'optional_inputs': optional_inputs}, ) dummy_plugin.methods.register_function( function=variadic_input_method, inputs={ 'ints': List[IntSequence1 | IntSequence2], 'int_set': Set[SingleInt] }, parameters={ 'nums': Set[Int], 'opt_nums': List[Int % Range(10, 20)] }, outputs=[ ('output', IntSequence1) ], name='Test variadic inputs', description='This method concatenates all of its variadic inputs', input_descriptions={ 'ints': 'A list of int artifacts', 'int_set': 'A set of int artifacts' }, parameter_descriptions={ 'nums': 'A set of ints', 'opt_nums': 'An optional list of ints' }, output_descriptions={ 'output': 'All of the above mashed together' }, examples={'variadic_input_simple': variadic_input_simple}, ) T = TypeMatch([IntSequence1, IntSequence2]) dummy_plugin.methods.register_function( function=type_match_list_and_set, inputs={ 'ints': T }, parameters={ 'strs1': List[Str], 'strs2': Set[Str] }, outputs=[ ('output', T) ], name='TypeMatch with list and set params', description='Just a method with a TypeMatch and list/set params', input_descriptions={ 'ints': 'An int artifact' }, parameter_descriptions={ 'strs1': 'A list of strings', 'strs2': 'A set of strings' }, output_descriptions={ 'output': '[0]' } ) dummy_plugin.visualizers.register_function( function=params_only_viz, inputs={}, parameters={ 'name': Str, 'age': Int % Range(0, None) }, name='Parameters only viz', description='This visualizer only accepts parameters.' ) dummy_plugin.visualizers.register_function( function=no_input_viz, inputs={}, parameters={}, name='No input viz', description='This visualizer does not accept any type of input.' ) dummy_plugin.visualizers.register_function( function=most_common_viz, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={}, name='Visualize most common integers', description='This visualizer produces HTML and TSV outputs containing the ' 'input sequence of integers ordered from most- to ' 'least-frequently occurring, along with their respective ' 'frequencies.', citations=[citations['barbeito1967microbiological']] ) # TODO add optional parameters to this method when they are supported dummy_plugin.visualizers.register_function( function=mapping_viz, inputs={ 'mapping1': Mapping, 'mapping2': Mapping }, parameters={ 'key_label': Str, 'value_label': Str }, name='Visualize two mappings', description='This visualizer produces an HTML visualization of two ' 'key-value mappings, each sorted in alphabetical order by key.' ) dummy_plugin.pipelines.register_function( function=parameter_only_pipeline, inputs={}, parameters={ 'int1': Int, 'int2': Int, 'metadata': Metadata }, outputs=[ ('foo', IntSequence2), ('bar', IntSequence1) ], name='Do multiple things', description='This pipeline only accepts parameters', parameter_descriptions={ 'int1': 'An integer, the first one in fact', 'int2': 'An integer, the second one', 'metadata': 'Very little is done with this' }, output_descriptions={ 'foo': 'Foo - "The Integers of 2"', 'bar': 'Bar - "What a sequences"' }, ) dummy_plugin.pipelines.register_function( function=typical_pipeline, inputs={ 'int_sequence': IntSequence1, 'mapping': Mapping }, parameters={ 'do_extra_thing': Bool, 'add': Int }, outputs=[ ('out_map', Mapping), ('left', IntSequence1), ('right', IntSequence1), ('left_viz', Visualization), ('right_viz', Visualization) ], input_descriptions={ 'int_sequence': 'A sequence of ints', 'mapping': 'A map to a number other than 42 will fail' }, parameter_descriptions={ 'do_extra_thing': 'Increment `left` by `add` if true', 'add': 'Unused if `do_extra_thing` is false' }, output_descriptions={ 'out_map': 'Same as input', 'left': 'Left side of `int_sequence` unless `do_extra_thing`', 'right': 'Right side of `int_sequence`', 'left_viz': '`left` visualized', 'right_viz': '`right` visualized' }, name='A typical pipeline with the potential to raise an error', description='Waste some time shuffling data around for no reason', citations=citations, # ALL of them. examples={'typical_pipeline_simple': typical_pipeline_simple, 'typical_pipeline_complex': typical_pipeline_complex}, ) dummy_plugin.pipelines.register_function( function=optional_artifact_pipeline, inputs={ 'int_sequence': IntSequence1, 'single_int': SingleInt }, parameters={}, outputs=[ ('ints', IntSequence1) ], input_descriptions={ 'int_sequence': 'Some integers', 'single_int': 'An integer' }, output_descriptions={ 'ints': 'More integers' }, name='Do stuff normally, but override this one step sometimes', description='Creates its own single_int, unless provided' ) dummy_plugin.pipelines.register_function( function=visualizer_only_pipeline, inputs={ 'mapping': Mapping }, parameters={}, outputs=[ ('viz1', Visualization), ('viz2', Visualization) ], input_descriptions={ 'mapping': 'A mapping to look at twice' }, output_descriptions={ 'viz1': 'The no input viz', 'viz2': 'Our `mapping` seen through the lense of "foo" *and* "bar"' }, name='Visualize many things', description='Looks at both nothing and a mapping' ) dummy_plugin.pipelines.register_function( function=pipelines_in_pipeline, inputs={ 'int_sequence': IntSequence1, 'mapping': Mapping }, parameters={}, outputs=[ ('int1', SingleInt), ('out_map', Mapping), ('left', IntSequence1), ('right', IntSequence1), ('left_viz', Visualization), ('right_viz', Visualization), ('viz1', Visualization), ('viz2', Visualization) ], name='Do a great many things', description=('Mapping is chained from typical_pipeline into ' 'visualizer_only_pipeline') ) dummy_plugin.pipelines.register_function( function=resumable_pipeline, inputs={ 'int_list': List[SingleInt], 'int_dict': Collection[SingleInt], }, parameters={ 'fail': Bool }, outputs=[ ('list_return', Collection[SingleInt]), ('dict_return', Collection[SingleInt]), ], name='To be resumed', description=('Called first with fail=True then again with fail=False ' 'meant to reuse results from first run durng second run') ) T = TypeMatch([IntSequence1, IntSequence2]) dummy_plugin.pipelines.register_function( function=resumable_varied_pipeline, inputs={ 'ints1': Collection[SingleInt], 'ints2': List[T], 'int1': SingleInt, }, parameters={ 'string': Str, 'metadata': Metadata, 'fail': Bool, }, outputs=[ ('ints1_ret', Collection[SingleInt]), ('ints2_ret', Collection[T]), ('int1_ret', SingleInt), ('list_ret', Collection[SingleInt]), ('dict_ret', Collection[SingleInt]), ('identity_ret', T), ('viz', Visualization), ], name='To be resumed', description=('Called first with fail=True then again with fail=False ' 'meant to reuse results from first run durng second run') ) dummy_plugin.pipelines.register_function( function=resumable_nested_varied_pipeline, inputs={ 'ints1': Collection[SingleInt], 'ints2': List[T], 'int1': SingleInt, }, parameters={ 'string': Str, 'metadata': Metadata, 'fail': Bool, }, outputs=[ ('ints1_ret', Collection[SingleInt]), ('ints2_ret', Collection[T]), ('int1_ret', SingleInt), ('list_ret', Collection[SingleInt]), ('dict_ret', Collection[SingleInt]), ('identity_ret', T), ('viz', Visualization), ], name='To be resumed', description=('Called first with fail=True then again with fail=False ' 'meant to reuse results from first run durng second run') ) dummy_plugin.pipelines.register_function( function=internal_fail_pipeline, inputs={ 'ints1': Collection[SingleInt], 'ints2': List[T], 'int1': SingleInt, }, parameters={ 'string': Str, 'fail': Bool, }, outputs=[ ('ints1_ret', Collection[SingleInt]), ('ints2_ret', Collection[T]), ('int1_ret', SingleInt), ], name='Internal fail pipeline', description=('This pipeline is called inside of ' 'resumable_nested_variable_pipeline to mimic a nested ' 'pipeline failing') ) dummy_plugin.pipelines.register_function( function=de_facto_list_pipeline, inputs={}, parameters={ 'kwarg': Bool, 'non_proxies': Bool }, outputs=[ ('output', Collection[SingleInt]), ], name='Pipeline that creates a de facto list of artifacts.', description=('This pipeline is supposed to be run in parallel to assert ' 'that we can handle a de facto list of proxies.') ) dummy_plugin.pipelines.register_function( function=de_facto_dict_pipeline, inputs={}, parameters={ 'kwarg': Bool, 'non_proxies': Bool }, outputs=[ ('output', Collection[SingleInt]), ], name='Pipeline that creates a de facto dict of artifacts.', description=('This pipeline is supposed to be run in parallel to assert ' 'that we can handle a de facto dict of proxies.') ) dummy_plugin.pipelines.register_function( function=list_pipeline, inputs={'ints': List[IntSequence1]}, parameters={}, outputs=[('output', Collection[SingleInt])], name='Takes a list and returns a collection', description='Takes a list and returns a collection' ) dummy_plugin.pipelines.register_function( function=collection_pipeline, inputs={'ints': Collection[IntSequence1]}, parameters={}, outputs=[('output', Collection[SingleInt])], name='Takes a collection and returns a collection', description='Takes a collection and returns a collection' ) dummy_plugin.pipelines.register_function( function=de_facto_collection_pipeline, inputs={}, parameters={}, outputs=[('output', Collection[Mapping])], name='Returns de facto ResultCollection', description='Takes nothing and returns de facto ResultCollection' ) dummy_plugin.pipelines.register_function( function=pointless_pipeline, inputs={}, parameters={}, outputs=[('random_int', SingleInt)], name='Get an integer', description='Integer was chosen to be 4 by a random dice roll' ) dummy_plugin.pipelines.register_function( function=failing_pipeline, inputs={ 'int_sequence': IntSequence1 }, parameters={ 'break_from': Str % Choices( {'arity', 'return-view', 'type', 'method', 'internal', 'no-plugin', 'no-action'}) }, outputs=[('mapping', Mapping)], name='Test different ways of failing', description=('This is useful to make sure all of the intermediate stuff is' ' cleaned up the way it should be.') ) dummy_plugin.methods.register_function( function=union_inputs, inputs={ 'ints1': IntSequence1, 'ints2': IntSequence2, }, parameters={}, outputs=[ ('ints', IntSequence1) ], name='Inputs with typing.Union', input_descriptions={ 'ints1': 'An int artifact', 'ints2': 'An int artifact' }, output_descriptions={ 'ints': '[0]', }, description='This method accepts a list or dict as first input.' ) dummy_plugin.methods.register_function( function=list_of_ints, inputs={ 'ints': List[SingleInt] }, parameters={}, outputs=[ ('output', Collection[SingleInt]) ], name='Reverses list of inputs', description='Some description', input_descriptions={ 'ints': 'Collection of ints' }, output_descriptions={ 'output': 'Reversed Collection of ints' }, examples={'collection_list_of_ints': collection_list_of_ints} ) dummy_plugin.methods.register_function( function=returns_int, inputs={}, parameters={ 'int': Int }, outputs=[ ('output', SingleInt) ], name='Returns int', description='Just returns an int', ) dummy_plugin.methods.register_function( function=dict_of_ints, inputs={ 'ints': Collection[SingleInt] }, parameters={}, outputs=[ ('output', Collection[SingleInt]) ], name='Takes ints', description='Some description', input_descriptions={ 'ints': 'Collection of ints' }, output_descriptions={ 'output': 'Collection of ints' }, examples={ 'collection_dict_of_ints': collection_dict_of_ints, 'construct_and_access_collection': construct_and_access_collection } ) dummy_plugin.methods.register_function( function=collection_inner_union, inputs={ 'ints': Collection[IntSequence1 | IntSequence2] }, parameters={}, outputs=[ ('output', Collection[IntSequence1]) ], name='Takes ints', description='Some description', input_descriptions={ 'ints': 'Collection of ints' }, output_descriptions={ 'output': 'Collection of ints' } ) dummy_plugin.methods.register_function( function=collection_outer_union, inputs={ 'ints': Collection[IntSequence1] | Collection[IntSequence2] }, parameters={}, outputs=[ ('output', Collection[IntSequence1]) ], name='Takes ints', description='Some description', input_descriptions={ 'ints': 'Collection of ints' }, output_descriptions={ 'output': 'Collection of ints' } ) dummy_plugin.methods.register_function( function=dict_params, inputs={}, parameters={ 'ints': Collection[Int], }, outputs=[ ('output', Collection[SingleInt]) ], name='Parameters only method', description='This method only accepts parameters.', ) dummy_plugin.methods.register_function( function=list_params, inputs={}, parameters={ 'ints': List[Int], }, outputs=[ ('output', Collection[SingleInt]) ], name='Parameters only method', description='This method only accepts parameters.', ) dummy_plugin.methods.register_function( function=varied_method, inputs={ 'ints1': List[SingleInt], 'ints2': Collection[IntSequence1], 'int1': SingleInt }, parameters={ 'string': Str, }, outputs=[ ('ints', Collection[SingleInt]), ('sequences', Collection[IntSequence1]), ('int', SingleInt) ], name='Takes and returns a combination of colletions and non collections', description='Takes and returns a combination of colletions and non' ' collections' ) dummy_plugin.methods.register_function( function=_underscore_method, inputs={}, parameters={}, outputs=[ ('int', SingleInt) ], name='Starts with an underscore', description='Exists to test that the cli does not render actions that' ' start with an underscore by default' ) import_module('qiime2.core.testing.mapped') other_plugin = Plugin( name='other-plugin', description='', short_description='', version='0.0.0-dev', website='', package='qiime2.core.archive.provenance_lib.tests', user_support_text='', citations=[] ) other_plugin.methods.register_function( function=concatenate_ints, inputs={ 'ints1': IntSequence1, 'ints2': IntSequence1, 'ints3': IntSequence1, }, parameters={ 'int1': Int, 'int2': Int }, outputs={ 'concatenated_ints': IntSequence1 }, name='Concatenate integers', description='Some description' ) @other_plugin.register_transformer def _9999999(ff: SingleIntFormat) -> str: with ff.open() as fh: return fh.read() qiime2-2024.5.0/qiime2/core/testing/tests/000077500000000000000000000000001462552636000201005ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/testing/tests/__init__.py000066400000000000000000000005351462552636000222140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/testing/tests/test_mapped_actions.py000066400000000000000000000164661462552636000245140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2 import Artifact from qiime2.core.testing.util import get_dummy_plugin from ..type import IntSequence1, IntSequence2 class ActionTester(unittest.TestCase): ACTION = 'N/A' def setUp(self): plugin = get_dummy_plugin() self.action = plugin.actions[self.ACTION] def run_action(self, **inputs): results = self.action(**inputs) async_results = self.action.asynchronous(**inputs).result() for a, b in zip(results, async_results): self.assertEqual(a.type, b.type) return results class TestConstrainedInputVisualization(ActionTester): ACTION = 'constrained_input_visualization' def test_match_foo(self): a = Artifact.import_data('Foo', "element 1", view_type=str) b = Artifact.import_data('Foo', "element 2", view_type=str) viz, = self.run_action(a=a, b=b) contents = (viz._archiver.data_dir / 'index.html').read_text() self.assertIn('element 1', contents) self.assertIn('element 2', contents) def test_match_nested(self): a = Artifact.import_data('C1[Baz]', "element 1", view_type=str) b = Artifact.import_data('C1[Baz]', "element 2", view_type=str) viz, = self.run_action(a=a, b=b) contents = (viz._archiver.data_dir / 'index.html').read_text() self.assertIn('element 1', contents) self.assertIn('element 2', contents) def test_mismatch_foo_bar(self): a = Artifact.import_data('Foo', "element 1", view_type=str) b = Artifact.import_data('Bar', "element 2", view_type=str) with self.assertRaisesRegex(ValueError, 'No solution.*Foo'): viz, = self.run_action(a=a, b=b) def test_mismatch_nested(self): a = Artifact.import_data('C1[Foo]', "element 1", view_type=str) b = Artifact.import_data('Foo', "element 2", view_type=str) with self.assertRaisesRegex(ValueError, 'No solution.*C1'): viz, = self.run_action(a=a, b=b) class TestCombinatoricallyMappedMethod(ActionTester): ACTION = 'combinatorically_mapped_method' def test_match_foo(self): a = Artifact.import_data('C1[Foo]', 'element 1', view_type=str) b = Artifact.import_data('C3[Foo, Foo, Foo]', 'element 2', view_type=str) x, y = self.run_action(a=a, b=b) self.assertEqual(repr(x.type), 'C2[Bar, Bar]') self.assertEqual(repr(y.type), 'Foo') def test_match_bar_foo(self): a = Artifact.import_data('C1[Bar]', 'element 1', view_type=str) b = Artifact.import_data('C3[Foo, Foo, Foo]', 'element 2', view_type=str) x, y = self.run_action(a=a, b=b) self.assertEqual(repr(x.type), 'C2[Baz, Baz]') self.assertEqual(repr(y.type), 'Foo') def test_match_baz_misc(self): a = Artifact.import_data('C1[Baz]', 'element 1', view_type=str) b = Artifact.import_data('C3[Foo, Bar, Baz]', 'element 2', view_type=str) x, y = self.run_action(a=a, b=b) self.assertEqual(repr(x.type), 'C2[Foo, Foo]') self.assertEqual(repr(y.type), 'Baz') def test_mismatch(self): a = Artifact.import_data('Bar', 'element 1', view_type=str) b = Artifact.import_data('C3[Foo, Foo, Foo]', 'element 2', view_type=str) with self.assertRaises(TypeError): self.run_action(a=a, b=b) class TestDoubleBoundVariableMethod(ActionTester): ACTION = 'double_bound_variable_method' def test_predicate_on_second(self): a = Artifact.import_data('Bar', 'element 1', view_type=str) b = Artifact.import_data('Bar % Properties("A")', 'element 2', view_type=str) extra = Artifact.import_data('Foo', 'always foo', view_type=str) x, = self.run_action(a=a, b=b, extra=extra) self.assertEqual(repr(x.type), 'Baz') def test_mismatch(self): a = Artifact.import_data('Foo', 'element 1', view_type=str) b = Artifact.import_data('Bar', 'element 2', view_type=str) extra = Artifact.import_data('Foo', 'always foo', view_type=str) with self.assertRaisesRegex(ValueError, 'match.*same output'): self.run_action(a=a, b=b, extra=extra) class TestBoolFlagSwapsOutputMethod(ActionTester): ACTION = 'bool_flag_swaps_output_method' def test_true(self): a = Artifact.import_data('Bar', 'element', view_type=str) x, = self.run_action(a=a, b=True) self.assertEqual(repr(x.type), 'C1[Foo]') def test_false(self): a = Artifact.import_data('Bar', 'element', view_type=str) x, = self.run_action(a=a, b=False) self.assertEqual(repr(x.type), 'Foo') class TestPredicatesPreservedMethod(ActionTester): ACTION = 'predicates_preserved_method' def test_simple(self): a = Artifact.import_data("Foo % Properties('A')", 'element 1', view_type=str) x, = self.run_action(a=a) self.assertEqual(repr(x.type), "Foo % Properties('A')") def test_mismatch(self): a = Artifact.import_data("Foo % Properties('X')", 'element 1', view_type=str) with self.assertRaises(TypeError): self.run_action(a=a) def test_combinations_preserved(self): a = Artifact.import_data("Foo % Properties('A', 'B')", 'element 1', view_type=str) x, = self.run_action(a=a) self.assertEqual(repr(x.type), "Foo % Properties('A', 'B')") def test_extra_dropped(self): a = Artifact.import_data("Foo % Properties('Extra', 'A', 'B')", 'element 1', view_type=str) x, = self.run_action(a=a) self.assertEqual(repr(x.type), "Foo % Properties('A', 'B')") class TestTypeMatchWithListAndSet(ActionTester): ACTION = 'type_match_list_and_set' def test_intsequence1(self): a = Artifact.import_data('IntSequence1', [1]) x = self.run_action(ints=a, strs1=['a'], strs2={'a'}) self.assertEqual(x.output.type, IntSequence1) def test_intsequence2(self): a = Artifact.import_data('IntSequence2', [1]) x = self.run_action(ints=a, strs1=['a'], strs2={'a'}) self.assertEqual(x.output.type, IntSequence2) class TestUnionedPrimitiveDecode(ActionTester): ACTION = 'unioned_primitives' def test_decode_int(self): exp = dict(foo=1, bar=1) res = self.action.signature.decode_parameters(foo='1', bar='1') self.assertEqual(res, exp) def test_decode_str(self): exp = dict(foo='auto_foo', bar='auto_bar') res = self.action.signature.decode_parameters(**exp) self.assertEqual(res, exp) def test_decode_mix(self): exp = dict(foo=1, bar='auto_bar') res = self.action.signature.decode_parameters(foo='1', bar='auto_bar') self.assertEqual(res, exp) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/testing/transformer.py000066400000000000000000000135311462552636000216550ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections from qiime2 import Metadata import pandas as pd from .format import ( FourIntsDirectoryFormat, MappingDirectoryFormat, IntSequenceFormat, IntSequenceFormatV2, IntSequenceDirectoryFormat, IntSequenceV2DirectoryFormat, IntSequenceMultiFileDirectoryFormat, SingleIntFormat, MappingFormat, UnimportableFormat, RedundantSingleIntDirectoryFormat, EchoFormat, ImportableOnlyFormat, ExportableOnlyFormat ) from .plugin import dummy_plugin, citations @dummy_plugin.register_transformer def _2(data: int) -> SingleIntFormat: ff = SingleIntFormat() with ff.open() as fh: fh.write('%d\n' % data) return ff @dummy_plugin.register_transformer def _5(ff: SingleIntFormat) -> int: with ff.open() as fh: return int(fh.read()) @dummy_plugin.register_transformer(citations=[citations['krauth2012depth']]) def _7(data: list) -> IntSequenceFormat: ff = IntSequenceFormat() with ff.open() as fh: for int_ in data: fh.write(f'{int_}\n') return ff @dummy_plugin.register_transformer(citations=citations) def _77(data: list) -> IntSequenceFormatV2: ff = IntSequenceFormatV2() with ff.open() as fh: fh.write('VERSION 2\n') for int_ in data: fh.write('%d\n' % int_) return ff @dummy_plugin.register_transformer def _9(ff: IntSequenceFormat) -> list: with ff.open() as fh: return list(map(int, fh.readlines())) @dummy_plugin.register_transformer def _99(ff: IntSequenceFormatV2) -> list: with ff.open() as fh: fh.readline() # skip header return list(map(int, fh.readlines())) @dummy_plugin.register_transformer def _10(ff: IntSequenceFormat) -> collections.Counter: with ff.open() as fh: return collections.Counter(map(int, fh.readlines())) @dummy_plugin.register_transformer def _1010(ff: IntSequenceFormatV2) -> collections.Counter: with ff.open() as fh: fh.readline() # skip header return collections.Counter(map(int, fh.readlines())) @dummy_plugin.register_transformer def _1000(ff: IntSequenceFormat) -> IntSequenceFormatV2: new_ff = IntSequenceFormatV2() with new_ff.open() as new_fh, ff.open() as fh: new_fh.write("VERSION 2\n") for line in fh: new_fh.write(line) return new_ff # This only exists to test `get_formats` and is functionally useless otherwise @dummy_plugin.register_transformer def _1100(df: IntSequenceMultiFileDirectoryFormat) -> \ IntSequenceDirectoryFormat: return IntSequenceDirectoryFormat() @dummy_plugin.register_transformer def _1001(df: IntSequenceV2DirectoryFormat) -> \ IntSequenceMultiFileDirectoryFormat: return IntSequenceMultiFileDirectoryFormat() @dummy_plugin.register_transformer def _0202(data: int) -> RedundantSingleIntDirectoryFormat: df = RedundantSingleIntDirectoryFormat() df.int1.write_data(data, int) df.int2.write_data(data, int) return df @dummy_plugin.register_transformer def _2020(ff: RedundantSingleIntDirectoryFormat) -> int: return ff.int1.view(int) # int2 must be the same for this format @dummy_plugin.register_transformer def _11(data: dict) -> MappingDirectoryFormat: df = MappingDirectoryFormat() df.mapping.write_data(data, dict) return df @dummy_plugin.register_transformer def _12(data: dict) -> MappingFormat: ff = MappingFormat() with ff.open() as fh: for key, value in data.items(): fh.write('%s\t%s\n' % (key, value)) return ff @dummy_plugin.register_transformer(citations=[citations['silvers1997effects']]) def _13(df: MappingDirectoryFormat) -> dict: # If this had been a `SingleFileDirectoryFormat` then this entire # transformer would have been redundant (the framework could infer it). return df.mapping.view(dict) @dummy_plugin.register_transformer def _14(ff: MappingFormat) -> dict: data = {} with ff.open() as fh: for line in fh: key, value = line.rstrip('\n').split('\t') if key in data: raise ValueError( "mapping.txt file must have unique keys. Key %r was " "observed more than once." % key) data[key] = value return data @dummy_plugin.register_transformer def _15(df: MappingDirectoryFormat) -> Metadata: d = df.mapping.view(dict) return Metadata(pd.DataFrame(d, index=pd.Index(["0"], name='id'))) @dummy_plugin.register_transformer def _3(df: FourIntsDirectoryFormat) -> list: # Note: most uses of `iter_views` will need to look at the first element # of the series of tuples provided by iter_views return [x for _, x in df.single_ints.iter_views(int)] @dummy_plugin.register_transformer def _1(data: list) -> FourIntsDirectoryFormat: df = FourIntsDirectoryFormat() for i, int_ in enumerate(data, 1): df.single_ints.write_data(int_, int, num=i) return df @dummy_plugin.register_transformer def _4(ff: UnimportableFormat) -> int: return 1 @dummy_plugin.register_transformer def _a1(data: str) -> EchoFormat: ff = EchoFormat() with ff.open() as fh: fh.write(data) return ff # only for testing get_formats - otherwise useless @dummy_plugin.register_transformer() def _4242(data: ImportableOnlyFormat) -> IntSequenceDirectoryFormat: return IntSequenceDirectoryFormat() # only for testing get_formats - otherwise useless @dummy_plugin.register_transformer() def _4243(data: IntSequenceDirectoryFormat) -> ExportableOnlyFormat: return ExportableOnlyFormat() qiime2-2024.5.0/qiime2/core/testing/type.py000066400000000000000000000037651462552636000203040ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.plugin as plugin IntSequence1 = plugin.SemanticType('IntSequence1') IntSequence2 = plugin.SemanticType('IntSequence2') IntSequence3 = plugin.SemanticType('IntSequence3') AscIntSequence = plugin.SemanticType('AscIntSequence') Mapping = plugin.SemanticType('Mapping') FourInts = plugin.SemanticType('FourInts') SingleInt = plugin.SemanticType('SingleInt') Kennel = plugin.SemanticType('Kennel', field_names='pet') Dog = plugin.SemanticType('Dog', variant_of=Kennel.field['pet']) Cat = plugin.SemanticType('Cat', variant_of=Kennel.field['pet']) # Kennel[Dog | Cat] C1 = plugin.SemanticType('C1', field_names='first') C2 = plugin.SemanticType('C2', field_names=['first', 'second'], variant_of=C1.field['first'], field_members={'first': [C1], 'second': [C1]}) C3 = plugin.SemanticType('C3', field_names=['first', 'second', 'third'], variant_of=[C1.field['first'], C2.field['first'], C2.field['second']], field_members={'first': [C1, C2], 'second': [C1, C2], 'third': [C1, C2]}) _variants = [ C1.field['first'], C2.field['first'], C3.field['first'], C2.field['second'], C3.field['second'], C3.field['third'] ] # C1[C2[C3[Foo, Bar, Baz], C1[Foo]]] ... etc Foo = plugin.SemanticType('Foo', variant_of=_variants) Bar = plugin.SemanticType('Bar', variant_of=_variants) Baz = plugin.SemanticType('Baz', variant_of=_variants) Squid = plugin.SemanticType('Squid') Octopus = plugin.SemanticType('Octopus') Cuttlefish = plugin.SemanticType('Cuttlefish') qiime2-2024.5.0/qiime2/core/testing/util.py000066400000000000000000000106441462552636000202720ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2 import os import os.path import zipfile import qiime2.sdk def get_dummy_plugin(): plugin_manager = qiime2.sdk.PluginManager() if 'dummy-plugin' not in plugin_manager.plugins: raise RuntimeError( "When running QIIME 2 unit tests, the QIIMETEST environment " "variable must be defined so that plugins required by unit tests " "are loaded. The value of the QIIMETEST environment variable can " "be anything. Example command: QIIMETEST=1 nosetests") return plugin_manager.plugins['dummy-plugin'] class ArchiveTestingMixin: """Mixin for testing properties of archives created by Archiver.""" def assertArchiveMembers(self, archive_filepath, root_dir, expected): """Assert members are in an archive. Parameters ---------- archive_filepath : str or Path Filepath to archive whose members will be verified against the `expected` members. root_dir : str or Path Root directory of the archive. Will be prepended to the member paths in `expected`. This is useful when the archive's root directory is not known ahead of time (e.g. when it is a random UUID) and the caller is determining the root directory dynamically. expected : set of str Set of expected archive members stored as paths relative to `root_dir`. """ archive_filepath = str(archive_filepath) root_dir = str(root_dir) with zipfile.ZipFile(archive_filepath, mode='r') as zf: observed = set(zf.namelist()) # Path separator '/' is hardcoded because paths in the zipfile will # always use this separator. expected = {root_dir + '/' + member for member in expected} self.assertEqual(observed, expected) def assertExtractedArchiveMembers(self, extract_dir, root_dir, expected): """Assert an archive's members are extracted to a directory. Parameters ---------- extract_dir : str or Path Path to directory the archive was extracted to. root_dir : str or Path Root directory of the archive that was extracted to `extract_dir`. This is useful when the archive's root directory is not known ahead of time (e.g. when it is a random UUID) and the caller is determining the root directory dynamically. expected : set of str Set of expected archive members extracted to `extract_dir`. Stored as paths relative to `root_dir`. """ extract_dir = str(extract_dir) root_dir = str(root_dir) observed = set() for root, _, filenames in os.walk(extract_dir): for filename in filenames: observed.add(os.path.join(root, filename)) expected = {os.path.join(extract_dir, root_dir, member) for member in expected} self.assertEqual(observed, expected) class ReallyEqualMixin: """Mixin for testing implementations of __eq__/__ne__. Based on this public domain code (also explains why the mixin is useful): https://ludios.org/testing-your-eq-ne-cmp/ """ def assertReallyEqual(self, a, b): # assertEqual first, because it will have a good message if the # assertion fails. self.assertEqual(a, b) self.assertEqual(b, a) self.assertTrue(a == b) self.assertTrue(b == a) self.assertFalse(a != b) self.assertFalse(b != a) def assertReallyNotEqual(self, a, b): # assertNotEqual first, because it will have a good message if the # assertion fails. self.assertNotEqual(a, b) self.assertNotEqual(b, a) self.assertFalse(a == b) self.assertFalse(b == a) self.assertTrue(a != b) self.assertTrue(b != a) class PipelineError(Exception): """This error is raised by the dummy-plugin pipelines that are designed to fail and be rerun to test pipeline resumption. """ def __init__(self, uuids): self.uuids = uuids qiime2-2024.5.0/qiime2/core/testing/validator.py000066400000000000000000000030771462552636000213040ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2 import Metadata from qiime2.plugin import ValidationError from .type import (Kennel, Dog, Cat, AscIntSequence, Squid, Octopus, Cuttlefish) from .format import Cephalapod from .plugin import dummy_plugin @dummy_plugin.register_validator(Kennel[Dog | Cat]) def validator_example_null1(data: dict, level): pass @dummy_plugin.register_validator(Kennel[Dog]) def validator_example_null2(data: Metadata, level): pass @dummy_plugin.register_validator(AscIntSequence) def validate_ascending_seq(data: list, level): # landmine for testing if data == [2021, 8, 24]: raise KeyError prev = float('-inf') for number in data: if not number > prev: raise ValidationError("%s is not greater than %s" % (number, prev)) @dummy_plugin.register_validator(Squid | Cuttlefish) def validator_sort_middle_b(data: Cephalapod, level): pass @dummy_plugin.register_validator(Squid) def validator_sort_last(data: Cephalapod, level): pass @dummy_plugin.register_validator(Squid | Octopus | Cuttlefish) def validator_sort_first(data: Cephalapod, level): pass @dummy_plugin.register_validator(Squid | Octopus) def validator_sort_middle(data: Cephalapod, level): pass qiime2-2024.5.0/qiime2/core/testing/visualizer.py000066400000000000000000000066411462552636000215140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import os import os.path import pandas as pd # Multiple types of visualizations (index.html, index.tsv). def most_common_viz(output_dir: str, ints: collections.Counter) -> None: df = pd.DataFrame(ints.most_common(), columns=["Integer", "Frequency"]) with open(os.path.join(output_dir, 'index.html'), 'w') as fh: fh.write('\n') fh.write('

Most common integers:

\n') fh.write(df.to_html(index=False)) fh.write('') with open(os.path.join(output_dir, 'index.tsv'), 'w') as fh: fh.write(df.to_csv(sep='\t', index=False)) # Multiple html files (index.1.html, index.2.html) def multi_html_viz(output_dir: str, ints: list) -> None: ints = [str(i) for i in ints] with open(os.path.join(output_dir, 'index.1.html'), 'w') as fh: fh.write('\n') fh.write(' '.join(ints)) fh.write('') with open(os.path.join(output_dir, 'index.2.html'), 'w') as fh: fh.write('\n') fh.write(' '.join(reversed(ints))) fh.write('') # No input artifacts, only parameters. def params_only_viz(output_dir: str, name: str = 'Foo Bar', age: int = 42): with open(os.path.join(output_dir, 'index.html'), 'w') as fh: fh.write('\n') fh.write('Name: %s\n' % name) fh.write('Age: %s\n' % age) fh.write('') # No input artifacts or parameters. def no_input_viz(output_dir: str): with open(os.path.join(output_dir, 'index.html'), 'w') as fh: fh.write('\n') fh.write('Hello, World!\n') fh.write('') # Multiple input artifacts and parameters, and a nested directory with required # resources for rendering. def mapping_viz(output_dir: str, mapping1: dict, mapping2: dict, key_label: str, value_label: str) -> None: df1 = _dict_to_dataframe(mapping1, key_label, value_label) df2 = _dict_to_dataframe(mapping2, key_label, value_label) with open(os.path.join(output_dir, 'index.html'), 'w') as fh: fh.write('') fh.write('') fh.write('\n') fh.write('\n') fh.write('

mapping1:

\n') fh.write(df1.to_html(index=False, classes='dummy-class')) fh.write('

mapping2:

\n') fh.write(df2.to_html(index=False, classes='dummy-class')) fh.write('') css_dir = os.path.join(output_dir, 'css') os.mkdir(css_dir) with open(os.path.join(css_dir, 'style.css'), 'w') as fh: fh.write(_css) def _dict_to_dataframe(dict_, key_label, value_label): return pd.DataFrame(sorted(dict_.items()), columns=[key_label, value_label]) # Example table styling taken from http://www.w3schools.com/css/css_table.asp _css = """ .dummy-class { border-collapse: collapse; width: 100%; } .dummy-class th, td { padding: 8px; text-align: left; border-bottom: 1px solid #ddd; } """ qiime2-2024.5.0/qiime2/core/tests/000077500000000000000000000000001462552636000164235ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/tests/__init__.py000066400000000000000000000005351462552636000205370ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/tests/test_cache.py000066400000000000000000000547251462552636000211140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import gc import pwd import crypt import shutil import string import atexit import psutil import random import platform import tempfile import unittest from contextlib import contextmanager import pytest from flufl.lock import LockState import qiime2 from qiime2.core.cache import Cache, _exit_cleanup, get_cache, _get_user from qiime2.core.testing.type import IntSequence1, IntSequence2, SingleInt from qiime2.core.testing.util import get_dummy_plugin from qiime2.sdk.result import Artifact from qiime2.core.util import load_action_yaml # NOTE: If you see an error after all of your tests have ran saying that a pool # called __TEST_FAILURE__ doesn't exist and you were running tests in multiple # processes concurrently that is normal. The process that finished first would # have killed the pool so the ones that finished later wouldn't have it. If you # see that when you are only running tests in one process there is likely a # problem TEST_POOL = '__TEST_FAILURE__' def _get_cache_contents(cache): """Gets contents of cache not including contents of the artifacts themselves relative to the root of the cache """ cache_contents = set() rel_keys = os.path.relpath(cache.keys, cache.path) rel_data = os.path.relpath(cache.data, cache.path) rel_pools = os.path.relpath(cache.pools, cache.path) rel_cache = os.path.relpath(cache.path, cache.path) for key in os.listdir(cache.keys): cache_contents.add(os.path.join(rel_keys, key)) for art in os.listdir(cache.data): cache_contents.add(os.path.join(rel_data, art)) for pool in os.listdir(cache.pools): for link in os.listdir(os.path.join(cache.pools, pool)): cache_contents.add(os.path.join(rel_pools, pool, link)) for elem in os.listdir(cache.path): if os.path.isfile(os.path.join(cache.path, elem)): cache_contents.add(os.path.join(rel_cache, elem)) return cache_contents def _on_exit_validate(cache, expected): observed = _get_cache_contents(cache) cache.remove(TEST_POOL) assert expected.issubset(observed) @contextmanager def _fake_user_for_cache(cache_prefix, i_acknowledge_this_is_dangerous=False): """Creates a fake user with a uname that is 8 random alphanumeric characters that we ensure does not collide with an existing uname and create a cache for said user under cache_prefix """ if not i_acknowledge_this_is_dangerous: raise ValueError('YOU MUST ACCEPT THE DANGER OF LETTING THIS SCRIPT ' 'MAKE AND REMOVE A USER') if not os.getegid() == 0: raise ValueError('This action requires super user permissions which ' 'you do not have') user_list = psutil.users() uname = ''.join(random.choices(string.ascii_letters + string.digits, k=8)) # Highly unlikely this will ever happen, but we really don't want to # have collisions here while uname in user_list: uname = ''.join( random.choices(string.ascii_letters + string.digits, k=8)) password = crypt.crypt('test', '22') os.system(f'useradd -p {password} {uname}') os.seteuid(pwd.getpwnam(uname).pw_uid) # seteuid does not convice getpass.getuser we are not root because it uses # getuid not geteuid. I cannot use setuid because then I would not be able # to get root permissions back, so I give it the cache path manually under # tmp. This should be functionally no different as far as permissions on # /tmp/qiime2 are concerned. It still thinks we are not root as far as # file system operations go user_cache = Cache(os.path.join(cache_prefix, uname)) try: yield (uname, user_cache) finally: os.seteuid(0) os.system(f'userdel {uname}') shutil.rmtree(user_cache.path) def _load_outputs(collection): outputs = [] for result in collection.values(): output = load_action_yaml( result._archiver.path)['action']['output-name'] outputs.append(output) return outputs class TestCache(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() # Create temp test dir self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') # Create artifact and cache self.art1 = Artifact.import_data(IntSequence1, [0, 1, 2]) self.art2 = Artifact.import_data(IntSequence1, [3, 4, 5]) self.art3 = Artifact.import_data(IntSequence1, [6, 7, 8]) self.art4 = Artifact.import_data(IntSequence2, [9, 10, 11]) self.cache = Cache(os.path.join(self.test_dir.name, 'new_cache')) self.not_cache_path = os.path.join(self.test_dir.name, 'not_cache') os.mkdir(self.not_cache_path) def tearDown(self): """Remove our cache and all that from last test """ self.test_dir.cleanup() def test_is_cache(self): """Verifies that is_cache is identifying a cache """ self.assertTrue(Cache.is_cache(self.cache.path)) def test_is_not_cache(self): """Verifies that is_cache is identifying when things aren't caches """ self.assertFalse(Cache.is_cache(self.not_cache_path)) def test_cache_manually_V1(self): """This test manually asserts the cache created by the constructor looks exactly as expected. """ self.assertTrue(os.path.exists(self.cache.path)) contents = set(os.listdir(self.cache.path)) self.assertEqual(Cache.base_cache_contents, contents) # Assert version file looks how we want with open(self.cache.version) as fh: lines = fh.readlines() self.assertEqual(lines[0], 'QIIME 2\n') self.assertEqual(lines[1], f'cache: {self.cache.CURRENT_FORMAT_VERSION}\n') self.assertEqual(lines[2], f'framework: {qiime2.__version__}\n') def test_roundtrip(self): # Save artifact to cache art1 = Artifact.import_data(IntSequence1, [0, 1, 2]) expected = art1.view(list) self.cache.save(art1, 'foo') # Delete artifact del art1 # Load artifact from cache art2 = self.cache.load('foo') # Ensure our data is correct self.assertEqual(expected, art2.view(list)) def test_remove(self): # Save our artifact self.cache.save(self.art1, 'foo') # Show that we can load our artifact self.cache.load('foo') # remove our artifact self.cache.remove('foo') # Show that we can no longer load our artifact with self.assertRaisesRegex( KeyError, f"'{self.cache.path}' does not contain the key " "'foo'"): self.cache.load('foo') def test_invalid_keys(self): # Invalid data key with self.assertRaisesRegex(ValueError, 'valid Python identifier'): self.cache.save(self.art1, '1') # Invalid pool key with self.assertRaisesRegex(ValueError, 'valid Python identifier'): self.cache.create_pool('1') def test_kebab_key(self): self.cache.save(self.art1, '-kebab-case-key-') artifact = self.cache.load('-kebab-case-key-') artifact.validate() def test_remove_locks(self): """Create some locks then see if we can remove them """ self.cache.lock.flufl_lock.lock() self.assertEqual(self.cache.lock.flufl_lock.state, LockState.ours) self.cache.clear_lock() self.assertEqual(self.cache.lock.flufl_lock.state, LockState.unlocked) # Might create another class for garbage collection tests to test more # cases with shared boilerplate def test_garbage_collection(self): # Data referenced directly by key self.cache.save(self.art1, 'foo') # Data referenced by pool that is referenced by key pool = self.cache.create_pool('bar') pool.save(self.art2) # We will be manually deleting the keys that back these two self.cache.save(self.art3, 'baz') pool = self.cache.create_pool('qux') pool.save(self.art4) # What we expect to see before and after gc expected_pre_gc_contents = \ set(('./VERSION', 'keys/foo', 'keys/bar', 'keys/baz', 'keys/qux', f'pools/bar/{self.art2.uuid}', f'pools/qux/{self.art4.uuid}', f'data/{self.art1.uuid}', f'data/{self.art2.uuid}', f'data/{self.art3.uuid}', f'data/{self.art4.uuid}')) expected_post_gc_contents = \ set(('./VERSION', 'keys/foo', 'keys/bar', f'pools/bar/{self.art2.uuid}', f'data/{self.art1.uuid}', f'data/{self.art2.uuid}')) # Assert cache looks how we want pre gc pre_gc_contents = _get_cache_contents(self.cache) self.assertEqual(expected_pre_gc_contents, pre_gc_contents) # Delete keys self.cache.remove(self.cache.keys / 'baz') self.cache.remove(self.cache.keys / 'qux') # Make sure Python's garbage collector gets the process pool symlinks # to the artifact that was keyed on baz and the one in the qux pool gc.collect() self.cache.garbage_collection() # Assert cache looks how we want post gc post_gc_contents = _get_cache_contents(self.cache) self.assertEqual(expected_post_gc_contents, post_gc_contents) def test_asynchronous(self): concatenate_ints = self.plugin.methods['concatenate_ints'] with self.cache: future = concatenate_ints.asynchronous(self.art1, self.art2, self.art4, 4, 5) result = future.result() result = result[0] expected = set(('./VERSION', f'data/{result._archiver.uuid}')) observed = _get_cache_contents(self.cache) self.assertEqual(expected, observed) def test_asynchronous_pool(self): concatenate_ints = self.plugin.methods['concatenate_ints'] test_pool = self.cache.create_pool(TEST_POOL) with self.cache: with test_pool: future = concatenate_ints.asynchronous(self.art1, self.art2, self.art4, 4, 5) result = future.result() result = result[0] expected = set(( './VERSION', f'data/{result._archiver.uuid}', f'keys/{TEST_POOL}', f'pools/{TEST_POOL}/{result._archiver.uuid}' )) observed = _get_cache_contents(self.cache) self.assertEqual(expected, observed) def test_no_dangling_ref(self): ref = self.cache.save(self.art1, 'foo') ref.validate() # This would create a dangling ref if we were not properly saving # things to the process pool when we save them with a cache key self.cache.remove('foo') ref.validate() def test_no_dangling_ref_pool(self): pool = self.cache.create_pool('pool') ref = pool.save(self.art1) ref.validate() # This would create a dangling ref if we were not properly saving # things to the process pool when we save them to a named pool self.cache.remove('pool') ref.validate() def test_pool(self): pool = self.cache.create_pool('pool') # Create an artifact in the cache and the pool with self.cache: with pool: ref = Artifact.import_data(IntSequence1, [0, 1, 2]) uuid = str(ref.uuid) self.assertIn(uuid, os.listdir(self.cache.data)) self.assertIn(uuid, os.listdir(self.cache.pools / 'pool')) def test_pool_no_cache_set(self): pool = self.cache.create_pool('pool') with pool: ref = Artifact.import_data(IntSequence1, [0, 1, 2]) uuid = str(ref.uuid) self.assertIn(uuid, os.listdir(self.cache.data)) self.assertIn(uuid, os.listdir(self.cache.pools / 'pool')) def test_pool_wrong_cache_set(self): cache = Cache(os.path.join(self.test_dir.name, 'cache')) pool = self.cache.create_pool('pool') with cache: with self.assertRaisesRegex(ValueError, 'pool that is not on the currently ' f'set cache.*{cache.path}'): with pool: Artifact.import_data(IntSequence1, [0, 1, 2]) def test_enter_multiple_caches(self): cache = Cache(os.path.join(self.test_dir.name, 'cache')) with self.cache: with self.assertRaisesRegex(ValueError, 'cannot enter multiple caches.*' f'{self.cache.path}'): with cache: pass def test_enter_multiple_pools(self): pool1 = self.cache.create_pool('pool1') pool2 = self.cache.create_pool('pool2') with pool1: with self.assertRaisesRegex(ValueError, 'cannot enter multiple pools.*' f'{pool1.path}'): with pool2: pass def test_loading_pool(self): self.cache.create_pool('pool') with self.assertRaisesRegex( ValueError, "'pool' does not point to any data"): self.cache.load('pool') def test_access_data_with_deleted_key(self): pool = self.cache.create_pool('pool') with self.cache: with pool: art = Artifact.import_data(IntSequence1, [0, 1, 2]) uuid = str(art.uuid) art = self.cache.save(art, 'a') art.validate() self.assertIn(uuid, os.listdir(self.cache.data)) self.assertIn(uuid, os.listdir(self.cache.pools / 'pool')) art = self.cache.load('a') art.validate() self.assertIn(uuid, os.listdir(self.cache.data)) self.assertIn(uuid, os.listdir(self.cache.pools / 'pool')) self.cache.remove('a') art.validate() self.assertIn(uuid, os.listdir(self.cache.data)) self.assertIn(uuid, os.listdir(self.cache.pools / 'pool')) def test_collection_list_input_cache(self): list_method = self.plugin.methods['list_of_ints'] dict_method = self.plugin.methods['dict_of_ints'] int_list = [Artifact.import_data(SingleInt, 0), Artifact.import_data(SingleInt, 1)] list_out = list_method(int_list) dict_out = dict_method(int_list) pre_cache_list = list_out.output pre_cache_dict = dict_out.output cache_list_out = self.cache.save_collection(list_out, 'list_out') cache_dict_out = self.cache.save_collection(dict_out, 'dict_out') self.assertEqual(pre_cache_list, cache_list_out) self.assertEqual(pre_cache_dict, cache_dict_out) def test_collection_dict_input_cache(self): list_method = self.plugin.methods['list_of_ints'] dict_method = self.plugin.methods['dict_of_ints'] int_dict = {'1': Artifact.import_data(SingleInt, 0), '2': Artifact.import_data(SingleInt, 1)} list_out = list_method(int_dict) dict_out = dict_method(int_dict) pre_cache_list = list_out.output pre_cache_dict = dict_out.output cache_list_out = self.cache.save_collection(list_out, 'list_out') cache_dict_out = self.cache.save_collection(dict_out, 'dict_out') self.assertEqual(pre_cache_list, cache_list_out) self.assertEqual(pre_cache_dict, cache_dict_out) def test_dangling_reference(self): ref = self.cache.save(self.art1, 'foo') ref.validate() shutil.rmtree(self.cache.data / str(ref.uuid)) with self.assertWarnsRegex(Warning, 'Dangling reference .*foo'): self.cache.garbage_collection() def test_dangling_reference_in_pool(self): pool = self.cache.create_pool('pool') ref = pool.save(self.art1) ref.validate() shutil.rmtree(self.cache.data / str(ref.uuid)) with self.assertWarnsRegex(Warning, f'Dangling reference .*{ref.uuid}'): self.cache.garbage_collection() def test_dangling_reference_pool(self): self.cache.create_pool('pool') shutil.rmtree(self.cache.pools / 'pool') with self.assertWarnsRegex(Warning, 'Dangling reference .*pool'): self.cache.garbage_collection() # This test has zzz in front of it because unittest.Testcase runs the tests # in alphabetical order, and we want this test to run last def test_zzz_asynchronous_pool_post_exit(self): """This test determines if all of the data is still in the cache when we are getting ready to exit. This was put here when ensuring we do not destroy our data when running asynchronous actions, and it can probably be removed once Archiver is reworked """ concatenate_ints = self.plugin.methods['concatenate_ints'] # This test needs to use a cache that exists past the lifespan of the # function cache = get_cache() test_pool = cache.create_pool(TEST_POOL, reuse=True) with test_pool: future = concatenate_ints.asynchronous(self.art1, self.art2, self.art4, 4, 5) result = future.result() result = result[0] expected = set(( './VERSION', f'data/{result._archiver.uuid}', f'keys/{TEST_POOL}', f'pools/{TEST_POOL}/{result._archiver.uuid}' )) atexit.unregister(_exit_cleanup) atexit.register(_on_exit_validate, cache, expected) atexit.register(_exit_cleanup) @pytest.mark.skipif(os.geteuid() == 0, reason="super user always wins") def test_surreptitiously_write_artifact(self): """Test temporarily no-oped because behavior is temporarily no-oped """ return # self.cache.save(self.art1, 'a') # target = self.cache.data / str(self.art1.uuid) / 'metadata.yaml' # with self.assertRaisesRegex(PermissionError, # f"Permission denied: '{target}'"): # with open(target, mode='a') as fh: # fh.write('gonna mess up ur metadata') @pytest.mark.skipif(os.geteuid() == 0, reason="super user always wins") def test_surreptitiously_add_file(self): """Test temporarily no-oped because behavior is temporarily no-oped """ return # self.cache.save(self.art1, 'a') # target = self.cache.data / str(self.art1.uuid) / 'extra.file' # with self.assertRaisesRegex(PermissionError, # f"Permission denied: '{target}'"): # with open(target, mode='w') as fh: # fh.write('extra file') @pytest.mark.skipif( os.geteuid() != 0, reason="only sudo can mess with users") @pytest.mark.skipif( platform.system() == "Darwin", reason="Mac clusters not really a thing") def test_multi_user(self): """This test determines if we can have multiple users successfully accessing the cache under the /tmp/qiime2 directory. This test came from this issue https://github.com/qiime2/qiime2/issues/639. It should only run as root because only root can create and delete users, and for now at least it won't run on Mac """ concatenate_ints = self.plugin.methods['concatenate_ints'] root_cache = get_cache() root_user = _get_user() # This should ensure that the /tmp/qiime2/root cache exists and has # things in it with root_cache: root_result = \ concatenate_ints(self.art1, self.art2, self.art4, 4, 5)[0] root_expected = set(( './VERSION', f'data/{root_result._archiver.uuid}' )) # The location we put the root cache in is also where we want the fake # user cache cache_prefix = os.path.split(root_cache.path)[0] # Temporarily create a new user and user cache for multi-user testing # purposes with _fake_user_for_cache( cache_prefix, i_acknowledge_this_is_dangerous=True) as (uname, user_cache): with user_cache: # We can't use the artifacts that are on the class here anymore # because they exist in root's temp cache and this user no # longer has access to it (which is good honestly) art1 = Artifact.import_data(IntSequence1, [0, 1, 2]) art2 = Artifact.import_data(IntSequence1, [3, 4, 5]) art4 = Artifact.import_data(IntSequence2, [9, 10, 11]) user_result = concatenate_ints(art1, art2, art4, 4, 5)[0] user_expected = set(( './VERSION', f'data/{user_result._archiver.uuid}', )) self.assertEqual(os.path.basename(root_cache.path), root_user) self.assertEqual(os.path.basename(user_cache.path), uname) root_observed = _get_cache_contents(root_cache) user_observed = _get_cache_contents(user_cache) self.assertTrue(root_expected.issubset(root_observed)) self.assertTrue(user_expected.issubset(user_observed)) def test_inconsistent_cache(self): cache = Cache() (cache.path / 'VERSION').unlink() del cache with self.assertWarnsRegex(UserWarning, "in an inconsistent state"): Cache() def test_output_collection_provenance(self): """ This is really a prov test, but it's here because the infrastructure to do it already exists in this class """ collection_pipeline = self.plugin.pipelines['collection_pipeline'] input_ = [Artifact.import_data(IntSequence1, [0, 1, 2])] with self.cache: output = collection_pipeline(input_).output expected = [['output', 'key1', '1/2'], ['output', 'key2', '2/2']] observed = _load_outputs(output) self.assertEqual(observed, expected) def test_cache_existing_dir(self): with self.assertRaisesRegex( ValueError, f"Path: '{self.not_cache_path}' already exists"): Cache(self.not_cache_path) qiime2-2024.5.0/qiime2/core/tests/test_enan.py000066400000000000000000000025301462552636000207550ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2.core.enan import get_payload_from_nan, make_nan_with_payload class TestNanPayloads(unittest.TestCase): def test_normal_nan(self): normal_nan = float('nan') payload, namespace = get_payload_from_nan(normal_nan) self.assertIs(payload, None) self.assertIs(namespace, None) def test_roundtrip_payload(self): for namespace in range(0, 256): for payload in range(-50, 500): nan = make_nan_with_payload(payload, namespace) new_payload, new_namespace = get_payload_from_nan(nan) self.assertEqual(namespace, new_namespace) self.assertEqual(payload, new_payload) self.assertNotEqual(nan, nan) def test_user_namespace_default(self): nan = make_nan_with_payload(42) payload, namespace = get_payload_from_nan(nan) self.assertEqual(42, payload) self.assertEqual(255, namespace) self.assertNotEqual(nan, nan) qiime2-2024.5.0/qiime2/core/tests/test_missing.py000066400000000000000000000061301462552636000215050ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import pandas as pd import pandas.testing as pdt from qiime2.core.missing import series_encode_missing, series_extract_missing class RoundTripMixin: def check_roundtrip(self, real_value, dtype): notna_exp = [real_value] series = pd.Series(notna_exp + self.missing_terms) encoded = series_encode_missing(series, self.enum) missing = series_extract_missing(encoded) self.assertEqual(encoded.dtype, dtype) # the non-null side of the series self.assertEqual(list(encoded[encoded.notna()]), notna_exp) # the null end (but in the orginal vocabulary) pdt.assert_series_equal(missing, series[1:].astype(object)) def test_roundtrip_float(self): self.check_roundtrip(0.05, float) def test_roundtrip_string(self): self.check_roundtrip('hello', object) def test_roundtrip_int(self): self.check_roundtrip(42, float) def test_roundtrip_bool(self): self.check_roundtrip(True, object) def test_roundtrip_all_missing_object(self): expected = [None, float('nan')] + self.missing_terms series = pd.Series(expected, dtype=object) encoded = series_encode_missing(series, self.enum) missing = series_extract_missing(encoded) self.assertEqual(encoded.dtype, object) pdt.assert_series_equal(missing, series.astype(object)) class TestISNDC(RoundTripMixin, unittest.TestCase): def setUp(self): self.enum = 'INSDC:missing' self.missing_terms = ['not applicable', 'missing', 'not collected', 'not provided', 'restricted access'] class TestOmitted(RoundTripMixin, unittest.TestCase): def setUp(self): self.enum = 'blank' self.missing_terms = [None, float('nan')] # test_roundtrip_all_missing_float is not possible with other schemes def test_roundtrip_all_missing_float(self): expected = [None, float('nan')] + self.missing_terms series = pd.Series(expected, dtype=float) encoded = series_encode_missing(series, self.enum) missing = series_extract_missing(encoded) self.assertEqual(encoded.dtype, float) pdt.assert_series_equal(missing, series.astype(object)) class TestError(RoundTripMixin, unittest.TestCase): def setUp(self): self.enum = 'no-missing' self.missing_terms = [] # no missing values, so bool and int are not object and float def test_roundtrip_bool(self): self.check_roundtrip(True, bool) def test_roundtrip_int(self): self.check_roundtrip(42, int) def test_roundtrip_all_missing_object(self): with self.assertRaisesRegex(ValueError, 'Missing values.*name=None'): super().test_roundtrip_all_missing_object() qiime2-2024.5.0/qiime2/core/tests/test_path.py000066400000000000000000000065161462552636000210000ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import pathlib import shutil import tempfile import unittest from qiime2.core.path import OwnedPath, OutPath class TestOwnedPath(unittest.TestCase): def setUp(self): self.from_dir = tempfile.mkdtemp() (pathlib.Path(self.from_dir) / 'foo.txt').touch() self.to_dir = tempfile.mkdtemp() # assume to_dir is empty for all tests def test_move_or_copy_owned(self): d = OwnedPath(self.from_dir) # ensure that we are owned d._user_owned = True d._move_or_copy(self.to_dir) # since from_dir is owned, _move_or_copy should copy, not move self.assertTrue(os.path.exists(os.path.join(self.from_dir, 'foo.txt'))) self.assertTrue(os.path.exists(os.path.join(self.to_dir, 'foo.txt'))) shutil.rmtree(self.from_dir) shutil.rmtree(self.to_dir) def test_move_or_copy_not_owned_rename(self): d = OwnedPath(self.from_dir) # ensure that we are not owned d._user_owned = False d._move_or_copy(self.to_dir) # since from_dir is not owned, _move_or_copy should move, not copy self.assertFalse(os.path.exists(os.path.join(self.from_dir, 'foo.txt'))) self.assertTrue(os.path.exists(os.path.join(self.to_dir, 'foo.txt'))) with self.assertRaises(FileNotFoundError): shutil.rmtree(self.from_dir) shutil.rmtree(self.to_dir) @unittest.mock.patch('pathlib.Path.rename', side_effect=FileExistsError) def test_move_or_copy_not_owned_copy(self, _): d = OwnedPath(self.from_dir) # ensure that we are not owned d._user_owned = False d._move_or_copy(self.to_dir) # since from_dir is not owned, but the network fs race condition crops # up, _move_or_copy should copy, not move, but then we still ensure # that the original path has been cleaned up self.assertFalse(os.path.exists(os.path.join(self.from_dir, 'foo.txt'))) self.assertTrue(os.path.exists(os.path.join(self.to_dir, 'foo.txt'))) with self.assertRaises(FileNotFoundError): shutil.rmtree(self.from_dir) shutil.rmtree(self.to_dir) class TestOutPath(unittest.TestCase): def test_new_outpath(self): f = OutPath() self.assertIsInstance(f, OutPath) self.assertTrue(f.is_file()) g = OutPath(dir=True) self.assertIsInstance(g, OutPath) self.assertTrue(g.is_dir()) def test_new_outpath_context_mgr(self): with OutPath() as f: path = str(f) self.assertIsInstance(f, OutPath) self.assertTrue(os.path.isfile(path)) self.assertFalse(os.path.isfile(path)) def test_destructor(self): f = OutPath() path = str(f) self.assertTrue(os.path.isfile(path)) f._destructor() self.assertFalse(os.path.isfile(path)) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/tests/test_pipeline_resumption.py000066400000000000000000001020221462552636000241230ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import pandas as pd import qiime2 from qiime2.core.cache import Cache from qiime2.core.testing.type import IntSequence1, SingleInt from qiime2.core.testing.util import get_dummy_plugin, PipelineError from qiime2.sdk.result import Artifact from qiime2.sdk.parallel_config import ParallelConfig from qiime2.core.util import load_action_yaml def _load_alias_uuid(result): return load_action_yaml(result._archiver.path)['action']['alias-of'] def _load_nested_alias_uuid(result, cache): alias_uuid = _load_alias_uuid(result) aliased_result = qiime2.sdk.Result.load( os.path.join(cache.data, alias_uuid)) return _load_alias_uuid(aliased_result) def _load_alias_uuids(collection): uuids = [] for result in collection.values(): uuids.append(_load_alias_uuid(result)) return uuids def _load_nested_alias_uuids(collection, cache): alias_uuids = _load_alias_uuids(collection) # load_alias_uuids is expecting a dictionary, so just make a dictionary # here so it gets what it wants alias_results = {} for idx, alias_uuid in enumerate(alias_uuids): alias_results[idx] = \ qiime2.sdk.Result.load(os.path.join(cache.data, alias_uuid)) return _load_alias_uuids(alias_results) class TestPipelineResumption(unittest.TestCase): def setUp(self): # Get our pipeline self.plugin = get_dummy_plugin() self.pipeline = self.plugin.pipelines['resumable_varied_pipeline'] self.nested_pipeline = \ self.plugin.pipelines['resumable_nested_varied_pipeline'] # Create temp test dir self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') # Create cache and pool self.cache = Cache(os.path.join(self.test_dir.name, 'cache')) self.pool = self.cache.create_pool('pool') # Create artifacts self.ints1 = {'1': Artifact.import_data(SingleInt, 0), '2': Artifact.import_data(SingleInt, 1)} self.ints1_2 = {'3': Artifact.import_data(SingleInt, 1), '4': Artifact.import_data(SingleInt, 2)} self.ints2 = [Artifact.import_data(IntSequence1, [0, 1, 2]), Artifact.import_data(IntSequence1, [3, 4, 5])] self.int1 = Artifact.import_data(SingleInt, 42) self.int2 = Artifact.import_data(SingleInt, 43) # Create metadata df1 = pd.DataFrame({'a': ['1', '2', '3']}, index=pd.Index(['0', '1', '2'], name='feature ID')) self.md1 = qiime2.Metadata(df1) df2 = pd.DataFrame({'b': ['4', '5', '6']}, index=pd.Index(['0', '1', '2'], name='feature ID')) self.md2 = qiime2.Metadata(df2) def tearDown(self): """Remove our cache and all that from last test """ self.test_dir.cleanup() def test_resumable_pipeline_no_pool(self): with self.cache: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, identity_ret, \ viz_ret = self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Nothing should have been recycled because we didn't use a pool self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertNotEqual(dict_uuids, complete_dict_uuids) self.assertNotEqual(identity_uuid, complete_identity_uuid) self.assertNotEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of # the pipeline are aliases of the artifacts created by the # first failed run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of # the pipeline are aliases of the artifacts created by the # first failed run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_artifact_varies(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass int2 instead of int1 ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1, self.ints2, self.md1, self.int2, 'Hi') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_artifact_varies_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass int2 instead of int1 with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int2, 'Hi') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_collection_varies(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass ints1_2 instead of ints1 ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1_2, self.ints2, self.md1, self.int1, 'Hi') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertNotEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_collection_varies_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass ints1_2 instead of ints1 with ParallelConfig(): future = self.pipeline.parallel( self.ints1_2, self.ints2, self.md1, self.int2, 'Hi') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertNotEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_str_varies(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass in Bye instead of Hi ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Bye') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_str_varies_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass in Bye instead of Hi with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Bye') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertNotEqual(ints1_uuids, complete_ints1_uuids) self.assertNotEqual(ints2_uuids, complete_ints2_uuids) self.assertNotEqual(int1_uuid, complete_int1_uuid) self.assertNotEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_md_varies(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass in md2 instead of md1 ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1, self.ints2, self.md2, self.int1, 'Hi') complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertNotEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_md_varies_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # Pass in md2 instead of md1 with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md2, self.int1, 'Hi') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed pipeline that # are implicated by the changed input are not aliases while the # others are self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertNotEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_nested_resumable_pipeline(self): with self.pool: with self.assertRaises(PipelineError) as e: self.nested_pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # We now run the not nested version. This will be able to reuse the # returns from varied_method ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.nested_pipeline( self.ints1, self.ints2, self.md1, self.int1, 'Hi') complete_ints1_uuids = _load_nested_alias_uuids( ints1_ret, self.cache) complete_ints2_uuids = _load_nested_alias_uuids( ints2_ret, self.cache) complete_int1_uuid = _load_nested_alias_uuid(int1_ret, self.cache) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of the # pipeline are aliases of the artifacts created by the first failed # run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_nested_resumable_pipeline_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.nested_pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi', fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids with ParallelConfig(): future = self.nested_pipeline.parallel( self.ints1, self.ints2, self.md1, self.int1, 'Hi') ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_nested_alias_uuids( ints1_ret, self.cache) complete_ints2_uuids = _load_nested_alias_uuids( ints2_ret, self.cache) complete_int1_uuid = _load_nested_alias_uuid(int1_ret, self.cache) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of the # pipeline are aliases of the artifacts created by the first failed # run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_default_args(self): with self.pool: with self.assertRaises(PipelineError) as e: self.pipeline( self.ints1, self.ints2, self.md1, fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.pipeline( self.ints1, self.ints2, self.md1) complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of # the pipeline are aliases of the artifacts created by the # first failed run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_resumable_pipeline_default_args_parallel(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1, fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids with ParallelConfig(): future = self.pipeline.parallel( self.ints1, self.ints2, self.md1) ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_alias_uuids(ints1_ret) complete_ints2_uuids = _load_alias_uuids(ints2_ret) complete_int1_uuid = _load_alias_uuid(int1_ret) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of # the pipeline are aliases of the artifacts created by the # first failed run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_nested_resumable_pipeline_default_args(self): with self.pool: with self.assertRaises(PipelineError) as e: self.nested_pipeline( self.ints1, self.ints2, self.md1, fail=True) ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids # We now run the not nested version. This will be able to reuse the # returns from varied_method ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = self.nested_pipeline( self.ints1, self.ints2, self.md1) complete_ints1_uuids = _load_nested_alias_uuids( ints1_ret, self.cache) complete_ints2_uuids = _load_nested_alias_uuids( ints2_ret, self.cache) complete_int1_uuid = _load_nested_alias_uuid(int1_ret, self.cache) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of the # pipeline are aliases of the artifacts created by the first failed # run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) def test_nested_resumable_pipeline_parallel_default_args(self): with self.pool: with self.assertRaises(PipelineError) as e: with ParallelConfig(): future = self.nested_pipeline.parallel( self.ints1, self.ints2, self.md1, fail=True) future._result() ints1_uuids, ints2_uuids, int1_uuid, list_uuids, dict_uuids, \ identity_uuid, viz_uuid = e.exception.uuids with ParallelConfig(): future = self.nested_pipeline.parallel( self.ints1, self.ints2, self.md1) ints1_ret, ints2_ret, int1_ret, list_ret, dict_ret, \ identity_ret, viz_ret = future._result() complete_ints1_uuids = _load_nested_alias_uuids( ints1_ret, self.cache) complete_ints2_uuids = _load_nested_alias_uuids( ints2_ret, self.cache) complete_int1_uuid = _load_nested_alias_uuid(int1_ret, self.cache) complete_list_uuids = _load_alias_uuids(list_ret) complete_dict_uuids = _load_alias_uuids(dict_ret) complete_identity_uuid = _load_alias_uuid(identity_ret) complete_viz_uuid = _load_alias_uuid(viz_ret) # Assert that the artifacts returned by the completed run of the # pipeline are aliases of the artifacts created by the first failed # run self.assertEqual(ints1_uuids, complete_ints1_uuids) self.assertEqual(ints2_uuids, complete_ints2_uuids) self.assertEqual(int1_uuid, complete_int1_uuid) self.assertEqual(list_uuids, complete_list_uuids) self.assertEqual(dict_uuids, complete_dict_uuids) self.assertEqual(identity_uuid, complete_identity_uuid) self.assertEqual(viz_uuid, complete_viz_uuid) qiime2-2024.5.0/qiime2/core/tests/test_util.py000066400000000000000000000357371462552636000210300ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import shutil import unittest import tempfile import pathlib import collections import datetime import dateutil.relativedelta as relativedelta import pytest import qiime2.core.util as util from qiime2.core.testing.type import Foo, Bar, Baz class TestFindDuplicates(unittest.TestCase): # NOTE: wrapping each input in `iter()` because that is the interface # expected by `find_duplicates`, and avoids the need to test other iterable # types, e.g. list, tuples, generators, etc. def test_empty_iterable(self): obs = util.find_duplicates(iter([])) self.assertEqual(obs, set()) def test_single_value(self): obs = util.find_duplicates(iter(['foo'])) self.assertEqual(obs, set()) def test_multiple_values_no_duplicates(self): obs = util.find_duplicates(iter(['foo', 'bar'])) self.assertEqual(obs, set()) def test_one_duplicate(self): obs = util.find_duplicates(iter(['foo', 'bar', 'foo'])) self.assertEqual(obs, {'foo'}) def test_multiple_duplicates(self): obs = util.find_duplicates( iter(['foo', 'bar', 'foo', 'baz', 'foo', 'bar'])) self.assertEqual(obs, {'foo', 'bar'}) def test_all_duplicates(self): obs = util.find_duplicates( iter(['foo', 'bar', 'baz', 'baz', 'bar', 'foo'])) self.assertEqual(obs, {'foo', 'bar', 'baz'}) def test_different_hashables(self): iterable = iter(['foo', 42, -9.999, 'baz', ('a', 'b'), 42, 'foo', ('a', 'b', 'c'), ('a', 'b')]) obs = util.find_duplicates(iterable) self.assertEqual(obs, {'foo', 42, ('a', 'b')}) class TestDurationTime(unittest.TestCase): def test_time_travel(self): start = datetime.datetime(1987, 10, 27, 1, 21, 2, 50) end = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '-2 years, -1 days, -3 seconds, and 999950 microseconds') def test_no_duration(self): time = datetime.datetime(1985, 10, 26, 1, 21, 0) reldelta = relativedelta.relativedelta(time, time) self.assertEqual(util.duration_time(reldelta), '0 microseconds') def test_singular(self): start = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) end = datetime.datetime(1986, 11, 27, 2, 22, 1, 1) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '1 year, 1 month, 1 day, 1 hour, 1 minute, 1 second,' ' and 1 microsecond') def test_plural(self): start = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) end = datetime.datetime(1987, 12, 28, 3, 23, 2, 2) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '2 years, 2 months, 2 days, 2 hours, 2 minutes, 2 seconds,' ' and 2 microseconds') def test_missing(self): start = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) end = datetime.datetime(1987, 10, 27, 1, 21, 2, 50) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '2 years, 1 day, 2 seconds, and 50 microseconds') def test_unusually_round_number(self): start = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) end = datetime.datetime(1985, 10, 27, 1, 21, 0, 0) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '1 day') def test_microseconds(self): start = datetime.datetime(1985, 10, 26, 1, 21, 0, 0) end = datetime.datetime(1985, 10, 26, 1, 21, 0, 1955) reldelta = relativedelta.relativedelta(end, start) self.assertEqual( util.duration_time(reldelta), '1955 microseconds') class TestMD5Sum(unittest.TestCase): # All expected results where generated via GNU coreutils md5sum def setUp(self): self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.test_path = pathlib.Path(self.test_dir.name) def tearDown(self): self.test_dir.cleanup() def make_file(self, bytes_): path = self.test_path / 'file' with path.open(mode='wb') as fh: fh.write(bytes_) return path def test_empty_file(self): self.assertEqual(util.md5sum(self.make_file(b'')), 'd41d8cd98f00b204e9800998ecf8427e') def test_single_byte_file(self): self.assertEqual(util.md5sum(self.make_file(b'a')), '0cc175b9c0f1b6a831c399e269772661') def test_large_file(self): path = self.make_file(b'verybigfile' * (1024 * 50)) self.assertEqual(util.md5sum(path), '27d64211ee283283ad866c18afa26611') def test_can_use_string(self): string_path = str(self.make_file(b'Normal text\nand things\n')) self.assertEqual(util.md5sum(string_path), '93b048d0202e4b06b658f3aef1e764d3') class TestMD5SumDirectory(unittest.TestCase): # All expected results where generated via GNU coreutils md5sum def setUp(self): self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.test_path = pathlib.Path(self.test_dir.name) def tearDown(self): self.test_dir.cleanup() def make_file(self, bytes_, relpath): path = self.test_path / relpath with path.open(mode='wb') as fh: fh.write(bytes_) return path def test_empty_directory(self): self.assertEqual(util.md5sum_directory(self.test_path), collections.OrderedDict()) def test_nested_empty_directories(self): (self.test_path / 'foo').mkdir() (self.test_path / 'foo' / 'bar').mkdir() (self.test_path / 'baz').mkdir() self.assertEqual(util.md5sum_directory(self.test_path), collections.OrderedDict()) def test_single_file(self): self.make_file(b'Normal text\nand things\n', 'foobarbaz.txt') self.assertEqual( util.md5sum_directory(self.test_path), collections.OrderedDict([ ('foobarbaz.txt', '93b048d0202e4b06b658f3aef1e764d3') ])) def test_single_file_md5sum_python(self): with tempfile.TemporaryDirectory() as tempdir: fp = pathlib.Path(tempdir) / 'test.txt' with open(fp, 'w') as fh: fh.write('contents\n') exp = 'e66545a2155380046fce3fdbd32a6b4f' obs = util.md5sum_python(fp) self.assertEqual(exp, obs) @pytest.mark.skipif( shutil.which('md5sum') is None, reason='md5sum executable is required' ) def test_single_file_md5sum_native(self): with tempfile.TemporaryDirectory() as tempdir: fp = pathlib.Path(tempdir) / 'test.txt' with open(fp, 'w') as fh: fh.write('contents\n') exp = 'e66545a2155380046fce3fdbd32a6b4f' obs = util.md5sum_native(fp) self.assertEqual(exp, obs) def test_single_file_nested(self): nested_dir = self.test_path / 'bar' nested_dir.mkdir() filepath = (nested_dir / 'foo.baz').relative_to(self.test_path) self.make_file(b'anything at all', filepath) self.assertEqual( util.md5sum_directory(self.test_path), collections.OrderedDict([ ('bar/foo.baz', 'dcc0975b66728be0315abae5968379cb') ])) def test_sorted_decent(self): nested_dir = self.test_path / 'beta' nested_dir.mkdir() filepath = (nested_dir / '10').relative_to(self.test_path) self.make_file(b'10', filepath) filepath = (nested_dir / '1').relative_to(self.test_path) self.make_file(b'1', filepath) filepath = (nested_dir / '2').relative_to(self.test_path) self.make_file(b'2', filepath) nested_dir = self.test_path / 'alpha' nested_dir.mkdir() filepath = (nested_dir / 'foo').relative_to(self.test_path) self.make_file(b'foo', filepath) filepath = (nested_dir / 'bar').relative_to(self.test_path) self.make_file(b'bar', filepath) self.make_file(b'z', 'z') self.assertEqual( list(util.md5sum_directory(self.test_path).items()), [ ('z', 'fbade9e36a3f36d3d676c1b808451dd7'), ('alpha/bar', '37b51d194a7513e45b56f6524f2d51f2'), ('alpha/foo', 'acbd18db4cc2f85cedef654fccc4a4d8'), ('beta/1', 'c4ca4238a0b923820dcc509a6f75849b'), ('beta/10', 'd3d9446802a44259755d38e6d163e820'), ('beta/2', 'c81e728d9d4c2f636f067f89cc14862c'), ]) def test_can_use_string(self): nested_dir = self.test_path / 'bar' nested_dir.mkdir() filepath = (nested_dir / 'foo.baz').relative_to(self.test_path) self.make_file(b'anything at all', filepath) self.assertEqual( util.md5sum_directory(str(self.test_path)), collections.OrderedDict([ ('bar/foo.baz', 'dcc0975b66728be0315abae5968379cb') ])) class TestChecksumFormat(unittest.TestCase): def test_to_simple(self): line = util.to_checksum_format('this/is/a/filepath', 'd9724aeba59d8cea5265f698b2c19684') self.assertEqual( line, 'd9724aeba59d8cea5265f698b2c19684 this/is/a/filepath') def test_from_simple(self): fp, chks = util.from_checksum_format( 'd9724aeba59d8cea5265f698b2c19684 this/is/a/filepath') self.assertEqual(fp, 'this/is/a/filepath') self.assertEqual(chks, 'd9724aeba59d8cea5265f698b2c19684') def test_to_hard(self): # two kinds of backslash n to trip up the escaping: line = util.to_checksum_format('filepath/\n/with/\\newline', '939aaaae6098ebdab049b0f3abe7b68c') # Note raw string self.assertEqual( line, r'\939aaaae6098ebdab049b0f3abe7b68c filepath/\n/with/\\newline') def test_from_hard(self): fp, chks = util.from_checksum_format( r'\939aaaae6098ebdab049b0f3abe7b68c filepath/\n/with/\\newline' + '\n') # newline from a checksum "file" self.assertEqual(fp, 'filepath/\n/with/\\newline') self.assertEqual(chks, '939aaaae6098ebdab049b0f3abe7b68c') def test_filepath_with_leading_backslash(self): line = r'\d41d8cd98f00b204e9800998ecf8427e \\.qza' fp, chks = util.from_checksum_format(line) self.assertEqual(chks, 'd41d8cd98f00b204e9800998ecf8427e') self.assertEqual(fp, r'\.qza') def test_filepath_with_leading_backslashes(self): line = r'\d41d8cd98f00b204e9800998ecf8427e \\\\\\.qza' fp, chks = util.from_checksum_format(line) self.assertEqual(fp, r'\\\.qza') self.assertEqual(chks, 'd41d8cd98f00b204e9800998ecf8427e') def test_impossible_backslash(self): # It may be impossible to generate a single '\' in the md5sum digest, # because each '\' is escaped (as '\\') in the digest. We'll # test for it anyway, for full coverage. fp, _ = util.from_checksum_format( r'fake_checksum \.qza' ) fp2, _ = util.from_checksum_format( r'\fake_checksum \.qza' ) self.assertEqual(fp, r'\.qza') self.assertEqual(fp2, r'\.qza') def test_from_legacy_format(self): fp, chks = util.from_checksum_format( r'0ed29022ace300b4d96847882daaf0ef *this/means/binary/mode') self.assertEqual(fp, 'this/means/binary/mode') self.assertEqual(chks, '0ed29022ace300b4d96847882daaf0ef') def check_roundtrip(self, filepath, checksum): line = util.to_checksum_format(filepath, checksum) new_fp, new_chks = util.from_checksum_format(line) self.assertEqual(new_fp, filepath) self.assertEqual(new_chks, checksum) def test_nonsense(self): self.check_roundtrip( r'^~gpfh)bU)WvN/;3jR6H-*={iEBM`(flY2>_|5mp8{-h>Ou\{{ImLT>h;XuC,.#', '89241859050e5a43ccb5f7aa0bca7a3a') self.check_roundtrip( r"l5AAPGKLP5Mcv0b`@zDR\XTTnF;[2M>O/>,d-^Nti'vpH\{>q)/4&CuU/xQ}z,O", 'c47d43cadb60faf30d9405a3e2592b26') self.check_roundtrip( r'FZ\rywG:7Q%"J@}Rk>\&zbWdS0nhEl_k1y1cMU#Lk_"*#*/uGi>Evl7M1suNNVE', '9c7753f252116473994e8bffba2c620b') class TestSortedPoset(unittest.TestCase): def test_already_sorted_incomparable(self): a = [Foo, Bar, Baz] r = util.sorted_poset(a) # Incomparable elements, so as long as they # are present, any order is valid. self.assertEqual(len(r), 3) self.assertIn(Foo, r) self.assertIn(Bar, r) self.assertIn(Baz, r) def test_already_sorted_all_comparable(self): a = [Foo, Foo | Bar, Foo | Bar | Baz] r = util.sorted_poset(a) self.assertEqual(a, r) def test_already_sorted_all_comparable_reverse(self): a = [Foo, Foo | Bar, Foo | Bar | Baz] r = util.sorted_poset(a, reverse=True) self.assertEqual(list(reversed(a)), r) def test_mixed_elements(self): a = [Foo | Bar, Foo | Baz, Foo] r = util.sorted_poset(a) self.assertEqual(r[0], Foo) # Order of others won't matter def test_mxed_elements_diamond(self): a = [Foo | Bar, Foo, Bar | Baz | Foo, Baz | Foo] r = util.sorted_poset(a) self.assertEqual(r[0], Foo) self.assertEqual(r[-1], Bar | Baz | Foo) def test_multiple_minimums(self): a = [Foo | Bar, Foo, Bar | Baz | Foo, Bar, Baz] r = util.sorted_poset(a) idx_foo = r.index(Foo) idx_bar = r.index(Bar) idx_foobar = r.index(Foo | Bar) self.assertLess(idx_foo, idx_foobar) self.assertLess(idx_bar, idx_foobar) self.assertEqual(r[-1], Bar | Baz | Foo) def test_multiple_equivalents(self): a = [Baz, Foo | Bar, Foo, Bar | Foo, Bar] r = util.sorted_poset(a) idx_foo = r.index(Foo) idx_bar = r.index(Bar) idx_barfoo = r.index(Bar | Foo) idx_foobar = r.index(Foo | Bar) adjacent = -1 <= idx_barfoo - idx_foobar <= 1 self.assertTrue(adjacent) self.assertLess(idx_foo, idx_barfoo) self.assertLess(idx_foo, idx_foobar) self.assertLess(idx_bar, idx_barfoo) self.assertLess(idx_bar, idx_foobar) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/tests/test_validate.py000066400000000000000000000256171462552636000216400ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.core.exceptions import ValidationError, ImplementationError import unittest from qiime2.core.validate import ValidationObject from qiime2.sdk import PluginManager from qiime2.plugin.plugin import ValidatorRecord, Plugin from qiime2.core.testing.type import (IntSequence1, AscIntSequence, Kennel, Dog, Squid, Octopus) from qiime2.core.testing.format import IntSequenceFormat, Cephalapod class TestValidationObject(unittest.TestCase): def setUp(self): self.simple_int_seq = IntSequenceFormat() with self.simple_int_seq.open() as fh: fh.write('\n'.join(map(str, range(3)))) self.simple_int_seq.validate(level='max') def test_initialization(self): validator_object = ValidationObject(IntSequence1) self.assertEqual(validator_object.concrete_type, IntSequence1) def test_add_validator(self): def test_validator_method(data: list, level): pass test_record = ValidatorRecord(validator=test_validator_method, view=list, plugin='this_plugin', context=IntSequence1) validator_object = ValidationObject(IntSequence1) validator_object.add_validator(test_record) self.assertEqual(validator_object._validators, [test_record]) def test_add_validation_object(self): first_VO = ValidationObject(IntSequence1) second_VO = ValidationObject(IntSequence1) def first_validator(data: list, level): pass def second_validator(data: list, level): pass first_record = ValidatorRecord(validator=first_validator, view=list, plugin='this_plugin', context=IntSequence1) second_record = ValidatorRecord(validator=second_validator, view=list, plugin='this_plugin', context=IntSequence1) first_VO.add_validator(first_record) second_VO.add_validator(second_record) # Allows us to demonstrate add_validation_object sets _is_sorted to # false first_VO._sort_validators() first_VO.add_validation_object(second_VO) self.assertEqual(first_VO._validators, [first_record, second_record]) self.assertFalse(first_VO._is_sorted) def test_catch_different_concrete_types(self): squid_vo = ValidationObject(Squid) octopus_vo = ValidationObject(Octopus) def squid_validator(data: Cephalapod, level): pass def octopus_validator(data: Cephalapod, level): pass squid_record = ValidatorRecord(validator=squid_validator, view=Cephalapod, plugin='ocean_plugin', context=Squid) octopus_record = ValidatorRecord(validator=octopus_validator, view=Cephalapod, plugin='sea_plugin', context=Octopus) squid_vo.add_validator(squid_record) octopus_vo.add_validator(octopus_record) with self.assertRaisesRegex(TypeError, "Unable to add"): squid_vo.add_validation_object(octopus_vo) def test_public_validators_generation(self): validator_object = ValidationObject(IntSequence1) def first_validator(data: list, level): pass def second_validator(data: list, level): pass first_record = ValidatorRecord(validator=first_validator, view=list, plugin='this_plugin', context=IntSequence1) second_record = ValidatorRecord(validator=second_validator, view=list, plugin='this_plugin', context=IntSequence1) validator_object.add_validator(first_record) validator_object.add_validator(second_record) self.assertEqual(validator_object.validators, [first_record, second_record]) self.assertTrue(validator_object._is_sorted) def test_run_validators(self): validator_object = ValidationObject(IntSequence1) has_run = False has_also_run = False def test_validator_method(data: list, level): nonlocal has_run has_run = True self.assertEqual(data, [0, 1, 2]) self.assertEqual(level, 'max') def test_another_validator(data: IntSequenceFormat, level): nonlocal has_also_run has_also_run = True self.assertEqual(level, 'max') test_record1 = ValidatorRecord(validator=test_validator_method, view=list, plugin='this_plugin', context=IntSequence1 | AscIntSequence) test_record2 = ValidatorRecord(validator=test_another_validator, view=IntSequenceFormat, plugin='this_plugin', context=IntSequence1) validator_object.add_validator(test_record1) validator_object.add_validator(test_record2) validator_object(self.simple_int_seq, level='max') self.assertTrue(has_run) self.assertTrue(has_also_run) def test_run_validators_validation_exception(self): validator_object = ValidationObject(AscIntSequence) def test_raising_validation_exception(data: list, level): raise ValidationError("2021-08-24") test_record = ValidatorRecord( validator=test_raising_validation_exception, view=list, plugin='this_plugin', context=AscIntSequence) validator_object.add_validator(test_record) with self.assertRaisesRegex(ValidationError, "2021-08-24"): validator_object(data=[], level=None) def test_run_validators_unknown_exception(self): validator_object = ValidationObject(AscIntSequence) def test_raising_validation_exception(data: list, level): raise KeyError("2021-08-24") test_record = ValidatorRecord( validator=test_raising_validation_exception, view=list, plugin='this_plugin', context=AscIntSequence) validator_object.add_validator(test_record) with self.assertRaisesRegex(ImplementationError, "attempted to validate"): validator_object(data=[], level=None) def test_validator_sorts(self): self.pm = PluginManager() test_object = self.pm.validators[Squid] self.assertFalse(test_object._is_sorted) exp = ['validator_sort_first', 'validator_sort_middle', 'validator_sort_middle_b', 'validator_sort_last'] exp2 = ['validator_sort_first', 'validator_sort_middle_b', 'validator_sort_middle', 'validator_sort_last'] obs = [record.validator.__name__ for record in test_object.validators] self.assertIn(obs, [exp, exp2]) self.assertTrue(test_object._is_sorted) class TestValidatorIntegration(unittest.TestCase): def setUp(self): # setup test plugin self.test_plugin = Plugin(name='validator_test_plugin', version='0.0.1', website='test.com', package='qiime2.core.tests', project_name='validator_test') self.pm = PluginManager() # setup test data self.simple_int_seq = IntSequenceFormat() with self.simple_int_seq.open() as fh: fh.write('\n'.join(map(str, range(3)))) self.simple_int_seq.validate(level='max') def tearDown(self): # This is a deadman switch to ensure that the test_plugin has been # added self.assertIn(self.test_plugin.name, self.pm.plugins) self.pm.forget_singleton() def test_validator_from_each_type_in_expression(self): @self.test_plugin.register_validator(IntSequence1 | AscIntSequence) def blank_validator(data: list, level): pass self.pm.add_plugin(self.test_plugin) def test_no_transformer_available(self): @self.test_plugin.register_validator(IntSequence1 | Kennel[Dog]) def blank_validator(data: list, level): pass with self.assertRaisesRegex( AssertionError, r"Kennel\[Dog\].*blank_validator.*transform.*builtins:list"): self.pm.add_plugin(self.test_plugin) class TestValidatorRegistration(unittest.TestCase): def setUp(self): self.test_plugin = Plugin(name='validator_test_plugin', version='0.0.1', website='test.com', package='qiime2.core.tests', project_name='validator_test') def test_catch_missing_validator_arg(self): run_checker = False with self.assertRaisesRegex(TypeError, "does not contain the" " required arguments"): run_checker = True @self.test_plugin.register_validator(IntSequence1) def validator_missing_level(data: list): pass assert run_checker def test_catch_extra_validator_arg(self): run_checker = False with self.assertRaisesRegex(TypeError, "does not contain the" " required arguments"): run_checker = True @self.test_plugin.register_validator(IntSequence1) def validator_extra_arg(data: list, level, spleen): pass assert run_checker def test_catch_no_data_annotation_in_validator(self): run_checker = False with self.assertRaisesRegex(TypeError, "No expected view type" " provided as annotation for `data`" " variable"): run_checker = True @self.test_plugin.register_validator(IntSequence1) def validator_no_view_annotation(data, level): pass assert run_checker qiime2-2024.5.0/qiime2/core/transform.py000066400000000000000000000203251462552636000176500ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pathlib from qiime2 import sdk from qiime2.plugin import model from qiime2.core import util def identity_transformer(view): return view class ModelType: @staticmethod def from_view_type(view_type): if issubclass(view_type, model.base.FormatBase): if issubclass(view_type, model.SingleFileDirectoryFormatBase): # HACK: this is necessary because we need to be able to "act" # like a FileFormat when looking up transformers, but our # input/output coercion still needs to bridge the # transformation as we do not have transitivity # In other words we have DX and we have transformers of X # In a perfect world we would automatically define DX -> X and # let transitivity handle it, but since that doesn't exist, we # need to treat DX as if it were X and coerce behind the scenes # TODO: redo this when transformers are transitive return SingleFileDirectoryFormatType(view_type) # Normal format type return FormatType(view_type) else: # TODO: supporting stdlib.typing may require an alternate # model type as `isinstance` is a meaningless operation # for them so validation would need to be handled differently return ObjectType(view_type) def __init__(self, view_type): self._pm = sdk.PluginManager() self._view_type = view_type self._view_name = util.get_view_name(self._view_type) self._record = None if self._view_name in self._pm.views: self._record = self._pm.views[self._view_name] def make_transformation(self, other, recorder=None): # record may be None in case of identity transformer transformer, transformer_record = self._get_transformer_to(other) if transformer is None: raise Exception("No transformation from %r to %r" % (self._view_type, other._view_type)) if recorder is not None: recorder(transformer_record, input_name=self._view_name, input_record=self._record, output_name=other._view_name, output_record=other._record) def transformation(view, validate_level='min'): view = self.coerce_view(view) self.validate(view, level=validate_level) new_view = transformer(view) new_view = other.coerce_view(new_view) other.validate(new_view) if transformer is not identity_transformer: other.set_user_owned(new_view, False) return new_view return transformation def _get_transformer_to(self, other): transformer, record = self._lookup_transformer(self._view_type, other._view_type) if transformer is None: return other._get_transformer_from(self) return transformer, record def has_transformation(self, other): """ Checks to see if there exist transformers for other Parameters ---------- other : ModelType subclass The object being checked for transformer Returns ------- bool Does the specified transformer exist for other? """ transformer, _ = self._get_transformer_to(other) return transformer is not None def _get_transformer_from(self, other): return None, None def coerce_view(self, view): return view def _lookup_transformer(self, from_, to_): if from_ == to_: return identity_transformer, None try: record = self._pm.transformers[from_][to_] return record.transformer, record except KeyError: return None, None def set_user_owned(self, view, value): pass class FormatType(ModelType): def coerce_view(self, view): if type(view) is str or isinstance(view, pathlib.Path): return self._view_type(view, mode='r') if isinstance(view, self._view_type): # wrap original path (inheriting the lifetime) and return a # read-only instance return self._view_type(view.path, mode='r') return view def validate(self, view, level='min'): if not isinstance(view, self._view_type): raise TypeError("%r is not an instance of %r." % (view, self._view_type)) # Formats have a validate method, so defer to it view.validate(level) def set_user_owned(self, view, value): view.path._user_owned = value class SingleFileDirectoryFormatType(FormatType): def __init__(self, view_type): # Single file directory formats have only one file named `file` # allowing us construct a model type from the format of `file` self._wrapped_view_type = view_type.file.format super().__init__(view_type) def _get_transformer_to(self, other): # Legend: # - Dx: single directory format of x # - Dy: single directory format of y # - x: input format x # - y: output format y # - ->: implicit transformer # - =>: registered transformer # - :> final transformation # - |: or, used when multiple situation are possible # It looks like all permutations because it is... # Dx :> y | Dy via Dx => y | Dy transformer, record = self._wrap_transformer(self, other) if transformer is not None: return transformer, record # Dx :> Dy via Dx -> x => y | Dy transformer, record = self._wrap_transformer(self, other, wrap_input=True) if transformer is not None: return transformer, record if type(other) is type(self): # Dx :> Dy via Dx -> x => y -> Dy transformer, record = self._wrap_transformer( self, other, wrap_input=True, wrap_output=True) if transformer is not None: return transformer, record # Out of options, try for Dx :> Dy via Dx => y -> Dy return other._get_transformer_from(self) # record is included def _get_transformer_from(self, other): # x | Dx :> Dy via x | Dx => y -> Dy # IMPORTANT: reverse other and self, this method is like __radd__ return self._wrap_transformer(other, self, wrap_output=True) def _wrap_transformer(self, in_, out_, wrap_input=False, wrap_output=False): input = in_._wrapped_view_type if wrap_input else in_._view_type output = out_._wrapped_view_type if wrap_output else out_._view_type transformer, record = self._lookup_transformer(input, output) if transformer is None: return None, None if wrap_input: transformer = in_._wrap_input(transformer) if wrap_output: transformer = out_._wrap_output(transformer) return transformer, record def _wrap_input(self, transformer): def wrapped(view): return transformer(view.file.view(self._wrapped_view_type)) return wrapped def _wrap_output(self, transformer): def wrapped(view): new_view = self._view_type() file_view = transformer(view) if transformer is not identity_transformer: self.set_user_owned(file_view, False) new_view.file.write_data(file_view, self._wrapped_view_type) return new_view return wrapped class ObjectType(ModelType): def validate(self, view, level=None): if not isinstance(view, self._view_type): raise TypeError("%r is not of type %r, cannot transform further." % (view, self._view_type)) qiime2-2024.5.0/qiime2/core/type/000077500000000000000000000000001462552636000162425ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/type/__init__.py000066400000000000000000000036241462552636000203600ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .collection import List, Set, Collection from .semantic import SemanticType, Properties from .primitive import (Str, Int, Float, Metadata, Bool, MetadataColumn, Categorical, Numeric, Range, Start, End, Choices, Jobs, Threads) from .visualization import Visualization from .signature import (PipelineSignature, MethodSignature, VisualizerSignature, IndexedCollectionElement, HashableInvocation) from .meta import TypeMap, TypeMatch from .util import (is_primitive_type, is_semantic_type, is_metadata_type, is_collection_type, is_visualization_type, interrogate_collection_type, parse_primitive, is_union, is_metadata_column_type, is_parallel_type) __all__ = [ # Type Helpers 'is_semantic_type', 'is_visualization_type', 'is_primitive_type', 'is_metadata_type', 'is_collection_type', 'interrogate_collection_type', 'parse_primitive', 'is_union', 'is_metadata_column_type', 'is_parallel_type', # Collection Types 'Set', 'List', 'Collection', # Semantic Types 'SemanticType', 'Properties', # Primitive Types 'Str', 'Int', 'Float', 'Bool', 'Metadata', 'MetadataColumn', 'Categorical', 'Numeric', 'Range', 'Start', 'End', 'Choices', 'Jobs', 'Threads', # Visualization Type 'Visualization', # Signatures 'PipelineSignature', 'MethodSignature', 'VisualizerSignature', 'IndexedCollectionElement', 'HashableInvocation', # Variables 'TypeMap', 'TypeMatch' ] qiime2-2024.5.0/qiime2/core/type/collection.py000066400000000000000000000077531462552636000207630ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import json from qiime2.core.type.template import TypeTemplate class _CollectionBase(TypeTemplate): public_proxy = 'encode', 'decode' def __init__(self): # For semantic types self.variant_of = frozenset() def __eq__(self, other): return type(self) is type(other) def get_name(self): return self.__class__.__name__[1:] # drop `_` def get_kind_expr(self, self_expr): if self_expr.fields: return self_expr.fields[0].kind return "" def get_kind(self): raise NotImplementedError def is_variant(self, self_expr, varfield): return False def validate_predicate(self, predicate): raise TypeError("Predicates cannot be applied to %r" % self.get_name()) def is_element_expr(self, self_expr, value): contained_expr = self_expr.fields[0] if isinstance(value, self._view) and len(value) > 0: return all(v in contained_expr for v in value) return False def is_element(self, value): raise NotImplementedError def get_union_membership_expr(self, self_expr): return self.get_name() + '-' + self.get_kind_expr(self_expr) # For primitive types def encode(self, value): return json.dumps(list(value)) def decode(self, string): return self._view(json.loads(string)) class _1DCollectionBase(_CollectionBase): def validate_field(self, name, field): if isinstance(field, _1DCollectionBase): raise TypeError("Cannot nest collection types.") if field.get_name() in {'MetadataColumn', 'Metadata'}: raise TypeError("Cannot use %r with metadata." % self.get_name()) def get_field_names(self): return ['type'] class _Set(_1DCollectionBase): _view = set class _List(_1DCollectionBase): _view = list def is_element_expr(self, self_expr, value): """Since this is a dictionary, we often need to make sure to use its values not its keys. """ from qiime2 import ResultCollection contained_expr = self_expr.fields[0] if isinstance(value, self._view) and len(value) > 0: return all(v in contained_expr for v in value) # The List's default type is unsurprisingly list, but we also want to # be able to pass in a Collection (dict) elif (isinstance(value, ResultCollection) or isinstance(value, dict)) \ and len(value) > 0: return all(v in contained_expr for v in value.values()) return False class _Tuple(_CollectionBase): _view = tuple def get_kind_expr(self, self_expr): return "" def get_field_names(self): return ['*types'] def validate_field_count(self, count): if not count: raise TypeError("Tuple type must contain at least one element.") def validate_field(self, name, field): # Tuples may contain anything, and as many fields as desired pass class _Collection(_1DCollectionBase): _view = dict def is_element_expr(self, self_expr, value): """Since this is a dictionary, we often need to make sure to use its values not its keys. """ from qiime2 import ResultCollection contained_expr = self_expr.fields[0] if (isinstance(value, ResultCollection) or isinstance(value, self._view)) and len(value) > 0: return all(v in contained_expr for v in value.values()) elif isinstance(value, list) and len(value) > 0: return all(v in contained_expr for v in value) return False Set = _Set() List = _List() Tuple = _Tuple() Collection = _Collection() qiime2-2024.5.0/qiime2/core/type/grammar.py000066400000000000000000000474351462552636000202570ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import itertools from abc import ABCMeta, abstractmethod from qiime2.core.util import tuplize, ImmutableBase def maximal_antichain(*types): maximal_elements = {} # easy to delete, retains order for t in types: placed = False for e in list(maximal_elements): if e <= t: # Delete first! Duplicate keys would disappear otherwise del maximal_elements[e] maximal_elements[t] = None placed = True if not placed: maximal_elements[t] = None return tuple(maximal_elements) def minimal_antichain(*types): minimal_elements = {} # easy to delete, retains order for t in types: placed = False for e in list(minimal_elements): if t <= e: # Delete first! Duplicate keys would disappear otherwise del minimal_elements[e] minimal_elements[t] = None placed = True if not placed: minimal_elements[t] = None return tuple(minimal_elements) class _ExpBase(ImmutableBase, metaclass=ABCMeta): def __init__(self, template): # Super basic smoke-test assert template is None or template.is_template self.template = template def __getattr__(self, name): if ('template' in self.__dict__ and self.template is not None and name in self.template.public_proxy): return getattr(self.template, name) raise AttributeError("%r object has no attribute %r" % (type(self), name)) # Prevent infinite recursion when pickling due to __getattr__ def __getstate__(self): return self.__dict__ def __setstate__(self, state): self.__dict__ = state @property def name(self): return self.template.get_name_expr(self) @property def kind(self): return self.template.get_kind_expr(self) @abstractmethod def __eq__(self, other): raise NotImplementedError def __ne__(self, other): return not self == other @abstractmethod def __le__(self, other): raise NotImplementedError @abstractmethod def __ge__(self, other): raise NotImplementedError @abstractmethod def __or__(self, other): raise NotImplementedError def __ror__(self, other): return self | other @abstractmethod def __and__(self, other): raise NotImplementedError def __rand__(self, other): return self & other @abstractmethod def equals(self, other): raise NotImplementedError def is_concrete(self): return False def iter_symbols(self): yield self.name class IncompleteExp(_ExpBase): def __init__(self, template): super().__init__(template) if (self.template is None or not list(self.template.get_field_names_expr(self))): raise ValueError("Template %r has no fields, should not be used" " with a IncompleteExp." % (template,)) def __eq__(self, other): if type(self) is not type(other): return NotImplemented return (self.name == other.name and tuple(self.template.get_field_names_expr(self)) == tuple(other.template.get_field_names_expr(self))) def __hash__(self): return (hash(type(self)) ^ hash(self.name) ^ hash(tuple(self.template.get_field_names_expr(self)))) def __repr__(self): fields = ', '.join( '{%s}' % f for f in self.template.get_field_names_expr(self)) return self.name + ('[%s]' % fields) def __le__(self, other): raise TypeError("Cannot compare subtype, %r is missing arguments" " for its fields." % (self,)) def __ge__(self, other): raise TypeError("Cannot compare supertype, %r is missing arguments" " for its fields." % (self,)) def __contains__(self, value): raise TypeError("Cannot check membership of %r, %r is missing" " arguments for its fields." % (value, self)) def __mod__(self, predicate): raise TypeError("Cannot apply predicate %r, %r is missing arguments" " for its fields." % (predicate, self)) def __or__(self, other): raise TypeError("Cannot union with %r, %r is missing arguments" " for its fields." % (other, self)) def __and__(self, other): raise TypeError("Cannot intersect with %r, %r is missing arguments" " for its fields." % (other, self)) def __getitem__(self, fields): fields = tuplize(fields) for field in fields: if not isinstance(field, _AlgebraicExpBase): raise TypeError("Field %r is not complete type expression." % (field,)) self.template.validate_fields_expr(self, fields) return TypeExp(self.template, fields=fields) def equals(self, other): return self == other class _AlgebraicExpBase(_ExpBase): def __le__(self, other): first = self._is_subtype_(other) if first is not NotImplemented: return first second = other._is_supertype_(self) if second is not NotImplemented: return second return False def __ge__(self, other): first = self._is_supertype_(other) if first is not NotImplemented: return first second = other._is_subtype_(self) if second is not NotImplemented: return second return False def __or__(self, other): if not ((self.is_bottom() or other.is_bottom()) or (self.get_union_membership() == other.get_union_membership() and self.get_union_membership() is not None)): raise TypeError("Cannot union %r and %r" % (self, other)) if self >= other: return self if self <= other: return other union = UnionExp((*self.unpack_union(), *other.unpack_union())) return union.normalize() def __and__(self, other): if (not self.can_intersect() or not other.can_intersect() or (self.kind != other.kind and not (self.is_top() or other.is_top()))): raise TypeError("Cannot intersect %r and %r" % (self, other)) # inverse of __or__ if self >= other: return other if self <= other: return self # Distribute over union if isinstance(self, UnionExp) or isinstance(other, UnionExp): m = [] for s, o in itertools.product(self.unpack_union(), other.unpack_union()): m.append(s & o) return UnionExp(m).normalize() elements = list(itertools.chain(self.unpack_intersection(), other.unpack_intersection())) if len(elements) > 1: # Give the expression a chance to collapse, as many intersections # are contradictions collapse = elements[0]._collapse_intersection_(elements[1]) if collapse is not NotImplemented: for e in elements[2:]: collapse = collapse._collapse_intersection_(e) return collapse # Back to the regularly scheduled inverse of __or__ members = minimal_antichain(*self.unpack_intersection(), *other.unpack_intersection()) return IntersectionExp(members) def _collapse_intersection_(self, other): return NotImplemented def equals(self, other): return self <= other <= self def is_concrete(self): return len(list(self.unpack_union())) == 1 def get_union_membership(self): if self.template is not None: return self.template.get_union_membership_expr(self) return True def can_intersect(self): return True # These methods are to be overridden by UnionExp def is_bottom(self): return False def unpack_union(self): yield self # These methods are to be overridden by IntersectionExp def is_top(self): return False def unpack_intersection(self): yield self class TypeExp(_AlgebraicExpBase): def __init__(self, template, fields=(), predicate=None): super().__init__(template) if predicate is not None and predicate.is_top(): predicate = None self.fields = tuple(fields) self.predicate = predicate super()._freeze_() @property def full_predicate(self): if self.predicate is None: return IntersectionExp() return self.predicate def __eq__(self, other): if type(self) is not type(other): return NotImplemented return (self.kind == other.kind and self.name == other.name and self.fields == other.fields and self.full_predicate == other.full_predicate) def __hash__(self): return (hash(type(self)) ^ hash(self.kind) ^ hash(self.name) ^ hash(self.fields) ^ hash(self.predicate)) def __repr__(self): result = self.name if self.fields: result += '[%s]' % ', '.join(repr(f) for f in self.fields) if self.predicate: predicate = repr(self.predicate) if self.predicate.template is None: # is _IdentityExpBase predicate = '(%s)' % predicate result += ' % ' + predicate return result def __getitem__(self, fields): raise TypeError("Cannot apply fields (%r) to %r," " fields already present." % (fields, self)) def __contains__(self, value): return (self.template.is_element_expr(self, value) and value in self.full_predicate) def __iter__(self): yield from {self.duplicate(fields=fields) for fields in itertools.product(*self.fields)} def iter_symbols(self): yield self.name for field in self.fields: yield from field.iter_symbols() def _is_subtype_(self, other): if other.template is None: return NotImplemented if not self.template.is_symbol_subtype_expr(self, other): return False for f1, f2 in itertools.zip_longest(self.fields, other.fields, # more fields = more specific fillvalue=IntersectionExp()): if not (f1 <= f2): return False if not (self.full_predicate <= other.full_predicate): return False return True def _is_supertype_(self, other): return NotImplemented def __mod__(self, predicate): if self.predicate: raise TypeError("%r already has a predicate, will not add %r" % (self, predicate)) if predicate is None or predicate.is_top(): return self return self.duplicate(predicate=predicate) def __rmod__(self, other): raise TypeError("Predicate (%r) must be applied to the right-hand side" " of a type expression." % (other,)) def duplicate(self, fields=(), predicate=None): if fields == (): fields = self.fields else: self.template.validate_fields_expr(self, fields) if predicate is None: predicate = self.predicate elif predicate.is_top(): predicate = None elif predicate.template is not None: self.template.validate_predicate_expr(self, predicate) return self.__class__(self.template, fields=fields, predicate=predicate) def _collapse_intersection_(self, other): if self.name != other.name: return UnionExp() new_fields = tuple( s & o for s, o in itertools.zip_longest(self.fields, other.fields, # same as a type mismatch fillvalue=UnionExp())) if any(f.is_bottom() for f in new_fields): return UnionExp() new_predicate = self.full_predicate & other.full_predicate if new_predicate.is_bottom(): return UnionExp() return self.duplicate(fields=new_fields, predicate=new_predicate) def is_concrete(self): return self._bool_attr_method('is_concrete') def _bool_attr_method(self, method_name): def method(s): return getattr(s, method_name)() if any(not method(f) for f in self.fields): return False if not method(self.full_predicate): return False return True def to_ast(self): ast = { "type": "expression", "builtin": True, "name": self.name, "predicate": self.predicate.to_ast() if self.predicate else None, "fields": [field.to_ast() for field in self.fields] } self.template.update_ast_expr(self, ast) return ast class PredicateExp(_AlgebraicExpBase): def __init__(self, template): super().__init__(template) super()._freeze_() def __eq__(self, other): return self.template == other.template def __hash__(self): return hash(self.template) def __contains__(self, value): return self.template.is_element_expr(self, value) def __repr__(self): return repr(self.template) def _is_subtype_(self, other): if (other.template is not None and self.template.is_symbol_subtype_expr(self, other)): return True return NotImplemented def _is_supertype_(self, other): if (other.template is not None and self.template.is_symbol_supertype_expr(self, other)): return True return NotImplemented def _collapse_intersection_(self, other): first = self.template.collapse_intersection(other.template) if first is None: return UnionExp() elif first is not NotImplemented: return self.__class__(first) second = other.template.collapse_intersection(self.template) if second is None: return UnionExp() elif second is not NotImplemented: return self.__class__(second) return NotImplemented def to_ast(self): ast = { "type": "predicate", "name": self.name, } self.template.update_ast_expr(self, ast) return ast class _IdentityExpBase(_AlgebraicExpBase): """ Base class for IntersectionExp and UnionExp. If there are no members, then they are Top or Bottom types respectively and represent identity values (like 1 for mul and 0 for add) for the type algebra. There is no template object for these expressions. That property will always be `None`. """ _operator = ' ? ' def __init__(self, members=()): super().__init__(template=None) self.members = tuple(members) super()._freeze_() @property def kind(self): if not self.members: return "identity" return self.members[0].kind @property def name(self): return "" def __eq__(self, other): return (type(self) is type(other) and set(self.members) == set(other.members)) def __hash__(self): return hash(type(self)) ^ hash(frozenset(self.members)) def __repr__(self): if not self.members: return self.__class__.__name__ + "()" return self._operator.join(repr(m) for m in self.members) def __iter__(self): for m in self.unpack_union(): yield from m def iter_symbols(self): for m in self.unpack_union(): yield from m.iter_symbols() def get_union_membership(self): if self.members: return self.members[0].get_union_membership() class UnionExp(_IdentityExpBase): _operator = ' | ' # used by _IdentityExpBase.__repr__ def __contains__(self, value): return any(value in s for s in self.members) def _is_subtype_(self, other): if (isinstance(other, self.__class__) and type(other) is not self.__class__): # other is subclass return NotImplemented # if other isn't a union, becomes all(s <= other for s in self.members) return all(any(s <= o for o in other.unpack_union()) for s in self.unpack_union()) def _is_supertype_(self, other): return all(any(s >= o for s in self.unpack_union()) for o in other.unpack_union()) def is_bottom(self): return not self.members def unpack_union(self): yield from self.members def to_ast(self): return { "type": "union", "members": [m.to_ast() for m in self.members] } def normalize(self): elements = self.members groups = {} for e in elements: if type(e) is TypeExp: candidate = e.duplicate(predicate=IntersectionExp()) if candidate in groups: groups[candidate].append(e) else: groups[candidate] = [e] else: # groups should be empty already, but don't even attempt # collapsing if its a union of type expressions and "other" groups = {} break if groups: elements = [] for candidate, group in groups.items(): if len(group) == 1: elements.append(group[0]) else: predicate = UnionExp([t.full_predicate for t in group]) predicate = predicate.normalize() elements.append(candidate.duplicate(predicate=predicate)) if len(elements) < 20: members = maximal_antichain(*elements) else: members = elements if len(members) == 1: return members[0] return UnionExp(members) class IntersectionExp(_IdentityExpBase): _operator = ' & ' # used by _IdentityExpBase.__repr__ def __contains__(self, value): return all(value in s for s in self.members) def _is_subtype_(self, other): if isinstance(other, UnionExp): # Union will treat `self` as an atomic type, comparing # its elements against `self`. This in turn will recurse back to # `self` allowing it to check if it is a subtype of the union # elements. That check will ultimately compare the elements of # `self` against a single element of the union. return NotImplemented return all(any(s <= o for s in self.unpack_intersection()) for o in other.unpack_intersection()) def _is_supertype_(self, other): if isinstance(other, UnionExp): return NotImplemented return all(any(s >= o for o in other.unpack_intersection()) for s in self.unpack_intersection()) def is_top(self): return not self.members def unpack_intersection(self): yield from self.members def to_ast(self): return { "type": "intersection", "members": [m.to_ast() for m in self.members] } qiime2-2024.5.0/qiime2/core/type/meta.py000066400000000000000000000251651462552636000175530ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import itertools from types import MappingProxyType from ..util import superscript, tuplize, ImmutableBase from .grammar import UnionExp, TypeExp from .collection import Tuple class TypeVarExp(UnionExp): def __init__(self, members, tmap, input=False, output=False, index=None): self.mapping = tmap self.input = input self.output = output self.index = index super().__init__(members) def __repr__(self): numbers = {} for idx, m in enumerate(self.members, 1): if m in numbers: numbers[m] += superscript(',' + str(idx)) else: numbers[m] = superscript(idx) return " | ".join([repr(k) + v for k, v in numbers.items()]) def uniq_upto_sub(self, a_expr, b_expr): """ Two elements are unique up to a subtype if they are indistinguishable with respect to that subtype. In the case of a type var, that means the same branches must be "available" in the type map. This means that A or B may have additional refinements (or may even be subtypes of each other), so long as that does not change the branch chosen by the type map. """ a_branches = [m for m in self.members if a_expr <= m] b_branches = [m for m in self.members if b_expr <= m] return a_branches == b_branches def __eq__(self, other): return (type(self) is type(other) and self.index == other.index and self.mapping == other.mapping) def __hash__(self): return hash(self.index) ^ hash(self.mapping) def is_concrete(self): return False def can_intersect(self): return False def get_union_membership_expr(self, self_expr): return None def _is_subtype_(self, other): return all(m <= other for m in self.members) def _is_supertype_(self, other): return any(m >= other for m in self.members) def __iter__(self): yield from self.members def unpack_union(self): yield self def to_ast(self): return { "type": "variable", "index": self.index, "group": id(self.mapping), "outputs": self.mapping.input_width(), "mapping": [ ([k.to_ast() for k in key.fields] + [v.to_ast() for v in value.fields]) for key, value in self.mapping.lifted.items()] } class TypeMap(ImmutableBase): def __init__(self, mapping): mapping = {Tuple[tuplize(k)]: Tuple[tuplize(v)] for k, v in mapping.items()} branches = list(mapping) for i, a in enumerate(branches): for j in range(i, len(branches)): b = branches[j] try: intersection = a & b except TypeError: raise ValueError("Cannot place %r and %r in the same " "type variable." % (a, b)) if (intersection.is_bottom() or intersection is a or intersection is b): continue for k in range(i): if intersection <= branches[k]: break else: raise ValueError( "Ambiguous resolution for invocations with type %r." " Could match %r or %r, add a new branch ABOVE these" " two (or modify these branches) to correct this." % (intersection.fields, a.fields, b.fields)) self.__lifted = mapping super()._freeze_() @property def lifted(self): return MappingProxyType(self.__lifted) def __eq__(self, other): return self is other def __hash__(self): return hash(id(self)) def __iter__(self): for idx, members in enumerate( zip(*(k.fields for k in self.lifted.keys()))): yield TypeVarExp(members, self, input=True, index=idx) yield from self.iter_outputs() def solve(self, *inputs): inputs = Tuple[inputs] for branch, outputs in self.lifted.items(): if inputs <= branch: return outputs.fields def input_width(self): return len(next(iter(self.lifted.keys())).fields) def iter_outputs(self, *, _double_as_input=False): start = self.input_width() for idx, members in enumerate( zip(*(v.fields for v in self.lifted.values())), start): yield TypeVarExp(members, self, output=True, index=idx, input=_double_as_input) def _get_intersections(listing): intersections = [] for a, b in itertools.combinations(listing, 2): i = a & b if i.is_bottom() or i is a or i is b: continue intersections.append(i) return intersections def TypeMatch(listing): listing = list(listing) intersections = _get_intersections(listing) to_add = [] while intersections: to_add.extend(intersections) intersections = _get_intersections(intersections) mapping = TypeMap({x: x for x in list(reversed(to_add)) + listing}) # TypeMatch only produces a single variable # iter_outputs is used by match for solving, so the index must match return next(iter(mapping.iter_outputs(_double_as_input=True))) def select_variables(expr): """When called on an expression, will yield selectors to the variable. A selector will either return the variable (or equivalent fragment) in an expression, or will return an entirely new expression with the fragment replaced with the value of `swap`. e.g. >>> from qiime2.core.type.tests.test_grammar import (MockTemplate, ... MockPredicate) >>> Example = MockTemplate('Example', fields=('x',)) >>> Foo = MockTemplate('Foo') >>> Bar = MockPredicate('Bar') >>> T = TypeMatch([Foo]) >>> U = TypeMatch([Bar]) >>> select_u, select_t = select_variables(Example[T] % U) >>> t = select_t(Example[T] % U) >>> assert T is t >>> u = select_u(Example[T] % U) >>> assert U is u >>> frag = select_t(Example[Foo] % Bar) >>> assert frag is Foo >>> new_expr = select_t(Example[T] % U, swap=frag) >>> assert new_expr == Example[Foo] % U """ if type(expr) is TypeVarExp: def select(x, swap=None): if swap is not None: return swap return x yield select return if type(expr) is not TypeExp: return if type(expr.full_predicate) is TypeVarExp: def select(x, swap=None): if swap is not None: return x.duplicate(predicate=swap) return x.full_predicate yield select for idx, field in enumerate(expr.fields): for sel in select_variables(field): # Without this closure, the idx in select will be the last # value of the enumerate, same for sel # (Same problem as JS with callbacks inside a loop) def closure(idx, sel): def select(x, swap=None): if swap is not None: new_fields = list(x.fields) new_fields[idx] = sel(x.fields[idx], swap) return x.duplicate(fields=tuple(new_fields)) return sel(x.fields[idx]) return select yield closure(idx, sel) def match(provided, inputs, outputs): provided_binding = {} error_map = {} for key, expr in inputs.items(): for selector in select_variables(expr): var = selector(expr) provided_fragment = selector(provided[key]) try: current_binding = provided_binding[var] except KeyError: provided_binding[var] = provided_fragment error_map[var] = provided[key] else: if not var.uniq_upto_sub(current_binding, provided_fragment): raise ValueError("Received %r and %r, but expected %r" " and %r to match (or to select the same" " output)." % (error_map[var], provided[key], current_binding, provided_fragment)) # provided_binding now maps TypeVarExp instances to a TypeExp instance # which is the relevent fragment from the provided input types grouped_maps = {} for item in provided_binding.items(): var = item[0] if var.mapping not in grouped_maps: grouped_maps[var.mapping] = [item] else: grouped_maps[var.mapping].append(item) # grouped_maps now maps a TypeMap instance to tuples of # (TypeVarExp, TypeExp) which are the items of provided_binding # i.e. all of the bindings are now grouped under their shared type maps output_fragments = {} for mapping, group in grouped_maps.items(): if len(group) != mapping.input_width(): raise ValueError("Missing input variables") inputs = [x[1] for x in sorted(group, key=lambda x: x[0].index)] solved = mapping.solve(*inputs) if solved is None: provided = tuple(error_map[x[0]] for x in sorted(group, key=lambda x: x[0].index)) raise ValueError("No solution for inputs: %r, check the signature " "to see valid combinations." % (provided,)) # type vars share identity by instance of map and index, so we will # be able to see the "same" vars again when looking up the outputs for var, out in zip(mapping.iter_outputs(), solved): output_fragments[var] = out # output_fragments now maps a TypeVarExp to a TypeExp which is the solved # fragment for the given output type variable results = {} for key, expr in outputs.items(): r = expr # output may not have a typevar, so default is the expr for selector in select_variables(expr): var = selector(expr) r = selector(r, swap=output_fragments[var]) results[key] = r # results now maps a key to a full TypeExp as solved by the inputs return results qiime2-2024.5.0/qiime2/core/type/parse.py000066400000000000000000000172071462552636000177350ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import ast from . import grammar, meta, collection, primitive, semantic, visualization def string_to_ast(type_expr): try: parsed = ast.parse(type_expr) except SyntaxError: raise ValueError("%r could not be parsed, it may not be a QIIME 2 type" " or it may not be an atomic type. Use" " `ast_to_type` instead." % (type_expr,)) if type(parsed) is not ast.Module: # I don't think this branch *can* be hit raise ValueError("%r is not a type expression." % (type_expr,)) try: expr, = parsed.body except ValueError: raise ValueError("Only one type expression may be parse at a time, got" ": %r" % (type_expr,)) return _expr(expr.value) def _expr(expr): node = type(expr) if node is ast.Name: return _build_atomic(expr.id) if node is ast.Call: args = _parse_args(expr.args) kwargs = _parse_kwargs(expr.keywords) return _build_predicate(expr.func.id, args, kwargs) if node is ast.Subscript: field_expr = expr.slice if type(field_expr) is ast.Tuple: field_expr = field_expr.elts else: field_expr = (field_expr,) base = _expr(expr.value) base['fields'] = [_expr(e) for e in field_expr] return base if node is ast.BinOp: op = type(expr.op) left = _expr(expr.left) right = _expr(expr.right) if op is ast.Mod: left['predicate'] = right return left if op is ast.BitOr: return _build_union(left, right) if op is ast.BitAnd: return _build_intersection(left, right) raise ValueError("Unknown expression: %r" % node) def _convert_literals(expr): node = type(expr) if node is ast.List: return [_convert_literals(e) for e in expr.elts] if node is ast.Set: return {_convert_literals(e) for e in expr.elts} if node is ast.Tuple: return tuple(_convert_literals(e) for e in expr.elts) if node is ast.Dict: return {_convert_literals(k): _convert_literals(v) for k, v in zip(expr.keys, expr.values)} if node is ast.Constant: return expr.value if node is ast.Name and expr.id == 'inf': return float('inf') raise ValueError("Unknown literal: %r" % node) def _parse_args(args): return tuple(_convert_literals(e) for e in args) def _parse_kwargs(kwargs): return {e.arg: _convert_literals(e.value) for e in kwargs} def _build_predicate(name, args, kwargs): base = { 'type': 'predicate', 'name': name } if name == 'Properties': return _build_properties(base, args, kwargs) if name == 'Range': return _build_range(base, args, kwargs) if name == 'Choices': return _build_choices(base, args, kwargs) def _normalize_input_collection(args): if len(args) == 1 and isinstance(args[0], (list, set, tuple)): return tuple(args[0]) return args def _build_choices(base, args, kwargs): if 'choices' in kwargs: args = (kwargs['choices'],) args = _normalize_input_collection(args) base['choices'] = list(args) return base def _build_range(base, args, kwargs): inclusive_start = kwargs.get('inclusive_start', True) inclusive_end = kwargs.get('inclusive_end', False) start = None end = None if len(args) == 1: end = args[0] elif len(args) != 0: start, end = args if start == float('-inf'): start = None if end == float('inf'): end = None base['range'] = [start, end] base['inclusive'] = [inclusive_start, inclusive_end] return base def _build_properties(base, args, kwargs): exclude = kwargs.get('exclude', []) if 'include' in kwargs: args = (kwargs['include'],) args = _normalize_input_collection(args) base['include'] = list(args) base['exclude'] = list(exclude) return base def _build_atomic(name): return { 'type': 'expression', 'builtin': name in {'Str', 'Int', 'Float', 'Bool', 'List', 'Set', 'Tuple', 'Visualization', 'Metadata', 'MetadataColumn', 'Numeric', 'Categorical'}, 'name': name, 'predicate': None, 'fields': [] } def _build_union(left, right): return _build_ident(left, right, 'union') def _build_intersection(left, right): return _build_ident(left, right, 'intersection') def _build_ident(left, right, type): members = [] if left['type'] == type: members.extend(left['members']) else: members.append(left) if right['type'] == type: members.extend(right['members']) else: members.append(right) return { 'type': type, 'members': members } def ast_to_type(json_ast, scope=None): if scope is None: scope = {} type_ = json_ast['type'] if type_ == 'expression': predicate = json_ast['predicate'] if predicate is not None: predicate = ast_to_type(predicate, scope=scope) fields = json_ast['fields'] if len(fields) > 0: fields = [ast_to_type(f, scope=scope) for f in fields] name = json_ast['name'] if not json_ast['builtin']: base_template = semantic.SemanticType( name, field_names=['field' + str(i) for i in range(len(fields))], field_members={ ('field' + str(i)): [child] for i, child in enumerate(fields) }, ).template elif name == 'Visualization': return visualization.Visualization elif name in {'List', 'Set', 'Tuple', 'Collection'}: base_template = getattr(collection, name).template else: base_template = getattr(primitive, name).template return grammar.TypeExp(base_template, fields=fields, predicate=predicate) if type_ == 'predicate': name = json_ast['name'] if name == 'Choices': return primitive.Choices(json_ast['choices']) if name == 'Range': return primitive.Range(*json_ast['range'], inclusive_start=json_ast['inclusive'][0], inclusive_end=json_ast['inclusive'][1]) if name == 'Properties': return semantic.Properties(json_ast['include'], exclude=json_ast['exclude']) if type_ == 'union': members = [ast_to_type(m, scope=scope) for m in json_ast['members']] return grammar.UnionExp(members) if type_ == 'intersection': members = [ast_to_type(m, scope=scope) for m in json_ast['members']] return grammar.IntersectionExp(members) if type_ == 'variable': var_group = json_ast['group'] if var_group not in scope: mapping = {} out_idx = json_ast['outputs'] for entry in json_ast['mapping']: entry = [ast_to_type(e) for e in entry] mapping[tuple(entry[:out_idx])] = tuple(entry[out_idx:]) scope[var_group] = list(meta.TypeMap(mapping)) return scope[var_group][json_ast['index']] qiime2-2024.5.0/qiime2/core/type/primitive.py000066400000000000000000000353311462552636000206310ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import numbers import itertools from qiime2.core.type.template import TypeTemplate, PredicateTemplate import qiime2.metadata as metadata import qiime2.core.util as util _RANGE_DEFAULT_START = float('-inf') _RANGE_DEFAULT_END = float('inf') _RANGE_DEFAULT_INCLUSIVE_START = True _RANGE_DEFAULT_INCLUSIVE_END = False class _PrimitivePredicateBase(PredicateTemplate): def get_kind(self): return 'primitive' def get_name(self): return self.__class__.__name__ class Range(_PrimitivePredicateBase): def __init__(self, *args, inclusive_start=_RANGE_DEFAULT_INCLUSIVE_START, inclusive_end=_RANGE_DEFAULT_INCLUSIVE_END): if len(args) == 2: self.start, self.end = args elif len(args) == 1: self.start = _RANGE_DEFAULT_START self.end, = args elif len(args) == 0: self.start = _RANGE_DEFAULT_START self.end = _RANGE_DEFAULT_END else: raise ValueError("Too many arguments passed, expected 0, 1, or 2.") self.inclusive_start = inclusive_start self.inclusive_end = inclusive_end if self.start is None: self.start = _RANGE_DEFAULT_START if self.end is None: self.end = _RANGE_DEFAULT_END if self.end < self.start: raise ValueError("End of range precedes start.") def __hash__(self): return (hash(type(self)) ^ hash(self.start) ^ hash(self.end) ^ hash(self.inclusive_start) ^ hash(self.inclusive_end)) def __eq__(self, other): return (type(self) is type(other) and self.start == other.start and self.end == other.end and self.inclusive_start == other.inclusive_start and self.inclusive_end == other.inclusive_end) def __repr__(self): args = [] start = self.start if start == float('-inf'): start = None end = self.end if end == float('inf'): end = None args.append(repr(start)) args.append(repr(end)) if self.inclusive_start is not _RANGE_DEFAULT_INCLUSIVE_START: args.append('inclusive_start=%r' % self.inclusive_start) if self.inclusive_end is not _RANGE_DEFAULT_INCLUSIVE_END: args.append('inclusive_end=%r' % self.inclusive_end) return "Range(%s)" % (', '.join(args),) def is_element(self, value): if self.inclusive_start: if value < self.start: return False elif value <= self.start: return False if self.inclusive_end: if value > self.end: return False elif value >= self.end: return False return True def is_symbol_subtype(self, other): if type(self) is not type(other): return False if other.start > self.start: return False elif (other.start == self.start and (not other.inclusive_start) and self.inclusive_start): return False if other.end < self.end: return False elif (other.end == self.end and (not other.inclusive_end) and self.inclusive_end): return False return True def is_symbol_supertype(self, other): if type(self) is not type(other): return False if other.start < self.start: return False elif (other.start == self.start and (not self.inclusive_start) and other.inclusive_start): return False if other.end > self.end: return False elif (other.end == self.end and (not self.inclusive_end) and other.inclusive_end): return False return True def collapse_intersection(self, other): if type(self) is not type(other): return None if self.start < other.start: new_start = other.start new_inclusive_start = other.inclusive_start elif other.start < self.start: new_start = self.start new_inclusive_start = self.inclusive_start else: new_start = self.start new_inclusive_start = ( self.inclusive_start and other.inclusive_start) if self.end > other.end: new_end = other.end new_inclusive_end = other.inclusive_end elif other.end > self.end: new_end = self.end new_inclusive_end = self.inclusive_end else: new_end = self.end new_inclusive_end = self.inclusive_end and other.inclusive_end if new_end < new_start: return None if (new_start == new_end and not (new_inclusive_start and new_inclusive_end)): return None return self.__class__(new_start, new_end, inclusive_start=new_inclusive_start, inclusive_end=new_inclusive_end).template def iter_boundaries(self): if self.start != float('-inf'): yield self.start if self.end != float('inf'): yield self.end def update_ast(self, ast): start = self.start if start == float('-inf'): start = None end = self.end if end == float('inf'): end = None ast['range'] = [start, end] ast['inclusive'] = [self.inclusive_start, self.inclusive_end] def Start(start, inclusive=_RANGE_DEFAULT_INCLUSIVE_START): return Range(start, _RANGE_DEFAULT_END, inclusive_start=inclusive) def End(end, inclusive=_RANGE_DEFAULT_INCLUSIVE_END): return Range(_RANGE_DEFAULT_START, end, inclusive_end=inclusive) class Choices(_PrimitivePredicateBase): def __init__(self, *choices): if not choices: raise ValueError("'Choices' cannot be instantiated with an empty" " set.") # Backwards compatibility with old Choices({1, 2, 3}) syntax if len(choices) == 1: if not isinstance(choices[0], (bool, str)): choices = choices[0] if type(choices) is set: # Choices should sort the provided set so interfaces which # cache the type have a predictable order choices = sorted(choices) self.choices = choices = tuple(choices) if len(choices) != len(set(choices)): raise ValueError("Duplicates found in choices: %r" % util.find_duplicates(choices)) def __hash__(self): return hash(type(self)) ^ hash(frozenset(self.choices)) def __eq__(self, other): return (type(self) is type(other) and set(self.choices) == set(other.choices)) def __repr__(self): return "%s(%s)" % (self.__class__.__name__, repr(list(self.choices))[1:-1]) def is_element(self, value): return value in self.choices def is_symbol_subtype(self, other): if type(self) is not type(other): return False return set(self.choices) <= set(other.choices) def is_symbol_supertype(self, other): if type(self) is not type(other): return False return set(self.choices) >= set(other.choices) def collapse_intersection(self, other): if type(self) is not type(other): return None new_choices_set = set(self.choices) & set(other.choices) if not new_choices_set: return None # order by appearance: new_choices = [] for c in itertools.chain(self.choices, other.choices): if c in new_choices_set: new_choices.append(c) new_choices_set.remove(c) return self.__class__(new_choices).template def iter_boundaries(self): yield from self.choices def update_ast(self, ast): ast['choices'] = list(self.choices) def unpack_union(self): for c in self.choices: yield self.__class__(c) class _PrimitiveTemplateBase(TypeTemplate): public_proxy = 'encode', 'decode' def __eq__(self, other): return type(self) is type(other) def __hash__(self): return hash(type(self)) def get_name(self): return self.__class__.__name__[1:] # drop `_` def get_kind(self): return 'primitive' def get_field_names(self): return [] def validate_field(self, name, field): raise NotImplementedError def validate_predicate_expr(self, self_expr, predicate_expr): predicate = predicate_expr.template if type(predicate) not in self._valid_predicates: raise TypeError(str(predicate_expr)) for bound in predicate.iter_boundaries(): if not self.is_element_expr(self_expr, bound): raise TypeError(bound) def validate_predicate(self, predicate): raise NotImplementedError class _Int(_PrimitiveTemplateBase): _valid_predicates = {Range} def is_element(self, value): return (value is not True and value is not False and isinstance(value, numbers.Integral)) def is_symbol_subtype(self, other): if other.get_name() == 'Float': return True return super().is_symbol_subtype(other) def decode(self, string): return int(string) def encode(self, value): return str(value) class _Str(_PrimitiveTemplateBase): _valid_predicates = {Choices} def is_element(self, value): return isinstance(value, str) def decode(self, string): return str(string) def encode(self, value): return str(value) class _Float(_PrimitiveTemplateBase): _valid_predicates = {Range} def is_symbol_supertype(self, other): if other.get_name() == 'Int': return True return super().is_symbol_supertype(other) def is_element(self, value): # Works with numpy just fine. return (value is not True and value is not False and isinstance(value, numbers.Real)) def decode(self, string): return float(string) def encode(self, value): return str(value) class _Bool(_PrimitiveTemplateBase): _valid_predicates = {Choices} def is_element(self, value): return value is True or value is False def validate_predicate(self, predicate): if type(predicate) is Choices: if set(predicate.iter_boundaries()) == {True, False}: raise TypeError("Choices should be ommitted when " "Choices(True, False).") def decode(self, string): if string not in ('false', 'true'): raise TypeError("%s is neither 'true' or 'false'" % string) return string == 'true' def encode(self, value): if value: return 'true' else: return 'false' class _Metadata(_PrimitiveTemplateBase): _valid_predicates = set() def is_element(self, value): return isinstance(value, metadata.Metadata) def decode(self, metadata): # This interface should have already retrieved this object. if not self.is_element(metadata): raise TypeError("`Metadata` must be provided by the interface" " directly.") return metadata def encode(self, value): # TODO: Should this be the provenance representation? Does that affect # decode? return value class _MetadataColumn(_PrimitiveTemplateBase): _valid_predicates = set() def is_element_expr(self, self_expr, value): return value in self_expr.fields[0] def is_element(self, value): raise NotImplementedError def get_field_names(self): return ["type"] def validate_field(self, name, field): if field.get_name() not in ("Numeric", "Categorical"): raise TypeError("Unsupported type in field: %r" % (field.get_name(),)) def decode(self, value): # This interface should have already retrieved this object. if not isinstance(value, metadata.MetadataColumn): raise TypeError("`Metadata` must be provided by the interface" " directly.") return value def encode(self, value): # TODO: Should this be the provenance representation? Does that affect # decode? return value class _Categorical(_PrimitiveTemplateBase): _valid_predicates = set() def get_union_membership_expr(self, self_expr): return 'metadata-column' def is_element(self, value): return isinstance(value, metadata.CategoricalMetadataColumn) class _Numeric(_PrimitiveTemplateBase): _valid_predicates = set() def get_union_membership_expr(self, self_expr): return 'metadata-column' def is_element(self, value): return isinstance(value, metadata.NumericMetadataColumn) class _Jobs(_PrimitiveTemplateBase): def is_element(self, value): return ( value is not True and value is not False and isinstance(value, numbers.Integral) and value >= 1 ) def decode(self, string): return int(string) def encode(self, value): return str(value) class _Threads(_PrimitiveTemplateBase): def is_element(self, value): if value == 'auto': return True return ( value is not True and value is not False and isinstance(value, numbers.Integral) and value >= 0 ) def decode(self, string): if string == 'auto': return string return int(string) def encode(self, value): return str(value) Int = _Int() Float = _Float() Bool = _Bool() Str = _Str() Metadata = _Metadata() MetadataColumn = _MetadataColumn() Categorical = _Categorical() Numeric = _Numeric() Jobs = _Jobs() Threads = _Threads() def infer_primitive_type(value): for t in (Int, Float): if value in t: return t % Range(value, value, inclusive_end=True) for t in (Bool, Str): if value in t: return t % Choices(value) for t in (Metadata, MetadataColumn[Categorical], MetadataColumn[Numeric]): if value in t: return t raise ValueError("Unknown primitive type: %r" % (value,)) qiime2-2024.5.0/qiime2/core/type/semantic.py000066400000000000000000000264201462552636000204230ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import types import collections.abc import itertools from qiime2.core.type.grammar import IncompleteExp, UnionExp, IntersectionExp from qiime2.core.type.template import TypeTemplate, PredicateTemplate from qiime2.core.type.util import is_semantic_type, is_qiime_type _RESERVED_NAMES = { # Predicates: 'range', 'choice', 'properties', 'arguments', # Primitives: 'integer', 'int', 'string', 'str', 'metadata', 'metadatacolumn', 'categoricalmetadatacolumn', 'numericmetadatacolumn', 'column', 'categoricalcolumn', 'numericcolumn', 'metacol', 'categoricalmetacol', 'numericmetacol', 'metadatacategory', 'float', 'double', 'number', 'set', 'list', 'bag', 'multiset', 'map', 'dict', 'nominal', 'ordinal', 'categorical', 'numeric', 'interval', 'ratio', 'continuous', 'discrete', 'tuple', 'row', 'record', # Type System: 'semantictype', 'propertymap', 'propertiesmap', 'typemap', 'typevariable', 'predicate' } def _validate_name(name): if type(name) is not str: raise TypeError("Names of semantic types must be strings, not %r." % name) if name.lower() in _RESERVED_NAMES: raise ValueError("%r is a reserved name." % name) def SemanticType(name, field_names=None, field_members=None, variant_of=None): """Create a new semantic type. Parameters ---------- name : str The name of the semantic type: this should match the variable to which the semantic type is assigned. field_names : str, iterable of str, optional Name(s) of the fields where member types can be placed. This makes the type a composite type, meaning that fields must be provided to produce realized semantic types. These names will define ad-hoc variant types accessible as `name`.field[`field_names` member]. field_members : mapping, optional A mapping of strings in `field_names` to one or more semantic types which are known to be members of the field (the variant type). variant_of : VariantField, iterable of VariantField, optional Define the semantic type to be a member of one or more variant types allowing it to be placed in the respective fields defined by those variant types. Returns ------- A Semantic Type There are several (private) types which may be returned, but anything returned by this factory will cause `is_semantic_type` to return True. """ _validate_name(name) variant_of = _munge_variant_of(variant_of) field_names = _munge_field_names(field_names) field_members = _munge_field_members(field_names, field_members) return SemanticTemplate(name, field_names, field_members, variant_of) def _munge_variant_of(variant_of): if variant_of is None: variant_of = () elif isinstance(variant_of, VariantField): variant_of = (variant_of,) else: variant_of = tuple(variant_of) for variant in variant_of: if not isinstance(variant, VariantField): raise ValueError("Element %r of %r is not a variant field" " (ExampleType.field['name'])." % (variant, variant_of)) return variant_of def _munge_field_names(field_names): if field_names is None: return () if type(field_names) is str: return (field_names,) field_names = tuple(field_names) for field_name in field_names: if type(field_name) is not str: raise ValueError("Field name %r from %r is not a string." % (field_name, field_names)) if len(set(field_names)) != len(field_names): raise ValueError("Duplicate field names in %r." % field_names) return field_names def _munge_field_members(field_names, field_members): if field_names is None: return {} fixed = {k: () for k in field_names} if field_members is None: return fixed if not isinstance(field_members, collections.abc.Mapping): raise ValueError("") fixed.update(field_members) for key, value in field_members.items(): if key not in field_names: raise ValueError("Field member key: %r is not in `field_names`" " (%r)." % (key, field_names)) if is_qiime_type(value) and is_semantic_type(value): fixed[key] = (value,) else: value = tuple(value) for v in value: if not is_semantic_type(v): raise ValueError("Field member: %r (of field %r) is not a" " semantic type." % (v, key)) fixed[key] = value return fixed class VariantField: def __init__(self, type_name, field_name, field_members): self.type_name = type_name self.field_name = field_name self.field_members = field_members def is_member(self, semantic_type): for field_member in self.field_members: if isinstance(field_member, IncompleteExp): # Pseudo-subtyping like Foo[X] <= Foo[Any]. # (IncompleteExp will never have __le__ because you # are probably doing something wrong with it (this totally # doesn't count!)) if semantic_type.name == field_member.name: return True # ... it doesn't count because this is a way of restricting our # ontology and isn't really crucial. Where it matters would be # in function application where the semantics must be defined # precisely and Foo[Any] is anything but precise. else: if semantic_type <= field_member: return True return False def __repr__(self): return "%s.field[%r]" % (self.type_name, self.field_name) class SemanticTemplate(TypeTemplate): public_proxy = 'field', def __init__(self, name, field_names, field_members, variant_of): self.name = name self.field_names = field_names self.__field = {f: VariantField(name, f, field_members[f]) for f in self.field_names} self.variant_of = variant_of @property def field(self): return types.MappingProxyType(self.__field) def __eq__(self, other): return (type(self) is type(other) and self.name == other.name and self.fields == other.fields and self.variant_of == other.variant_of) def __hash__(self): return (hash(type(self)) ^ hash(self.name) ^ hash(self.fields) ^ hash(self.variant_of)) def get_kind(self): return 'semantic-type' def get_name(self): return self.name def get_field_names(self): return self.field_names def is_element_expr(self, self_expr, value): import qiime2.sdk if not isinstance(value, qiime2.sdk.Artifact): return False return value.type <= self_expr def is_element(self, value): raise NotImplementedError def validate_field(self, name, field): raise NotImplementedError def validate_fields_expr(self, self_expr, fields_expr): self.validate_field_count(len(fields_expr)) for expr, varf in zip(fields_expr, [self.field[n] for n in self.field_names]): if (expr.template is not None and hasattr(expr.template, 'is_variant')): check = expr.template.is_variant else: check = self.is_variant if not check(expr, varf): raise TypeError("%r is not a variant of %r" % (expr, varf)) @classmethod def is_variant(cls, expr, varf): if isinstance(expr, UnionExp): return all(cls.is_variant(e, varf) for e in expr.members) if isinstance(expr, IntersectionExp): return any(cls.is_variant(e, varf) for e in expr.members) return varf.is_member(expr) or varf in expr.template.variant_of def validate_predicate(self, predicate): if not isinstance(predicate, Properties): raise TypeError() def update_ast(self, ast): ast['builtin'] = False class Properties(PredicateTemplate): def __init__(self, *include, exclude=()): if len(include) == 1 and isinstance(include[0], (list, tuple, set, frozenset)): include = tuple(include[0]) if type(exclude) is str: exclude = (exclude,) self.include = tuple(include) self.exclude = tuple(exclude) for prop in itertools.chain(self.include, self.exclude): if type(prop) is not str: raise TypeError("%r in %r is not a string." % (prop, self)) def __hash__(self): return hash(frozenset(self.include)) ^ hash(frozenset(self.exclude)) def __eq__(self, other): return (type(self) is type(other) and set(self.include) == set(other.include) and set(self.exclude) == set(other.exclude)) def __repr__(self): args = [] if self.include: args.append(', '.join(repr(s) for s in self.include)) if self.exclude: args.append("exclude=%r" % list(self.exclude)) return "%s(%s)" % (self.__class__.__name__, ', '.join(args)) def is_symbol_subtype(self, other): if type(self) is not type(other): return False return (set(other.include) <= set(self.include) and set(other.exclude) <= set(self.exclude)) def is_symbol_supertype(self, other): if type(self) is not type(other): return False return (set(other.include) >= set(self.include) and set(other.exclude) >= set(self.exclude)) def collapse_intersection(self, other): if type(self) is not type(other): return None new_include_set = set(self.include) | set(other.include) new_exclude_set = set(self.exclude) | set(other.exclude) new_include = [] new_exclude = [] for inc in itertools.chain(self.include, other.include): if inc in new_include_set: new_include.append(inc) new_include_set.remove(inc) for exc in itertools.chain(self.exclude, other.exclude): if exc in new_exclude_set: new_exclude.append(exc) new_exclude_set.remove(exc) return self.__class__(*new_include, exclude=new_exclude).template def get_kind(self): return 'semantic-type' def get_name(self): return self.__class__.__name__ def is_element(self, expr): return True # attached TypeExp checks this def get_union_membership_expr(self, self_expr): return 'predicate-' + self.get_name() def update_ast(self, ast): ast['include'] = list(self.include) ast['exclude'] = list(self.exclude) qiime2-2024.5.0/qiime2/core/type/signature.py000066400000000000000000000766361462552636000206370ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import inspect import copy import itertools import tempfile import qiime2.sdk import qiime2.core.type as qtype from qiime2.core.archive.provenance import MetadataInfo from .grammar import TypeExp, UnionExp from .meta import TypeVarExp from .collection import List, Set, Collection from .primitive import infer_primitive_type from .visualization import Visualization from . import meta from .util import (is_semantic_type, is_collection_type, is_primitive_type, parse_primitive) from ..util import ImmutableBase, md5sum, create_collection_name class __NoValueMeta(type): def __repr__(self): return "NOVALUE" # This sentinel is a class so that it retains the correct memory address when # pickled class _NOVALUE(metaclass=__NoValueMeta): pass class ParameterSpec(ImmutableBase): NOVALUE = _NOVALUE def __init__(self, qiime_type=NOVALUE, view_type=NOVALUE, default=NOVALUE, description=NOVALUE): self.qiime_type = qiime_type self.view_type = view_type self.default = default self.description = description self._freeze_() def has_qiime_type(self): return self.qiime_type is not self.NOVALUE def has_view_type(self): return self.view_type is not self.NOVALUE def has_default(self): return self.default is not self.NOVALUE def has_description(self): return self.description is not self.NOVALUE def duplicate(self, **kwargs): qiime_type = kwargs.pop('qiime_type', self.qiime_type) view_type = kwargs.pop('view_type', self.view_type) default = kwargs.pop('default', self.default) description = kwargs.pop('description', self.description) if kwargs: raise TypeError("Unknown arguments: %r" % kwargs) return ParameterSpec(qiime_type, view_type, default, description) def __repr__(self): return ("ParameterSpec(qiime_type=%r, view_type=%r, default=%r, " "description=%r)" % (self.qiime_type, self.view_type, self.default, self.description)) def __eq__(self, other): return (self.qiime_type == other.qiime_type and self.view_type == other.view_type and self.default == other.default and self.description == other.description) def __ne__(self, other): return not (self == other) class PipelineSignature: builtin_args = ('ctx',) def __init__(self, callable, inputs, parameters, outputs, input_descriptions=None, parameter_descriptions=None, output_descriptions=None): """ Parameters ---------- callable : callable Callable with view type annotations on parameters and return. inputs : dict Parameter name to semantic type. parameters : dict Parameter name to primitive type. outputs : dict or list of tuples Each pair/tuple contains the name of the output (str) and its QIIME type. input_descriptions : dict, optional Input name to description string. parameter_descriptions : dict, optional Parameter name to description string. output_descriptions : dict, optional Output name to description string. """ # update type of outputs if needed if type(outputs) is list: outputs = dict(outputs) elif type(outputs) is set: raise ValueError("Plugin registration for %r cannot use a set()" " to define the outputs, as the order is random." % callable.__name__) inputs, parameters, outputs, signature_order = \ self._parse_signature(callable, inputs, parameters, outputs, input_descriptions, parameter_descriptions, output_descriptions) self._assert_valid_inputs(inputs) self._assert_valid_parameters(parameters) self._assert_valid_outputs(outputs) self._assert_valid_views(inputs, parameters, outputs) self.inputs = inputs self.parameters = parameters self.outputs = outputs self.signature_order = signature_order def _parse_signature(self, callable, inputs, parameters, outputs, input_descriptions=None, parameter_descriptions=None, output_descriptions=None): # Initialize dictionaries if non-existant. if input_descriptions is None: input_descriptions = {} if parameter_descriptions is None: parameter_descriptions = {} if output_descriptions is None: output_descriptions = {} # Copy so we can "exhaust" the collections and check for missing params inputs = copy.copy(inputs) parameters = copy.copy(parameters) input_descriptions = copy.copy(input_descriptions) parameter_descriptions = copy.copy(parameter_descriptions) output_descriptions = copy.copy(output_descriptions) builtin_args = list(self.builtin_args) annotated_inputs = collections.OrderedDict() annotated_parameters = collections.OrderedDict() annotated_outputs = collections.OrderedDict() signature_order = collections.OrderedDict() for name, parameter in inspect.signature(callable).parameters.items(): if (parameter.kind == parameter.VAR_POSITIONAL or parameter.kind == parameter.VAR_KEYWORD): raise TypeError("Variadic definitions are unsupported: %r" % name) if builtin_args: if builtin_args[0] != name: raise TypeError("Missing builtin argument %r, got %r" % (builtin_args[0], name)) builtin_args = builtin_args[1:] continue view_type = ParameterSpec.NOVALUE if parameter.annotation is not parameter.empty: view_type = parameter.annotation default = ParameterSpec.NOVALUE if parameter.default is not parameter.empty: default = parameter.default if name in inputs: description = input_descriptions.pop(name, ParameterSpec.NOVALUE) param_spec = ParameterSpec( qiime_type=inputs.pop(name), view_type=view_type, default=default, description=description) annotated_inputs[name] = param_spec signature_order[name] = param_spec elif name in parameters: description = parameter_descriptions.pop(name, ParameterSpec.NOVALUE) param_spec = ParameterSpec( qiime_type=parameters.pop(name), view_type=view_type, default=default, description=description) annotated_parameters[name] = param_spec signature_order[name] = param_spec elif name not in self.builtin_args: raise TypeError("Parameter in callable without QIIME type:" " %r" % name) # we should have popped both of these empty by this point if inputs or parameters: raise TypeError("Callable does not have parameter(s): %r" % (list(inputs) + list(parameters))) if 'return' in callable.__annotations__: output_views = qiime2.core.util.tuplize( callable.__annotations__['return']) if len(output_views) != len(outputs): raise TypeError("Number of registered outputs (%r) does not" " match annotation (%r)" % (len(outputs), len(output_views))) for (name, qiime_type), view_type in zip(outputs.items(), output_views): description = output_descriptions.pop(name, ParameterSpec.NOVALUE) annotated_outputs[name] = ParameterSpec( qiime_type=qiime_type, view_type=view_type, description=description) else: for name, qiime_type in outputs.items(): description = output_descriptions.pop(name, ParameterSpec.NOVALUE) annotated_outputs[name] = ParameterSpec( qiime_type=qiime_type, description=description) # we should have popped the descriptions empty by this point if input_descriptions or parameter_descriptions or output_descriptions: raise TypeError( "Callable does not have parameter(s)/output(s) found in " "descriptions: %r" % [*input_descriptions, *parameter_descriptions, *output_descriptions]) return (annotated_inputs, annotated_parameters, annotated_outputs, signature_order) def collate_inputs(self, *args, **kwargs): # Collate positional inputs collated_inputs = {name: value for name, value in zip(self.signature_order, args)} collated_inputs.update(kwargs) return collated_inputs def _assert_valid_inputs(self, inputs): for input_name, spec in inputs.items(): if not is_semantic_type(spec.qiime_type): raise TypeError( "Input %r must be a semantic QIIME type, not %r" % (input_name, spec.qiime_type)) if not isinstance(spec.qiime_type, (TypeExp, UnionExp)): raise TypeError( "Input %r must be a complete semantic type expression, " "not %r" % (input_name, spec.qiime_type)) if spec.has_default() and spec.default is not None: raise ValueError( "Input %r has a default value of %r. Only a default " "value of `None` is supported for inputs." % (input_name, spec.default)) for var_selector in meta.select_variables(spec.qiime_type): var = var_selector(spec.qiime_type) if not var.input: raise TypeError("An output variable has been associated" " with an input type: %r" % spec.qiime_type) def _assert_valid_parameters(self, parameters): for param_name, spec in parameters.items(): if not is_primitive_type(spec.qiime_type): raise TypeError( "Parameter %r must be a primitive QIIME type, not %r" % (param_name, spec.qiime_type)) if not isinstance(spec.qiime_type, (TypeExp, UnionExp)): raise TypeError( "Parameter %r must be a complete primitive type " "expression, not %r" % (param_name, spec.qiime_type)) if (spec.has_default() and spec.default is not None and spec.default not in spec.qiime_type): raise TypeError("Default value for parameter %r is not of " "semantic QIIME type %r or `None`." % (param_name, spec.qiime_type)) for var_selector in meta.select_variables(spec.qiime_type): var = var_selector(spec.qiime_type) if not var.input: raise TypeError("An output variable has been associated" " with an input type: %r" % spec.qiime_type) def _assert_valid_outputs(self, outputs): if len(outputs) == 0: raise TypeError("%s requires at least one output" % self.__class__.__name__) for output_name, spec in outputs.items(): if not (is_semantic_type(spec.qiime_type) or spec.qiime_type == Visualization or spec.qiime_type == Collection[Visualization]): raise TypeError( "Output %r must be a semantic QIIME type or " "Visualization, not %r" % (output_name, spec.qiime_type)) if not isinstance(spec.qiime_type, (TypeVarExp, TypeExp)): raise TypeError( "Output %r must be a complete type expression, not %r" % (output_name, spec.qiime_type)) for var_selector in meta.select_variables(spec.qiime_type): var = var_selector(spec.qiime_type) if not var.output: raise TypeError("An input variable has been associated" " with an input type: %r") def _assert_valid_views(self, inputs, parameters, outputs): for name, spec in itertools.chain(inputs.items(), parameters.items(), outputs.items()): if spec.has_view_type(): raise TypeError( " Pipelines do not support function annotations (found one" " for parameter: %r)." % name) def coerce_user_input(self, **user_input): """ Coerce user inputs to be appropriate for callable """ callable_args = {} for name, spec in self.signature_order.items(): # Some arguments may be optional and won't be present here. Whether # they passed all mandatory arguments or not is validated elsewhere if name in user_input: arg = user_input[name] if name in self.inputs: callable_args[name] = self._coerce_given_input(arg, spec) else: callable_args[name] = \ self._coerce_given_parameter(arg, spec) return callable_args def _coerce_given_input(self, _input, spec): """ Coerce input to be appropriate for callable """ _, qiime_name = self._get_qiime_type_and_name(spec) # Transform collection from list to dict and vice versa if needed if qiime_name == 'Collection' and isinstance(_input, list): _input = self._list_to_dict(_input) elif qiime_name == 'List' and \ (isinstance(_input, dict) or isinstance(_input, qiime2.sdk.ResultCollection)): _input = self._dict_to_list(_input) if isinstance(_input, dict): _input = qiime2.sdk.ResultCollection(_input) return _input def _coerce_given_parameter(self, param, spec): """ Coerce parameter to be appropriate for callable """ view_type = spec.view_type qiime_type = spec.qiime_type if view_type == dict and isinstance(param, list): param = self._list_to_dict(param) elif view_type == list and isinstance(param, dict): param = self._dict_to_list(param) if qiime_type is qtype.Threads and param == 'auto': param = 0 return param def transform_and_add_callable_args_to_prov(self, provenance, **callable_args): """ Transform inputs to views and add all callable arguments to provenance. Needs to be done together so we can add transformation records to provenance and because we want transformers to run outside the DFK in parsl """ for name, spec in self.signature_order.items(): arg = callable_args[name] if name in self.inputs: callable_args[name] = \ self._transform_and_add_input_to_prov( provenance, name, spec, arg) else: provenance.add_parameter(name, spec.qiime_type, arg) return callable_args def _transform_and_add_input_to_prov(self, provenance, name, spec, _input): """ Transform the input and add both the input and the transformation record to provenance """ transformed_input = None # Add input to provenance after creating the correct collection # type provenance.add_input(name, _input) qiime_type, _ = self._get_qiime_type_and_name(spec) # Transform artifacts to view types as necessary if _input is None: transformed_input = None elif spec.has_view_type(): recorder = provenance.transformation_recorder(name) # Transform all members of collection into view type if qtype.is_collection_type(qiime_type): if isinstance(_input, qiime2.sdk.result.ResultCollection): transformed_input = qiime2.sdk.result.ResultCollection( {k: v._view(spec.view_type, recorder) for k, v in _input.items()}) else: transformed_input = [ i._view(spec.view_type, recorder) for i in _input] else: transformed_input = _input._view(spec.view_type, recorder) else: transformed_input = _input return transformed_input def _get_qiime_type_and_name(self, spec): """ Get concrete qiime type and name from nested spec """ qiime_type = spec.qiime_type qiime_name = spec.qiime_type.name # I don't think this will necessarily work if we nest collection # types in the future if qiime_name == '': # If we have an outer union as our semantic type, the name will # be the empty string, and the type will be the entire union # expression. In order to get a meaningful name and a type # that tells us if we have a collection, we unpack the union # and grab that info from the first element. All subsequent # elements will share this same basic information because we # do not allow # List[TypeA] | Collection[TypeA] qiime_type = next(iter(spec.qiime_type)) qiime_name = qiime_type.name return qiime_type, qiime_name def coerce_given_outputs(self, output_views, output_types, scope, provenance): """ Coerce the outputs produced by the method into the desired types if possible. Primarily useful to create collections of outputs """ outputs = [] for output_view, (name, spec) in zip(output_views, output_types.items()): if spec.qiime_type.name == 'Collection': output = qiime2.sdk.ResultCollection() size = len(output_view) if isinstance(output_view, qiime2.sdk.ResultCollection) or \ isinstance(output_view, dict): keys = list(output_view.keys()) values = list(output_view.values()) else: keys = None values = output_view for idx, view in enumerate(values): if keys is not None: key = str(keys[idx]) else: key = str(idx) collection_name = create_collection_name( name=name, key=key, idx=idx, size=size) output[key] = self._create_output_artifact( provenance, collection_name, scope, spec, view) elif type(output_view) is not spec.view_type: raise TypeError( "Expected output view type %r, received %r" % (spec.view_type.__name__, type(output_view).__name__)) else: output = self._create_output_artifact( provenance, name, scope, spec, output_view) outputs.append(output) return outputs def _create_output_artifact(self, provenance, name, scope, spec, view): """ Create an output artifact from a view and add it to provenance """ prov = provenance.fork(name) qiime_type = spec.qiime_type # If we have a collection we need to get a concrete qiime_type to # instantiate each artifact as. # # For instance, we cannot instantiate a Collection[SingleInt] from an # integer. We want to instantiate a SingleInt that will be put into a # ResultCollection outside of this method. if is_collection_type(qiime_type): qiime_type = qiime_type.fields[0] scope.add_reference(prov) artifact = qiime2.sdk.Artifact._from_view( qiime_type, view, spec.view_type, prov) artifact = scope.add_parent_reference(artifact) return artifact def decode_parameters(self, **kwargs): params = {} for key, spec in self.parameters.items(): if (spec.has_default() and spec.default is None and kwargs[key] is None): params[key] = None else: params[key] = parse_primitive(spec.qiime_type, kwargs[key]) return params def _dict_to_list(self, _input): """ Turn dict to list """ return list(_input.values()) def _list_to_dict(self, _input): """ Turn list to dict """ return {str(idx): v for idx, v in enumerate(_input)} def check_types(self, **kwargs): for name, spec in self.signature_order.items(): parameter = kwargs[name] # A type mismatch is unacceptable unless the value is None # and this parameter's default value is None. if ((parameter not in spec.qiime_type) and not (spec.has_default() and spec.default is None and parameter is None)): if isinstance(parameter, qiime2.sdk.Visualization): raise TypeError( "Parameter %r received a Visualization as an " "argument. Visualizations may not be used as inputs." % name) elif isinstance(parameter, qiime2.sdk.Artifact): raise TypeError( "Parameter %r requires an argument of type %r. An " "argument of type %r was passed." % ( name, spec.qiime_type, parameter.type)) elif isinstance(parameter, qiime2.Metadata): raise TypeError( "Parameter %r received Metadata as an " "argument, which is incompatible with parameter " "type: %r" % (name, spec.qiime_type)) else: # handle primitive types raise TypeError( "Parameter %r received %r as an argument, which is " "incompatible with parameter type: %r" % (name, parameter, spec.qiime_type)) def solve_output(self, **kwargs): solved_outputs = None for _, spec in itertools.chain(self.inputs.items(), self.parameters.items(), self.outputs.items()): if list(meta.select_variables(spec.qiime_type)): break # a variable exists, do the hard work else: # no variables solved_outputs = self.outputs if solved_outputs is None: inputs = {**{k: s.qiime_type for k, s in self.inputs.items()}, **{k: s.qiime_type for k, s in self.parameters.items()}} outputs = {k: s.qiime_type for k, s in self.outputs.items()} input_types = { k: self._infer_type(k, v) for k, v in kwargs.items()} solved = meta.match(input_types, inputs, outputs) solved_outputs = collections.OrderedDict( (k, s.duplicate(qiime_type=solved[k])) for k, s in self.outputs.items()) for output_name, spec in solved_outputs.items(): if not spec.qiime_type.is_concrete(): raise TypeError( "Solved output %r must be a concrete type, not %r" % (output_name, spec.qiime_type)) return solved_outputs def _infer_type(self, key, value): if value is None: if key in self.inputs: return self.inputs[key].qiime_type elif key in self.parameters: return self.parameters[key].qiime_type # Shouldn't happen: raise ValueError("Parameter passed not consistent with signature.") if type(value) is list: inner = UnionExp((self._infer_type(key, v) for v in value)) return List[inner.normalize()] if type(value) is set: inner = UnionExp((self._infer_type(key, v) for v in value)) return Set[inner.normalize()] if type(value) is dict or \ isinstance(value, qiime2.sdk.ResultCollection): inner = UnionExp( (self._infer_type(key, v) for v in value.values())) return Collection[inner.normalize()] if isinstance( value, (qiime2.sdk.Artifact, qiime2.sdk.proxy.ProxyArtifact)): return value.type else: return infer_primitive_type(value) def __repr__(self): lines = [] for group in 'inputs', 'parameters', 'outputs': lookup = getattr(self, group) lines.append('%s:' % group) for name, spec in lookup.items(): lines.append(' %s: %r' % (name, spec)) return '\n'.join(lines) def __eq__(self, other): return (type(self) is type(other) and self.inputs == other.inputs and self.parameters == other.parameters and self.outputs == other.outputs and self.signature_order == other.signature_order) def __ne__(self, other): return not (self == other) class MethodSignature(PipelineSignature): builtin_args = () def _assert_valid_outputs(self, outputs): super()._assert_valid_outputs(outputs) # Assert all output types are semantic types. The parent class is less # strict in its output type requirements. for output_name, spec in outputs.items(): if not is_semantic_type(spec.qiime_type): raise TypeError( "Output %r must be a semantic QIIME type, not %r" % (output_name, spec.qiime_type)) def _assert_valid_views(self, inputs, parameters, outputs): for name, spec in itertools.chain(inputs.items(), parameters.items(), outputs.items()): if not spec.has_view_type(): raise TypeError("Method is missing a function annotation for" " parameter: %r" % name) class VisualizerSignature(PipelineSignature): builtin_args = ('output_dir',) def __init__(self, callable, inputs, parameters, input_descriptions=None, parameter_descriptions=None): outputs = {'visualization': Visualization} output_descriptions = None super().__init__(callable, inputs, parameters, outputs, input_descriptions, parameter_descriptions, output_descriptions) def _assert_valid_outputs(self, outputs): super()._assert_valid_outputs(outputs) output = outputs['visualization'] if output.has_view_type() and output.view_type is not None: raise TypeError( "Visualizer callable cannot return anything. Its return " "annotation must be `None`, not %r. Write output to " "`output_dir`." % output.view_type) def _assert_valid_views(self, inputs, parameters, outputs): for name, spec in itertools.chain(inputs.items(), parameters.items()): if not spec.has_view_type(): raise TypeError("Visualizer is missing a function annotation" " for parameter: %r" % name) IndexedCollectionElement = collections.namedtuple( 'IndexedCollectionElement', ['item_name', 'idx', 'total']) class HashableInvocation(): def __init__(self, plugin_action, arguments): self.plugin_action = plugin_action unified_arguments = self._unify_dicts(arguments) self.arguments = self._make_hashable(unified_arguments) def __eq__(self, other): return (self.plugin_action == other.plugin_action) \ and (self.arguments == other.arguments) def __hash__(self): return hash((self.plugin_action, self.arguments)) def __repr__(self): return (f'\nPLUGIN_ACTION: {self.plugin_action}\nARGUMENTS:' f' {self.arguments}\n') def _unify_dicts(self, arguments): """Check if action.yaml gave us any lists of single element dicts to unify """ for idx, argument in enumerate(arguments): name, value = list(argument.items())[0] if isinstance(value, list) and \ all(isinstance(x, dict) for x in value): arguments[idx] = {name: self._unify_dict(value)} return arguments def _unify_dict(self, collection): """If we do have a list of single element dicts, turn it into one dict """ unified_dict = {} for elem in collection: for k, v in elem.items(): unified_dict[k] = v return unified_dict def _make_hashable(self, collection): """Take an arbitrarily nested collection and turn it into a hashable arbitrarily nested tuple. Turns Artifacts into their uuid and Metadata into their md5sum """ from qiime2 import Artifact from qiime2.sdk import ResultCollection from qiime2.metadata.metadata import _MetadataBase new_collection = [] if isinstance(collection, dict) or \ isinstance(collection, ResultCollection): for k, v in collection.items(): new_collection.append((k, self._make_hashable(v))) elif isinstance(collection, list): for elem in collection: new_collection.append(self._make_hashable(elem)) elif isinstance(collection, Artifact): return str(collection.uuid) elif isinstance(collection, _MetadataBase): with tempfile.NamedTemporaryFile('w') as fh: fp = fh.name collection.save(fp) collection = md5sum(fp) return collection elif isinstance(collection, MetadataInfo): return collection.md5sum_hash else: return collection return tuple(new_collection) qiime2-2024.5.0/qiime2/core/type/template.py000066400000000000000000000110771462552636000204350ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from abc import ABCMeta, abstractmethod import itertools import inspect from qiime2.core.type.grammar import (IncompleteExp, TypeExp, PredicateExp, IntersectionExp) class _BaseTemplate(metaclass=ABCMeta): public_proxy = () is_template = True # for smoke-testing @property def __signature__(self): return inspect.signature(self.__init__) @abstractmethod def __eq__(self, other): raise NotImplementedError def __hash__(self, other): return 0 def get_name_expr(self, self_expr): return self.get_name() @abstractmethod def get_name(self): raise NotImplementedError def get_kind_expr(self, self_expr): return self.get_kind() @abstractmethod def get_kind(self): raise NotImplementedError def get_union_membership_expr(self, self_expr): return self.get_kind_expr(self_expr) def is_element_expr(self, self_expr, value): return self.is_element(value) @abstractmethod def is_element(self, value): raise NotImplementedError def collapse_intersection(self, other): return NotImplemented def is_symbol_subtype_expr(self, self_expr, other_expr): return self.is_symbol_subtype(other_expr.template) def is_symbol_subtype(self, other): return self.get_name() == other.get_name() def is_symbol_supertype_expr(self, self_expr, other_expr): return self.is_symbol_supertype(other_expr.template) def is_symbol_supertype(self, other): return self.get_name() == other.get_name() def update_ast_expr(self, self_expr, ast): self.update_ast(ast) def update_ast(self, ast): pass # default is to do nothing class TypeTemplate(_BaseTemplate): def __new__(cls, *args, _pickle=False, **kwargs): self = super().__new__(cls) if _pickle: return self self.__init__(*args, **kwargs) if list(self.get_field_names()): return IncompleteExp(self) else: return TypeExp(self) def __getnewargs_ex__(self): return ((), {'_pickle': True}) def get_field_names_expr(self, expr): return self.get_field_names() @abstractmethod def get_field_names(self): raise NotImplementedError def validate_fields_expr(self, self_expr, fields_expr): self.validate_field_count(len(fields_expr)) for expr, name in itertools.zip_longest( fields_expr, self.get_field_names_expr(self_expr), fillvalue=IntersectionExp()): if expr.template is None: for exp in expr.members: if exp.template is None: for ex in exp.members: self.validate_field(name, ex.template) else: self.validate_field(name, exp.template) else: self.validate_field(name, expr.template) def validate_field_count(self, count): exp = len(self.get_field_names()) if count != exp: raise TypeError("Expected only %r fields, got %r" % (exp, count)) @abstractmethod def validate_field(self, name, field): raise NotImplementedError def validate_predicate_expr(self, self_expr, predicate_expr): if predicate_expr.template is None: for predicate in predicate_expr.members: self.validate_predicate_expr(self_expr, predicate) else: self.validate_predicate(predicate_expr.template) @abstractmethod def validate_predicate(self, predicate): raise NotImplementedError class PredicateTemplate(_BaseTemplate): def __new__(cls, *args, _pickle=False, **kwargs): self = super().__new__(cls) if _pickle: return self self.__init__(*args, **kwargs) return PredicateExp(self) def __getnewargs_ex__(self): return ((), {'_pickle': True}) @abstractmethod def __hash__(self, other): raise NotImplementedError @abstractmethod def is_symbol_subtype(self, other): raise NotImplementedError @abstractmethod def is_symbol_supertype(self, other): raise NotImplementedError qiime2-2024.5.0/qiime2/core/type/tests/000077500000000000000000000000001462552636000174045ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/core/type/tests/__init__.py000066400000000000000000000005351462552636000215200ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/core/type/tests/test_collection.py000066400000000000000000000123741462552636000231570ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2.core.type import ( is_collection_type, is_primitive_type, is_semantic_type, Set, List, SemanticType, Int, Metadata, MetadataColumn, Categorical, Numeric, Range) class TestIsTypes(unittest.TestCase): def test_list_semantic_type(self): Foo = SemanticType('Foo') self.assertTrue(is_collection_type(List[Foo])) self.assertTrue(is_semantic_type(List[Foo])) self.assertFalse(is_primitive_type(List[Foo])) def test_set_semantic_type(self): Foo = SemanticType('Foo') self.assertTrue(is_collection_type(Set[Foo])) self.assertTrue(is_semantic_type(Set[Foo])) self.assertFalse(is_primitive_type(Set[Foo])) def test_list_primitive_type(self): self.assertTrue(is_collection_type(List[Int % Range(5)])) self.assertTrue(is_primitive_type(List[Int % Range(5)])) self.assertFalse(is_semantic_type(List[Int % Range(5)])) def test_set_primitive_type(self): self.assertTrue(is_collection_type(Set[Int % Range(5)])) self.assertTrue(is_primitive_type(Set[Int % Range(5)])) self.assertFalse(is_semantic_type(Set[Int % Range(5)])) class TestCollectionBase(unittest.TestCase): def test_no_list_metadata(self): with self.assertRaisesRegex(TypeError, 'metadata'): List[Metadata] def test_no_set_metadata(self): with self.assertRaisesRegex(TypeError, 'metadata'): List[Metadata] def test_no_list_metadata_column(self): with self.assertRaisesRegex(TypeError, 'metadata'): List[MetadataColumn[Categorical]] with self.assertRaisesRegex(TypeError, 'metadata'): List[MetadataColumn[Numeric]] def test_no_set_metadata_column(self): with self.assertRaisesRegex(TypeError, 'metadata'): Set[MetadataColumn[Categorical]] with self.assertRaisesRegex(TypeError, 'metadata'): Set[MetadataColumn[Numeric]] def test_no_nesting_list_list(self): with self.assertRaisesRegex(TypeError, 'nest'): List[List[Int]] def test_no_nesting_set_set(self): with self.assertRaisesRegex(TypeError, 'nest'): Set[Set[Int]] def test_no_nesting_mixed(self): with self.assertRaisesRegex(TypeError, 'nest'): List[Set[Int]] class TestCollectionExpression(unittest.TestCase): def test_bad_union(self): with self.assertRaisesRegex(TypeError, 'not union'): List[Int] | Set[Int] def test_union_inside_collection(self): Foo = SemanticType('Foo') Bar = SemanticType('Bar') self.assertTrue(List[Foo] <= List[Foo | Bar]) def test_no_predicate(self): with self.assertRaisesRegex(TypeError, 'cannot be applied'): List[Int] % Range(5) def is_concrete(self): Foo = SemanticType('Foo') self.assertFalse(List[Foo].is_concrete()) self.assertFalse(Set[Int].is_concrete()) def test_to_ast_semantic(self): Foo = SemanticType('Foo') ast = List[Foo].to_ast() self.assertEqual(ast['fields'][0], Foo.to_ast()) def test_to_ast_primitive(self): ast = List[Int % Range(5)].to_ast() self.assertEqual(ast['fields'][0], (Int % Range(5)).to_ast()) def test_contains_list_primitive(self): self.assertTrue([1, 2, 3] in List[Int]) self.assertTrue([-1, 2, 3] in List[Int]) self.assertFalse([-1, 2, 3] in List[Int % Range(0, 5)]) self.assertFalse([1, 1.1, 1.11] in List[Int]) self.assertFalse({1, 2, 3} in List[Int]) self.assertFalse(object() in List[Int]) def test_contains_set_primitive(self): self.assertTrue({1, 2, 3} in Set[Int]) self.assertTrue({-1, 2, 3} in Set[Int]) self.assertFalse({-1, 2, 3} in Set[Int % Range(0, 5)]) self.assertFalse({1, 1.1, 1.11} in Set[Int]) self.assertFalse([1, 2, 3] in Set[Int]) self.assertFalse(object() in Set[Int]) def test_variant_of_field_members(self): Bar = SemanticType('Bar') Foo = SemanticType('Foo', field_names='foo', field_members={'foo': List[Bar]}) with self.assertRaisesRegex(TypeError, 'is not a variant'): Foo[List[Bar]] def test_variant_of_alt(self): Foo = SemanticType('Foo', field_names='foo') Bar = SemanticType('Bar', variant_of=Foo.field['foo']) with self.assertRaisesRegex(TypeError, 'is not a variant'): Foo[Set[Bar]] def test_encode_decode_set(self): value = List[Int].decode("[1, 2, 3]") self.assertEqual(value, [1, 2, 3]) json = List[Int].encode(value) self.assertEqual(json, "[1, 2, 3]") def test_encode_decode_list(self): value = Set[Int].decode("[1, 2, 3]") self.assertEqual(value, {1, 2, 3}) json = Set[Int].encode(value) self.assertEqual(json, "[1, 2, 3]") if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_grammar.py000066400000000000000000000563471462552636000224620ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pickle import unittest import collections.abc import qiime2.core.type.grammar as grammar import qiime2.core.type.template as template class _MockBase: public_proxy = 'example', def __init__(self, name, fields=()): self.test_data = {} self.name = name self.fields = fields def track_call(func): def wrapped(self, *args, **kwargs): self.test_data[func.__name__] = True return func(self, *args, **kwargs) return wrapped @track_call def __eq__(self, other): return id(self) == id(other) @track_call def __hash__(self): return hash(id(self)) @track_call def get_field_names(self): return self.fields @track_call def get_name(self): return self.name @track_call def get_kind(self): return "tester" @track_call def validate_union(self, other): pass @track_call def validate_intersection(self, other): pass @track_call def is_element(self, value): return self.name.startswith(value) @track_call def collapse_intersection(self, other): return super().collapse_intersection(other) @track_call def is_symbol_subtype(self, other): return self.name == other.name @track_call def is_symbol_supertype(self, other): return self.name == other.name @track_call def update_ast(self, ast): ast['extra_junk'] = self.name def validate_field(self, name, field): self.test_data['validate_field'] = name if field.name == 'InvalidMember': raise TypeError('InvalidMember cannot be used') @track_call def validate_predicate(self, predicate): pass @track_call def example(self): return ... class MockTemplate(_MockBase, template.TypeTemplate): pass class MockPredicate(_MockBase, template.PredicateTemplate): def __init__(self, name, alphabetize=False): self.alphabetize = alphabetize super().__init__(name) def __repr__(self): return self.name def is_symbol_subtype(self, other): if not self.alphabetize or not other.alphabetize: return super().is_symbol_subtype(other) return self.name <= other.name def is_symbol_supertype(self, other): if not self.alphabetize or not other.alphabetize: return super().is_symbol_supertype(other) return self.name >= other.name def get_kind(self): return "tester-predicate" class TestIncompleteExp(unittest.TestCase): def IncompleteExp(self, name, fields): expr = MockTemplate(name, fields) self.assertIsInstance(expr, grammar.IncompleteExp) return expr def test_construction_sanity(self): expr = MockTemplate('foo') # TypeExpr with self.assertRaisesRegex(ValueError, "no fields"): # template has no fields, so putting it in an IncompleteExp # doesn't make sense expr = grammar.IncompleteExp(expr.template) def test_mod(self): with self.assertRaisesRegex(TypeError, 'predicate'): self.IncompleteExp('foo', ('a',)) % ... def test_or(self): with self.assertRaisesRegex(TypeError, 'union'): self.IncompleteExp('foo', ('a',)) | ... def test_and(self): with self.assertRaisesRegex(TypeError, 'intersect'): self.IncompleteExp('foo', ('a',)) & ... def test_repr(self): self.assertEqual(repr(self.IncompleteExp('Example', ('foo',))), 'Example[{foo}]') self.assertEqual(repr(self.IncompleteExp('Example', ('f', 'b'))), 'Example[{f}, {b}]') def test_le(self): expr_a = self.IncompleteExp('Foo', ('a',)) expr_b = self.IncompleteExp('Bar', ('b',)) with self.assertRaisesRegex(TypeError, 'missing arguments'): expr_a <= expr_b def test_ge(self): expr_a = self.IncompleteExp('Foo', ('a',)) expr_b = self.IncompleteExp('Bar', ('b',)) with self.assertRaisesRegex(TypeError, 'missing arguments'): expr_a >= expr_b def test_in(self): expr_a = self.IncompleteExp('Foo', ('a',)) with self.assertRaisesRegex(TypeError, 'missing arguments'): ... in expr_a def test_field_w_typeexp(self): expr_a = self.IncompleteExp('Foo', ('baz',)) expr_inner = MockTemplate('Bar') result = expr_a[expr_inner] self.assertEqual(repr(result), 'Foo[Bar]') self.assertIsInstance(result, grammar.TypeExp) self.assertEqual(expr_a.template.test_data['validate_field'], 'baz') def test_field_w_incompleteexp(self): expr_a = self.IncompleteExp('Foo', ('a',)) expr_b = self.IncompleteExp('Bar', ('b',)) with self.assertRaisesRegex(TypeError, 'complete type expression'): expr_a[expr_b] def test_field_w_nonsense(self): expr_a = self.IncompleteExp('Foo', ('a',)) with self.assertRaisesRegex(TypeError, 'complete type expression'): expr_a[...] def test_field_wrong_length(self): X = MockTemplate('X') C = self.IncompleteExp('C', ['foo', 'bar']) with self.assertRaisesRegex(TypeError, '1'): C[X] C = self.IncompleteExp('C', ['foo']) with self.assertRaisesRegex(TypeError, '2'): C[X, X] def test_field_nested_expression(self): X = MockTemplate('X') C = self.IncompleteExp('C', ['foo', 'bar']) self.assertEqual(repr(C[X, C[C[X, X], X]]), 'C[X, C[C[X, X], X]]') def test_field_invalid_member(self): C = self.IncompleteExp('C', ['foo']) InvalidMember = MockTemplate('InvalidMember') with self.assertRaisesRegex(TypeError, 'InvalidMember'): C[InvalidMember] def test_field_union(self): X = MockTemplate('X') Y = MockTemplate('Y') Z = MockTemplate('Z') C = self.IncompleteExp('C', ['foo']) result = C[X | Y | Z] self.assertEqual(repr(result), "C[X | Y | Z]") def test_field_invalid_union(self): X = MockTemplate('X') InvalidMember = MockTemplate('InvalidMember') Z = MockTemplate('Z') C = self.IncompleteExp('C', ['foo']) with self.assertRaisesRegex(TypeError, 'InvalidMember'): C[X | InvalidMember | Z] def test_field_insane(self): X = MockTemplate('X') Y = MockTemplate('Y') Z = MockTemplate('Z') InvalidIntersection = grammar.IntersectionExp( members=(MockTemplate('InvalidMember'), Y)) C = self.IncompleteExp('C', ['foo']) with self.assertRaisesRegex(TypeError, 'InvalidMember'): C[X | InvalidIntersection | Z] def test_iter_symbols(self): expr = self.IncompleteExp('Example', ('foo',)) self.assertEqual(list(expr.iter_symbols()), ['Example']) def test_is_concrete(self): expr = self.IncompleteExp('Example', ('foo',)) self.assertFalse(expr.is_concrete()) def test_pickle(self): expr = self.IncompleteExp('Example', ('foo',)) clone = pickle.loads(pickle.dumps(expr)) self.assertEqual(expr, clone) def test_proxy(self): expr = self.IncompleteExp('Example', ('foo',)) self.assertIs(expr.example(), ...) self.assertTrue(expr.template.test_data['example']) def test_eq_nonsense(self): expr_a = self.IncompleteExp('Example', ('foo',)) self.assertEqual(expr_a.__eq__(...), NotImplemented) def test_hash_eq_equals(self): expr_a = self.IncompleteExp('Example', ('foo',)) expr_b = self.IncompleteExp('Example', ('foo',)) self.assertEqual(hash(expr_a), hash(expr_b)) self.assertEqual(expr_a, expr_b) self.assertTrue(expr_a.equals(expr_b)) def test_not_hash_eq_equals_field_mismatch(self): expr_a = self.IncompleteExp('Example', ('foo',)) expr_b = self.IncompleteExp('Example', ('something_else',)) self.assertNotEqual(hash(expr_a), hash(expr_b)) self.assertNotEqual(expr_a, expr_b) self.assertFalse(expr_a.equals(expr_b)) class TestTypeExp(unittest.TestCase): def test_hashable(self): X = MockTemplate('X') Y = MockTemplate('Y', fields=('a',)) Z = MockTemplate('Z') P = MockPredicate('P') self.assertIsInstance(X, collections.abc.Hashable) # There really shouldn't be a collision between these: self.assertNotEqual(hash(X), hash(Z % P)) self.assertEqual(Y[X], Y[X]) self.assertEqual(hash(Y[X]), hash(Y[X])) def test_eq_nonsense(self): X = MockTemplate('X') self.assertIs(X.__eq__(42), NotImplemented) self.assertFalse(X == 42) def test_eq_different_instances(self): X = MockTemplate('X') X_ = MockTemplate('X') self.assertIsNot(X, X_) self.assertEqual(X, X_) def test_field(self): X = MockTemplate('X') with self.assertRaisesRegex(TypeError, 'fields'): X['scikit-bio/assets/.no.gif'] Y = MockTemplate('Y', fields=('foo',))[X] with self.assertRaisesRegex(TypeError, 'fields'): Y[';-)'] def test_repr(self): Face0 = MockTemplate('(o_-)') Face1 = MockTemplate('-_-') Exclaim0 = MockTemplate('!') Exclaim1 = MockTemplate('!', fields=('a',)) Exclaim2 = MockTemplate('!', fields=('a', 'b')) Face2 = MockPredicate('(o_o)') Face3 = grammar.IntersectionExp( (MockPredicate('='), MockPredicate('='))) # repr -> "= & =" Face4 = grammar.UnionExp( (MockPredicate('<'), MockPredicate('<'))) self.assertEqual(repr(Exclaim0), '!') self.assertEqual(repr(Exclaim1[Face1]), '![-_-]') self.assertEqual(repr(Exclaim2[Face1, Exclaim0]), '![-_-, !]') self.assertEqual(repr(Exclaim2[Face1, Exclaim0] % Face2), '![-_-, !] % (o_o)') self.assertEqual(repr(Face0 % Face2), '(o_-) % (o_o)') self.assertEqual(repr(Face0 % Face3), '(o_-) % (= & =)') self.assertEqual(repr(Exclaim2[Face1, Exclaim0] % Face3), '![-_-, !] % (= & =)') self.assertEqual(repr(Exclaim2[Face1, Exclaim0] % Face4), '![-_-, !] % (< | <)') def test_full_predicate(self): expr = MockTemplate('Foo') predicate = MockPredicate('Bar') self.assertIs((expr % predicate).full_predicate, predicate) self.assertTrue(expr.full_predicate.is_top()) def test_in(self): expr = MockTemplate('Foo') self.assertIn('Foo', expr) self.assertTrue(expr.template.test_data['is_element']) expr = MockTemplate('Bar') % MockPredicate('Baz') self.assertIn('Ba', expr) self.assertTrue(expr.template.test_data['is_element']) self.assertTrue(expr.predicate.template.test_data['is_element']) def test_not_in(self): expr = MockTemplate('Foo') self.assertNotIn('Bar', expr) expr = MockTemplate('Bar') % MockPredicate('Baz') self.assertNotIn('Bar', expr) # Bar not a substring of Baz def test_mod(self): Bar = MockTemplate('Bar') Baz = MockPredicate('Baz') noop = Bar % grammar.IntersectionExp() self.assertIs(Bar, noop) with self.assertRaisesRegex(TypeError, 'predicate'): (Bar % Baz) % Baz with self.assertRaisesRegex(TypeError, 'right-hand'): Baz % Bar def test_iter(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') C2 = MockTemplate('C2', fields=('a', 'b')) P = MockPredicate('P') Q = MockPredicate('Q') self.assertEqual( { Foo, C2[Foo, Foo], C2[Foo, C2[Foo % (P & Q), Bar]], C2[Foo, C2[Foo % (P & Q), Foo]], C2[Foo, C2[Bar, Bar]], C2[Foo, C2[Bar, Foo]] }, set( Foo | C2[Foo, Foo | C2[Foo % (P & Q) | Bar, Bar | Foo]] ) ) def test_iter_symbols(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') C2 = MockTemplate('C2', fields=('a', 'b')) P = MockPredicate('P') Q = MockPredicate('Q') self.assertEqual( {'Foo', 'C2', 'Bar'}, set((Foo | C2[Foo, Foo | C2[Foo % (P & Q) | Bar, Bar | Foo]] ).iter_symbols())) def test_is_concrete(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') C2 = MockTemplate('C2', fields=('a', 'b')) P = MockPredicate('P') Q = MockPredicate('Q') self.assertTrue(Foo.is_concrete()) self.assertTrue(C2[Foo, Bar].is_concrete()) self.assertTrue((Foo % P).is_concrete()) self.assertTrue((C2[Foo % P, Bar] % Q).is_concrete()) def test_not_concrete(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') C2 = MockTemplate('C2', fields=('a', 'b')) P = MockPredicate('P') Q = MockPredicate('Q') AnnoyingToMake = grammar.TypeExp( Foo.template, predicate=grammar.UnionExp((P, Q))) self.assertFalse((Foo | Bar).is_concrete()) self.assertFalse(C2[Foo | Bar, Bar].is_concrete()) self.assertFalse(C2[Foo, Bar | Foo].is_concrete()) self.assertFalse(AnnoyingToMake.is_concrete()) def test_to_ast(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') C2 = MockTemplate('C2', fields=('a', 'b')) P = MockPredicate('P') Q = MockPredicate('Q') self.assertEqual( (C2[Foo | Bar % P, Foo % (P & Q)] % (P | Q)).to_ast(), { 'type': 'expression', 'builtin': True, 'name': 'C2', 'predicate': { 'type': 'union', 'members': [ { 'type': 'predicate', 'name': 'P', 'extra_junk': 'P' }, { 'type': 'predicate', 'name': 'Q', 'extra_junk': 'Q' } ] }, 'fields': [ { 'type': 'union', 'members': [ { 'type': 'expression', 'builtin': True, 'name': 'Foo', 'predicate': None, 'fields': [], 'extra_junk': 'Foo' }, { 'type': 'expression', 'builtin': True, 'name': 'Bar', 'predicate': { 'type': 'predicate', 'name': 'P', 'extra_junk': 'P' }, 'fields': [], 'extra_junk': 'Bar' } ] }, { 'type': 'expression', 'builtin': True, 'name': 'Foo', 'predicate': { 'type': 'intersection', 'members': [ { 'type': 'predicate', 'name': 'P', 'extra_junk': 'P' }, { 'type': 'predicate', 'name': 'Q', 'extra_junk': 'Q' } ] }, 'fields': [], 'extra_junk': 'Foo' } ], 'extra_junk': 'C2' }) class TestIntersection(unittest.TestCase): def test_basic(self): P = MockPredicate('P') Q = MockPredicate('Q') result = P & Q self.assertEqual(repr(result), "P & Q") def test_subtype(self): P = MockPredicate('P', alphabetize=True) Q = MockPredicate('Q', alphabetize=True) self.assertIs(Q & P, P) self.assertIs(P & Q, P) def test_identity(self): x = grammar.IntersectionExp() self.assertTrue(x.is_top()) self.assertEqual(repr(x), 'IntersectionExp()') self.assertEqual(x.kind, 'identity') self.assertEqual(x.name, '') def test_in(self): Tree = MockPredicate('Tree') Trick = MockPredicate('Trick') Trek = MockPredicate('Trek') Truck = MockPredicate('Truck') self.assertIn('Tr', Tree & Trick & Trek & Truck) self.assertNotIn('Tre', Tree & Trick & Trek & Truck) self.assertIn('Tre', Tree & Trek) self.assertNotIn('Tree', Tree & Trek) self.assertNotIn('Nope', Tree & Truck) def test_distribution(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') S = MockPredicate('S') self.assertTrue( ((P | Q) & (R | S)).equals(P & R | P & S | Q & R | Q & S)) self.assertTrue( ((P | Q) & (R & S)).equals(P & R & S | Q & R & S)) self.assertEqual(Foo & Bar, grammar.UnionExp()) self.assertEqual(C2[Foo, Bar] & C2[Foo, Foo], grammar.UnionExp()) self.assertTrue((C2[Foo % P, Bar] & C2[Foo % Q, Bar]).equals( C2[Foo % (P & Q), Bar])) class TestUnion(unittest.TestCase): def test_basic(self): P = MockPredicate('P') Q = MockPredicate('Q') result = P | Q self.assertEqual(repr(result), "P | Q") def test_subtype(self): P = MockPredicate('P', alphabetize=True) Q = MockPredicate('Q', alphabetize=True) self.assertIs(Q | P, Q) self.assertIs(P | Q, Q) def test_identity(self): x = grammar.UnionExp() self.assertTrue(x.is_bottom()) self.assertEqual(repr(x), 'UnionExp()') self.assertEqual(x.kind, 'identity') self.assertEqual(x.name, '') def test_in(self): Bat = MockTemplate('Bat') Cat = MockTemplate('Cat') self.assertIn('C', Bat | Cat) self.assertIn('B', Bat | Cat) self.assertNotIn('D', Bat | Cat) def test_distribution(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') S = MockPredicate('S') self.assertTrue( ((P | Q) | (R | S)).equals(P | Q | R | S)) self.assertTrue( (P | (Q | R)).equals(P | Q | R)) self.assertEqual( repr(Foo % P | Bar % Q | Foo % R | Bar % S), 'Foo % (P | R) | Bar % (Q | S)') self.assertEqual( repr(grammar.UnionExp( [Foo % P, Bar % Q, Foo % R, Bar % S]).normalize()), 'Foo % (P | R) | Bar % (Q | S)') def test_maximum_antichain(self): P = MockPredicate('P', alphabetize=True) Q = MockPredicate('Q', alphabetize=True) X = MockPredicate('X') Y = MockPredicate('Y') self.assertEqual(repr((P | X) | (Q | Y)), 'X | Q | Y') self.assertTrue(repr(X & Y | (P & X | Q) | P & X & Q & Y), 'X & Y | Q') self.assertTrue(repr(X & Y | P & X | (X | Q)), 'X | Q') class TestSubtyping(unittest.TestCase): def assertStrongSubtype(self, X, Y): self.assertLessEqual(X, Y) # Should be the same in either direction self.assertGreaterEqual(Y, X) # X and Y would be equal otherwise self.assertFalse(X >= Y) def assertNoRelation(self, X, Y): XsubY = X <= Y self.assertEqual(XsubY, Y >= X) YsubX = Y <= X self.assertEqual(YsubX, X >= Y) self.assertFalse(XsubY or YsubX) def test_equal(self): Foo = MockTemplate('Foo') Foo2 = MockTemplate('Foo') self.assertTrue(Foo.equals(Foo2)) self.assertTrue(Foo2.equals(Foo)) def test_symbol_subtype(self): P = MockPredicate('P', alphabetize=True) Q = MockPredicate('Q', alphabetize=True) self.assertStrongSubtype(P, Q) def test_field(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') Baz = MockTemplate('Baz') self.assertStrongSubtype(C2[Foo, Bar], C2[Foo | Bar, Bar | Baz]) self.assertNoRelation(C2[Baz, Bar], C2[Foo | Bar, Bar | Baz]) self.assertStrongSubtype(C2[Foo, Bar], Bar | C2[Foo, Bar]) self.assertNoRelation(Baz | C2[Foo, Bar], Bar | C2[Foo, Bar]) self.assertStrongSubtype(C2[Foo, Bar | Baz], Bar | C2[Foo, Foo | Bar | Baz]) self.assertNoRelation(C2[Foo | Baz, Bar | Baz], Bar | C2[Foo, Bar]) def test_generic_subtype(self): C1 = MockTemplate('C1', fields=('a',)) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') self.assertStrongSubtype(C1[Foo] | C1[Bar], C1[Foo | Bar]) self.assertStrongSubtype(C1[C1[Foo] | C1[Bar]], C1[C1[Foo | Bar]]) self.assertStrongSubtype(C1[C1[Foo]] | C1[C1[Bar]], C1[C1[Foo] | C1[Bar]]) self.assertStrongSubtype(C1[C1[Foo]] | C1[C1[Bar]], C1[C1[Foo | Bar]]) def test_predicate_intersection(self): Foo = MockTemplate('Foo') P = MockPredicate('P') Q = MockPredicate('Q') self.assertStrongSubtype(Foo % (P & Q), Foo) self.assertStrongSubtype(Foo % (P & Q), Foo % P) self.assertStrongSubtype(Foo % (P & Q), Foo % Q) def test_union_of_intersections(self): P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') self.assertStrongSubtype(P & Q | Q & R, Q | P | R) self.assertStrongSubtype(P & Q | Q & R, Q | P & R) self.assertStrongSubtype(P & Q & R, P & Q | Q & R) self.assertStrongSubtype(P & Q & R, P & Q | Q & R | R & P) def test_type_union(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') Baz = MockTemplate('Baz') self.assertStrongSubtype(Foo | Bar, Foo | Bar | Baz) self.assertNoRelation(Foo | Baz, Baz | Bar) def test_predicate_union(self): P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') self.assertStrongSubtype(P | Q, P | Q | R) self.assertNoRelation(P | R, P | Q) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_meta.py000066400000000000000000000374511462552636000217550ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import pickle import qiime2.core.type.collection as col import qiime2.core.type.meta as meta from qiime2.core.type.tests.test_grammar import MockTemplate, MockPredicate class TestSelect(unittest.TestCase): def test_select_simple(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') X, Y = meta.TypeMap({ Foo: Bar }) sel, = meta.select_variables(X) self.assertIs(sel(X), X) self.assertIs(sel(Foo), Foo) self.assertIs(sel(X, swap=Foo), Foo) def test_select_inside_field(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') X, Y = meta.TypeMap({ Foo: Bar }) sel, = meta.select_variables(C2[X, Foo]) self.assertIs(sel(C2[X, Bar]), X) self.assertIs(sel(C2[Bar, Foo]), Bar) self.assertEqual(sel(C2[X, Foo], swap=Foo), C2[Foo, Foo]) def test_select_predicate(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') Q = MockPredicate('Q') X, Y = meta.TypeMap({ P & Q: Foo, P: Bar, Q: Foo }) sel, = meta.select_variables(Foo % X) self.assertIs(sel(Foo % X), X) self.assertIs(sel(Foo % P), P) self.assertEqual(sel(Foo % X, swap=Q), Foo % Q) def test_multiple_select(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') X1, Y1 = meta.TypeMap({ Foo: Bar }) X2, Y2 = meta.TypeMap({ P & Q: Foo, P: Bar, Q: Foo }) expr = C2[X1, Foo % X2] % X2 pred_sel, field_sel, field_pred_sel = meta.select_variables(expr) self.assertIs(pred_sel(expr), X2) self.assertIs(pred_sel(C2[Bar, Foo % Q] % P), P) self.assertEqual(pred_sel(expr, swap=R), C2[X1, Foo % X2] % R) self.assertIs(field_sel(expr), X1) self.assertIs(field_sel(C2[Bar, Foo]), Bar) self.assertEqual(field_sel(expr, swap=Foo), C2[Foo, Foo % X2] % X2) self.assertIs(field_pred_sel(expr), X2) self.assertIs(field_pred_sel(C2[Bar, Foo % Q] % P), Q) self.assertEqual(field_pred_sel(expr, swap=R), C2[X1, Foo % R] % X2) class TestTypeMap(unittest.TestCase): def test_missing_branch_requested(self): P = MockPredicate('P') Q = MockPredicate('Q') with self.assertRaisesRegex(ValueError, 'Ambiguous'): meta.TypeMap({P: P, Q: Q}) def test_mismatched_pieces(self): P = MockPredicate('P') Bar = MockTemplate('Bar') with self.assertRaisesRegex(ValueError, 'in the same'): meta.TypeMap({P: P, Bar: Bar}) def test_iter_sorted(self): P = MockPredicate('P', alphabetize=True) Q = MockPredicate('Q', alphabetize=True) Other = MockPredicate('Other') X, Y = meta.TypeMap({ P & Other: Other, P: P, Q & Other: Other, Q: Q, Other: Other }) mapping = X.mapping self.assertEqual( list(mapping.lifted), [col.Tuple[P & Other], col.Tuple[P], col.Tuple[Q & Other], col.Tuple[Q], col.Tuple[Other]]) def test_variables(self): P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') S = MockPredicate('S') X, Y = meta.TypeMap({ P & Q: R & S, P: R, Q: S, }) self.assertEqual(X.members, (P & Q, P, Q)) self.assertEqual(Y.members, (R & S, R, S)) self.assertEqual(X.index, 0) self.assertEqual(Y.index, 1) self.assertTrue(X.input) self.assertTrue(Y.output) self.assertFalse(X.output) self.assertFalse(Y.input) # subtyping self.assertFalse(S <= X) self.assertFalse(R <= X) self.assertLessEqual(P, X) self.assertLessEqual(Q, X) self.assertLessEqual(P & Q, X) self.assertLessEqual(P & S | P & R, X) def test_pickle(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') X, Y = meta.TypeMap({ Foo: Bar }) X1, Y1 = pickle.loads(pickle.dumps((X, Y))) # Pickled together self.assertIs(X1.mapping, Y1.mapping) self.assertEqual(X1.index, X.index) self.assertEqual(Y1.index, Y.index) def test_subtype(self): P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') S = MockPredicate('S') T, U, Y = meta.TypeMap({ (P & Q, P & Q): R & S, (P & Q, Q): R & S, (P, P): R, (Q, Q): S }) self.assertLessEqual(P, T) self.assertLessEqual(Q, T) self.assertFalse(P | Q <= T) self.assertLessEqual(T, U) class TestTypeMatch(unittest.TestCase): def test_missing_branch_provided(self): P = MockPredicate('P') Q = MockPredicate('Q') T = meta.TypeMatch([P, Q]) self.assertEqual(T.members, (P & Q, P, Q)) def test_variable(self): P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') # This strange list is for branch coverage mostly T = meta.TypeMatch([P & Q, P, Q, R]) self.assertTrue(T.input) self.assertTrue(T.output) self.assertEqual(T.index, 1) # it really is supposed to be 1 class TestMatch(unittest.TestCase): def test_single_variable(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') X, Y = meta.TypeMap({ Foo % P: Foo, Bar % P: Foo % P, Foo: Bar, Bar: Bar % P }) input_signature = dict(input1=X) output_signature = dict(output1=Y) foop = dict(input1=Foo % P) barp = dict(input1=Bar % P) foo = dict(input1=Foo) bar = dict(input1=Bar) self.assertEqual(meta.match(foop, input_signature, output_signature), dict(output1=Foo)) self.assertEqual(meta.match(barp, input_signature, output_signature), dict(output1=Foo % P)) self.assertEqual(meta.match(foo, input_signature, output_signature), dict(output1=Bar)) self.assertEqual(meta.match(bar, input_signature, output_signature), dict(output1=Bar % P)) def test_nested_match(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') X, Y = meta.TypeMap({ Foo % P: Foo, Bar % P: Foo % P, Foo: Bar, Bar: Bar % P }) input_signature = dict(input1=C2[X, Bar]) output_signature = dict(output1=C2[Bar, Y]) foop = dict(input1=C2[Foo % P, Bar]) barp = dict(input1=C2[Bar % P, Bar]) foo = dict(input1=C2[Foo, Foo]) bar = dict(input1=C2[Bar, Foo]) self.assertEqual(meta.match(foop, input_signature, output_signature), dict(output1=C2[Bar, Foo])) self.assertEqual(meta.match(barp, input_signature, output_signature), dict(output1=C2[Bar, Foo % P])) self.assertEqual(meta.match(foo, input_signature, output_signature), dict(output1=C2[Bar, Bar])) self.assertEqual(meta.match(bar, input_signature, output_signature), dict(output1=C2[Bar, Bar % P])) def test_multiple_variables(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') A, B, C, Y, Z = meta.TypeMap({ (Foo % P, Bar, Bar): (Foo, Foo), (Foo, Bar % P, Foo): (Bar, Foo), (Foo, Foo, Bar): (Foo, Bar), (Bar, Bar % P, Foo): (Bar, Bar) }) input_signature = dict(input1=C2[A, B], input2=C) output_signature = dict(output1=C2[Y, Z]) fbb = dict(input1=C2[Foo % P, Bar], input2=Bar) fbf = dict(input1=C2[Foo, Bar % P], input2=Foo) ffb = dict(input1=C2[Foo, Foo], input2=Bar % P) # subtype on in2! bbf = dict(input1=C2[Bar % P, Bar % P], input2=Foo) # subtype on in1 self.assertEqual(meta.match(fbb, input_signature, output_signature), dict(output1=C2[Foo, Foo])) self.assertEqual(meta.match(fbf, input_signature, output_signature), dict(output1=C2[Bar, Foo])) self.assertEqual(meta.match(ffb, input_signature, output_signature), dict(output1=C2[Foo, Bar])) self.assertEqual(meta.match(bbf, input_signature, output_signature), dict(output1=C2[Bar, Bar])) def test_multiple_mappings(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') X, Y = meta.TypeMap({ Foo % P: Foo, Bar % P: Foo % P, Foo: Bar, Bar: Bar % P }) T, R = meta.TypeMap({ Bar % P: Foo, Foo % P: Foo % P, Bar: Bar, Foo: Bar % P }) input_signature = dict(input1=C2[X, T]) output_signature = dict(output1=C2[R, Y]) foop = dict(input1=C2[Foo % P, Bar]) barp = dict(input1=C2[Bar % P, Bar % P]) foo = dict(input1=C2[Foo, Foo]) bar = dict(input1=C2[Bar, Foo]) self.assertEqual(meta.match(foop, input_signature, output_signature), dict(output1=C2[Bar, Foo])) self.assertEqual(meta.match(barp, input_signature, output_signature), dict(output1=C2[Foo, Foo % P])) self.assertEqual(meta.match(foo, input_signature, output_signature), dict(output1=C2[Bar % P, Bar])) self.assertEqual(meta.match(bar, input_signature, output_signature), dict(output1=C2[Bar % P, Bar % P])) def test_no_solution(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') A, B, C, Y, Z = meta.TypeMap({ (Foo % P, Bar, Bar): (Foo, Foo), (Foo, Bar % P, Foo): (Bar, Foo), (Foo, Foo, Bar): (Foo, Bar), (Bar, Bar % P, Foo): (Bar, Bar) }) input_signature = dict(input1=C2[A, B], input2=C) output_signature = dict(output1=C2[Y, Z]) with self.assertRaisesRegex(ValueError, 'No solution'): meta.match(dict(input1=C2[Foo, Foo], input2=Foo), input_signature, output_signature) def test_inconsistent_binding(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') A, B, C, Y, Z = meta.TypeMap({ (Foo % P, Bar, Bar): (Foo, Foo), (Foo, Bar % P, Foo): (Bar, Foo), (Foo, Foo, Bar): (Foo, Bar), (Bar, Bar % P, Foo): (Bar, Bar) }) input_signature = dict(input1=C2[A, B], input2=C2[C, C]) output_signature = dict(output1=C2[Y, Z]) with self.assertRaisesRegex(ValueError, 'to match'): meta.match(dict(input1=C2[Foo, Bar % P], input2=C2[Foo, Bar]), input_signature, output_signature) def test_consistent_subtype_binding(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') A, B, C, Y, Z = meta.TypeMap({ (Foo % P, Bar, Bar): (Foo, Foo), (Foo, Bar % P, Foo): (Bar, Foo), (Foo, Foo, Bar): (Foo, Bar), (Bar, Bar % P, Foo): (Bar, Bar) }) input_signature = dict(input1=C2[A, B], input2=C2[C, C]) output_signature = dict(output1=C2[Y, Z]) cons = dict(input1=C2[Foo, Bar % P], input2=C2[Foo, Foo % P]) self.assertEqual(meta.match(cons, input_signature, output_signature), dict(output1=C2[Bar, Foo])) def test_missing_variables(self): C2 = MockTemplate('C2', fields=('a', 'b')) Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') A, B, C, Y, Z = meta.TypeMap({ (Foo % P, Bar, Bar): (Foo, Foo), (Foo, Bar % P, Foo): (Bar, Foo), (Foo, Foo, Bar): (Foo, Bar), (Bar, Bar % P, Foo): (Bar, Bar) }) input_signature = dict(input1=C2[A, B], input2=Foo) output_signature = dict(output1=C2[Y, Z]) with self.assertRaisesRegex(ValueError, 'Missing'): meta.match(dict(input1=C2[Foo, Foo], input2=Foo), input_signature, output_signature) def test_no_variables(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') P = MockPredicate('P') input_signature = dict(input1=Foo, input2=Bar) output_signature = dict(output1=Bar % P, output2=Foo % P) given = dict(input1=Foo % P, input2=Bar) self.assertEqual(meta.match(given, input_signature, output_signature), output_signature) def test_type_match(self): Foo = MockTemplate('Foo') Bar = MockTemplate('Bar') Baz = MockTemplate('Baz') P = MockPredicate('P') T = meta.TypeMatch([Baz, Foo, Bar]) input_signature = dict(input1=T) output_signature = dict(output1=T) foop = dict(input1=Foo % P) barp = dict(input1=Bar % P) foo = dict(input1=Foo) bar = dict(input1=Bar) self.assertEqual(meta.match(foop, input_signature, output_signature), dict(output1=Foo)) self.assertEqual(meta.match(barp, input_signature, output_signature), dict(output1=Bar)) self.assertEqual(meta.match(foo, input_signature, output_signature), dict(output1=Foo)) self.assertEqual(meta.match(bar, input_signature, output_signature), dict(output1=Bar)) def test_type_match_auto_intersect(self): C1 = MockTemplate('C1', fields=('a',)) Foo = MockTemplate('Foo') P = MockPredicate('P') Q = MockPredicate('Q') R = MockPredicate('R') S = MockPredicate('S') T = meta.TypeMatch([P, Q, R, S]) input_signature = dict(input1=C1[Foo] % T) output_signature = dict(output1=Foo % T) pqrs = dict(input1=C1[Foo] % (P & Q & R & S)) p = dict(input1=C1[Foo] % P) pr = dict(input1=C1[Foo] % (P & R)) qs = dict(input1=C1[Foo] % (Q & S)) self.assertEqual(meta.match(pqrs, input_signature, output_signature), dict(output1=Foo % (P & Q & R & S))) self.assertEqual(meta.match(p, input_signature, output_signature), dict(output1=Foo % P)) self.assertEqual(meta.match(pr, input_signature, output_signature), dict(output1=Foo % (P & R))) self.assertEqual(meta.match(qs, input_signature, output_signature), dict(output1=Foo % (Q & S))) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_parse.py000066400000000000000000000124551462552636000221360ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2.core.type.parse import ast_to_type, string_to_ast from qiime2.core.testing.type import Foo, Bar, C1, C2 from qiime2.plugin import (Int, Float, Str, Bool, Range, Choices, TypeMap, Properties, List, Set, Visualization, Metadata, MetadataColumn, Categorical, Numeric) class TestParsing(unittest.TestCase): def assert_roundtrip(self, type): ast = string_to_ast(repr(type)) type1 = ast_to_type(ast) type2 = ast_to_type(type1.to_ast()) self.assertEqual(type, type1) self.assertEqual(ast, type1.to_ast()) self.assertEqual(type1, type2) def test_simple_semantic_type(self): self.assert_roundtrip(Foo) self.assert_roundtrip(Bar) self.assert_roundtrip(C1[Foo]) def test_union_semantic_type(self): self.assert_roundtrip(Foo | Bar) self.assert_roundtrip(C1[Foo | Bar]) def test_complicated_semantic_type(self): self.assert_roundtrip(C2[C1[Foo % Properties(["A", "B"]) | Bar], Foo % Properties("A") ] % Properties(exclude=["B", "C"])) def test_collection_semantic_type(self): self.assert_roundtrip(List[Foo | Bar]) self.assert_roundtrip(Set[Bar]) def test_visualization(self): self.assert_roundtrip(Visualization) def test_primitive_simple(self): self.assert_roundtrip(Int) self.assert_roundtrip(Float) self.assert_roundtrip(Str) self.assert_roundtrip(Bool) def test_primitive_predicate(self): self.assert_roundtrip(Int % Range(0, 10)) self.assert_roundtrip( Int % (Range(0, 10) | Range(50, 100, inclusive_end=True))) self.assert_roundtrip(Float % Range(None, 10)) self.assert_roundtrip(Float % Range(0, None)) self.assert_roundtrip(Str % Choices("A")) self.assert_roundtrip(Str % Choices(["A"])) self.assert_roundtrip(Str % Choices("A", "B")) self.assert_roundtrip(Str % Choices(["A", "B"])) self.assert_roundtrip(Bool % Choices(True)) self.assert_roundtrip(Bool % Choices(False)) def test_collection_primitive(self): self.assert_roundtrip(Set[Str % Choices('A', 'B', 'C')]) self.assert_roundtrip(List[Int % Range(1, 3, inclusive_end=True) | Str % Choices('A', 'B', 'C')]) def test_metadata_primitive(self): self.assert_roundtrip(Metadata) self.assert_roundtrip(MetadataColumn[Numeric]) self.assert_roundtrip(MetadataColumn[Categorical]) self.assert_roundtrip(MetadataColumn[Numeric | Categorical]) def test_typevars(self): T, U, V, W, X = TypeMap({ (Foo, Bar, Str % Choices('A', 'B')): (C1[Foo], C1[Bar]), (Foo | Bar, Foo, Str): (C1[Bar], C1[Foo]) }) scope = {} T1 = ast_to_type(T.to_ast(), scope=scope) U1 = ast_to_type(U.to_ast(), scope=scope) V1 = ast_to_type(V.to_ast(), scope=scope) W1 = ast_to_type(W.to_ast(), scope=scope) X1 = ast_to_type(X.to_ast(), scope=scope) self.assertEqual(len(scope), 1) self.assertEqual(scope[id(T.mapping)], [T1, U1, V1, W1, X1]) self.assertEqual(T1.mapping.lifted, T.mapping.lifted) self.assertIs(T1.mapping, U1.mapping) self.assertIs(U1.mapping, V1.mapping) self.assertIs(V1.mapping, W1.mapping) self.assertIs(W1.mapping, X1.mapping) def test_TypeMap_with_properties(self): I, OU = TypeMap({ C1[Foo % Properties(['A', 'B', 'C'])]: Str, C1[Foo % Properties(['A', 'B'])]: Str, C1[Foo % Properties(['A', 'C'])]: Str, C1[Foo % Properties(['B', 'C'])]: Str, C1[Foo % Properties(['A'])]: Str, C1[Foo % Properties(['B'])]: Str, C1[Foo % Properties(['C'])]: Str, }) scope = {} i = ast_to_type(I.to_ast(), scope=scope) o = ast_to_type(OU.to_ast(), scope=scope) self.assertEqual(scope[id(I.mapping)], [i, o]) self.assertEqual(len(scope), 1) # Assert mapping is the same after ast_to_type call self.assertEqual(I.mapping.lifted, i.mapping.lifted) # Assert that the mapping object is the same in both i and o self.assertIs(i.mapping, o.mapping) def test_syntax_error(self): with self.assertRaisesRegex(ValueError, "could not be parsed"): string_to_ast('$') def test_bad_juju(self): with self.assertRaisesRegex(ValueError, "one type expression"): string_to_ast('import os; os.rmdir("something-important")') def test_more_bad(self): with self.assertRaisesRegex(ValueError, "Unknown expression"): string_to_ast('lambda x: x') def test_weird(self): with self.assertRaisesRegex(ValueError, "Unknown literal"): string_to_ast('FeatureTable(Foo + Bar)') if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_primitive.py000066400000000000000000000071411462552636000230300ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import pandas as pd import qiime2.metadata as metadata import qiime2.core.type.primitive as primitive import qiime2.core.type.grammar as grammar class TestIntersectTwoRanges(unittest.TestCase): def assertIntersectEqual(self, a, b, exp): r1 = a & b r2 = b & a self.assertEqual(r1, r2) self.assertEqual(r1, exp) def test_overlap_simple(self): a = primitive.Range(0, 10) b = primitive.Range(3, 7) self.assertIntersectEqual(a, b, b) def test_overlap_inclusive_point(self): a = primitive.Range(0, 5, inclusive_end=True) b = primitive.Range(5, 10) exp = primitive.Range(5, 5, inclusive_start=True, inclusive_end=True) self.assertIntersectEqual(a, b, exp) def test_disjoint_far(self): a = primitive.Range(-10, -5) b = primitive.Range(5, 10) self.assertIntersectEqual(a, b, grammar.UnionExp()) def test_disjoint_exclusive_point(self): a = primitive.Range(0, 5, inclusive_end=False) b = primitive.Range(5, 9, inclusive_start=False) self.assertIntersectEqual(a, b, grammar.UnionExp()) class TestChoices(unittest.TestCase): def test_list_constructor(self): choices = primitive.Choices(['a', 'b', 'c']) self.assertEqual(choices.template.choices, ('a', 'b', 'c')) self.assertIn('a', choices) self.assertNotIn('x', choices) def test_set_constructor(self): choices = primitive.Choices({'a', 'b', 'c'}) self.assertEqual(choices.template.choices, ('a', 'b', 'c')) self.assertIn('a', choices) self.assertNotIn('x', choices) def test_varargs_constructor(self): choices = primitive.Choices('a', 'b', 'c') self.assertEqual(choices.template.choices, ('a', 'b', 'c')) self.assertIn('a', choices) self.assertNotIn('x', choices) def test_union(self): a = primitive.Choices('a', 'b', 'c') b = primitive.Choices('x', 'y', 'z') r = a | b self.assertIn('a', r) self.assertIn('x', r) self.assertNotIn('foo', r) def test_intersection(self): a = primitive.Choices('a', 'b', 'c') b = primitive.Choices('a', 'c', 'z') r = a & b self.assertIn('a', r) self.assertIn('c', r) self.assertNotIn('b', r) self.assertNotIn('z', r) class TestMetadataColumn(unittest.TestCase): def test_decode_categorical_value(self): value = pd.Series({'a': 'a', 'b': 'b', 'c': 'c'}, name='foo') value.index.name = 'id' cat_md = metadata.CategoricalMetadataColumn(value) res = primitive.MetadataColumn[primitive.Categorical].decode(cat_md) self.assertIs(res, cat_md) def test_decode_numeric_value(self): value = pd.Series({'a': 1, 'b': 2, 'c': 3}, name='foo') value.index.name = 'id' num_md = metadata.NumericMetadataColumn(value) res = primitive.MetadataColumn[primitive.Categorical].decode(num_md) self.assertIs(res, num_md) def test_decode_other(self): with self.assertRaisesRegex(TypeError, 'provided.*directly'): primitive.MetadataColumn[primitive.Categorical].decode( "") if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_semantic.py000066400000000000000000000037761462552636000226350ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import qiime2.core.type.semantic as semantic import qiime2.core.type.grammar as grammar import qiime2.core.type.primitive as primitive import qiime2.core.type.visualization as visualization class TestIsSemanticType(unittest.TestCase): def test_primitives_not_semantic(self): looped = False for element in dir(primitive): looped = True element = getattr(primitive, element) if isinstance(element, grammar._ExpBase): self.assertFalse(semantic.is_semantic_type(element)) self.assertTrue(looped) def test_visualization_not_semantic(self): self.assertFalse( semantic.is_semantic_type(visualization.Visualization)) def test_type_expr_not_semantic(self): TypeExpr = grammar.TypeExp(None) self.assertFalse(semantic.is_semantic_type(TypeExpr)) def test_simple_semantic_type(self): A = semantic.SemanticType('A') X = semantic.SemanticType('X') Foo = semantic.SemanticType('Foo', field_names=['a', 'b']) self.assertTrue(semantic.is_semantic_type(A)) self.assertTrue(semantic.is_semantic_type(X)) self.assertTrue(semantic.is_semantic_type(Foo)) def test_composite_semantic_type(self): Foo = semantic.SemanticType('Foo', field_names=['a', 'b']) A = semantic.SemanticType('A', variant_of=Foo.field['a']) B = semantic.SemanticType('B', variant_of=Foo.field['b']) self.assertTrue(semantic.is_semantic_type(A)) self.assertTrue(semantic.is_semantic_type(B)) self.assertTrue(semantic.is_semantic_type(Foo[A, B])) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/tests/test_util.py000066400000000000000000001044771462552636000220070ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2.core.type import ( parse_primitive, Int, Float, Bool, Str, List, Set, Collection, Metadata, MetadataColumn) class TestParsePrimitiveNonCollectionsSimple(unittest.TestCase): def test_metadata_expr(self): with self.assertRaisesRegex(ValueError, 'Metadata may not be parsed'): parse_primitive(Metadata, '42') def test_metadata_column_expr(self): with self.assertRaisesRegex(ValueError, 'MetadataColumn.* may not be parsed'): parse_primitive(MetadataColumn, '42') def test_int_type_int_value(self): obs = parse_primitive(Int, '42') self.assertEqual(obs, 42) self.assertIsInstance(obs, int) def test_float_type_int_value(self): obs = parse_primitive(Float, '42') self.assertEqual(obs, 42) # ints are a subtype of float, so if it can be parsed as an int, fine. self.assertIsInstance(obs, int) def test_bool_type_int_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Bool, '42') def test_str_type_int_value(self): obs = parse_primitive(Str, '42') self.assertEqual(obs, '42') self.assertIsInstance(obs, str) def test_int_type_float_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int, '42.0') def test_float_type_float_value(self): obs = parse_primitive(Float, '42.0') self.assertEqual(obs, 42.0) self.assertIsInstance(obs, float) def test_bool_type_float_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Bool, '42.0') def test_str_type_float_value(self): obs = parse_primitive(Str, '42.0') self.assertEqual(obs, '42.0') self.assertIsInstance(obs, str) def test_int_type_bool_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int, 'True') def test_float_type_bool_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Float, 'True') def test_bool_type_bool_value(self): obs = parse_primitive(Bool, 'True') self.assertEqual(obs, True) self.assertIsInstance(obs, bool) def test_str_type_bool_value(self): obs = parse_primitive(Str, 'True') self.assertEqual(obs, 'True') self.assertIsInstance(obs, str) def test_int_type_str_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int, 'peanut') def test_float_type_str_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Float, 'peanut') def test_bool_type_str_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Bool, 'peanut') def test_str_type_str_value(self): obs = parse_primitive(Str, 'peanut') self.assertEqual(obs, 'peanut') self.assertIsInstance(obs, str) class TestParsePrimitiveNonCollectionNonStringInputs(unittest.TestCase): def test_int_type_int_value(self): obs = parse_primitive(Int, 1) self.assertEqual(obs, 1) self.assertIsInstance(obs, int) def test_float_type_float_value(self): obs = parse_primitive(Float, 3.3) self.assertEqual(obs, 3.3) self.assertIsInstance(obs, float) def test_bool_type_bool_value(self): obs = parse_primitive(Bool, True) self.assertEqual(obs, True) self.assertIsInstance(obs, bool) def test_str_type_str_value(self): obs = parse_primitive(Str, 'peanut') self.assertEqual(obs, 'peanut') self.assertIsInstance(obs, str) def test_int_type_bool_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int, True) class TestParsePrimitiveNonCollectionsSimpleUnions(unittest.TestCase): def setUp(self): super().setUp() self.exprs = [ Int | Bool, Int | Str, Float | Bool, Float | Str, Bool | Str, ] def test_int_union_float_expr_int_value(self): # Int | Float == Float obs = parse_primitive(Int | Float, '42') self.assertEqual(obs, 42) self.assertIsInstance(obs, int) def test_int_union_float_expr_float_value(self): # Int | Float == Float obs = parse_primitive(Int | Float, '42.1') self.assertEqual(obs, 42.1) self.assertIsInstance(obs, float) def test_int_union_float_expr_bool_value(self): # Int | Float == Float with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int | Float, 'True') def test_int_union_float_expr_str_value(self): # Int | Float == Float with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(Int | Float, 'peanut') def test_simple_unions_with_int_value(self): for expr in self.exprs: with self.subTest(expr=expr): obs = parse_primitive(expr, '42') self.assertEqual(obs, 42) self.assertIsInstance(obs, int) def test_simple_unions_with_float_value(self): for expr in self.exprs: with self.subTest(expr=expr): obs = parse_primitive(expr, '42.1') self.assertEqual(obs, 42.1) self.assertIsInstance(obs, float) def test_simple_unions_with_bool_value(self): for expr in self.exprs: with self.subTest(expr=expr): obs = parse_primitive(expr, 'True') self.assertEqual(obs, True) self.assertIsInstance(obs, bool) def test_simple_unions_with_str_value(self): for expr in self.exprs: with self.subTest(expr=expr): obs = parse_primitive(expr, 'peanut') self.assertEqual(obs, 'peanut') self.assertIsInstance(obs, str) class TestParsePrimitiveCollectionsSimple(unittest.TestCase): def test_list_of_int(self): obs = parse_primitive(List[Int], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_list_of_int_bad_value_variant_a(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int], ('True', '2', '3')) def test_list_of_int_bad_value_variant_b(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int], ('1', '2', 'False')) def test_set_of_int(self): obs = parse_primitive(Set[Int], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_collection_of_int_given_list(self): obs = parse_primitive(Collection[Int], ('1', '2', '3')) self.assertEqual(obs, {'0': 1, '1': 2, '2': 3}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['0'], int) def test_collection_of_int_given_dict(self): obs = parse_primitive(Collection[Int], {'1': '1', '2': '2', '3': '3'}) self.assertEqual(obs, {'1': 1, '2': 2, '3': 3}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['1'], int) def test_list_of_float(self): obs = parse_primitive(List[Float], ('1.0', '2.0', '3.0')) self.assertEqual(obs, [1.0, 2.0, 3.0]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_set_of_float(self): obs = parse_primitive(Set[Float], ('1.0', '2.0', '3.0')) self.assertEqual(obs, {1.0, 2.0, 3.0}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_collection_of_float_given_list(self): obs = parse_primitive(Collection[Float], ('1.0', '2.0', '3.0')) self.assertEqual(obs, {'0': 1.0, '1': 2.0, '2': 3.0}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['0'], float) def test_collection_of_float_given_dict(self): obs = parse_primitive(Collection[Float], {'1': '1.0', '2': '2.0', '3': '3.0'}) self.assertEqual(obs, {'1': 1, '2': 2, '3': 3}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['1'], float) def test_list_of_bool(self): obs = parse_primitive(List[Bool], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_set_of_bool(self): obs = parse_primitive(Set[Bool], ('True', 'False')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_collection_of_bool_given_list(self): obs = parse_primitive(Collection[Bool], ('True', 'False', 'True')) self.assertEqual(obs, {'0': True, '1': False, '2': True}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['0'], bool) def test_collection_of_bool_given_dict(self): obs = parse_primitive(Collection[Bool], {'1': 'True', '2': 'False', '3': 'True'}) self.assertEqual(obs, {'1': True, '2': False, '3': True}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['1'], int) def test_list_of_str(self): obs = parse_primitive(List[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_set_of_str(self): obs = parse_primitive(Set[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_collection_of_str_given_list(self): obs = parse_primitive(Collection[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'0': 'peanut', '1': 'the', '2': 'dog'}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['0'], str) def test_collection_of_str_given_dict(self): obs = parse_primitive(Collection[Str], {'1': 'peanut', '2': 'the', '3': 'dog'}) self.assertEqual(obs, {'1': 'peanut', '2': 'the', '3': 'dog'}) self.assertIsInstance(obs, dict) self.assertIsInstance(obs['1'], str) # The next tests _aren't_ monomorphic, because unions of Int and Float # always yield a Float (List[Int] | List[Float] == List[Float]). def test_list_int_or_float_with_int_value(self): obs = parse_primitive(List[Int] | List[Float], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_set_int_or_float_with_int_value(self): obs = parse_primitive(Set[Int] | Set[Float], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_list_int_or_float_with_float_value(self): obs = parse_primitive(List[Int] | List[Float], ('1.1', '2.2', '3.3')) self.assertEqual(obs, [1.1, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_set_int_or_float_with_float_value(self): obs = parse_primitive(Set[Int] | Set[Float], ('1.1', '2.2', '3.3')) self.assertEqual(obs, {1.1, 2.2, 3.3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_list_int_or_float_int_value(self): obs = parse_primitive(List[Int | Float], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_set_int_or_float_int_value(self): obs = parse_primitive(Set[Int | Float], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) class TestParsePrimitiveCollectionsMonomorphic(unittest.TestCase): def test_list_int_or_bool_with_int_value(self): obs = parse_primitive(List[Int] | List[Bool], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_list_int_or_bool_with_bool_value(self): obs = parse_primitive(List[Int] | List[Bool], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_int_or_bool_with_mixed_value_variant_a(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int] | List[Bool], ('True', '2', '3')) def test_list_int_or_bool_with_mixed_value_variant_b(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int] | List[Bool], ('1', '2', 'True')) def test_list_int_or_bool_with_mixed_value_variant_c(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int] | List[Bool], ('False', '2', 'True')) def test_set_int_or_bool_with_int_value(self): obs = parse_primitive(Set[Int] | Set[Bool], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_set_int_or_bool_with_bool_value(self): obs = parse_primitive(Set[Int] | Set[Bool], ('True', 'False')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_list_int_or_str_with_int_value(self): obs = parse_primitive(List[Int] | List[Str], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_list_int_or_str_with_str_value(self): obs = parse_primitive(List[Int] | List[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_int_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Int] | List[Str], ('1', 'the', 'dog')) self.assertEqual(obs, ['1', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[1], str) def test_list_int_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Int] | List[Str], ('peanut', 'the', '1')) self.assertEqual(obs, ['peanut', 'the', '1']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[2], str) def test_set_int_or_str_with_int_value(self): obs = parse_primitive(Set[Int] | Set[Str], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_set_int_or_str_with_str_value(self): obs = parse_primitive(Set[Int] | Set[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_list_float_or_bool_with_float_value(self): obs = parse_primitive(List[Float] | List[Bool], ('1.1', '2.2', '3.3')) self.assertEqual(obs, [1.1, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_list_float_or_bool_with_bool_value(self): obs = parse_primitive(List[Float] | List[Bool], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_float_or_bool_with_mixed_value_variant_a(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Float] | List[Bool], ('1.1', 'False', 'True')) def test_list_float_or_bool_with_mixed_value_variant_b(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Float] | List[Bool], ('True', 'False', '3.3')) def test_set_float_or_bool_with_float_value(self): obs = parse_primitive(Set[Float] | Set[Bool], ('1.1', '2.2', '3.3')) self.assertEqual(obs, {1.1, 2.2, 3.3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_set_float_or_bool_with_bool_value(self): obs = parse_primitive(Set[Float] | Set[Bool], ('True', 'False', 'True')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_list_float_or_str_with_float_value(self): obs = parse_primitive(List[Float] | List[Str], ('1.1', '2.2', '3.3')) self.assertEqual(obs, [1.1, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_list_float_or_str_with_str_value(self): obs = parse_primitive(List[Float] | List[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_float_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Float] | List[Str], ('1.1', 'the', 'dog')) self.assertEqual(obs, ['1.1', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_float_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Float] | List[Str], ('peanut', 'the', '3.3')) self.assertEqual(obs, ['peanut', 'the', '3.3']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[-1], str) def test_set_float_or_str_with_float_value(self): obs = parse_primitive(Set[Float] | Set[Str], ('1.1', '2.2', '3.3')) self.assertEqual(obs, {1.1, 2.2, 3.3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_set_float_or_str_with_str_value(self): obs = parse_primitive(Set[Float] | Set[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_list_bool_or_str_with_bool_value(self): obs = parse_primitive(List[Bool] | List[Str], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_bool_or_str_with_str_value(self): obs = parse_primitive(List[Bool] | List[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_bool_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Bool] | List[Str], ('True', 'the', 'dog')) self.assertEqual(obs, ['True', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_bool_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Bool] | List[Str], ('peanut', 'the', 'True')) self.assertEqual(obs, ['peanut', 'the', 'True']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[-1], str) def test_set_bool_or_str_with_bool_value(self): obs = parse_primitive(Set[Bool] | Set[Str], ('True', 'False', 'True')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_set_bool_or_str_with_str_value(self): obs = parse_primitive(Set[Bool] | Set[Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_list_bool_or_str_with_mixed_value(self): obs = parse_primitive(List[Bool] | List[Str], ('peanut', 'the', 'True')) self.assertEqual(obs, ['peanut', 'the', 'True']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[-1], str) class TestParsePrimitiveCollectionsComposite(unittest.TestCase): def test_list_int_or_bool_with_int_value(self): obs = parse_primitive(List[Int | Bool], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_list_int_or_bool_with_float_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool], ('1.1', '2.2', '3.3')) def test_list_int_or_bool_with_bool_value(self): obs = parse_primitive(List[Int | Bool], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_int_or_bool_with_str_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool], ('peanut', 'the', 'dog')) def test_list_int_or_bool_with_mixed_value(self): obs = parse_primitive(List[Int | Bool], ('1', 'False', '2', 'True')) self.assertEqual(obs, [1, False, 2, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) self.assertIsInstance(obs[1], bool) def test_list_int_or_bool_with_mixed_value_variant_a(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool], ('peanut', 'False', '2', 'True')) def test_list_int_or_bool_with_mixed_value_variant_b(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool], ('1', 'False', '2', 'peanut')) def test_list_int_or_bool_with_bad_mix_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool], ('1', 'True', 'dog')) def test_set_int_or_bool_with_int_value(self): obs = parse_primitive(Set[Int | Bool], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_set_int_or_bool_with_bool_value(self): obs = parse_primitive(Set[Int | Bool], ('True', 'False', 'True')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_set_int_or_bool_with_mixed_value(self): obs = parse_primitive(Set[Int | Bool], ('1', 'False', '2', 'True')) self.assertEqual(obs, {1, False, 2, True}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) self.assertIsInstance(obs.pop(), int) def test_list_int_or_str_with_int_value(self): obs = parse_primitive(List[Int | Str], ('1', '2', '3')) self.assertEqual(obs, [1, 2, 3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) def test_list_int_or_str_with_str_value(self): obs = parse_primitive(List[Int | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_int_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Int | Str], ('1', 'the', 'dog')) self.assertEqual(obs, [1, 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], int) self.assertIsInstance(obs[1], str) def test_list_int_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Int | Str], ('peanut', 'the', '1')) self.assertEqual(obs, ['peanut', 'the', 1]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[2], int) def test_set_int_or_str_with_int_value(self): obs = parse_primitive(Set[Int | Str], ('1', '2', '3')) self.assertEqual(obs, {1, 2, 3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), int) def test_set_int_or_str_with_str_value(self): obs = parse_primitive(Set[Int | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_set_int_or_str_with_mixed_value(self): obs = parse_primitive(Set[Int | Str], ('1', 'the', '2', 'dog')) self.assertEqual(obs, {1, 'the', 2, 'dog'}) self.assertIsInstance(obs, set) def test_list_float_or_bool_with_float_value(self): obs = parse_primitive(List[Float | Bool], ('1.1', '2.2', '3.3')) self.assertEqual(obs, [1.1, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_list_float_or_bool_with_bool_value(self): obs = parse_primitive(List[Float | Bool], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_float_or_bool_with_mixed_value_variant_a(self): obs = parse_primitive(List[Float | Bool], ('True', '2.2', '3.3')) self.assertEqual(obs, [True, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) self.assertIsInstance(obs[1], float) def test_list_float_or_bool_with_mixed_value_variant_b(self): obs = parse_primitive(List[Float | Bool], ('1.1', '2.2', 'False')) self.assertEqual(obs, [1.1, 2.2, False]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) self.assertIsInstance(obs[-1], bool) def test_list_float_or_bool_with_bad_mix_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Float | Bool], ('1.1', '2.2', 'peanut')) def test_set_float_or_bool_with_float_value(self): obs = parse_primitive(Set[Float | Bool], ('1.1', '2.2', '3.3')) self.assertEqual(obs, {1.1, 2.2, 3.3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_set_float_or_bool_with_bool_value(self): obs = parse_primitive(Set[Float | Bool], ('True', 'False', 'True')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_list_float_or_str_with_float_value(self): obs = parse_primitive(List[Float | Str], ('1.1', '2.2', '3.3')) self.assertEqual(obs, [1.1, 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) def test_list_float_or_str_with_str_value(self): obs = parse_primitive(List[Float | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_float_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Float | Str], ('peanut', '2.2', '3.3')) self.assertEqual(obs, ['peanut', 2.2, 3.3]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[1], float) def test_list_float_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Float | Str], ('1.1', '2.2', 'dog')) self.assertEqual(obs, [1.1, 2.2, 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], float) self.assertIsInstance(obs[-1], str) def test_set_float_or_str_with_float_value(self): obs = parse_primitive(Set[Float | Str], ('1.1', '2.2', '3.3')) self.assertEqual(obs, {1.1, 2.2, 3.3}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), float) def test_set_float_or_str_with_str_value(self): obs = parse_primitive(Set[Float | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) def test_list_bool_or_str_with_bool_value(self): obs = parse_primitive(List[Bool | Str], ('True', 'False', 'True')) self.assertEqual(obs, [True, False, True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) def test_list_bool_or_str_with_str_value(self): obs = parse_primitive(List[Bool | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, ['peanut', 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) def test_list_bool_or_str_with_mixed_value_variant_a(self): obs = parse_primitive(List[Bool | Str], ('True', 'the', 'dog')) self.assertEqual(obs, [True, 'the', 'dog']) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], bool) self.assertIsInstance(obs[-1], str) def test_list_bool_or_str_with_mixed_value_variant_b(self): obs = parse_primitive(List[Bool | Str], ('peanut', 'the', 'True')) self.assertEqual(obs, ['peanut', 'the', True]) self.assertIsInstance(obs, list) self.assertIsInstance(obs[0], str) self.assertIsInstance(obs[-1], bool) def test_set_bool_or_str_with_bool_value(self): obs = parse_primitive(Set[Bool | Str], ('True', 'False', 'True')) self.assertEqual(obs, {True, False}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), bool) def test_set_bool_or_str_with_str_value(self): obs = parse_primitive(Set[Bool | Str], ('peanut', 'the', 'dog')) self.assertEqual(obs, {'peanut', 'the', 'dog'}) self.assertIsInstance(obs, set) self.assertIsInstance(obs.pop(), str) class TestParsePrimitiveCollectionsComplex(unittest.TestCase): def test_list_int_bool_or_list_float_with_bool_int_value(self): obs = parse_primitive(List[Int | Bool] | List[Float], ('1', '2', 'True', 'False')) self.assertEqual(obs, [1, 2, True, False]) def test_list_int_bool_or_list_float_with_float_value(self): obs = parse_primitive(List[Int | Bool] | List[Float], ('1.1', '2.2', '3.3', '4.4')) self.assertEqual(obs, [1.1, 2.2, 3.3, 4.4]) def test_list_int_bool_or_list_float_with_bad_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Int | Bool] | List[Float], ('1', '2.2', 'True', 'False')) def test_list_int_str_or_list_float_with_str_int_value(self): obs = parse_primitive(List[Int | Str] | List[Float], ('1', '2', 'peanut', 'the')) self.assertEqual(obs, [1, 2, 'peanut', 'the']) def test_list_int_str_or_list_float_with_float_value(self): obs = parse_primitive(List[Int | Str] | List[Float], ('1.1', '2.2', '3.3', '4.4')) self.assertEqual(obs, [1.1, 2.2, 3.3, 4.4]) def test_list_int_str_or_list_float_str_with_float_value(self): obs = parse_primitive(List[Int | Str] | List[Float | Str], ('1.1', '2.2', '3.3', '4.4')) self.assertEqual(obs, [1.1, 2.2, 3.3, 4.4]) def test_list_int_str_or_list_float_str_bool_with_float_value(self): obs = parse_primitive(List[Int | Str] | List[Float | Str | Bool], ('1.1', '2.2', '3.3', '4.4')) self.assertEqual(obs, [1.1, 2.2, 3.3, 4.4]) def test_list_int_str_or_list_float_str_bool_with_float_str_value(self): obs = parse_primitive(List[Int | Str] | List[Float | Str | Bool], ('1.1', '2.2', 'the', 'peanut')) self.assertEqual(obs, [1.1, 2.2, 'the', 'peanut']) def test_list_int_str_or_list_float_str_bool_with_float_bool_value(self): obs = parse_primitive(List[Int | Str] | List[Float | Str | Bool], ('1.1', '2.2', 'True', 'False')) self.assertEqual(obs, [1.1, 2.2, True, False]) def test_list_int_str_or_list_float_with_mixed_value(self): obs = parse_primitive(List[Int | Str] | List[Float], ('1.1', '2', 'True', 'peanut')) self.assertEqual(obs, ['1.1', 2, 'True', 'peanut']) def test_list_float_bool_or_list_str_with_float_bool_value(self): obs = parse_primitive(List[Float | Bool] | List[Int], ('1', '2', 'True', 'False')) self.assertEqual(obs, [1, 2, True, False]) def test_list_float_bool_or_list_str_with_int_value(self): obs = parse_primitive(List[Float | Bool] | List[Int], ('1', '2', '3', '4')) self.assertEqual(obs, [1, 2, 3, 4]) def test_list_float_bool_or_list_str_with_bad_value(self): with self.assertRaisesRegex(ValueError, 'Could not coerce'): parse_primitive(List[Float | Bool] | List[Int], ('1', '2.2', 'True', 'peanut')) def test_set_int_bool_or_list_float_with_bool_int_value(self): obs = parse_primitive(Set[Int | Bool] | Set[Float], ('1', '2', 'True', 'False')) self.assertEqual(obs, {1, 2, True, False}) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/core/type/util.py000066400000000000000000000203261462552636000175740ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections from qiime2.core.util import tuplize from qiime2.core.type.collection import List, Set, Collection from qiime2.core.type.primitive import Int, Float, Bool, Str, Jobs, Threads from qiime2.core.type.grammar import UnionExp, _ExpBase, IntersectionExp from qiime2.core.type.parse import ast_to_type def _strip_predicates(expr): if isinstance(expr, UnionExp): return UnionExp(_strip_predicates(m) for m in expr.members).normalize() if hasattr(expr, 'fields'): new_fields = tuple(_strip_predicates(f) for f in expr.fields) return expr.duplicate(fields=new_fields, predicate=IntersectionExp()) def val_to_bool(value): if type(value) is bool: return value elif str(value).lower() == 'true': return True elif str(value).lower() == 'false': return False else: raise ValueError('Could not cast to bool') def val_to_int(v): type_ = type(v) if type_ is int: return v elif type_ is str: return int(v) else: raise ValueError('Could not cast to int') def val_to_float(v): type_ = type(v) if type_ is float: return v elif type_ is str: return float(v) else: raise ValueError('Could not cast to float') VariadicRecord = collections.namedtuple('VariadicRecord', ['pytype', 'q2type']) _VARIADIC = { 'List': VariadicRecord(pytype=list, q2type=List), 'Set': VariadicRecord(pytype=set, q2type=Set), 'Collection': VariadicRecord(pytype=dict, q2type=Collection), } CoercionRecord = collections.namedtuple('CoercionRecord', ['func', 'pytype']) # Beware visitor, order matters in this here mapper _COERCION_MAPPER = { Int: CoercionRecord(pytype=int, func=val_to_int), Float: CoercionRecord(pytype=float, func=val_to_float), Bool: CoercionRecord(pytype=bool, func=val_to_bool), Str: CoercionRecord(pytype=str, func=str), } _COERCE_ERROR = ValueError( 'Could not coerce value based on expression provided.') CollectionStyle = collections.namedtuple( 'CollectionStyle', ['style', 'members', 'view', 'expr', 'base']) def _norm_input(t): if type(t) is dict: return ast_to_type(t) elif not isinstance(t, _ExpBase): raise TypeError("%r is not a QIIME 2 type" % (t,)) return t def is_qiime_type(t): try: _norm_input(t) except Exception: return False else: return True def is_primitive_type(t): expr = _norm_input(t) return hasattr(expr, 'kind') and expr.kind == 'primitive' def is_metadata_type(t): expr = _norm_input(t) return is_primitive_type(t) and expr.name.startswith('Metadata') def is_metadata_column_type(t): expr = _norm_input(t) return is_primitive_type(t) and expr.name.endswith('MetadataColumn') def is_semantic_type(t): expr = _norm_input(t) return hasattr(expr, 'kind') and expr.kind == 'semantic-type' def is_visualization_type(t): expr = _norm_input(t) return hasattr(expr, 'kind') and expr.kind == 'visualization' def is_union(t): expr = _norm_input(t) return isinstance(expr, UnionExp) def is_collection_type(t): expr = _norm_input(t) if expr.name in _VARIADIC: return True if is_union(expr): for m in expr.members: if m.name in _VARIADIC: return True return False def is_parallel_type(t): expr = _norm_input(t) return is_primitive_type(t) and expr in (Jobs, Threads) def interrogate_collection_type(t): expr = _norm_input(t) style = None # simple, monomorphic, composite, complex members = None # T , [T1, T2] , [T1, T2], [[T1], [T2, T3]] view = None # set, list base = None if expr.name in _VARIADIC: view, base = _VARIADIC[expr.name] field, = expr.fields if isinstance(field, UnionExp): style = 'composite' members = list(field.members) else: style = 'simple' members = field elif isinstance(expr, UnionExp): if expr.members[0].name in _VARIADIC: members = [] for member in expr.members: field, = member.fields if isinstance(field, UnionExp): style = 'complex' members.append(list(field.members)) else: members.append([field]) if style != 'complex': style = 'monomorphic' # use last iteration view, base = _VARIADIC[member.name] if style == 'monomorphic': members = [m[0] for m in members] return CollectionStyle(style=style, members=members, view=view, expr=expr, base=base) def _ordered_coercion(types): types = tuple(types) return tuple(k for k in _COERCION_MAPPER.keys() if k in types) def _interrogate_types(allowed, value): ordered_allowed = _ordered_coercion(allowed) for coerce_type in (_COERCION_MAPPER[x].func for x in ordered_allowed): try: return coerce_type(value) except ValueError: pass raise _COERCE_ERROR def parse_primitive(t, value): expr = _norm_input(t) result = [] allowed = None homogeneous = True keys = None if isinstance(value, dict): keys = list(value.keys()) value = list(value.values()) if is_metadata_type(expr): raise ValueError('%r may not be parsed with this util.' % (expr,)) expr = _strip_predicates(expr) collection_style = interrogate_collection_type(expr) if collection_style.style in ('simple', 'monomorphic', 'composite'): allowed = list(collection_style.members) if collection_style.style == 'composite': homogeneous = False elif collection_style.style == 'complex': # Sort here so that we can start with any simple lists in the memberset for subexpr in sorted(collection_style.members, key=len): expr = collection_style.base[UnionExp(subexpr)] try: return parse_primitive(expr, value) except ValueError: pass raise _COERCE_ERROR elif collection_style.style is None: value = tuplize(value) if expr in (Int, Float, Bool, Str): # No sense in walking over all options when we know # what it should be allowed = [expr] else: allowed = list(_COERCION_MAPPER.keys()) else: pass assert allowed is not None # Int <= Float, make sure its added in if Float in allowed and Int not in allowed: allowed.append(Int) for v in value: result.append(_interrogate_types(allowed, v)) # Some exprs require homogeneous values, make it so if homogeneous: all_matching = False for member in allowed: if all(type(x) is _COERCION_MAPPER[member].pytype for x in result): all_matching = True break if not all_matching and collection_style.style == 'monomorphic': for subexpr in allowed: expr = collection_style.base[subexpr] try: return parse_primitive(expr, value) except ValueError: pass raise _COERCE_ERROR if collection_style.view is None: return result[0] else: # If we are supposed to have a dict we need to give it key, value # pairs. We end up here when we invoke a command that takes a # Collection on the command line if collection_style.view is dict: # If we have keys, value was originally a dict, and we want to # reattach the keys to the values if keys is not None: return {k: v for k, v in zip(keys, result)} else: return {str(k): v for k, v in enumerate(result)} return collection_style.view(result) qiime2-2024.5.0/qiime2/core/type/visualization.py000066400000000000000000000020231462552636000215120ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.core.type.template import TypeTemplate class _Visualization(TypeTemplate): def get_kind(self): return "visualization" def __eq__(self, other): return type(self) is type(other) def get_field_names(self): return [] def get_name(self): return "Visualization" def is_element(self, value): import qiime2.sdk return isinstance(value, qiime2.sdk.Visualization) def validate_field(self, name, field): raise TypeError def get_union_membership_expr(self, self_expr): return None def validate_predicate(self, predicate, expr): raise TypeError Visualization = _Visualization() qiime2-2024.5.0/qiime2/core/util.py000066400000000000000000000316301462552636000166130ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import contextlib import warnings import hashlib import stat import os import io import collections import uuid as _uuid import yaml import zipfile import pathlib import shutil import subprocess import decorator READ_ONLY_FILE = stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH READ_ONLY_DIR = stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH | stat.S_IRUSR \ | stat.S_IRGRP | stat.S_IROTH USER_GROUP_RWX = stat.S_IRWXU | stat.S_IRWXG OTHER_NO_WRITE = stat.S_IRWXU | stat.S_IRWXG | stat.S_IROTH | stat.S_IXOTH def get_view_name(view): from .format import FormatBase if not isinstance(view, type): view = view.__class__ if issubclass(view, FormatBase): # Not qualname because we don't have a notion of "nested" formats return view.__name__ return ':'.join([view.__module__, view.__qualname__]) def tuplize(x): if type(x) is not tuple: return (x,) return x def overrides(cls): def decorator(func): if not hasattr(cls, func.__name__): raise AssertionError("%r does not override %r" % (func, cls.__name__)) return func return decorator def superscript(number): table = { '0': chr(8304), '1': chr(185), '2': chr(178), '3': chr(179), **{str(i): chr(x) for i, x in enumerate(range(8308, 8314), 4)}, 'a': chr(7491), 'e': chr(7497), 'f': chr(7584), 'i': chr(8305), 'n': chr(8319), '-': chr(8315), '.': chr(39), ',': chr(39) } return ''.join([table[d] for d in str(number)]) def find_duplicates(iterable): """Find duplicate values in an iterable. Parameters ---------- iterable : iterable Iterable to search for duplicates. Returns ------- set Values that are duplicated in `iterable`. Notes ----- Values in `iterable` must be hashable. """ # Modified from https://stackoverflow.com/a/9835819/3776794 to return # duplicates instead of remove duplicates from an iterable. seen = set() duplicates = set() for value in iterable: if value in seen: duplicates.add(value) else: seen.add(value) return duplicates # Concept from: http://stackoverflow.com/a/11157649/579416 def duration_time(relative_delta): attrs = ['years', 'months', 'days', 'hours', 'minutes', 'seconds', 'microseconds'] results = [] for attr in attrs: value = getattr(relative_delta, attr) if value != 0: if value == 1: # Remove plural 's' attr = attr[:-1] results.append("%d %s" % (value, attr)) if results: text = results[-1] if results[:-1]: text = ', and '.join([', '.join(results[:-1]), text]) return text else: # Great Scott! No time has passed! return '0 %s' % attrs[-1] def has_md5sum_native(): return shutil.which('md5sum') is not None def md5sum(filepath): if has_md5sum_native(): return md5sum_native(filepath) else: return md5sum_python(filepath) def md5sum_python(filepath): md5 = hashlib.md5() with open(str(filepath), mode='rb') as fh: for chunk in iter(lambda: fh.read(io.DEFAULT_BUFFER_SIZE), b""): md5.update(chunk) return md5.hexdigest() def md5sum_native(filepath): result = subprocess.run(['md5sum', str(filepath)], check=True, capture_output=True, text=True) _, digest = from_checksum_format(result.stdout) return digest def md5sum_zip(zf: zipfile.ZipFile, filepath: str) -> str: """ Given a ZipFile object and relative filepath within the zip archive, returns the md5sum of the file """ md5 = hashlib.md5() with zf.open(filepath) as fh: for chunk in iter(lambda: fh.read(io.DEFAULT_BUFFER_SIZE), b""): md5.update(chunk) return md5.hexdigest() def md5sum_directory(directory): if has_md5sum_native(): md5sum = md5sum_native else: md5sum = md5sum_python directory = str(directory) sums = collections.OrderedDict() for root, dirs, files in os.walk(directory, topdown=True): dirs[:] = sorted([d for d in dirs if not d[0] == '.']) for file in sorted(files): if file[0] == '.': continue path = os.path.join(root, file) sums[os.path.relpath(path, start=directory)] = md5sum(path) return sums def md5sum_directory_zip(zf: zipfile.ZipFile) -> dict: """ Returns a mapping of fp/checksum pairs for all files in zf. The root dir has been removed from these filepaths. This mimics the output in checksums.md5 (without sorted descent), but is not generalizable beyond QIIME 2 archives. """ sums = dict() for file in zf.namelist(): fp = pathlib.Path(file) if fp.name != 'checksums.md5': file_parts = list(fp.parts) fp_w_o_root_uuid = pathlib.Path(*(file_parts[1:])) sums[str(fp_w_o_root_uuid)] = md5sum_zip(zf, file) return sums def to_checksum_format(filepath, checksum): # see https://www.gnu.org # /software/coreutils/manual/html_node/md5sum-invocation.html if '\\' in filepath or '\n' in filepath: filepath = filepath.replace('\\', '\\\\').replace('\n', '\\n') checksum = '\\' + checksum return '%s %s' % (checksum, filepath) def from_checksum_format(line): line = line.rstrip('\n') parts = line.split(' ', 1) if len(parts) < 2: parts = line.split(' *', 1) checksum, filepath = parts if checksum[0] == '\\': chars = '' escape = False # Gross, but regular `.replace` will overlap with itself and # negative lookbehind in regex is *probably* harder than scanning for char in filepath: # 1) Escape next character if not escape and char == '\\': escape = True continue # 2) Handle escape sequence if escape: try: chars += {'\\': '\\', 'n': '\n'}[char] except KeyError: chars += '\\' + char # Wasn't an escape after all escape = False continue # 3) Nothing interesting chars += char checksum = checksum[1:] filepath = chars return filepath, checksum @contextlib.contextmanager def warning(): def _warnformat(msg, category, filename, lineno, file=None, line=None): return '%s:%s: %s: %s\n' % (filename, lineno, category.__name__, msg) default_warn_format = warnings.formatwarning try: warnings.formatwarning = _warnformat warnings.filterwarnings('always') yield warnings.warn finally: warnings.formatwarning = default_warn_format # Descriptor protocol for creating an attribute that is bound to an # (arbitrarily nested) attribute accessible to the instance at runtime. class LateBindingAttribute: def __init__(self, attribute): self._attribute = attribute def __get__(self, obj, cls=None): attrs = self._attribute.split('.') curr_attr = obj for attr in attrs: curr_attr = getattr(curr_attr, attr) return staticmethod(curr_attr).__get__(obj, cls) # Removes the first parameter from a callable's signature. class DropFirstParameter(decorator.FunctionMaker): @classmethod def from_function(cls, function): return cls.create(function, "return None", {}) def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.signature = self._remove_first_arg(self.signature) self.shortsignature = self._remove_first_arg(self.shortsignature) def _remove_first_arg(self, string): return ",".join(string.split(',')[1:])[1:] def _immutable_error(obj, *args): raise TypeError('%s is immutable.' % obj.__class__.__name__) class ImmutableBase: def _freeze_(self): """Disables __setattr__ when called. It is idempotent.""" self._frozen = True # The particular value doesn't matter __delattr__ = __setitem__ = __delitem__ = _immutable_error def __setattr__(self, *args): # This doesn't stop silly things like # object.__setattr__(obj, ...), but that's a pretty rude thing # to do anyways. We are just trying to avoid accidental mutation. if hasattr(self, '_frozen'): _immutable_error(self) super().__setattr__(*args) def sorted_poset(iterable, *, key=None, reverse=False): values = list(iterable) elements = values if key is not None: elements = [key(x) for x in values] result = [] sorted_elements = [] for value, element in zip(values, elements): idx = 0 for idx, placed in enumerate(sorted_elements, 1): if element <= placed: idx -= 1 break result.insert(idx, value) sorted_elements.insert(idx, element) if reverse: result = list(reversed(result)) return result def is_uuid4(uuid_str): # Adapted from https://gist.github.com/ShawnMilo/7777304 try: uuid = _uuid.UUID(hex=uuid_str, version=4) except ValueError: # The string is not a valid hex code for a UUID. return False # If uuid_str is a valid hex code, but an invalid uuid4, UUID.__init__ # will convert it to a valid uuid4. return str(uuid) == uuid_str def set_permissions(path, file_permissions=None, dir_permissions=None, skip_root=False): """Set permissions on all directories and files under and including path """ # Panfs is currently causing issues for us setting permissions. We still # want to set rwx for user and group before we remove things to ensure we # can remove them, but we want to temporarily no-op other permission # changes if file_permissions != USER_GROUP_RWX: file_permissions = None if dir_permissions != USER_GROUP_RWX: dir_permissions = None # Just get out if we aren't doing anything if file_permissions is None and dir_permissions is None: return for directory, _, files in os.walk(path): # We may want to set permissions under a directory but not on the # directory itself. if dir_permissions and not (skip_root and directory == str(path)): try: os.chmod(directory, dir_permissions) except FileNotFoundError: pass for file in files: if file_permissions: try: os.chmod(os.path.join(directory, file), file_permissions) except FileNotFoundError: pass def touch_under_path(path): """Touches everything under a given path to ensure they don't get culled by Mac """ for directory, _, files in os.walk(path): try: os.utime(directory, None, follow_symlinks=False) except FileNotFoundError: pass for file in files: try: os.utime( os.path.join(directory, file), None, follow_symlinks=False) except FileNotFoundError: pass def load_action_yaml(path): """Takes a path to an unzipped Aritfact and loads its action.yaml with yaml.safe_load """ # TODO: Make these actually do something useful at least for the tags # that are relevant to what we need out of provenance (this is partially # done) def ref_constructor(loader, node): # We only care about the name of the thing we are referencing which # is at the end of this list return node.value.split(':')[-1] def cite_constructor(loader, node): return node.value def metadata_constructor(loader, node): # Use the md5sum of the metadata as its identifier, so we can tell # if two artifacts used the same metadata input metadata_path = prov_path / node.value return md5sum(metadata_path) yaml.constructor.SafeConstructor.add_constructor('!ref', ref_constructor) yaml.constructor.SafeConstructor.add_constructor('!cite', cite_constructor) yaml.constructor.SafeConstructor.add_constructor( '!metadata', metadata_constructor) prov_path = path / 'provenance' / 'action' action_path = prov_path / 'action.yaml' with open(action_path) as fh: prov = yaml.safe_load(fh) return prov def create_collection_name(*, name, key, idx, size): """ Only accepts kwargs. Creates a name for a collection item in a standardized way. Assumes 0 based indexing. """ return [name, key, f'{idx + 1}/{size}'] qiime2-2024.5.0/qiime2/core/validate.py000066400000000000000000000142141462552636000174260ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.core.exceptions import ValidationError, ImplementationError from qiime2.core.transform import ModelType from qiime2.core.util import sorted_poset class ValidationObject: r""" Store, sort and run all semantic validators for a for a single, complete semantic type(a `concrete type`). Attributes ---------- concrete_type: SemanticType The semantic type for which the validators are valid for. """ def __init__(self, concrete_type): r""" Create a new ValidationObject to add ValidatorRecords to. Parameters ---------- concrete_type: semantic type The single, complete semantic type that the validators are to be associated with. """ # Private Attributes # ------------------ # _validators: list # A list of ValidatorRecords # _is_sorted: Bool # Tracks whether or not `_validators` has been sorted or not. self._validators = [] self.concrete_type = concrete_type self._is_sorted = False def add_validator(self, validator_record): r""" Adds new validator record to plugin. Parameters ---------- validator_record: ValidatorRecord ValidatorRecord is a collections.namedtuple found in `qiime2/plugin/plugin.py`. Notes ----- Used by Plugin to add a `ValidatorRecord` for a new validator to a plugin. Usually called through the `register_validator` decorator. """ self._validators.append(validator_record) self._is_sorted = False def add_validation_object(self, *others): r""" Incorporates another validation object of the same concrete type. Parameters ---------- *others: Any number of validation objects of the same concrete type. Notes ----- Used to combine validation objects from different plugins. This is done non-heirarchically by `PluginManager` by creating a new, blank object for each `concrete_type` that it encounters, then adds the objects from each plugin. """ for other in others: if self.concrete_type != other.concrete_type: raise TypeError('Unable to add ValidationObject of' ' `concrete_type: %s to ValidationObject of' ' `concrete_type: %s`' % (other.concrete_type, self.concrete_type)) self._validators += other._validators self._is_sorted = False @property def validators(self) -> list: r""" Public access method for the validators stored in ValidationObject. Returns ------- list A sorted list of validator records. """ if not self._is_sorted: self._sort_validators() return self._validators def _sort_validators(self): r""" Sorts validators Notes ----- A partial order sort of the validators. The runtime for this sort is :math:`\theta(n^2)`. This is not a concern, as the number of validators present for any particular type is expected to remain trivially low. The validators are sorted from general to specific. """ self._validators = sorted_poset( iterable=self._validators, key=lambda record: record.context, reverse=True) self._is_sorted = True def __call__(self, data, level): r""" Validates that provided data meets the conditions of a semantic type. Parameters ---------- data: A view of the data to be validated. level: {'min', 'max'} specifies the level validation occurs at. Notes ----- Use of `level` is required but the behaviour is defined in the individual validators. """ from_mt = ModelType.from_view_type(type(data)) for record in self.validators: to_mt = ModelType.from_view_type(record.view) transformation = from_mt.make_transformation(to_mt) new_data = transformation(data) try: record.validator(data=new_data, level=level) except ValidationError: raise except Exception as e: raise ImplementationError("An unexpected error occured when %r" " from %r attempted to validate %r" % (record.validator.__name__, record.plugin, new_data)) from e def assert_transformation_available(self, data): r""" Checks that required transformations exist. Parameters ---------- data: view view type of input data. Raises ------ AssertionError If no transformation exists from the data view to the view expected by a particular validator. Notes ----- Called by `qiime2.sdk.PluginManager._consistency_check` to ensure the transformers required to run the validators are defined. """ mt = ModelType.from_view_type(data) for record in self._validators: mt_other = ModelType.from_view_type(record.view) if not mt.has_transformation(mt_other): raise AssertionError( 'Could not validate %s using %r because there was no' ' transformation from %r to %r' % (self.concrete_type, record.validator.__name__, mt._view_name, mt_other._view_name) ) qiime2-2024.5.0/qiime2/jupyter/000077500000000000000000000000001462552636000160335ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/jupyter/__init__.py000066400000000000000000000007511462552636000201470ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .hooks import load_jupyter_server_extension from .template import make_html __all__ = ['make_html', 'load_jupyter_server_extension'] qiime2-2024.5.0/qiime2/jupyter/handlers.py000066400000000000000000000036651462552636000202170ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import pathlib import tornado.web as web from notebook.base.handlers import IPythonHandler from qiime2.core.archive.archiver import ArchiveCheck class QIIME2RedirectHandler(IPythonHandler): """Add a location to location_store for later retrieval""" def initialize(self, result_store): self.result_store = result_store def get(self): location = self.get_query_argument('location') if not os.path.exists(location): # Client DOM should explain that the user should re-run the cell self.send_error(409) # Conflict return # is it actually a QIIME 2 result, or a random part of the filesystem archive = ArchiveCheck(pathlib.Path(location)) self.result_store[archive.uuid] = os.path.join(location, 'data') self.redirect('view/%s/' % archive.uuid) class QIIME2ResultHandler(web.StaticFileHandler): def initialize(self, path, default_filename): super().initialize(path, default_filename) self.result_store = path # path is actually result_store @classmethod def get_absolute_path(cls, root, path): uuid, path = path.split('/', 1) root = root[uuid] # This is janky, but validate_absolute_path is the only thing # that will use this data, so it can know to unpack the tuple again return (super().get_absolute_path(root, path), uuid) def validate_absolute_path(self, root, abspath_uuid): absolute_path, uuid = abspath_uuid root = self.result_store[uuid] return super().validate_absolute_path(root, absolute_path) qiime2-2024.5.0/qiime2/jupyter/hooks.py000066400000000000000000000017401462552636000175320ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- def load_jupyter_server_extension(nb_server): from .handlers import QIIME2RedirectHandler, QIIME2ResultHandler from notebook.utils import url_path_join result_store = {} app = nb_server.web_app def route(path): return url_path_join(app.settings['base_url'], 'qiime2', path) app.add_handlers(r'.*', [ (route(r'redirect'), QIIME2RedirectHandler, {'result_store': result_store}), (route(r'view/(.*)'), QIIME2ResultHandler, # This *is* odd, but it's because we are tricking StaticFileHandler {'path': result_store, 'default_filename': 'index.html'}) ]) qiime2-2024.5.0/qiime2/jupyter/template.py000066400000000000000000000050461462552636000202250ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import urllib.parse def make_html(location): url = "/qiime2/redirect?location={location}".format( location=urllib.parse.quote(location)) # This is dark magic. An image has an onload handler, which let's me # grab the parent dom in an anonymous way without needing to scope the # output cells of Jupyter with some kind of random ID. # Using transparent pixel from: https://stackoverflow.com/a/14115340/579416 return ('
'.format( anon_func=_anonymous_function, url=url)) # 404 - the extension isn't installed # 428 - the result went out of scope, re-run cell # 302->200 - set up the iframe for that location _anonymous_function = '''\ function(div, url){ if (typeof require !== 'undefined') { var baseURL = require.toUrl('').split('/').slice(0, -2).join('/'); } else { var baseURL = JSON.parse( document.getElementById('jupyter-config-data').innerHTML ).baseUrl.slice(0, -1); } url = baseURL + url; fetch(url).then(function(res) { if (res.status === 404) { div.innerHTML = 'Install QIIME 2 Jupyter extension with:
' + 'jupyter serverextension enable --py qiime2' + ' --sys-prefix
then restart your server.' + '

(Interactive output not available on ' + 'static notebook viewer services like nbviewer.)'; } else if (res.status === 409) { div.innerHTML = 'Visualization no longer in scope. Re-run this cell' + ' to see the visualization.'; } else if (res.ok) { url = res.url; div.innerHTML = '
Open in a: new window' } else { div.innerHTML = 'Something has gone wrong. Check notebook server for' + ' errors.'; } }); }''' qiime2-2024.5.0/qiime2/metadata/000077500000000000000000000000001462552636000161115ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/__init__.py000066400000000000000000000011731462552636000202240ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .metadata import (Metadata, MetadataColumn, NumericMetadataColumn, CategoricalMetadataColumn) from .io import MetadataFileError __all__ = ['Metadata', 'MetadataColumn', 'NumericMetadataColumn', 'CategoricalMetadataColumn', 'MetadataFileError'] qiime2-2024.5.0/qiime2/metadata/base.py000066400000000000000000000040111462552636000173710ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- SUPPORTED_COLUMN_TYPES = {'categorical', 'numeric'} SUPPORTED_ID_HEADERS = { 'case_insensitive': { 'id', 'sampleid', 'sample id', 'sample-id', 'featureid', 'feature id', 'feature-id' }, # For backwards-compatibility with existing formats. 'exact_match': { # QIIME 1 mapping files. "#Sample ID" was never supported, but # we're including it here for symmetry with the other supported # headers that allow a space between words. '#SampleID', '#Sample ID', # biom-format: observation metadata and "classic" (TSV) OTU tables. '#OTUID', '#OTU ID', # Qiita sample/prep information files. 'sample_name' } } FORMATTED_ID_HEADERS = "Case-insensitive: %s\n\nCase-sensitive: %s" % ( ', '.join(repr(e) for e in sorted( SUPPORTED_ID_HEADERS['case_insensitive'])), ', '.join(repr(e) for e in sorted( SUPPORTED_ID_HEADERS['exact_match'])) ) def is_id_header(name): """Determine if a name is a valid ID column header. This function may be used to determine if a value in a metadata file is a valid ID column header, or if a pandas ``Index.name`` matches the ID header requirements. The "ID header" corresponds to the ``Metadata.id_header`` and ``MetadataColumn.id_header`` properties. Parameters ---------- name : string or None Name to check against ID header requirements. Returns ------- bool ``True`` if `name` is a valid ID column header, ``False`` otherwise. """ return name and (name in SUPPORTED_ID_HEADERS['exact_match'] or name.lower() in SUPPORTED_ID_HEADERS['case_insensitive']) qiime2-2024.5.0/qiime2/metadata/io.py000066400000000000000000000472511462552636000171030ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import csv import itertools import os.path import re import numpy as np import pandas as pd from qiime2.core.util import find_duplicates import qiime2.core.missing as _missing from .base import SUPPORTED_COLUMN_TYPES, FORMATTED_ID_HEADERS, is_id_header from .metadata import Metadata, MetadataColumn class MetadataFileError(Exception): _suffix = ( "There may be more errors present in the metadata file. To get a full " "report, sample/feature metadata files can be validated with Keemei: " "https://keemei.qiime2.org\n\nFind details on QIIME 2 metadata " "requirements here: https://docs.qiime2.org/%s/tutorials/metadata/") def __init__(self, message, include_suffix=True): # Lazy import because `qiime2.__release__` is available at runtime but # not at import time (otherwise the release value could be interpolated # into `_suffix` in the class definition above). import qiime2 if include_suffix: message = message + '\n\n' + self._suffix % qiime2.__release__ super().__init__(message) class MetadataReader: def __init__(self, filepath): if not os.path.isfile(filepath): raise MetadataFileError( "Metadata file path doesn't exist, or the path points to " "something other than a file. Please check that the path " "exists, has read permissions, and points to a regular file " "(not a directory): %s" % filepath) self._filepath = filepath # Used by `read()` to store an iterator yielding rows with # leading/trailing whitespace stripped from their cells (this is a # preprocessing step that should happen with *every* row). The iterator # protocol is the only guaranteed API on this object. self._reader = None def read(self, into, column_types=None, column_missing_schemes=None, default_missing_scheme=_missing.DEFAULT_MISSING): if column_types is None: column_types = {} try: # Newline settings based on recommendation from csv docs: # https://docs.python.org/3/library/csv.html#id3 # Ignore BOM on read (but do not write BOM) with open(self._filepath, 'r', newline='', encoding='utf-8-sig') as fh: tsv_reader = csv.reader(fh, dialect='excel-tab', strict=True) self._reader = (self._strip_cell_whitespace(row) for row in tsv_reader) header = self._read_header() directives = self._read_directives(header) ids, data = self._read_data(header) except UnicodeDecodeError as e: if ('0xff in position 0' in str(e) or '0xfe in position 0' in str(e)): raise MetadataFileError( "Metadata file must be encoded as UTF-8 or ASCII, found " "UTF-16. If this file is from Microsoft Excel, save " "as a plain text file, not 'UTF-16 Unicode'") raise MetadataFileError( "Metadata file must be encoded as UTF-8 or ASCII. The " "following error occurred when decoding the file:\n\n%s" % e) finally: self._reader = None index = pd.Index(ids, name=header[0], dtype=object) df = pd.DataFrame(data, columns=header[1:], index=index, dtype=object) # TODO: move these checks over to Metadata.__init__() so that you can # pass column_types with an untyped dataframe. This would require a bit # of a refactor and doesn't buy a whole lot at the moment, hence the # TODO. for name, type in column_types.items(): if name not in df.columns: raise MetadataFileError( "Column name %r specified in `column_types` is not a " "column in the metadata file." % name) if type not in SUPPORTED_COLUMN_TYPES: fmt_column_types = ', '.join( repr(e) for e in sorted(SUPPORTED_COLUMN_TYPES)) raise MetadataFileError( "Column name %r specified in `column_types` has an " "unrecognized column type %r. Supported column types: %s" % (name, type, fmt_column_types)) resolved_column_types = directives.get('types', {}) resolved_column_types.update(column_types) if column_missing_schemes is None: column_missing_schemes = {} resolved_missing = {c: default_missing_scheme for c in df.columns} resolved_missing.update(directives.get('missing', {})) resolved_missing.update(column_missing_schemes) try: # Cast each column to the appropriate dtype based on column type. df = df.apply(self._cast_column, axis='index', column_types=resolved_column_types, missing_schemes=resolved_missing) except MetadataFileError as e: # HACK: If an exception is raised within `DataFrame.apply`, pandas # adds an extra tuple element to `e.args`, making the original # error message difficult to read because a tuple is repr'd instead # of a string. To work around this, we catch and reraise a # MetadataFileError with the original error message. We use # `include_suffix=False` to avoid adding another suffix to the # error message we're reraising. msg = e.args[0] raise MetadataFileError(msg, include_suffix=False) try: return into(df, column_missing_schemes=resolved_missing, default_missing_scheme=default_missing_scheme) except Exception as e: raise MetadataFileError( "There was an issue with loading the metadata file:\n\n%s" % e) def _read_header(self): header = None for row in self._reader: if self._is_header(row): header = row break elif self._is_comment(row): continue elif self._is_empty(row): continue elif self._is_directive(row): raise MetadataFileError( "Found directive %r while searching for header. " "Directives may only appear immediately after the header." % row[0]) else: raise MetadataFileError( "Found unrecognized ID column name %r while searching for " "header. The first column name in the header defines the " "ID column, and must be one of these values:\n\n%s\n\n" "NOTE: Metadata files must contain tab-separated values." % (row[0], FORMATTED_ID_HEADERS)) if header is None: raise MetadataFileError( "Failed to locate header. The metadata file may be empty, or " "consists only of comments or empty rows.") # Trim trailing empty cells from header. data_extent = None for idx, cell in enumerate(header): if cell != '': data_extent = idx header = header[:data_extent+1] # Basic validation to 1) fail early before processing entire file; and # 2) make some basic guarantees about the header for things in this # class that use the header as part of reading the file. column_names = set(header) if '' in column_names: raise MetadataFileError( "Found at least one column without a name in the header. Each " "column must be named.") elif len(header) != len(column_names): duplicates = find_duplicates(header) raise MetadataFileError( "Column names must be unique. The following column names are " "duplicated: %s" % (', '.join(repr(e) for e in sorted(duplicates)))) # Skip the first element of the header because we know it is a valid ID # header. The other column names are validated to ensure they *aren't* # valid ID headers. for column_name in header[1:]: if is_id_header(column_name): raise MetadataFileError( "Metadata column name %r conflicts with a name reserved " "for the ID column header. Reserved ID column headers:" "\n\n%s" % (column_name, FORMATTED_ID_HEADERS)) return header def _read_directives(self, header): directives = {} for row in self._reader: directive_kind = None if not self._is_directive(row): self._reader = itertools.chain([row], self._reader) break if self._is_column_types_directive(row): directive_kind = 'types' elif self._is_missing_directive(row): directive_kind = 'missing' else: raise MetadataFileError( "Unrecognized directive %r. Only the #q2:types" " and #q2:missing directives are supported at this" " time." % row[0]) if directive_kind in directives: raise MetadataFileError( "Found duplicate directive %r. Each directive may " "only be specified a single time." % row[0]) row = self._match_header_len(row, header) collected = {name: arg for name, arg in zip(header[1:], row[1:]) if arg} directives[directive_kind] = collected if 'types' in directives: column_types = directives['types'] for column_name, column_type in column_types.items(): type_nocase = column_type.lower() if type_nocase in SUPPORTED_COLUMN_TYPES: column_types[column_name] = type_nocase else: fmt_column_types = ', '.join( repr(e) for e in sorted(SUPPORTED_COLUMN_TYPES)) raise MetadataFileError( "Column %r has an unrecognized column type %r " "specified in its #q2:types directive. " "Supported column types (case-insensitive): %s" % (column_name, column_type, fmt_column_types)) if 'missing' in directives: for column_name, column_missing in directives['missing'].items(): if column_missing not in _missing.BUILTIN_MISSING: raise MetadataFileError( "Column %r has an unrecognized missing value scheme %r" " specified in its #q2:missing directive." " Supported missing value schemes (case-sensitive): %s" % (column_name, column_missing, list(_missing.BUILTIN_MISSING)) ) return directives def _read_data(self, header): ids = [] data = [] for row in self._reader: if self._is_comment(row): continue elif self._is_empty(row): continue elif self._is_directive(row): raise MetadataFileError( "Found directive %r outside of the directives section of " "the file. Directives may only appear immediately after " "the header." % row[0]) elif self._is_header(row): raise MetadataFileError( "Metadata ID %r conflicts with a name reserved for the ID " "column header. Reserved ID column headers:\n\n%s" % (row[0], FORMATTED_ID_HEADERS)) row = self._match_header_len(row, header) ids.append(row[0]) data.append(row[1:]) return ids, data def _strip_cell_whitespace(self, row): return [cell.strip() for cell in row] def _match_header_len(self, row, header): row_len = len(row) header_len = len(header) if row_len < header_len: # Pad row with empty cells to match header length. row = row + [''] * (header_len - row_len) elif row_len > header_len: trailing_row = row[header_len:] if not self._is_empty(trailing_row): raise MetadataFileError( "Metadata row contains more cells than are declared by " "the header. The row has %d cells, while the header " "declares %d cells." % (row_len, header_len)) row = row[:header_len] return row def _is_empty(self, row): # `all` returns True for an empty iterable, so this check works for a # row of zero elements (corresponds to a blank line in the file). return all((cell == '' for cell in row)) def _is_comment(self, row): return ( len(row) > 0 and row[0].startswith('#') and not self._is_directive(row) and not self._is_header(row) ) def _is_header(self, row): if len(row) == 0: return False return is_id_header(row[0]) def _is_directive(self, row): return len(row) > 0 and row[0].startswith('#q2:') def _is_column_types_directive(self, row): return len(row) > 0 and row[0].split(' ')[0] == '#q2:types' def _is_missing_directive(self, row): return len(row) > 0 and row[0].split(' ')[0] == '#q2:missing' def _cast_column(self, series, column_types, missing_schemes): if series.name in missing_schemes: scheme = missing_schemes[series.name] series = _missing.series_encode_missing(series, scheme) if series.name in column_types: if column_types[series.name] == 'numeric': return self._to_numeric(series) else: # 'categorical' return self._to_categorical(series) else: # Infer type try: return self._to_numeric(series) except MetadataFileError: return self._to_categorical(series) def _to_categorical(self, series): # Replace empty strings with `None` to force the series to remain # dtype=object (this only matters if the series consists solely of # missing data). Replacing with np.nan and casting to dtype=object # won't retain the correct dtype in the resulting dataframe # (`DataFrame.apply` seems to force series consisting solely of np.nan # to dtype=float64, even if dtype=object is specified. # # To replace a value with `None`, the following invocation of # `Series.replace` must be used because `None` is a sentinel: # https://stackoverflow.com/a/17097397/3776794 return series.replace([''], [None]) def _to_numeric(self, series): series = series.replace('', np.nan) is_numeric = series.apply(self._is_numeric) if is_numeric.all(): return pd.to_numeric(series, errors='raise') else: non_numerics = series[~is_numeric].unique() raise MetadataFileError( "Cannot convert metadata column %r to numeric. The following " "values could not be interpreted as numeric: %s" % (series.name, ', '.join(repr(e) for e in sorted(non_numerics)))) def _is_numeric(self, value): return (isinstance(value, float) or len(_numeric_regex.findall(value)) == 1) class MetadataWriter: def __init__(self, metadata): self._metadata = metadata def write(self, filepath): # Newline settings based on recommendation from csv docs: # https://docs.python.org/3/library/csv.html#id3 # Do NOT write a BOM, hence utf-8 not utf-8-sig with open(filepath, 'w', newline='', encoding='utf-8') as fh: tsv_writer = csv.writer(fh, dialect='excel-tab', strict=True) md = self._metadata header = [md.id_header] types_directive = ['#q2:types'] missing_directive = ['#q2:missing'] if isinstance(md, Metadata): for name, props in md.columns.items(): header.append(name) types_directive.append(props.type) missing_directive.append(props.missing_scheme) elif isinstance(md, MetadataColumn): header.append(md.name) types_directive.append(md.type) missing_directive.append(md.missing_scheme) else: raise NotImplementedError tsv_writer.writerow(header) tsv_writer.writerow(types_directive) if self._non_default_missing(missing_directive): tsv_writer.writerow(missing_directive) df = md.to_dataframe(encode_missing=True) df.fillna('', inplace=True) df = df.map(self._format) tsv_writer.writerows(df.itertuples(index=True)) def _non_default_missing(self, missing_directive): missing = missing_directive[1:] result = False for m in missing: if m != _missing.DEFAULT_MISSING: result = True break return result def _format(self, value): if isinstance(value, str): return value elif isinstance(value, float): # Use fixed precision or scientific notation as necessary (both are # roundtrippable in the metadata file format), with up to 15 digits # *total* precision (i.e. before and after the decimal point), # rounding if necessary. Trailing zeros or decimal points will not # be included in the formatted string (e.g. 42.0 will be formatted # as "42"). A precision of 15 digits is used because that is within # the 64-bit floating point spec (things get weird after that). # # Using repr() and str() each have their own predefined precision # which varies across Python versions. Using the string formatting # presentation types (e.g. %g, %f) without specifying a precision # will usually default to 6 digits past the decimal point, which # seems a little low. # # References: # # - https://stackoverflow.com/a/2440786/3776794 # - https://stackoverflow.com/a/2440708/3776794 # - https://docs.python.org/3/library/string.html# # format-specification-mini-language # - https://stackoverflow.com/a/20586479/3776794 # - https://drj11.wordpress.com/2007/07/03/python-poor-printing- # of-floating-point/ return '{0:.15g}'.format(value) else: raise NotImplementedError # Credit: https://stackoverflow.com/a/4703508/3776794 _numeric_pattern = r""" ^[-+]? # optional sign (?: (?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc | (?: \d+ \.? ) # 1. 12. 123. etc 1 12 123 etc ) # followed by optional exponent part if desired (?: [Ee] [+-]? \d+ ) ?$ """ _numeric_regex = re.compile(_numeric_pattern, re.VERBOSE) qiime2-2024.5.0/qiime2/metadata/metadata.py000066400000000000000000001311441462552636000202470ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import abc import collections import itertools import sqlite3 import types import warnings import pandas as pd import numpy as np import qiime2 from qiime2.core.util import find_duplicates import qiime2.core.missing as _missing from .base import SUPPORTED_COLUMN_TYPES, FORMATTED_ID_HEADERS, is_id_header DEFAULT_MISSING = _missing.DEFAULT_MISSING class _MetadataBase: """Base class for functionality shared between Metadata and MetadataColumn. Parameters ---------- index : pandas.Index IDs associated with the metadata. """ @property def id_header(self): """Name identifying the IDs associated with the metadata. This property is read-only. Returns ------- str Name of IDs associated with the metadata. """ return self._id_header @property def ids(self): """IDs associated with the metadata. This property is read-only. Returns ------- tuple of str Metadata IDs. """ return self._ids @property def id_count(self): """Number of metadata IDs. This property is read-only. Returns ------- int Number of metadata IDs. """ return len(self._ids) @property def artifacts(self): """Artifacts that are the source of the metadata. This property is read-only. Returns ------- tuple of qiime2.Artifact Source artifacts of the metadata. """ return tuple(self._artifacts) def __init__(self, index): if index.empty: raise ValueError( "%s must contain at least one ID." % self.__class__.__name__) id_header = index.name self._assert_valid_id_header(id_header) self._id_header = id_header self._validate_index(index, axis='id') self._ids = tuple(index) self._artifacts = [] def __eq__(self, other): return ( isinstance(other, self.__class__) and self._id_header == other._id_header and self._artifacts == other._artifacts ) def __ne__(self, other): return not (self == other) def _add_artifacts(self, artifacts): deduped = set(self._artifacts) for artifact in artifacts: if not isinstance(artifact, qiime2.Artifact): raise TypeError( "Expected Artifact object, received %r" % artifact) if artifact in deduped: raise ValueError( "Duplicate source artifacts are not supported on %s " "objects. The following artifact is a duplicate of " "another source artifact: %r" % (self.__class__.__name__, artifact)) deduped.add(artifact) self._artifacts.extend(artifacts) # Static helpers below for code reuse in Metadata and MetadataColumn @classmethod def _assert_valid_id_header(cls, name): if not is_id_header(name): raise ValueError( "pandas index name (`Index.name`) must be one of the " "following values, not %r:\n\n%s" % (name, FORMATTED_ID_HEADERS)) @classmethod def _validate_index(cls, index, *, axis): if axis == 'id': label = 'ID' elif axis == 'column': label = 'column name' else: raise NotImplementedError for value in index: if not isinstance(value, str): raise TypeError( "Detected non-string metadata %s of type %r: %r" % (label, type(value), value)) if not value: raise ValueError( "Detected empty metadata %s. %ss must consist of at least " "one character." % (label, label)) if axis == 'id' and value.startswith('#'): raise ValueError( "Detected metadata %s that begins with a pound sign " "(#): %r" % (label, value)) if is_id_header(value): raise ValueError( "Detected metadata %s %r that conflicts with a name " "reserved for the ID header. Reserved ID headers:\n\n%s" % (label, value, FORMATTED_ID_HEADERS)) if len(index) != len(set(index)): duplicates = find_duplicates(index) raise ValueError( "Metadata %ss must be unique. The following %ss are " "duplicated: %s" % (label, label, ', '.join(repr(e) for e in sorted(duplicates)))) @classmethod def _filter_ids_helper(cls, df_or_series, ids, ids_to_keep): # `ids_to_keep` can be any iterable, so turn it into a list so that it # can be iterated over multiple times below (and length-checked). ids_to_keep = list(ids_to_keep) if len(ids_to_keep) == 0: raise ValueError("`ids_to_keep` must contain at least one ID.") duplicates = find_duplicates(ids_to_keep) if duplicates: raise ValueError( "`ids_to_keep` must contain unique IDs. The following IDs are " "duplicated: %s" % (', '.join(repr(e) for e in sorted(duplicates)))) ids_to_keep = set(ids_to_keep) missing_ids = ids_to_keep - ids if missing_ids: raise ValueError( "The following IDs are not present in the metadata: %s" % (', '.join(repr(e) for e in sorted(missing_ids)))) # While preserving order, get rid of any IDs not contained in # `ids_to_keep`. ids_to_discard = ids - ids_to_keep return df_or_series.drop(labels=ids_to_discard, axis='index', inplace=False, errors='raise') def save(self, filepath, ext=None): """Save a TSV metadata file. The TSV metadata file format is described at https://docs.qiime2.org in the Metadata Tutorial. The file will always include the ``#q2:types`` directive in order to make the file roundtrippable without relying on column type inference. Parameters ---------- filepath : str Path to save TSV metadata file at. ext : str Preferred file extension (.tsv, .txt, etc). Will be left blank if no extension is included. Including a period in the extension is optional, and any additional periods delimiting the filepath and the extension will be reduced to a single period. Returns ------- str Filepath and extension (if provided) that the file was saved to. See Also -------- Metadata.load """ from .io import MetadataWriter if ext is None: ext = '' else: ext = '.' + ext.lstrip('.') filepath = filepath.rstrip('.') if not filepath.endswith(ext): filepath += ext MetadataWriter(self).write(filepath) return filepath # Other properties such as units can be included here in the future! ColumnProperties = collections.namedtuple('ColumnProperties', ['type', 'missing_scheme']) class Metadata(_MetadataBase): """Store metadata associated with identifiers in a study. Metadata is tabular in nature, mapping study identifiers (e.g. sample or feature IDs) to columns of metadata associated with each ID. For more details about metadata in QIIME 2, including the TSV metadata file format, see the Metadata Tutorial at https://docs.qiime2.org. The following text focuses on design and considerations when working with ``Metadata`` objects at the API level. A ``Metadata`` object is composed of zero or more ``MetadataColumn`` objects. A ``Metadata`` object always contains at least one ID, regardless of the number of columns. Each column in the ``Metadata`` object has an associated column type representing either *categorical* or *numeric* data. Each metadata column is represented by an object corresponding to the column's type: ``CategoricalMetadataColumn`` or ``NumericMetadataColumn``, respectively. A ``Metadata`` object is closely linked to its corresponding TSV metadata file format described at https://docs.qiime2.org. Therefore, certain requirements present in the file format are also enforced on the in-memory object in order to make serialized ``Metadata`` objects roundtrippable when loaded from disk again. For example, IDs cannot begin with a pound character (``#``) because those IDs would be interpreted as comment rows when written to disk as TSV. See the metadata file format spec for more details about data formatting requirements. In addition to being loaded from or saved to disk, a ``Metadata`` object can be constructed from a ``pandas.DataFrame`` object. See the *Parameters* section below for details on how to construct ``Metadata`` objects from dataframes. ``Metadata`` objects have various methods to access, filter, and merge data. A dataframe can be retrieved from the ``Metadata`` object for further data manipulation using the pandas API. Individual ``MetadataColumn`` objects can be retrieved to gain access to APIs applicable to a single metadata column. Missing values may be encoded in one of the following schemes: 'blank' The default, which treats `None`/`NaN` as the only valid missing values. 'no-missing' Indicates there are no missing values in a column, any `None`/`NaN` values should be considered an error. If a scheme other than 'blank' is used by default, this scheme can be provided to preserve strings as categorical terms. 'INSDC:missing' The INSDC vocabulary for missing values. The current implementation supports only lower-case terms which match exactly: 'not applicable', 'missing', 'not provided', 'not collected', and 'restricted access'. Parameters ---------- dataframe : pandas.DataFrame Dataframe containing metadata. The dataframe's index defines the IDs, and the index name (``Index.name``) must match one of the required ID headers described in the metadata file format spec. Each column in the dataframe defines a metadata column, and the metadata column's type (i.e. *categorical* or *numeric*) is determined based on the column's dtype. If a column has ``dtype=object``, it may contain strings or pandas missing values (e.g. ``np.nan``, ``None``). Columns matching this requirement are assumed to be *categorical*. If a column in the dataframe has ``dtype=float`` or ``dtype=int``, it may contain floating point numbers or integers, as well as pandas missing values (e.g. ``np.nan``). Columns matching this requirement are assumed to be *numeric*. Regardless of column type (categorical vs numeric), the dataframe stored within the ``Metadata`` object will have any missing values normalized to ``np.nan``. Columns with ``dtype=int`` will be cast to ``dtype=float``. To obtain a dataframe from the ``Metadata`` object containing these normalized data types and values, use ``Metadata.to_dataframe()``. column_missing_schemes : dict, optional Describe the metadata column handling for missing values described in the dataframe. This is a dict mapping column names (str) to missing-value schemes (str). Valid values are 'blank', 'no-missing', and 'INSDC:missing'. Column names may be omitted. default_missing_scheme : str, optional The missing scheme to use when none has been provided in the file or in `column_missing_schemes`. """ @classmethod def load(cls, filepath, column_types=None, column_missing_schemes=None, default_missing_scheme=DEFAULT_MISSING): """Load a TSV metadata file. The TSV metadata file format is described at https://docs.qiime2.org in the Metadata Tutorial. Parameters ---------- filepath : str Path to TSV metadata file to be loaded. column_types : dict, optional Override metadata column types specified or inferred in the file. This is a dict mapping column names (str) to column types (str). Valid column types are 'categorical' and 'numeric'. Column names may be omitted from this dict to use the column types read from the file. column_missing_schemes : dict, optional Override the metadata column handling for missing values described in the file. This is a dict mapping column names (str) to missing-value schemes (str). Valid values are 'blank', 'no-missing', and 'INSDC:missing'. Column names may be omitted. default_missing_scheme : str, optional The missing scheme to use when none has been provided in the file or in `column_missing_schemes`. Returns ------- Metadata Metadata object loaded from `filepath`. Raises ------ MetadataFileError If the metadata file is invalid in any way (e.g. doesn't meet the file format's requirements). See Also -------- save """ from .io import MetadataReader return MetadataReader(filepath).read( into=cls, column_types=column_types, column_missing_schemes=column_missing_schemes, default_missing_scheme=default_missing_scheme) @property def columns(self): """Ordered mapping of column names to ColumnProperties. The mapping that is returned is read-only. This property is also read-only. Returns ------- types.MappingProxyType Ordered mapping of column names to ColumnProperties. """ # Read-only proxy to the OrderedDict mapping column names to # ColumnProperties. return types.MappingProxyType(self._columns) @property def column_count(self): """Number of metadata columns. This property is read-only. Returns ------- int Number of metadata columns. Notes ----- Zero metadata columns are allowed. See Also -------- id_count """ return len(self._columns) def __init__(self, dataframe, column_missing_schemes=None, default_missing_scheme=DEFAULT_MISSING): if not isinstance(dataframe, pd.DataFrame): raise TypeError( "%s constructor requires a pandas.DataFrame object, not " "%r" % (self.__class__.__name__, type(dataframe))) super().__init__(dataframe.index) if column_missing_schemes is None: column_missing_schemes = {} self._dataframe, self._columns = self._normalize_dataframe( dataframe, column_missing_schemes, default_missing_scheme) self._validate_index(self._dataframe.columns, axis='column') def _normalize_dataframe(self, dataframe, column_missing_schemes, default_missing_scheme): norm_df = dataframe.copy() # Do not attempt to strip empty metadata if not norm_df.columns.empty: norm_df.columns = norm_df.columns.str.strip() norm_df.index = norm_df.index.str.strip() columns = collections.OrderedDict() for column_name, series in norm_df.items(): missing_scheme = column_missing_schemes.get(column_name, default_missing_scheme) metadata_column = self._metadata_column_factory(series, missing_scheme) norm_df[column_name] = metadata_column.to_series() properties = ColumnProperties(type=metadata_column.type, missing_scheme=missing_scheme) columns[column_name] = properties return norm_df, columns def _metadata_column_factory(self, series, missing_scheme): series = _missing.series_encode_missing(series, missing_scheme) # Collapse dtypes except for all NaN columns so that we can preserve # empty categorical columns. Empty numeric columns will already have # the expected dtype and values if not series.isna().all(): series = series.infer_objects() dtype = series.dtype if NumericMetadataColumn._is_supported_dtype(dtype): column = NumericMetadataColumn(series, missing_scheme) elif CategoricalMetadataColumn._is_supported_dtype(dtype): column = CategoricalMetadataColumn(series, missing_scheme) else: raise TypeError( "Metadata column %r has an unsupported pandas dtype of %s. " "Supported dtypes: float, int, object" % (series.name, dtype)) column._add_artifacts(self.artifacts) return column def __repr__(self): """String summary of the metadata and its columns.""" lines = [] # Header lines.append(self.__class__.__name__) lines.append('-' * len(self.__class__.__name__)) # Dimensions lines.append('%d ID%s x %d column%s' % ( self.id_count, '' if self.id_count == 1 else 's', self.column_count, '' if self.column_count == 1 else 's', )) # Column properties if self.column_count != 0: max_name_len = max((len(name) for name in self.columns)) for name, props in self.columns.items(): padding = ' ' * ((max_name_len - len(name)) + 1) lines.append('%s:%s%r' % (name, padding, props)) # Epilogue lines.append('') lines.append('Call to_dataframe() for a tabular representation.') return '\n'.join(lines) def __eq__(self, other): """Determine if this metadata is equal to another. ``Metadata`` objects are equal if their IDs, columns (including column names, types, and ordering), ID headers, source artifacts, and metadata values are equal. Parameters ---------- other : Metadata Metadata to test for equality. Returns ------- bool Indicates whether this ``Metadata`` object is equal to `other`. See Also -------- __ne__ """ return ( super().__eq__(other) and self._columns == other._columns and self._dataframe.equals(other._dataframe) ) def __ne__(self, other): """Determine if this metadata is not equal to another. ``Metadata`` objects are not equal if their IDs, columns (including column names, types, or ordering), ID headers, source artifacts, or metadata values are not equal. Parameters ---------- other : Metadata Metadata to test for inequality. Returns ------- bool Indicates whether this ``Metadata`` object is not equal to `other`. See Also -------- __eq__ """ return not (self == other) def to_dataframe(self, encode_missing=False): """Create a pandas dataframe from the metadata. The dataframe's index name (``Index.name``) will match this metadata object's ``id_header``, and the index will contain this metadata object's IDs. The dataframe's column names will match the column names in this metadata. Categorical columns will be stored as ``dtype=object`` (containing strings), and numeric columns will be stored as ``dtype=float``. Parameters ---------- encode_missing : bool, optional Whether to convert missing values (NaNs) back into their original vocabulary (strings) if a missing scheme was used. Returns ------- pandas.DataFrame Dataframe constructed from the metadata. """ df = self._dataframe.copy() if encode_missing: def replace_nan(series): missing = _missing.series_extract_missing(series) # avoid dtype changing if there's no missing values if not missing.empty: series[missing.index] = missing return series df = df.apply(replace_nan) return df def get_column(self, name): """Retrieve metadata column based on column name. Parameters ---------- name : str Name of the metadata column to retrieve. Returns ------- MetadataColumn Requested metadata column (``CategoricalMetadataColumn`` or ``NumericMetadataColumn``). See Also -------- get_ids """ try: series = self._dataframe[name] missing_scheme = self._columns[name].missing_scheme except KeyError: raise ValueError( '%r is not a column in the metadata. Available columns: ' '%s' % (name, ', '.join(repr(c) for c in self.columns))) return self._metadata_column_factory(series, missing_scheme) def get_ids(self, where=None): """Retrieve IDs matching search criteria. Parameters ---------- where : str, optional SQLite WHERE clause specifying criteria IDs must meet to be included in the results. All IDs are included by default. Returns ------- set IDs matching search criteria specified in `where`. See Also -------- ids filter_ids get_column Notes ----- The ID header (``Metadata.id_header``) may be used in the `where` clause to query the table's ID column. """ if where is None: return set(self._ids) conn = sqlite3.connect(':memory:') conn.row_factory = lambda cursor, row: row[0] # https://github.com/pandas-dev/pandas/blob/ # 7c7bd569ce8e0f117c618d068e3d2798134dbc73/pandas/io/sql.py#L1306 with warnings.catch_warnings(): warnings.filterwarnings( 'ignore', 'The spaces in these column names will not.*') self._dataframe.to_sql('metadata', conn, index=True, index_label=self.id_header) c = conn.cursor() # In general we wouldn't want to format our query in this way because # it leaves us open to sql injection, but it seems acceptable here for # a few reasons: # 1) This is a throw-away database which we're just creating to have # access to the query language, so any malicious behavior wouldn't # impact any data that isn't temporary # 2) The substitution syntax recommended in the docs doesn't allow # us to specify complex `where` statements, which is what we need to # do here. For example, we need to specify things like: # WHERE Subject='subject-1' AND SampleType='gut' # but their qmark/named-style syntaxes only supports substition of # variables, such as: # WHERE Subject=? # 3) sqlite3.Cursor.execute will only execute a single statement so # inserting multiple statements # (e.g., "Subject='subject-1'; DROP...") will result in an # OperationalError being raised. query = ('SELECT "{0}" FROM metadata WHERE {1} GROUP BY "{0}" ' 'ORDER BY "{0}";'.format(self.id_header, where)) try: c.execute(query) except sqlite3.OperationalError as e: conn.close() raise ValueError("Selection of IDs failed with query:\n %s\n\n" "If one of the metadata column names specified " "in the `where` statement is on this list " "of reserved keywords " "(http://www.sqlite.org/lang_keywords.html), " "please ensure it is quoted appropriately in the " "`where` statement." % query) from e ids = set(c.fetchall()) conn.close() return ids def merge(self, *others): """Merge this ``Metadata`` object with other ``Metadata`` objects. Returns a new ``Metadata`` object containing the merged contents of this ``Metadata`` object and `others`. The merge is not in-place and will always return a **new** merged ``Metadata`` object. The merge will include only those IDs that are shared across **all** ``Metadata`` objects being merged (i.e. the merge is an *inner join*). Each metadata column being merged must have a unique name; merging metadata with overlapping column names will result in an error. Parameters ---------- others : tuple One or more ``Metadata`` objects to merge with this ``Metadata`` object. Returns ------- Metadata New object containing merged metadata. The merged IDs will be in the same relative order as the IDs in this ``Metadata`` object after performing the inner join. The merged column order will match the column order of ``Metadata`` objects being merged from left to right. Raises ------ ValueError If zero ``Metadata`` objects are provided in `others` (there is nothing to merge in this case). Notes ----- The merged ``Metadata`` object will always have its ``id_header`` property set to ``'id'``, regardless of the ``id_header`` values on the ``Metadata`` objects being merged. The merged ``Metadata`` object tracks all source artifacts that it was built from to preserve provenance (i.e. the ``.artifacts`` property on all ``Metadata`` objects is merged). """ if len(others) < 1: raise ValueError( "At least one Metadata object must be provided to merge into " "this Metadata object (otherwise there is nothing to merge).") dfs = [] columns = [] artifacts = [] for md in itertools.chain([self], others): df = md._dataframe dfs.append(df) columns.extend(df.columns.tolist()) artifacts.extend(md.artifacts) columns = pd.Index(columns) if columns.has_duplicates: raise ValueError( "Cannot merge metadata with overlapping columns. The " "following columns overlap: %s" % ', '.join([repr(e) for e in columns[columns.duplicated()].unique()])) merged_df = dfs[0].join(dfs[1:], how='inner') # Not using DataFrame.empty because empty columns are allowed in # Metadata. if merged_df.index.empty: raise ValueError( "Cannot merge because there are no IDs shared across metadata " "objects.") merged_df.index.name = 'id' merged_md = self.__class__(merged_df) merged_md._add_artifacts(artifacts) return merged_md def filter_ids(self, ids_to_keep): """Filter metadata by IDs. Parameters ---------- ids_to_keep : iterable of str IDs that should be retained in the filtered ``Metadata`` object. If any IDs in `ids_to_keep` are not contained in this ``Metadata`` object, a ``ValueError`` will be raised. The filtered ``Metadata`` object will retain the same relative ordering of IDs in this ``Metadata`` object. Thus, the ordering of IDs in `ids_to_keep` does not determine the ordering of IDs in the filtered ``Metadata`` object. Returns ------- Metadata The metadata filtered by IDs. See Also -------- get_ids filter_columns """ filtered_df = self._filter_ids_helper(self._dataframe, self.get_ids(), ids_to_keep) filtered_md = self.__class__(filtered_df) filtered_md._add_artifacts(self.artifacts) return filtered_md def filter_columns(self, *, column_type=None, drop_all_unique=False, drop_zero_variance=False, drop_all_missing=False): """Filter metadata by columns. Parameters ---------- column_type : str, optional If supplied, will retain only columns of this type. The currently supported column types are 'numeric' and 'categorical'. drop_all_unique : bool, optional If ``True``, columns that contain a unique value for every ID will be dropped. Missing data (``np.nan``) are ignored when determining unique values. If a column consists solely of missing data, it will be dropped. drop_zero_variance : bool, optional If ``True``, columns that contain the same value for every ID will be dropped. Missing data (``np.nan``) are ignored when determining variance. If a column consists solely of missing data, it will be dropped. drop_all_missing : bool, optional If ``True``, columns that have a missing value (``np.nan``) for every ID will be dropped. Returns ------- Metadata The metadata filtered by columns. See Also -------- filter_ids """ if (column_type is not None and column_type not in SUPPORTED_COLUMN_TYPES): raise ValueError( "Unknown column type %r. Supported column types: %s" % (column_type, ', '.join(sorted(SUPPORTED_COLUMN_TYPES)))) # Build up a set of columns to drop. Short-circuit as soon as we know a # given column can be dropped (no need to apply further filters to it). columns_to_drop = set() for column, props in self.columns.items(): if column_type is not None and props.type != column_type: columns_to_drop.add(column) continue series = self._dataframe[column] if drop_all_unique or drop_zero_variance: # Ignore nans in the unique count, and compare to the number of # non-nan values in the series. num_unique = series.nunique(dropna=True) if drop_all_unique and num_unique == series.count(): columns_to_drop.add(column) continue # If num_unique == 0, the series was empty (all nans). If # num_unique == 1, the series contained only a single unique # value (ignoring nans). if drop_zero_variance and num_unique < 2: columns_to_drop.add(column) continue if drop_all_missing and series.isna().all(): columns_to_drop.add(column) continue filtered_df = self._dataframe.drop(columns_to_drop, axis=1, inplace=False) filtered_md = self.__class__(filtered_df) filtered_md._add_artifacts(self.artifacts) return filtered_md class MetadataColumn(_MetadataBase, metaclass=abc.ABCMeta): """Abstract base class representing a single metadata column. Concrete subclasses represent specific metadata column types, e.g. ``CategoricalMetadataColumn`` and ``NumericMetadataColumn``. See the ``Metadata`` class docstring for details about ``Metadata`` and ``MetadataColumn`` objects, including a description of column types. The main difference in constructing ``MetadataColumn`` vs ``Metadata`` objects is that ``MetadataColumn`` objects are constructed from a ``pandas.Series`` object instead of a ``pandas.DataFrame``. Otherwise, the same restrictions, considerations, and data normalization are applied as with ``Metadata`` objects. Parameters ---------- series : pd.Series The series to construct a column from. missing_scheme : "blank", "no-missing", "INSDC:missing" How to interpret terms for missing values. These will be converted to NaN. The default treatment is to take no action. """ # Abstract, must be defined by subclasses. type = None @classmethod @abc.abstractmethod def _is_supported_dtype(cls, dtype): """ Contract: Return ``True`` if the series `dtype` is supported by this object and can be handled appropriately by ``_normalize_``. Return ``False`` otherwise. """ raise NotImplementedError @classmethod @abc.abstractmethod def _normalize_(cls, series): """ Contract: Return a copy of `series` that has been converted to the appropriate internal dtype and has any other necessary normalization or validation applied (e.g. missing value representations, disallowing certain values, etc). Raise an error with a detailed error message if the operation cannot be completed. """ raise NotImplementedError @property def name(self): """Metadata column name. This property is read-only. Returns ------- str Metadata column name. """ return self._series.name @property def missing_scheme(self): """The vocabulary used to encode missing values This property is read-only. Returns ------- str "blank", "no-missing", or "INSDC:missing" """ return self._missing_scheme def __init__(self, series, missing_scheme=DEFAULT_MISSING): if not isinstance(series, pd.Series): raise TypeError( "%s constructor requires a pandas.Series object, not %r" % (self.__class__.__name__, type(series))) super().__init__(series.index) series = _missing.series_encode_missing(series, missing_scheme) # if the series has values with a consistent dtype, make the series # that dtype. Don't change the dtype if there is a column of all NaN if not series.isna().all(): series = series.infer_objects() if not self._is_supported_dtype(series.dtype): raise TypeError( "%s %r does not support a pandas.Series object with dtype %s" % (self.__class__.__name__, series.name, series.dtype)) self._missing_scheme = missing_scheme self._series = self._normalize_(series) self._validate_index([self._series.name], axis='column') def __repr__(self): """String summary of the metadata column.""" return '<%s name=%r id_count=%d>' % (self.__class__.__name__, self.name, self.id_count) def __eq__(self, other): """Determine if this metadata column is equal to another. ``MetadataColumn`` objects are equal if their IDs, column names, column types, ID headers, source artifacts, and metadata values are equal. Parameters ---------- other : MetadataColumn Metadata column to test for equality. Returns ------- bool Indicates whether this ``MetadataColumn`` object is equal to `other`. See Also -------- __ne__ """ return ( super().__eq__(other) and self.name == other.name and self._series.equals(other._series) ) def __ne__(self, other): """Determine if this metadata column is not equal to another. ``MetadataColumn`` objects are not equal if their IDs, column names, column types, ID headers, source artifacts, or metadata values are not equal. Parameters ---------- other : MetadataColumn Metadata column to test for inequality. Returns ------- bool Indicates whether this ``MetadataColumn`` object is not equal to `other`. See Also -------- __eq__ """ return not (self == other) def to_series(self, encode_missing=False): """Create a pandas series from the metadata column. The series index name (``Index.name``) will match this metadata column's ``id_header``, and the index will contain this metadata column's IDs. The series name will match this metadata column's name. Parameters ---------- encode_missing : bool, optional Whether to convert missing values (NaNs) back into their original vocabulary (strings) if a missing scheme was used. Returns ------- pandas.Series Series constructed from the metadata column. See Also -------- to_dataframe """ series = self._series.copy() if encode_missing: missing = self.get_missing() if not missing.empty: series[missing.index] = missing return series def to_dataframe(self, encode_missing=False): """Create a pandas dataframe from the metadata column. The dataframe will contain exactly one column. The dataframe's index name (``Index.name``) will match this metadata column's ``id_header``, and the index will contain this metadata column's IDs. The dataframe's column name will match this metadata column's name. Parameters ---------- encode_missing : bool, optional Whether to convert missing values (NaNs) back into their original vocabulary (strings) if a missing scheme was used. Returns ------- pandas.DataFrame Dataframe constructed from the metadata column. See Also -------- to_series """ return self.to_series(encode_missing=encode_missing).to_frame() def get_missing(self): """Return a series containing only missing values (with an index). If the column was constructed with a missing scheme, then the values of the series will be the original terms instead of NaN. """ return _missing.series_extract_missing(self._series) def get_value(self, id): """Retrieve metadata column value associated with an ID. Parameters ---------- id : str ID corresponding to the metadata column value to retrieve. Returns ------- object Value associated with the provided `id`. """ if id not in self._series.index: raise ValueError("ID %r is not present in %r" % (id, self)) return self._series.loc[id] def has_missing_values(self): """Determine if the metadata column has one or more missing values. Returns ------- bool ``True`` if the metadata column has one or more missing values (``np.nan``), ``False`` otherwise. See Also -------- drop_missing_values get_ids """ return len(self.get_ids(where_values_missing=True)) > 0 def drop_missing_values(self): """Filter out missing values from the metadata column. Returns ------- MetadataColumn Metadata column with missing values removed. See Also -------- has_missing_values get_ids """ missing = self.get_ids(where_values_missing=True) present = self.get_ids() - missing return self.filter_ids(present) def get_ids(self, where_values_missing=False): """Retrieve IDs matching search criteria. Parameters ---------- where_values_missing : bool, optional If ``True``, only return IDs that are associated with missing values (``np.nan``). If ``False`` (the default), return all IDs in the metadata column. Returns ------- set IDs matching search criteria. See Also -------- ids filter_ids has_missing_values drop_missing_values """ if where_values_missing: ids = self._series.index[self._series.isna()] else: ids = self._ids return set(ids) def filter_ids(self, ids_to_keep): """Filter metadata column by IDs. Parameters ---------- ids_to_keep : iterable of str IDs that should be retained in the filtered ``MetadataColumn`` object. If any IDs in `ids_to_keep` are not contained in this ``MetadataColumn`` object, a ``ValueError`` will be raised. The filtered ``MetadataColumn`` object will retain the same relative ordering of IDs in this ``MetadataColumn`` object. Thus, the ordering of IDs in `ids_to_keep` does not determine the ordering of IDs in the filtered ``MetadataColumn`` object. Returns ------- MetadataColumn The metadata column filtered by IDs. See Also -------- get_ids """ filtered_series = self._filter_ids_helper(self._series, self.get_ids(), ids_to_keep) filtered_mdc = self.__class__(filtered_series) filtered_mdc._add_artifacts(self.artifacts) return filtered_mdc class CategoricalMetadataColumn(MetadataColumn): """A single metadata column containing categorical data. See the ``Metadata`` class docstring for details about ``Metadata`` and ``MetadataColumn`` objects, including a description of column types and supported data formats. """ type = 'categorical' @classmethod def _is_supported_dtype(cls, dtype): return dtype == 'object' @classmethod def _normalize_(cls, series): def normalize(value): if isinstance(value, str): value = value.strip() if value == '': raise ValueError( "%s does not support empty strings as values. Use an " "appropriate pandas missing value type " "(e.g. `numpy.nan`) or supply a non-empty string as " "the value in column %r." % (cls.__name__, series.name)) else: return value elif pd.isna(value): # permits np.nan, Python float nan, None if type(value) is float and np.isnan(value): return value return np.nan else: raise TypeError( "%s only supports strings or missing values. Found value " "%r of type %r in column %r." % (cls.__name__, value, type(value), series.name)) norm_series = series.apply(normalize).astype(object) norm_series.index = norm_series.index.str.strip() norm_series.name = norm_series.name.strip() return norm_series class NumericMetadataColumn(MetadataColumn): """A single metadata column containing numeric data. See the ``Metadata`` class docstring for details about ``Metadata`` and ``MetadataColumn`` objects, including a description of column types and supported data formats. """ type = 'numeric' @classmethod def _is_supported_dtype(cls, dtype): return dtype == 'float' or dtype == 'int' or dtype == 'int64' @classmethod def _normalize_(cls, series): series = series.astype(float, copy=True, errors='raise') if np.isinf(series).any(): raise ValueError( "%s does not support positive or negative infinity as a " "floating point value in column %r." % (cls.__name__, series.name)) return series qiime2-2024.5.0/qiime2/metadata/tests/000077500000000000000000000000001462552636000172535ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/tests/__init__.py000066400000000000000000000005351462552636000213670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/metadata/tests/data/000077500000000000000000000000001462552636000201645ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/000077500000000000000000000000001462552636000216125ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/column-name-conflicts-with-id-header.tsv000066400000000000000000000001001462552636000313250ustar00rootroot00000000000000sampleid col1 featureid col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/comments-and-empty-rows-only.tsv000066400000000000000000000000561462552636000300410ustar00rootroot00000000000000# # Hello, World! # Foo, # Bar, # Baz qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/data-longer-than-header.tsv000066400000000000000000000001041462552636000267160ustar00rootroot00000000000000sampleid col1 col2 col3 id1 1 a foo id2 2 b bar overflow id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/directive-after-directives-section.tsv000066400000000000000000000002231462552636000312230ustar00rootroot00000000000000id col1 col2 col3 # directives must appear *immediately* below header #q2:types numeric categorical categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/directive-before-header.tsv000066400000000000000000000001371462552636000270150ustar00rootroot00000000000000#q2:types numeric categorical categorical id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/directive-longer-than-header.tsv000066400000000000000000000001551462552636000277710ustar00rootroot00000000000000sampleid col1 col2 col3 #q2:types numeric categorical categorical numeric id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/duplicate-column-names-with-whitespace.tsv000066400000000000000000000000771462552636000320250ustar00rootroot00000000000000id " col1 " col2 col1 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/duplicate-column-names.tsv000066400000000000000000000000651462552636000267170ustar00rootroot00000000000000id col1 col2 col1 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/duplicate-directives.tsv000066400000000000000000000002151462552636000264570ustar00rootroot00000000000000id col1 col2 col3 #q2:types numeric categorical categorical #q2:types categorical categorical categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/duplicate-ids-with-whitespace.tsv000066400000000000000000000000741462552636000302030ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar "id1 " 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/duplicate-ids.tsv000066400000000000000000000000651462552636000251000ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id1 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/empty-column-name.tsv000066400000000000000000000000611462552636000257140ustar00rootroot00000000000000id col1 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/empty-file000066400000000000000000000000001462552636000235760ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/empty-id.tsv000066400000000000000000000000621462552636000240760ustar00rootroot00000000000000ID col1 col2 col3 id1 1 a foo 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/header-only-with-comments-and-empty-rows.tsv000066400000000000000000000000761462552636000322420ustar00rootroot00000000000000# Hello, World! id col1 col2 col3 # Foo, # Bar, # Baz qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/header-only.tsv000066400000000000000000000000221462552636000245510ustar00rootroot00000000000000id col1 col2 col3 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/id-conflicts-with-id-header.tsv000066400000000000000000000000721462552636000275160ustar00rootroot00000000000000sampleid col1 col2 col3 id1 1 a foo id 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/invalid-header.tsv000066400000000000000000000001041462552636000252170ustar00rootroot00000000000000invalid_id_header col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/missing-unknown-scheme.tsv000066400000000000000000000004061462552636000267600ustar00rootroot00000000000000id col1 col2 col3 #q2:types numeric categorical categorical #q2:missing BAD:SCHEME INSDC:missing no-missing id1 1 a foo id2 2 b bar id3 3 c 42 id4 not applicable missing anything id5 not collected not provided whatever id6 restricted access restricted access 10 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/non-utf-8.tsv000066400000000000000000000001541462552636000241030ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/qiime1-empty.tsv000066400000000000000000000001301462552636000246630ustar00rootroot00000000000000#SampleID col1 col2 col3 # A QIIME 1 mapping file can have comments # below the header. qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/simple-utf-16be.txt000066400000000000000000000001541462552636000251730ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/simple-utf-16le.txt000066400000000000000000000001541462552636000252050ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/types-directive-non-numeric.tsv000066400000000000000000000007451462552636000277260ustar00rootroot00000000000000# All sorts of strings that shouldn't be interpreted as numbers in `col2`! # Note that the first few values in `col2` *can* be interpreted as numbers, # just to have a mixed set of numeric and non-numeric values. id col1 col2 #q2:types numeric numeric id1 1 42 id2 1 -42.50 id3 1 id4 1 a id5 1 foo id6 1 1,000 id7 1 1.000.0 id8 1 $42 id9 1 nan id10 1 NaN id11 1 NA id12 1 inf id13 1 +inf id14 1 -inf id15 1 Infinity id16 1 1_000_000 id17 1 0xAF id18 1 1e3e4 id19 1 e3 id20 1 sample-1 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/unrecognized-column-type.tsv000066400000000000000000000001271462552636000273160ustar00rootroot00000000000000id col1 col2 col3 #q2:types numeric foo categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/unrecognized-directive.tsv000066400000000000000000000001111462552636000270110ustar00rootroot00000000000000id col1 col2 col3 #q2:foo bar baz bar id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/whitespace-only-column-name.tsv000066400000000000000000000000701462552636000276710ustar00rootroot00000000000000id col1 " " col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/invalid/whitespace-only-id.tsv000066400000000000000000000000701462552636000260520ustar00rootroot00000000000000ID col1 col2 col3 id1 1 a foo " " 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/000077500000000000000000000000001462552636000212635ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/metadata/tests/data/valid/BOM-simple.txt000066400000000000000000000000701462552636000237250ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/all-cells-padded.tsv000066400000000000000000000000361462552636000251070ustar00rootroot00000000000000id col1 col2 col3 id1 id2 id3 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/biom-observation-metadata.tsv000066400000000000000000000001561462552636000270600ustar00rootroot00000000000000#OTUID taxonomy confidence # optional comments OTU_1 k__Bacteria;p__Firmicutes 0.890 OTU_2 k__Bacteria 0.9999 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/case-insensitive-types-directive.tsv000066400000000000000000000001361462552636000304100ustar00rootroot00000000000000id col1 col2 col3 #q2:types CATEGORICAL CategoricaL NuMeRiC id1 1 a -5 id2 2 b 0.0 id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/column-order.tsv000066400000000000000000000000541462552636000244260ustar00rootroot00000000000000id z y x id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/comments.tsv000066400000000000000000000010461462552636000236470ustar00rootroot00000000000000# pre-header # comment id col1 col2 col3 # post-header # comment id1 1 a foo id2 2 b bar # intra-data comment with another # sign # ## # comment with leading whitespace is still a comment. # comment with tab characters is also a comment! "# if the first cell is quoted, the parsing rules first process and strip double quotes, then check if the first cell begins with a pound sign" " # same rule applies if the de-quoted cell has leading whitespace (leading/trailing whitespace is *always* ignored)" id3 3 c 42 # trailing # comment qiime2-2024.5.0/qiime2/metadata/tests/data/valid/complete-types-directive.tsv000066400000000000000000000001431462552636000267450ustar00rootroot00000000000000id col1 col2 col3 #q2:types categorical categorical categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/empty-rows.tsv000066400000000000000000000002571462552636000241530ustar00rootroot00000000000000 id col1 col2 col3 id1 1 a foo id2 2 b bar " " " " " " " " " " id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/empty-types-directive.tsv000066400000000000000000000000771462552636000263010ustar00rootroot00000000000000id col1 col2 col3 #q2:types id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/jagged-trailing-columns.tsv000066400000000000000000000000701462552636000265240ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/jagged.tsv000066400000000000000000000000551462552636000232420ustar00rootroot00000000000000id col1 col2 col3 id1 1 a id2 2 b bar id3 3 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/leading-trailing-whitespace.tsv000066400000000000000000000006271462552636000273720ustar00rootroot00000000000000 # Leading/trailing whitespace is ignored in *any* type of cell, including # comments, empty rows, headers, directives, and data cells. # Double-quotes are always processed prior to stripping leading/trailing # whitespace within the cell. id "col1 " " col2" col3 #q2:types " numeric " categorical " categorical " id1 " 1 " a foo " " " id2 " 2 b "bar " id3 3 "c " 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/mac-line-endings.tsv000066400000000000000000000000651462552636000251340ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/minimal.tsv000066400000000000000000000000051462552636000234420ustar00rootroot00000000000000id a qiime2-2024.5.0/qiime2/metadata/tests/data/valid/missing-data.tsv000066400000000000000000000010051462552636000243750ustar00rootroot00000000000000# Missing data can be represented with empty cells or whitespace-only cells. # Test that values used to represent missing data in other programs # (e.g. pandas) are not treated as missing (e.g. "NA", "N/A"). Also test # columns that consist solely of missing data. By default, an empty column will # be treated as numeric data (column "NA" in this example). "col4" overrides # this behavior to make its empty column categorical. id col1 NA col3 col4 #q2:types categorical None 1 null nan N/A NA " " NA qiime2-2024.5.0/qiime2/metadata/tests/data/valid/missing-insdc-no-directive.tsv000066400000000000000000000002541462552636000271570ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 id4 not applicable missing anything id5 not collected not provided whatever id6 restricted access restricted access 10 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/missing-insdc.tsv000066400000000000000000000004111462552636000245640ustar00rootroot00000000000000id col1 col2 col3 #q2:types numeric categorical categorical #q2:missing INSDC:missing INSDC:missing no-missing id1 1 a foo id2 2 b bar id3 3 c 42 id4 not applicable missing anything id5 not collected not provided whatever id6 restricted access restricted access 10 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/no-columns.tsv000066400000000000000000000000151462552636000241070ustar00rootroot00000000000000id a b my-id qiime2-2024.5.0/qiime2/metadata/tests/data/valid/no-id-or-column-name-type-cast.tsv000066400000000000000000000001101462552636000275550ustar00rootroot00000000000000id 42.0 1000 -4.2 0.000001 2 b 2.5 0.004000 1 b 4.2 0.000000 3 c -9.999 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/no-newline-at-eof.tsv000066400000000000000000000000641462552636000252450ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42qiime2-2024.5.0/qiime2/metadata/tests/data/valid/non-standard-characters.tsv000066400000000000000000000013731462552636000265320ustar00rootroot00000000000000# See the corresponding unit test case for the goals of this file. The file # tests the following cases for IDs, column names, and cells. Many of the # choices are based on use-cases/bugs reported on the forum, Slack, etc. # # - Unicode characters # - Parentheses, underscores, less than (<), and greater than (>) # - Single and double quotes. Double quotes must be escaped according to the # Excel TSV dialect's double quote escaping rules. # - Escaped newlines (\n), carriage returns (\r), tabs (\t), and spaces # - Inline comment characters aren't treated as comments id ↩c@l1™ col(#2) #col'3 """""" "col 5" ©id##1 ƒoo (foo) #f o #o "fo o" ((id))2 ''2'' b#r "ba r" 'id_3<>' "b""ar" "c d" "4 2" """id#4""" b__a_z <42> >42 "i d 5" baz 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/numeric-column.tsv000066400000000000000000000001571462552636000247610ustar00rootroot00000000000000id col1 id1 0 id2 2.0 id3 0.00030 id4 -4.2 id5 1e-4 id6 1e4 id7 +1.5E+2 id8 id9 1. id10 .5 id11 1e-08 id12 -0 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/override-insdc.tsv000066400000000000000000000001371462552636000247370ustar00rootroot00000000000000id col1 #q2:missing no-missing id1 collected id2 not collected id3 not collected id4 collected qiime2-2024.5.0/qiime2/metadata/tests/data/valid/partial-types-directive.tsv000066400000000000000000000001131462552636000265660ustar00rootroot00000000000000id col1 col2 col3 #q2:types categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/qiime1.tsv000066400000000000000000000001731462552636000232070ustar00rootroot00000000000000#SampleID col1 col2 col3 # A QIIME 1 mapping file can have comments # below the header. id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/qiita-preparation-information.tsv000066400000000000000000000001541462552636000277750ustar00rootroot00000000000000sample_name BARCODE EXPERIMENT_DESIGN_DESCRIPTION id.1 ACGT longitudinal study id.2 TGCA longitudinal study qiime2-2024.5.0/qiime2/metadata/tests/data/valid/qiita-sample-information.tsv000066400000000000000000000001321462552636000267260ustar00rootroot00000000000000sample_name DESCRIPTION TITLE id.1 description 1 A Title id.2 description 2 Another Title qiime2-2024.5.0/qiime2/metadata/tests/data/valid/recommended-ids.tsv000066400000000000000000000000731462552636000250600ustar00rootroot00000000000000id col1 c6ca034a-223f-40b4-a0e0-45942912a5ea foo My.ID bar qiime2-2024.5.0/qiime2/metadata/tests/data/valid/rows-shorter-than-header.tsv000066400000000000000000000000441462552636000266530ustar00rootroot00000000000000id col1 col2 col3 id1 1 a id2 2 id3 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/simple-with-directive.tsv000066400000000000000000000001371462552636000262400ustar00rootroot00000000000000id col1 col2 col3 #q2:types numeric categorical categorical id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/simple.tsv000066400000000000000000000000651462552636000233130ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/simple.txt000066400000000000000000000000651462552636000233160ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/single-column.tsv000066400000000000000000000000321462552636000245700ustar00rootroot00000000000000id col1 id1 1 id2 2 id3 3 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/single-id.tsv000066400000000000000000000000361462552636000236730ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo qiime2-2024.5.0/qiime2/metadata/tests/data/valid/trailing-columns.tsv000066400000000000000000000000751462552636000253120ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/unix-line-endings.tsv000066400000000000000000000000651462552636000253570ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/data/valid/windows-line-endings.tsv000066400000000000000000000000711462552636000260630ustar00rootroot00000000000000id col1 col2 col3 id1 1 a foo id2 2 b bar id3 3 c 42 qiime2-2024.5.0/qiime2/metadata/tests/test_io.py000066400000000000000000001544151462552636000213050ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import os.path import pkg_resources import tempfile import unittest import numpy as np import pandas as pd from qiime2.metadata import (Metadata, CategoricalMetadataColumn, NumericMetadataColumn, MetadataFileError) def get_data_path(filename): return pkg_resources.resource_filename('qiime2.metadata.tests', 'data/%s' % filename) # NOTE: many of the test files in the `data` directory intentionally have # leading/trailing whitespace characters on some lines, as well as mixed usage # of spaces, tabs, carriage returns, and newlines. When editing these files, # please make sure your code editor doesn't strip these leading/trailing # whitespace characters (e.g. Atom does this by default), nor automatically # modify the files in some other way such as converting Windows-style CRLF # line terminators to Unix-style newlines. # # When committing changes to the files, carefully review the diff to make sure # unintended changes weren't introduced. class TestLoadErrors(unittest.TestCase): def test_path_does_not_exist(self): with self.assertRaisesRegex(MetadataFileError, "Metadata file path doesn't exist"): Metadata.load( '/qiime2/unit/tests/hopefully/this/path/does/not/exist') def test_path_is_directory(self): fp = get_data_path('valid') with self.assertRaisesRegex(MetadataFileError, "path points to something other than a " "file"): Metadata.load(fp) def test_non_utf_8_file(self): fp = get_data_path('invalid/non-utf-8.tsv') with self.assertRaisesRegex(MetadataFileError, 'encoded as UTF-8 or ASCII'): Metadata.load(fp) def test_utf_16_le_file(self): fp = get_data_path('invalid/simple-utf-16le.txt') with self.assertRaisesRegex(MetadataFileError, 'UTF-16 Unicode'): Metadata.load(fp) def test_utf_16_be_file(self): fp = get_data_path('invalid/simple-utf-16be.txt') with self.assertRaisesRegex(MetadataFileError, 'UTF-16 Unicode'): Metadata.load(fp) def test_empty_file(self): fp = get_data_path('invalid/empty-file') with self.assertRaisesRegex(MetadataFileError, 'locate header.*file may be empty'): Metadata.load(fp) def test_comments_and_empty_rows_only(self): fp = get_data_path('invalid/comments-and-empty-rows-only.tsv') with self.assertRaisesRegex(MetadataFileError, 'locate header.*only of comments or empty ' 'rows'): Metadata.load(fp) def test_header_only(self): fp = get_data_path('invalid/header-only.tsv') with self.assertRaisesRegex(MetadataFileError, 'at least one ID'): Metadata.load(fp) def test_header_only_with_comments_and_empty_rows(self): fp = get_data_path( 'invalid/header-only-with-comments-and-empty-rows.tsv') with self.assertRaisesRegex(MetadataFileError, 'at least one ID'): Metadata.load(fp) def test_qiime1_empty_mapping_file(self): fp = get_data_path('invalid/qiime1-empty.tsv') with self.assertRaisesRegex(MetadataFileError, 'at least one ID'): Metadata.load(fp) def test_invalid_header(self): fp = get_data_path('invalid/invalid-header.tsv') with self.assertRaisesRegex(MetadataFileError, 'unrecognized ID column name.*' 'invalid_id_header'): Metadata.load(fp) def test_empty_id(self): fp = get_data_path('invalid/empty-id.tsv') with self.assertRaisesRegex(MetadataFileError, 'empty metadata ID'): Metadata.load(fp) def test_whitespace_only_id(self): fp = get_data_path('invalid/whitespace-only-id.tsv') with self.assertRaisesRegex(MetadataFileError, 'empty metadata ID'): Metadata.load(fp) def test_empty_column_name(self): fp = get_data_path('invalid/empty-column-name.tsv') with self.assertRaisesRegex(MetadataFileError, 'column without a name'): Metadata.load(fp) def test_whitespace_only_column_name(self): fp = get_data_path('invalid/whitespace-only-column-name.tsv') with self.assertRaisesRegex(MetadataFileError, 'column without a name'): Metadata.load(fp) def test_duplicate_ids(self): fp = get_data_path('invalid/duplicate-ids.tsv') with self.assertRaisesRegex(MetadataFileError, 'IDs must be unique.*id1'): Metadata.load(fp) def test_duplicate_ids_with_whitespace(self): fp = get_data_path('invalid/duplicate-ids-with-whitespace.tsv') with self.assertRaisesRegex(MetadataFileError, 'IDs must be unique.*id1'): Metadata.load(fp) def test_duplicate_column_names(self): fp = get_data_path('invalid/duplicate-column-names.tsv') with self.assertRaisesRegex(MetadataFileError, 'Column names must be unique.*col1'): Metadata.load(fp) def test_duplicate_column_names_with_whitespace(self): fp = get_data_path( 'invalid/duplicate-column-names-with-whitespace.tsv') with self.assertRaisesRegex(MetadataFileError, 'Column names must be unique.*col1'): Metadata.load(fp) def test_id_conflicts_with_id_header(self): fp = get_data_path('invalid/id-conflicts-with-id-header.tsv') with self.assertRaisesRegex(MetadataFileError, "ID 'id' conflicts.*ID column header"): Metadata.load(fp) def test_column_name_conflicts_with_id_header(self): fp = get_data_path( 'invalid/column-name-conflicts-with-id-header.tsv') with self.assertRaisesRegex(MetadataFileError, "column name 'featureid' conflicts.*ID " "column header"): Metadata.load(fp) def test_column_types_unrecognized_column_name(self): fp = get_data_path('valid/simple.tsv') with self.assertRaisesRegex(MetadataFileError, 'not_a_column.*column_types.*not a column ' 'in the metadata file'): Metadata.load(fp, column_types={'not_a_column': 'numeric'}) def test_column_types_unrecognized_column_type(self): fp = get_data_path('valid/simple.tsv') with self.assertRaisesRegex(MetadataFileError, 'col2.*column_types.*unrecognized column ' 'type.*CATEGORICAL'): Metadata.load(fp, column_types={'col1': 'numeric', 'col2': 'CATEGORICAL'}) def test_column_types_not_convertible_to_numeric(self): fp = get_data_path('valid/simple.tsv') with self.assertRaisesRegex(MetadataFileError, "column 'col3' to numeric.*could not be " "interpreted as numeric: 'bar', 'foo'"): Metadata.load(fp, column_types={'col1': 'numeric', 'col2': 'categorical', 'col3': 'numeric'}) def test_column_types_override_directive_not_convertible_to_numeric(self): fp = get_data_path('valid/simple-with-directive.tsv') with self.assertRaisesRegex(MetadataFileError, "column 'col3' to numeric.*could not be " "interpreted as numeric: 'bar', 'foo'"): Metadata.load(fp, column_types={'col3': 'numeric'}) def test_directive_before_header(self): fp = get_data_path('invalid/directive-before-header.tsv') with self.assertRaisesRegex(MetadataFileError, 'directive.*#q2:types.*searching for ' 'header'): Metadata.load(fp) def test_unrecognized_directive(self): fp = get_data_path('invalid/unrecognized-directive.tsv') with self.assertRaisesRegex(MetadataFileError, 'Unrecognized directive.*#q2:foo.*' '#q2:types.*#q2:missing.*directive'): Metadata.load(fp) def test_duplicate_directives(self): fp = get_data_path('invalid/duplicate-directives.tsv') with self.assertRaisesRegex(MetadataFileError, 'duplicate directive.*#q2:types'): Metadata.load(fp) def test_unrecognized_column_type_in_directive(self): fp = get_data_path('invalid/unrecognized-column-type.tsv') with self.assertRaisesRegex(MetadataFileError, 'col2.*unrecognized column type.*foo.*' '#q2:types directive'): Metadata.load(fp) def test_column_types_directive_not_convertible_to_numeric(self): fp = get_data_path('invalid/types-directive-non-numeric.tsv') # This error message regex is intentionally verbose because we want to # assert that many different types of non-numeric strings aren't # interpreted as numbers. The error message displays a sorted list of # all values that couldn't be converted to numbers, making it possible # to test a variety of non-numeric strings in a single test case. msg = (r"column 'col2' to numeric.*could not be interpreted as " r"numeric: '\$42', '\+inf', '-inf', '0xAF', '1,000', " r"'1\.000\.0', '1_000_000', '1e3e4', 'Infinity', 'NA', 'NaN', " "'a', 'e3', 'foo', 'inf', 'nan', 'sample-1'") with self.assertRaisesRegex(MetadataFileError, msg): Metadata.load(fp) def test_directive_after_directives_section(self): fp = get_data_path( 'invalid/directive-after-directives-section.tsv') with self.assertRaisesRegex(MetadataFileError, '#q2:types.*outside of the directives ' 'section'): Metadata.load(fp) def test_directive_longer_than_header(self): fp = get_data_path('invalid/directive-longer-than-header.tsv') with self.assertRaisesRegex(MetadataFileError, 'row has 5 cells.*header declares 4 ' 'cells'): Metadata.load(fp) def test_data_longer_than_header(self): fp = get_data_path('invalid/data-longer-than-header.tsv') with self.assertRaisesRegex(MetadataFileError, 'row has 5 cells.*header declares 4 ' 'cells'): Metadata.load(fp) def test_unknown_missing_scheme(self): fp = get_data_path('invalid/missing-unknown-scheme.tsv') with self.assertRaisesRegex(MetadataFileError, 'col1.*BAD:SCHEME.*#q2:missing'): Metadata.load(fp) class TestLoadSuccess(unittest.TestCase): def setUp(self): self.temp_dir_obj = tempfile.TemporaryDirectory( prefix='qiime2-metadata-tests-temp-') self.temp_dir = self.temp_dir_obj.name # This Metadata object is compared against observed Metadata objects in # many of the tests, so just define it once here. self.simple_md = Metadata( pd.DataFrame({'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) # Basic sanity check to make sure the columns are ordered and typed as # expected. It'd be unfortunate to compare observed results to expected # results that aren't representing what we think they are! obs_columns = [(name, props.type) for name, props in self.simple_md.columns.items()] exp_columns = [('col1', 'numeric'), ('col2', 'categorical'), ('col3', 'categorical')] self.assertEqual(obs_columns, exp_columns) def tearDown(self): self.temp_dir_obj.cleanup() def test_simple(self): # Simple metadata file without comments, empty rows, jaggedness, # missing data, odd IDs or column names, directives, etc. The file has # multiple column types (numeric, categorical, and something that has # mixed numbers and strings, which must be interpreted as categorical). fp = get_data_path('valid/simple.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_bom_simple_txt(self): # This is the encoding that notepad.exe will use most commonly fp = get_data_path('valid/BOM-simple.txt') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_different_file_extension(self): fp = get_data_path('valid/simple.txt') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_no_newline_at_eof(self): fp = get_data_path('valid/no-newline-at-eof.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_unix_line_endings(self): fp = get_data_path('valid/unix-line-endings.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_windows_line_endings(self): fp = get_data_path('valid/windows-line-endings.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_mac_line_endings(self): fp = get_data_path('valid/mac-line-endings.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_no_source_artifacts(self): fp = get_data_path('valid/simple.tsv') metadata = Metadata.load(fp) self.assertEqual(metadata.artifacts, ()) def test_retains_column_order(self): # Explicitly test that the file's column order is retained in the # Metadata object. Many of the test cases use files with column names # in alphabetical order (e.g. "col1", "col2", "col3"), which matches # how pandas orders columns in a DataFrame when supplied with a dict # (many of the test cases use this feature of the DataFrame # constructor when constructing the expected DataFrame). fp = get_data_path('valid/column-order.tsv') obs_md = Metadata.load(fp) # Supply DataFrame constructor with explicit column ordering instead of # a dict. exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_columns = ['z', 'y', 'x'] exp_data = [ [1.0, 'a', 'foo'], [2.0, 'b', 'bar'], [3.0, 'c', '42'] ] exp_df = pd.DataFrame(exp_data, index=exp_index, columns=exp_columns) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_leading_trailing_whitespace(self): fp = get_data_path('valid/leading-trailing-whitespace.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_comments(self): fp = get_data_path('valid/comments.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_empty_rows(self): fp = get_data_path('valid/empty-rows.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_qiime1_mapping_file(self): fp = get_data_path('valid/qiime1.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='#SampleID') exp_df = pd.DataFrame({'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_qiita_sample_information_file(self): fp = get_data_path('valid/qiita-sample-information.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id.1', 'id.2'], name='sample_name') exp_df = pd.DataFrame({ 'DESCRIPTION': ['description 1', 'description 2'], 'TITLE': ['A Title', 'Another Title']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_qiita_preparation_information_file(self): fp = get_data_path('valid/qiita-preparation-information.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id.1', 'id.2'], name='sample_name') exp_df = pd.DataFrame({ 'BARCODE': ['ACGT', 'TGCA'], 'EXPERIMENT_DESIGN_DESCRIPTION': ['longitudinal study', 'longitudinal study']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_biom_observation_metadata_file(self): fp = get_data_path('valid/biom-observation-metadata.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['OTU_1', 'OTU_2'], name='#OTUID') exp_df = pd.DataFrame([['k__Bacteria;p__Firmicutes', 0.890], ['k__Bacteria', 0.9999]], columns=['taxonomy', 'confidence'], index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_supported_id_headers(self): case_insensitive = { 'id', 'sampleid', 'sample id', 'sample-id', 'featureid', 'feature id', 'feature-id' } exact_match = { '#SampleID', '#Sample ID', '#OTUID', '#OTU ID', 'sample_name' } # Build a set of supported headers, including exact matches and headers # with different casing. headers = set() for header in case_insensitive: headers.add(header) headers.add(header.upper()) headers.add(header.title()) for header in exact_match: headers.add(header) fp = os.path.join(self.temp_dir, 'metadata.tsv') count = 0 for header in headers: with open(fp, 'w') as fh: fh.write('%s\tcolumn\nid1\tfoo\nid2\tbar\n' % header) obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2'], name=header) exp_df = pd.DataFrame({'column': ['foo', 'bar']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) count += 1 # Since this test case is a little complicated, make sure that the # expected number of comparisons are happening. self.assertEqual(count, 26) def test_recommended_ids(self): fp = get_data_path('valid/recommended-ids.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['c6ca034a-223f-40b4-a0e0-45942912a5ea', 'My.ID'], name='id') exp_df = pd.DataFrame({'col1': ['foo', 'bar']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_non_standard_characters(self): # Test that non-standard characters in IDs, column names, and cells are # handled correctly. The test case isn't exhaustive (e.g. it doesn't # test every Unicode character; that would be a nice additional test # case to have in the future). Instead, this test aims to be more of an # integration test for the robustness of the reader to non-standard # data. Many of the characters and their placement within the data file # are based on use-cases/bugs reported on the forum, Slack, etc. The # data file has comments explaining these test case choices in more # detail. fp = get_data_path('valid/non-standard-characters.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['©id##1', '((id))2', "'id_3<>'", '"id#4"', 'i d\r\t\n5'], name='id') exp_columns = ['↩c@l1™', 'col(#2)', "#col'3", '""', 'col\t \r\n5'] exp_data = [ ['ƒoo', '(foo)', '#f o #o', 'fo\ro', np.nan], ["''2''", 'b#r', 'ba\nr', np.nan, np.nan], ['b"ar', 'c\td', '4\r\n2', np.nan, np.nan], ['b__a_z', '<42>', '>42', np.nan, np.nan], ['baz', np.nan, '42'] ] exp_df = pd.DataFrame(exp_data, index=exp_index, columns=exp_columns) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_missing_data(self): fp = get_data_path('valid/missing-data.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['None', 'nan', 'NA'], name='id') exp_df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, np.nan]), ('NA', [np.nan, np.nan, np.nan]), ('col3', ['null', 'N/A', 'NA']), ('col4', np.array([np.nan, np.nan, np.nan], dtype=object))]), index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) # Test that column types are correct (mainly for the two empty columns; # one should be numeric, the other categorical). obs_columns = [(name, props.type) for name, props in obs_md.columns.items()] exp_columns = [('col1', 'numeric'), ('NA', 'numeric'), ('col3', 'categorical'), ('col4', 'categorical')] self.assertEqual(obs_columns, exp_columns) def test_missing_insdc(self): fp = get_data_path('valid/missing-insdc.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3', 'id4', 'id5', 'id6'], name='id') exp_df = pd.DataFrame({'col1': [1, 2, 3] + ([float('nan')] * 3), 'col2': ['a', 'b', 'c'] + ([float('nan')] * 3), 'col3': ['foo', 'bar', '42', 'anything', 'whatever', '10']}, index=exp_index) # not testing column_missing_schemes here on purpose, externally the # nan's shouldn't be meaningfully different exp_md = Metadata(exp_df) pd.testing.assert_frame_equal(obs_md.to_dataframe(), exp_md.to_dataframe()) obs_columns = [(name, props.type, props.missing_scheme) for name, props in obs_md.columns.items()] exp_columns = [ ('col1', 'numeric', 'INSDC:missing'), ('col2', 'categorical', 'INSDC:missing'), ('col3', 'categorical', 'no-missing') ] self.assertEqual(obs_columns, exp_columns) def test_insdc_no_directives(self): fp = get_data_path('valid/missing-insdc-no-directive.tsv') obs_md = Metadata.load(fp, default_missing_scheme='INSDC:missing') exp_index = pd.Index(['id1', 'id2', 'id3', 'id4', 'id5', 'id6'], name='id') exp_df = pd.DataFrame({'col1': [1, 2, 3] + ([float('nan')] * 3), 'col2': ['a', 'b', 'c'] + ([float('nan')] * 3), 'col3': ['foo', 'bar', '42', 'anything', 'whatever', '10']}, index=exp_index) # not testing column_missing_schemes here on purpose, externally the # nan's shouldn't be meaningfully different exp_md = Metadata(exp_df) pd.testing.assert_frame_equal(obs_md.to_dataframe(), exp_md.to_dataframe()) obs_columns = [(name, props.type, props.missing_scheme) for name, props in obs_md.columns.items()] exp_columns = [ ('col1', 'numeric', 'INSDC:missing'), ('col2', 'categorical', 'INSDC:missing'), ('col3', 'categorical', 'INSDC:missing') ] self.assertEqual(obs_columns, exp_columns) def test_insdc_override(self): fp = get_data_path('valid/override-insdc.tsv') # This file has INSDC terms, but they aren't missing values. obs_md = Metadata.load(fp, default_missing_scheme='INSDC:missing') exp_index = pd.Index(['id1', 'id2', 'id3', 'id4'], name='id') exp_df = pd.DataFrame({'col1': ['collected', 'not collected', 'not collected', 'collected']}, index=exp_index) pd.testing.assert_frame_equal(obs_md.to_dataframe(), exp_df) obs_columns = [(name, props.type, props.missing_scheme) for name, props in obs_md.columns.items()] exp_columns = [ ('col1', 'categorical', 'no-missing'), ] self.assertEqual(obs_columns, exp_columns) def test_minimal_file(self): # Simplest possible metadata file consists of one ID and zero columns. fp = get_data_path('valid/minimal.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['a'], name='id') exp_df = pd.DataFrame({}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_single_id(self): fp = get_data_path('valid/single-id.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1'], name='id') exp_df = pd.DataFrame({'col1': [1.0], 'col2': ['a'], 'col3': ['foo']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_no_columns(self): fp = get_data_path('valid/no-columns.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['a', 'b', 'my-id'], name='id') exp_df = pd.DataFrame({}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_single_column(self): fp = get_data_path('valid/single-column.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': [1.0, 2.0, 3.0]}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_trailing_columns(self): fp = get_data_path('valid/trailing-columns.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_jagged_trailing_columns(self): # Test case based on https://github.com/qiime2/qiime2/issues/335 fp = get_data_path('valid/jagged-trailing-columns.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_padding_rows_shorter_than_header(self): fp = get_data_path('valid/rows-shorter-than-header.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': [1.0, 2.0, np.nan], 'col2': ['a', np.nan, np.nan], 'col3': [np.nan, np.nan, np.nan]}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_all_cells_padded(self): fp = get_data_path('valid/all-cells-padded.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': [np.nan, np.nan, np.nan], 'col2': [np.nan, np.nan, np.nan], 'col3': [np.nan, np.nan, np.nan]}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_does_not_cast_ids_or_column_names(self): fp = get_data_path('valid/no-id-or-column-name-type-cast.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['0.000001', '0.004000', '0.000000'], dtype=object, name='id') exp_columns = ['42.0', '1000', '-4.2'] exp_data = [ [2.0, 'b', 2.5], [1.0, 'b', 4.2], [3.0, 'c', -9.999] ] exp_df = pd.DataFrame(exp_data, index=exp_index, columns=exp_columns) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_numeric_column(self): fp = get_data_path('valid/numeric-column.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3', 'id4', 'id5', 'id6', 'id7', 'id8', 'id9', 'id10', 'id11', 'id12'], name='id') exp_df = pd.DataFrame({'col1': [0.0, 2.0, 0.0003, -4.2, 1e-4, 1e4, 1.5e2, np.nan, 1.0, 0.5, 1e-8, -0.0]}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_numeric_column_as_categorical(self): fp = get_data_path('valid/numeric-column.tsv') obs_md = Metadata.load(fp, column_types={'col1': 'categorical'}) exp_index = pd.Index(['id1', 'id2', 'id3', 'id4', 'id5', 'id6', 'id7', 'id8', 'id9', 'id10', 'id11', 'id12'], name='id') exp_df = pd.DataFrame({'col1': ['0', '2.0', '0.00030', '-4.2', '1e-4', '1e4', '+1.5E+2', np.nan, '1.', '.5', '1e-08', '-0']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_with_complete_types_directive(self): fp = get_data_path('valid/complete-types-directive.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': ['1', '2', '3'], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_with_partial_types_directive(self): fp = get_data_path('valid/partial-types-directive.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': ['1', '2', '3'], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_with_empty_types_directive(self): fp = get_data_path('valid/empty-types-directive.tsv') obs_md = Metadata.load(fp) self.assertEqual(obs_md, self.simple_md) def test_with_case_insensitive_types_directive(self): fp = get_data_path('valid/case-insensitive-types-directive.tsv') obs_md = Metadata.load(fp) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': ['1', '2', '3'], 'col2': ['a', 'b', 'c'], 'col3': [-5.0, 0.0, 42.0]}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_column_types_without_directive(self): fp = get_data_path('valid/simple.tsv') obs_md = Metadata.load(fp, column_types={'col1': 'categorical'}) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': ['1', '2', '3'], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) def test_column_types_override_directive(self): fp = get_data_path('valid/simple-with-directive.tsv') obs_md = Metadata.load(fp, column_types={'col1': 'categorical', 'col2': 'categorical'}) exp_index = pd.Index(['id1', 'id2', 'id3'], name='id') exp_df = pd.DataFrame({'col1': ['1', '2', '3'], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=exp_index) exp_md = Metadata(exp_df) self.assertEqual(obs_md, exp_md) class TestSave(unittest.TestCase): def setUp(self): self.temp_dir_obj = tempfile.TemporaryDirectory( prefix='qiime2-metadata-tests-temp-') self.temp_dir = self.temp_dir_obj.name self.filepath = os.path.join(self.temp_dir, 'metadata.tsv') def tearDown(self): self.temp_dir_obj.cleanup() def test_simple(self): md = Metadata(pd.DataFrame( {'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\tcol3\n" "#q2:types\tnumeric\tcategorical\tcategorical\n" "id1\t1\ta\tfoo\n" "id2\t2\tb\tbar\n" "id3\t3\tc\t42\n" ) self.assertEqual(obs, exp) def test_save_metadata_auto_extension(self): md = Metadata(pd.DataFrame( {'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) # Filename & extension endswith is matching (non-default). fp = os.path.join(self.temp_dir, 'metadatatsv') obs_md = md.save(fp, '.tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadatatsv.tsv') # No period in filename; no extension included. fp = os.path.join(self.temp_dir, 'metadata') obs_md = md.save(fp) obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata') # No period in filename; no period in extension. fp = os.path.join(self.temp_dir, 'metadata') obs_md = md.save(fp, 'tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # No period in filename; multiple periods in extension. fp = os.path.join(self.temp_dir, 'metadata') obs_md = md.save(fp, '..tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Single period in filename; no period in extension. fp = os.path.join(self.temp_dir, 'metadata.') obs_md = md.save(fp, 'tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Single period in filename; single period in extension. fp = os.path.join(self.temp_dir, 'metadata.') obs_md = md.save(fp, '.tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Single period in filename; multiple periods in extension. fp = os.path.join(self.temp_dir, 'metadata.') obs_md = md.save(fp, '..tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Multiple periods in filename; single period in extension. fp = os.path.join(self.temp_dir, 'metadata..') obs_md = md.save(fp, '.tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Multiple periods in filename; multiple periods in extension. fp = os.path.join(self.temp_dir, 'metadata..') obs_md = md.save(fp, '..tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # No extension in filename; no extension input. fp = os.path.join(self.temp_dir, 'metadata') obs_md = md.save(fp) obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata') # No extension in filename; extension input. fp = os.path.join(self.temp_dir, 'metadata') obs_md = md.save(fp, '.tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Extension in filename; no extension input. fp = os.path.join(self.temp_dir, 'metadata.tsv') obs_md = md.save(fp) obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') # Extension in filename; extension input (non-matching). fp = os.path.join(self.temp_dir, 'metadata.tsv') obs_md = md.save(fp, '.txt') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv.txt') # Extension in filename; extension input (matching). fp = os.path.join(self.temp_dir, 'metadata.tsv') obs_md = md.save(fp, '.tsv') obs_filename = os.path.basename(obs_md) self.assertEqual(obs_filename, 'metadata.tsv') def test_no_bom(self): md = Metadata(pd.DataFrame( {'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md.save(self.filepath) with open(self.filepath, 'rb') as fh: obs = fh.read(2) self.assertEqual(obs, b'id') def test_different_file_extension(self): md = Metadata(pd.DataFrame( {'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) filepath = os.path.join(self.temp_dir, 'metadata.txt') md.save(filepath) with open(filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\tcol3\n" "#q2:types\tnumeric\tcategorical\tcategorical\n" "id1\t1\ta\tfoo\n" "id2\t2\tb\tbar\n" "id3\t3\tc\t42\n" ) self.assertEqual(obs, exp) def test_some_missing_data(self): md = Metadata( pd.DataFrame({'col1': [42.0, np.nan, -3.5], 'col2': ['a', np.nan, np.nan]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\n" "#q2:types\tnumeric\tcategorical\n" "id1\t42\ta\n" "id2\t\t\n" "id3\t-3.5\t\n" ) self.assertEqual(obs, exp) def test_all_missing_data(self): # nan-only columns that are numeric or categorical. md = Metadata( pd.DataFrame({'col1': [np.nan, np.nan, np.nan], 'col2': np.array([np.nan, np.nan, np.nan], dtype=object)}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\n" "#q2:types\tnumeric\tcategorical\n" "id1\t\t\n" "id2\t\t\n" "id3\t\t\n" ) self.assertEqual(obs, exp) def test_missing_schemes(self): md = Metadata( pd.DataFrame({'col1': [42.0, np.nan, -3.5], 'col2': ['a', 'not applicable', 'restricted access']}, index=pd.Index(['id1', 'id2', 'id3'], name='id')), column_missing_schemes={ 'col1': 'blank', 'col2': 'INSDC:missing'} ) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\n" "#q2:types\tnumeric\tcategorical\n" "#q2:missing\tblank\tINSDC:missing\n" "id1\t42\ta\n" "id2\t\tnot applicable\n" "id3\t-3.5\trestricted access\n" ) self.assertEqual(obs, exp) def test_default_missing_scheme(self): md = Metadata( pd.DataFrame({'col1': [42.0, np.nan, -3.5], 'col2': ['a', 'not applicable', 'restricted access']}, index=pd.Index(['id1', 'id2', 'id3'], name='id')), default_missing_scheme='INSDC:missing') md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\n" "#q2:types\tnumeric\tcategorical\n" "#q2:missing\tINSDC:missing\tINSDC:missing\n" "id1\t42\ta\n" "id2\t\tnot applicable\n" "id3\t-3.5\trestricted access\n" ) self.assertEqual(obs, exp) def test_default_missing_scheme_override(self): md = Metadata( pd.DataFrame({'col1': [42.0, np.nan, -3.5], 'col2': ['a', 'not applicable', 'restricted access']}, index=pd.Index(['id1', 'id2', 'id3'], name='id')), default_missing_scheme='q2:error', column_missing_schemes=dict(col1='INSDC:missing', col2='INSDC:missing')) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\tcol2\n" "#q2:types\tnumeric\tcategorical\n" "#q2:missing\tINSDC:missing\tINSDC:missing\n" "id1\t42\ta\n" "id2\t\tnot applicable\n" "id3\t-3.5\trestricted access\n" ) self.assertEqual(obs, exp) def test_unsorted_column_order(self): index = pd.Index(['id1', 'id2', 'id3'], name='id') columns = ['z', 'b', 'y'] data = [ [1.0, 'a', 'foo'], [2.0, 'b', 'bar'], [3.0, 'c', '42'] ] md = Metadata(pd.DataFrame(data, index=index, columns=columns)) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tz\tb\ty\n" "#q2:types\tnumeric\tcategorical\tcategorical\n" "id1\t1\ta\tfoo\n" "id2\t2\tb\tbar\n" "id3\t3\tc\t42\n" ) self.assertEqual(obs, exp) def test_alternate_id_header(self): md = Metadata(pd.DataFrame( {'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='#SampleID'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "#SampleID\tcol1\tcol2\tcol3\n" "#q2:types\tnumeric\tcategorical\tcategorical\n" "id1\t1\ta\tfoo\n" "id2\t2\tb\tbar\n" "id3\t3\tc\t42\n" ) self.assertEqual(obs, exp) def test_various_numbers(self): numbers = [ 0.0, -0.0, np.nan, 1.0, 42.0, -33.0, 1e-10, 1.5e15, 0.0003, -4.234, # This last number should be rounded because it exceeds 15 digits # of precision. 12.34567891234567 ] index = pd.Index(['id1', 'id2', 'id3', 'id4', 'id5', 'id6', 'id7', 'id8', 'id9', 'id10', 'id11'], name='ID') md = Metadata(pd.DataFrame({'numbers': numbers}, index=index)) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "ID\tnumbers\n" "#q2:types\tnumeric\n" "id1\t0\n" "id2\t-0\n" "id3\t\n" "id4\t1\n" "id5\t42\n" "id6\t-33\n" "id7\t1e-10\n" "id8\t1.5e+15\n" "id9\t0.0003\n" "id10\t-4.234\n" "id11\t12.3456789123457\n" ) self.assertEqual(obs, exp) def test_minimal(self): md = Metadata(pd.DataFrame({}, index=pd.Index(['my-id'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\n" "#q2:types\n" "my-id\n" ) self.assertEqual(obs, exp) def test_single_id(self): md = Metadata(pd.DataFrame( {'col1': ['foo'], 'col2': [4.002]}, index=pd.Index(['my-id'], name='featureid'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "featureid\tcol1\tcol2\n" "#q2:types\tcategorical\tnumeric\n" "my-id\tfoo\t4.002\n" ) self.assertEqual(obs, exp) def test_no_columns(self): md = Metadata(pd.DataFrame( {}, index=pd.Index(['foo', 'bar', 'baz'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\n" "#q2:types\n" "foo\n" "bar\n" "baz\n" ) self.assertEqual(obs, exp) def test_single_column(self): md = Metadata(pd.DataFrame( {'col1': ['42', '4.3', '4.4000']}, index=pd.Index(['foo', 'bar', 'baz'], name='id'))) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcol1\n" "#q2:types\tcategorical\n" "foo\t42\n" "bar\t4.3\n" "baz\t4.4000\n" ) self.assertEqual(obs, exp) def test_ids_and_column_names_as_numeric_strings(self): index = pd.Index(['0.000001', '0.004000', '0.000000'], dtype=object, name='id') columns = ['42.0', '1000', '-4.2'] data = [ [2.0, 'b', 2.5], [1.0, 'b', 4.2], [3.0, 'c', -9.999] ] df = pd.DataFrame(data, index=index, columns=columns) md = Metadata(df) md.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\t42.0\t1000\t-4.2\n" "#q2:types\tnumeric\tcategorical\tnumeric\n" "0.000001\t2\tb\t2.5\n" "0.004000\t1\tb\t4.2\n" "0.000000\t3\tc\t-9.999\n" ) self.assertEqual(obs, exp) # A couple of basic tests for CategoricalMetadataColumn and # NumericMetadataColumn below. Those classes simply transform themselves # into single-column Metadata objects within `MetadataColumn.save()` and # use the same writer code from there on. def test_categorical_metadata_column(self): mdc = CategoricalMetadataColumn(pd.Series( ['foo', 'bar', '42.50'], name='categorical-column', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcategorical-column\n" "#q2:types\tcategorical\n" "id1\tfoo\n" "id2\tbar\n" "id3\t42.50\n" ) self.assertEqual(obs, exp) def test_categorical_metadata_column_insdc_no_missing(self): mdc = CategoricalMetadataColumn(pd.Series( ['foo', 'bar', '42.50'], name='categorical-column', index=pd.Index(['id1', 'id2', 'id3'], name='id')), missing_scheme='INSDC:missing') mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcategorical-column\n" "#q2:types\tcategorical\n" "#q2:missing\tINSDC:missing\n" "id1\tfoo\n" "id2\tbar\n" "id3\t42.50\n" ) self.assertEqual(obs, exp) def test_categorical_metadata_column_insdc_missing(self): mdc = CategoricalMetadataColumn(pd.Series( ['foo', 'missing', '42.50'], name='categorical-column', index=pd.Index(['id1', 'id2', 'id3'], name='id')), missing_scheme='INSDC:missing') mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tcategorical-column\n" "#q2:types\tcategorical\n" "#q2:missing\tINSDC:missing\n" "id1\tfoo\n" "id2\tmissing\n" "id3\t42.50\n" ) self.assertEqual(obs, exp) def test_numeric_metadata_column(self): mdc = NumericMetadataColumn(pd.Series( [1e-15, 42.50, -999.0], name='numeric-column', index=pd.Index(['id1', 'id2', 'id3'], name='#OTU ID'))) mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "#OTU ID\tnumeric-column\n" "#q2:types\tnumeric\n" "id1\t1e-15\n" "id2\t42.5\n" "id3\t-999\n" ) self.assertEqual(obs, exp) def test_numeric_metadata_column_insdc_no_missing(self): mdc = NumericMetadataColumn(pd.Series( [1e-15, 42.50, -999.0], name='numeric-column', index=pd.Index(['id1', 'id2', 'id3'], name='#OTU ID')), missing_scheme='INSDC:missing') mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "#OTU ID\tnumeric-column\n" "#q2:types\tnumeric\n" "#q2:missing\tINSDC:missing\n" "id1\t1e-15\n" "id2\t42.5\n" "id3\t-999\n" ) self.assertEqual(obs, exp) def test_numeric_metadata_column_insdc_missing(self): mdc = NumericMetadataColumn(pd.Series( [1e-15, 'missing', -999.0], name='numeric-column', index=pd.Index(['id1', 'id2', 'id3'], name='#OTU ID')), missing_scheme='INSDC:missing') mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "#OTU ID\tnumeric-column\n" "#q2:types\tnumeric\n" "#q2:missing\tINSDC:missing\n" "id1\t1e-15\n" "id2\tmissing\n" "id3\t-999\n" ) self.assertEqual(obs, exp) # TODO this class spot-checks some of the more "difficult" valid files to make # sure they can be read, written to disk, and read again in a lossless way. # A more complete strategy (with fewer test cases) would be performing a # roundtrip test on every valid file under the `data` directory (e.g. with a # `glob` and for loop). class TestRoundtrip(unittest.TestCase): def setUp(self): self.temp_dir_obj = tempfile.TemporaryDirectory( prefix='qiime2-metadata-tests-temp-') self.temp_dir = self.temp_dir_obj.name self.filepath = os.path.join(self.temp_dir, 'metadata.tsv') def tearDown(self): self.temp_dir_obj.cleanup() def test_simple(self): fp = get_data_path('valid/simple.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_non_standard_characters(self): fp = get_data_path('valid/non-standard-characters.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_missing_data(self): fp = get_data_path('valid/missing-data.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_missing_insdc(self): fp = get_data_path('valid/missing-insdc.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_minimal_file(self): fp = get_data_path('valid/minimal.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_numeric_column(self): fp = get_data_path('valid/numeric-column.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_all_cells_padded(self): fp = get_data_path('valid/all-cells-padded.tsv') md1 = Metadata.load(fp) md1.save(self.filepath) md2 = Metadata.load(self.filepath) self.assertEqual(md1, md2) def test_categorical_metadata_column(self): fp = get_data_path('valid/simple.tsv') md1 = Metadata.load(fp) mdc1 = md1.get_column('col2') self.assertIsInstance(mdc1, CategoricalMetadataColumn) mdc1.save(self.filepath) md2 = Metadata.load(self.filepath) mdc2 = md2.get_column('col2') self.assertIsInstance(mdc1, CategoricalMetadataColumn) self.assertEqual(mdc1, mdc2) def test_numeric_metadata_column(self): fp = get_data_path('valid/simple.tsv') md1 = Metadata.load(fp) mdc1 = md1.get_column('col1') self.assertIsInstance(mdc1, NumericMetadataColumn) mdc1.save(self.filepath) md2 = Metadata.load(self.filepath) mdc2 = md2.get_column('col1') self.assertIsInstance(mdc1, NumericMetadataColumn) self.assertEqual(mdc1, mdc2) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/metadata/tests/test_metadata.py000066400000000000000000002030331462552636000224450ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import unittest import warnings import pandas as pd import numpy as np from qiime2 import Artifact from qiime2.metadata import (Metadata, CategoricalMetadataColumn, NumericMetadataColumn) from qiime2.core.testing.util import get_dummy_plugin, ReallyEqualMixin class TestInvalidMetadataConstruction(unittest.TestCase): def test_non_dataframe(self): with self.assertRaisesRegex( TypeError, 'Metadata constructor.*DataFrame.*not.*Series'): Metadata(pd.Series([1, 2, 3], name='col', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_no_ids(self): with self.assertRaisesRegex(ValueError, 'Metadata.*at least one ID'): Metadata(pd.DataFrame({}, index=pd.Index([], name='id'))) with self.assertRaisesRegex(ValueError, 'Metadata.*at least one ID'): Metadata(pd.DataFrame({'column': []}, index=pd.Index([], name='id'))) def test_invalid_id_header(self): # default index name with self.assertRaisesRegex(ValueError, r'Index\.name.*None'): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', 'b', 'c']))) with self.assertRaisesRegex(ValueError, r'Index\.name.*my-id-header'): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', 'b', 'c'], name='my-id-header'))) def test_non_str_id(self): with self.assertRaisesRegex( TypeError, 'non-string metadata ID.*type.*float.*nan'): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', np.nan, 'c'], name='id'))) def test_non_str_column_name(self): with self.assertRaisesRegex( TypeError, 'non-string metadata column name.*type.*' 'float.*nan'): Metadata(pd.DataFrame( {'col': [1, 2, 3], np.nan: [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_empty_id(self): with self.assertRaisesRegex( ValueError, 'empty metadata ID.*at least one character'): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', '', 'c'], name='id'))) def test_empty_column_name(self): with self.assertRaisesRegex( ValueError, 'empty metadata column name.*' 'at least one character'): Metadata(pd.DataFrame( {'col': [1, 2, 3], '': [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_pound_sign_id(self): with self.assertRaisesRegex( ValueError, "metadata ID.*begins with a pound sign.*'#b'"): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', '#b', 'c'], name='id'))) def test_id_conflicts_with_id_header(self): with self.assertRaisesRegex( ValueError, "metadata ID 'sample-id'.*conflicts.*reserved.*" "ID header"): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', 'sample-id', 'c'], name='id'))) def test_column_name_conflicts_with_id_header(self): with self.assertRaisesRegex( ValueError, "metadata column name 'featureid'.*conflicts.*" "reserved.*ID header"): Metadata(pd.DataFrame( {'col': [1, 2, 3], 'featureid': [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_duplicate_ids(self): with self.assertRaisesRegex(ValueError, "Metadata IDs.*unique.*'a'"): Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', 'b', 'a'], name='id'))) def test_duplicate_column_names(self): data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] with self.assertRaisesRegex(ValueError, "Metadata column names.*unique.*'col1'"): Metadata(pd.DataFrame(data, columns=['col1', 'col2', 'col1'], index=pd.Index(['a', 'b', 'c'], name='id'))) def test_unsupported_column_dtype(self): with self.assertRaisesRegex( TypeError, "Metadata column 'col2'.*unsupported.*dtype.*bool"): Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': [True, False, True]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_categorical_column_unsupported_type(self): with self.assertRaisesRegex( TypeError, "CategoricalMetadataColumn.*strings or missing " r"values.*42\.5.*float.*'col2'"): Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 42.5]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_categorical_column_empty_str(self): with self.assertRaisesRegex( ValueError, "CategoricalMetadataColumn.*empty strings.*" "column 'col2'"): Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', '', 'bar']}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_numeric_column_infinity(self): with self.assertRaisesRegex( ValueError, "NumericMetadataColumn.*positive or negative " "infinity.*column 'col2'"): Metadata(pd.DataFrame( {'col1': ['foo', 'bar', 'baz'], 'col2': [42, float('+inf'), 4.3]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_unknown_missing_scheme(self): with self.assertRaisesRegex(ValueError, "BAD:SCHEME"): Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'bar']}, index=pd.Index(['a', 'b', 'c'], name='id')), default_missing_scheme='BAD:SCHEME') def test_missing_q2_error(self): index = pd.Index(['None', 'nan', 'NA', 'foo'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, np.nan, np.nan]), ('NA', [np.nan, np.nan, np.nan, np.nan]), ('col3', ['null', 'N/A', np.nan, 'NA']), ('col4', np.array([np.nan, np.nan, np.nan, np.nan], dtype=object))]), index=index) with self.assertRaisesRegex(ValueError, 'col1.*no-missing'): Metadata(df, default_missing_scheme='no-missing') class TestMetadataConstructionAndProperties(unittest.TestCase): def assertEqualColumns(self, obs_columns, exp): obs = [(name, props.type) for name, props in obs_columns.items()] self.assertEqual(obs, exp) def test_minimal(self): md = Metadata(pd.DataFrame({}, index=pd.Index(['a'], name='id'))) self.assertEqual(md.id_count, 1) self.assertEqual(md.column_count, 0) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('a',)) self.assertEqualColumns(md.columns, []) def test_single_id(self): index = pd.Index(['id1'], name='id') df = pd.DataFrame({'col1': [1.0], 'col2': ['a'], 'col3': ['foo']}, index=index) md = Metadata(df) self.assertEqual(md.id_count, 1) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('id1',)) self.assertEqualColumns(md.columns, [('col1', 'numeric'), ('col2', 'categorical'), ('col3', 'categorical')]) def test_no_columns(self): index = pd.Index(['id1', 'id2', 'foo'], name='id') df = pd.DataFrame({}, index=index) md = Metadata(df) self.assertEqual(md.id_count, 3) self.assertEqual(md.column_count, 0) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('id1', 'id2', 'foo')) self.assertEqualColumns(md.columns, []) def test_single_column(self): index = pd.Index(['id1', 'a', 'my-id'], name='id') df = pd.DataFrame({'column': ['foo', 'bar', 'baz']}, index=index) md = Metadata(df) self.assertEqual(md.id_count, 3) self.assertEqual(md.column_count, 1) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('id1', 'a', 'my-id')) self.assertEqualColumns(md.columns, [('column', 'categorical')]) def test_retains_column_order(self): # Supply DataFrame constructor with explicit column ordering instead of # a dict. index = pd.Index(['id1', 'id2', 'id3'], name='id') columns = ['z', 'a', 'ch'] data = [ [1.0, 'a', 'foo'], [2.0, 'b', 'bar'], [3.0, 'c', '42'] ] df = pd.DataFrame(data, index=index, columns=columns) md = Metadata(df) self.assertEqual(md.id_count, 3) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('id1', 'id2', 'id3')) self.assertEqualColumns(md.columns, [('z', 'numeric'), ('a', 'categorical'), ('ch', 'categorical')]) def test_supported_id_headers(self): case_insensitive = { 'id', 'sampleid', 'sample id', 'sample-id', 'featureid', 'feature id', 'feature-id' } exact_match = { '#SampleID', '#Sample ID', '#OTUID', '#OTU ID', 'sample_name' } # Build a set of supported headers, including exact matches and headers # with different casing. headers = set() for header in case_insensitive: headers.add(header) headers.add(header.upper()) headers.add(header.title()) for header in exact_match: headers.add(header) count = 0 for header in headers: index = pd.Index(['id1', 'id2'], name=header) df = pd.DataFrame({'column': ['foo', 'bar']}, index=index) md = Metadata(df) self.assertEqual(md.id_header, header) count += 1 # Since this test case is a little complicated, make sure that the # expected number of comparisons are happening. self.assertEqual(count, 26) def test_recommended_ids(self): index = pd.Index(['c6ca034a-223f-40b4-a0e0-45942912a5ea', 'My.ID'], name='id') df = pd.DataFrame({'col1': ['foo', 'bar']}, index=index) md = Metadata(df) self.assertEqual(md.id_count, 2) self.assertEqual(md.column_count, 1) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('c6ca034a-223f-40b4-a0e0-45942912a5ea', 'My.ID')) self.assertEqualColumns(md.columns, [('col1', 'categorical')]) def test_non_standard_characters(self): index = pd.Index(['©id##1', '((id))2', "'id_3<>'", '"id#4"', 'i d\r\t\n5'], name='id') columns = ['↩c@l1™', 'col(#2)', "#col'3", '""', 'col\t \r\n5'] data = [ ['ƒoo', '(foo)', '#f o #o', 'fo\ro', np.nan], ["''2''", 'b#r', 'ba\nr', np.nan, np.nan], ['b"ar', 'c\td', '4\r\n2', np.nan, np.nan], ['b__a_z', '<42>', '>42', np.nan, np.nan], ['baz', np.nan, '42'] ] df = pd.DataFrame(data, index=index, columns=columns) md = Metadata(df) self.assertEqual(md.id_count, 5) self.assertEqual(md.column_count, 5) self.assertEqual(md.id_header, 'id') self.assertEqual( md.ids, ('©id##1', '((id))2', "'id_3<>'", '"id#4"', 'i d\r\t\n5')) self.assertEqualColumns(md.columns, [('↩c@l1™', 'categorical'), ('col(#2)', 'categorical'), ("#col'3", 'categorical'), ('""', 'categorical'), ('col\t \r\n5', 'numeric')]) def test_missing_data(self): index = pd.Index(['None', 'nan', 'NA', 'foo'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, np.nan, np.nan]), ('NA', [np.nan, np.nan, np.nan, np.nan]), ('col3', ['null', 'N/A', np.nan, 'NA']), ('col4', np.array([np.nan, np.nan, np.nan, np.nan], dtype=object))]), index=index) md = Metadata(df) self.assertEqual(md.id_count, 4) self.assertEqual(md.column_count, 4) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('None', 'nan', 'NA', 'foo')) self.assertEqualColumns(md.columns, [('col1', 'numeric'), ('NA', 'numeric'), ('col3', 'categorical'), ('col4', 'categorical')]) def test_missing_data_insdc(self): index = pd.Index(['None', 'nan', 'NA', 'foo'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, 'missing', np.nan]), # TODO: it is not currently possible to have an ENTIRELY numeric # column from missing terms, as the dtype of the series is object # and there is not way to indicate the dtype beyond that. # ('NA', [np.nan, np.nan, 'not applicable', np.nan]), ('col3', ['null', 'N/A', 'not collected', 'NA']), ('col4', np.array([np.nan, np.nan, 'restricted access', np.nan], dtype=object))]), index=index) md = Metadata(df, default_missing_scheme='INSDC:missing') self.assertEqual(md.id_count, 4) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('None', 'nan', 'NA', 'foo')) self.assertEqualColumns(md.columns, [('col1', 'numeric'), ('col3', 'categorical'), ('col4', 'categorical')]) pd.testing.assert_frame_equal(md.to_dataframe(), pd.DataFrame( {'col1': [1.0, np.nan, np.nan, np.nan], 'col3': ['null', 'N/A', np.nan, 'NA'], 'col4': np.array([np.nan, np.nan, np.nan, np.nan], dtype=object)}, index=index)) def test_missing_data_insdc_column_missing(self): index = pd.Index(['None', 'nan', 'NA', 'foo'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, 'missing', np.nan]), # TODO: it is not currently possible to have an ENTIRELY numeric # column from missing terms, as the dtype of the series is object # and there is not way to indicate the dtype beyond that. # ('NA', [np.nan, np.nan, 'not applicable', np.nan]), ('col3', ['null', 'N/A', 'not collected', 'NA']), ('col4', np.array([np.nan, np.nan, 'restricted access', np.nan], dtype=object))]), index=index) md = Metadata(df, column_missing_schemes={ 'col1': 'INSDC:missing', 'col3': 'INSDC:missing', 'col4': 'INSDC:missing' }) self.assertEqual(md.id_count, 4) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('None', 'nan', 'NA', 'foo')) self.assertEqualColumns(md.columns, [('col1', 'numeric'), ('col3', 'categorical'), ('col4', 'categorical')]) pd.testing.assert_frame_equal(md.to_dataframe(), pd.DataFrame( {'col1': [1.0, np.nan, np.nan, np.nan], 'col3': ['null', 'N/A', np.nan, 'NA'], 'col4': np.array([np.nan, np.nan, np.nan, np.nan], dtype=object)}, index=index)) def test_missing_data_default_override(self): index = pd.Index(['None', 'nan', 'NA', 'foo'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [1.0, np.nan, 'missing', np.nan]), # TODO: it is not currently possible to have an ENTIRELY numeric # column from missing terms, as the dtype of the series is object # and there is not way to indicate the dtype beyond that. # ('NA', [np.nan, np.nan, 'not applicable', np.nan]), ('col3', ['null', 'N/A', 'not collected', 'NA']), ('col4', np.array([np.nan, np.nan, 'restricted access', np.nan], dtype=object))]), index=index) md = Metadata(df, column_missing_schemes={ 'col1': 'INSDC:missing', 'col3': 'INSDC:missing', 'col4': 'INSDC:missing' }, default_missing_scheme='no-missing') self.assertEqual(md.id_count, 4) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('None', 'nan', 'NA', 'foo')) self.assertEqualColumns(md.columns, [('col1', 'numeric'), ('col3', 'categorical'), ('col4', 'categorical')]) pd.testing.assert_frame_equal(md.to_dataframe(), pd.DataFrame( {'col1': [1.0, np.nan, np.nan, np.nan], 'col3': ['null', 'N/A', np.nan, 'NA'], 'col4': np.array([np.nan, np.nan, np.nan, np.nan], dtype=object)}, index=index)) def test_does_not_cast_ids_or_column_names(self): index = pd.Index(['0.000001', '0.004000', '0.000000'], dtype=object, name='id') columns = ['42.0', '1000', '-4.2'] data = [ [2.0, 'b', 2.5], [1.0, 'b', 4.2], [3.0, 'c', -9.999] ] df = pd.DataFrame(data, index=index, columns=columns) md = Metadata(df) self.assertEqual(md.id_count, 3) self.assertEqual(md.column_count, 3) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('0.000001', '0.004000', '0.000000')) self.assertEqualColumns(md.columns, [('42.0', 'numeric'), ('1000', 'categorical'), ('-4.2', 'numeric')]) def test_mixed_column_types(self): md = Metadata( pd.DataFrame({'col0': [1.0, 2.0, 3.0], 'col1': ['a', 'b', 'c'], 'col2': ['foo', 'bar', '42'], 'col3': ['1.0', '2.5', '-4.002'], 'col4': [1, 2, 3], 'col5': [1, 2, 3.5], 'col6': [1e-4, -0.0002, np.nan], 'col7': ['cat', np.nan, 'dog'], 'col8': ['a', 'a', 'a'], 'col9': [0, 0, 0]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertEqual(md.id_count, 3) self.assertEqual(md.column_count, 10) self.assertEqual(md.id_header, 'id') self.assertEqual(md.ids, ('id1', 'id2', 'id3')) self.assertEqualColumns(md.columns, [('col0', 'numeric'), ('col1', 'categorical'), ('col2', 'categorical'), ('col3', 'categorical'), ('col4', 'numeric'), ('col5', 'numeric'), ('col6', 'numeric'), ('col7', 'categorical'), ('col8', 'categorical'), ('col9', 'numeric')]) def test_case_insensitive_duplicate_ids(self): index = pd.Index(['a', 'b', 'A'], name='id') df = pd.DataFrame({'column': ['1', '2', '3']}, index=index) metadata = Metadata(df) self.assertEqual(metadata.ids, ('a', 'b', 'A')) def test_case_insensitive_duplicate_column_names(self): index = pd.Index(['a', 'b', 'c'], name='id') df = pd.DataFrame({'column': ['1', '2', '3'], 'Column': ['4', '5', '6']}, index=index) metadata = Metadata(df) self.assertEqual(set(metadata.columns), {'column', 'Column'}) def test_categorical_column_leading_trailing_whitespace_value(self): md1 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', ' bar ', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) md2 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(md1, md2) def test_leading_trailing_whitespace_id(self): md1 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': [4, 5, 6]}, index=pd.Index(['a', ' b ', 'c'], name='id'))) md2 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(md1, md2) def test_leading_trailing_whitespace_column_name(self): md1 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], ' col2 ': [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) md2 = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': [4, 5, 6]}, index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(md1, md2) class TestSourceArtifacts(unittest.TestCase): def setUp(self): self.md = Metadata(pd.DataFrame( {'col': [1, 2, 3]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_no_source_artifacts(self): self.assertEqual(self.md.artifacts, ()) def test_add_zero_artifacts(self): self.md._add_artifacts([]) self.assertEqual(self.md.artifacts, ()) def test_add_artifacts(self): # First two artifacts have the same data but different UUIDs. artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) self.md._add_artifacts([artifact1]) artifact2 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) artifact3 = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) self.md._add_artifacts([artifact2, artifact3]) self.assertEqual(self.md.artifacts, (artifact1, artifact2, artifact3)) def test_add_non_artifact(self): artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) with self.assertRaisesRegex(TypeError, "Artifact object.*42"): self.md._add_artifacts([artifact, 42]) # Test that the object hasn't been mutated. self.assertEqual(self.md.artifacts, ()) def test_add_duplicate_artifact(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) artifact2 = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) self.md._add_artifacts([artifact1, artifact2]) with self.assertRaisesRegex( ValueError, "Duplicate source artifacts.*artifact: Mapping"): self.md._add_artifacts([artifact1]) # Test that the object hasn't been mutated. self.assertEqual(self.md.artifacts, (artifact1, artifact2)) class TestRepr(unittest.TestCase): def test_singular(self): md = Metadata(pd.DataFrame({'col1': [42]}, index=pd.Index(['a'], name='id'))) obs = repr(md) self.assertIn('Metadata', obs) self.assertIn('1 ID x 1 column', obs) self.assertIn("col1: ColumnProperties(type='numeric'," " missing_scheme='blank')", obs) def test_plural(self): md = Metadata(pd.DataFrame({'col1': [42, 42], 'col2': ['foo', 'bar']}, index=pd.Index(['a', 'b'], name='id'))) obs = repr(md) self.assertIn('Metadata', obs) self.assertIn('2 IDs x 2 columns', obs) self.assertIn("col1: ColumnProperties(type='numeric'," " missing_scheme='blank')", obs) self.assertIn("col2: ColumnProperties(type='categorical'," " missing_scheme='blank')", obs) def test_column_name_padding(self): data = [[0, 42, 'foo']] index = pd.Index(['my-id'], name='id') columns = ['col1', 'longer-column-name', 'c'] md = Metadata(pd.DataFrame(data, index=index, columns=columns)) obs = repr(md) self.assertIn('Metadata', obs) self.assertIn('1 ID x 3 columns', obs) self.assertIn( "col1: ColumnProperties(type='numeric'," " missing_scheme='blank')", obs) self.assertIn( "longer-column-name: ColumnProperties(type='numeric'," " missing_scheme='blank')", obs) self.assertIn( "c: ColumnProperties(type='categorical'," " missing_scheme='blank')", obs) class TestEqualityOperators(unittest.TestCase, ReallyEqualMixin): def setUp(self): get_dummy_plugin() def test_type_mismatch(self): md = Metadata( pd.DataFrame({'col1': [1.0, 2.0, 3.0], 'col2': ['a', 'b', 'c'], 'col3': ['foo', 'bar', '42']}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) mdc = md.get_column('col1') self.assertIsInstance(md, Metadata) self.assertIsInstance(mdc, NumericMetadataColumn) self.assertReallyNotEqual(md, mdc) def test_id_header_mismatch(self): data = {'col1': ['foo', 'bar'], 'col2': [42, 43]} md1 = Metadata(pd.DataFrame( data, index=pd.Index(['id1', 'id2'], name='id'))) md2 = Metadata(pd.DataFrame( data, index=pd.Index(['id1', 'id2'], name='ID'))) self.assertReallyNotEqual(md1, md2) def test_source_mismatch(self): # Metadata created from an artifact vs not shouldn't compare equal, # even if the data is the same. artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) md_from_artifact = artifact.view(Metadata) md_no_artifact = Metadata(md_from_artifact.to_dataframe()) pd.testing.assert_frame_equal(md_from_artifact.to_dataframe(), md_no_artifact.to_dataframe()) self.assertReallyNotEqual(md_from_artifact, md_no_artifact) def test_artifact_mismatch(self): # Metadata created from different artifacts shouldn't compare equal, # even if the data is the same. artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) md1 = artifact1.view(Metadata) md2 = artifact2.view(Metadata) pd.testing.assert_frame_equal(md1.to_dataframe(), md2.to_dataframe()) self.assertReallyNotEqual(md1, md2) def test_id_mismatch(self): md1 = Metadata(pd.DataFrame({'a': '1', 'b': '2'}, index=pd.Index(['0'], name='id'))) md2 = Metadata(pd.DataFrame({'a': '1', 'b': '2'}, index=pd.Index(['1'], name='id'))) self.assertReallyNotEqual(md1, md2) def test_column_name_mismatch(self): md1 = Metadata(pd.DataFrame({'a': '1', 'b': '2'}, index=pd.Index(['0'], name='id'))) md2 = Metadata(pd.DataFrame({'a': '1', 'c': '2'}, index=pd.Index(['0'], name='id'))) self.assertReallyNotEqual(md1, md2) def test_column_type_mismatch(self): md1 = Metadata(pd.DataFrame({'col1': ['42', '43']}, index=pd.Index(['id1', 'id2'], name='id'))) md2 = Metadata(pd.DataFrame({'col1': [42, 43]}, index=pd.Index(['id1', 'id2'], name='id'))) self.assertReallyNotEqual(md1, md2) def test_column_order_mismatch(self): index = pd.Index(['id1', 'id2'], name='id') md1 = Metadata(pd.DataFrame([[42, 'foo'], [43, 'bar']], index=index, columns=['z', 'a'])) md2 = Metadata(pd.DataFrame([['foo', 42], ['bar', 43]], index=index, columns=['a', 'z'])) self.assertReallyNotEqual(md1, md2) def test_data_mismatch(self): md1 = Metadata(pd.DataFrame({'a': '1', 'b': '3'}, index=pd.Index(['0'], name='id'))) md2 = Metadata(pd.DataFrame({'a': '1', 'b': '2'}, index=pd.Index(['0'], name='id'))) self.assertReallyNotEqual(md1, md2) def test_equality_without_artifact(self): md1 = Metadata(pd.DataFrame({'a': '1', 'b': '3'}, index=pd.Index(['0'], name='id'))) md2 = Metadata(pd.DataFrame({'a': '1', 'b': '3'}, index=pd.Index(['0'], name='id'))) self.assertReallyEqual(md1, md2) def test_equality_with_artifact(self): artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) md1 = artifact.view(Metadata) md2 = artifact.view(Metadata) self.assertReallyEqual(md1, md2) def test_equality_with_missing_data(self): md1 = Metadata(pd.DataFrame( {'col1': [1, np.nan, 4.2], 'col2': [np.nan, 'foo', np.nan]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'col1': [1, np.nan, 4.2], 'col2': [np.nan, 'foo', np.nan]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertReallyEqual(md1, md2) class TestToDataframe(unittest.TestCase): def test_minimal(self): df = pd.DataFrame({}, index=pd.Index(['id1'], name='id')) md = Metadata(df) obs = md.to_dataframe() pd.testing.assert_frame_equal(obs, df) def test_id_header_preserved(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='#SampleID')) md = Metadata(df) obs = md.to_dataframe() pd.testing.assert_frame_equal(obs, df) self.assertEqual(obs.index.name, '#SampleID') def test_dataframe_copy(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df) obs = md.to_dataframe() pd.testing.assert_frame_equal(obs, df) self.assertIsNot(obs, df) def test_retains_column_order(self): index = pd.Index(['id1', 'id2'], name='id') columns = ['z', 'a', 'ch'] data = [ [1.0, 'a', 'foo'], [2.0, 'b', 'bar'] ] df = pd.DataFrame(data, index=index, columns=columns) md = Metadata(df) obs = md.to_dataframe() pd.testing.assert_frame_equal(obs, df) self.assertEqual(obs.columns.tolist(), ['z', 'a', 'ch']) def test_missing_data(self): # Different missing data representations should be normalized to np.nan index = pd.Index(['None', 'nan', 'NA', 'id1'], name='id') df = pd.DataFrame(collections.OrderedDict([ ('col1', [42.5, np.nan, float('nan'), 3]), ('NA', [np.nan, 'foo', float('nan'), None]), ('col3', ['null', 'N/A', np.nan, 'NA']), ('col4', np.array([np.nan, np.nan, np.nan, np.nan], dtype=object))]), index=index) md = Metadata(df) obs = md.to_dataframe() exp = pd.DataFrame(collections.OrderedDict([ ('col1', [42.5, np.nan, np.nan, 3.0]), ('NA', [np.nan, 'foo', np.nan, np.nan]), ('col3', ['null', 'N/A', np.nan, 'NA']), ('col4', np.array([np.nan, np.nan, np.nan, np.nan], dtype=object))]), index=index) pd.testing.assert_frame_equal(obs, exp) self.assertEqual(obs.dtypes.to_dict(), {'col1': np.float64, 'NA': object, 'col3': object, 'col4': object}) self.assertTrue(np.isnan(obs['col1']['NA'])) self.assertTrue(np.isnan(obs['NA']['NA'])) self.assertTrue(np.isnan(obs['NA']['id1'])) def test_dtype_int_normalized_to_dtype_float(self): index = pd.Index(['id1', 'id2', 'id3'], name='id') df = pd.DataFrame({'col1': [42, -43, 0], 'col2': [42.0, -43.0, 0.0], 'col3': [42, np.nan, 0]}, index=index) self.assertEqual(df.dtypes.to_dict(), {'col1': np.int64, 'col2': np.float64, 'col3': np.float64}) md = Metadata(df) obs = md.to_dataframe() exp = pd.DataFrame({'col1': [42.0, -43.0, 0.0], 'col2': [42.0, -43.0, 0.0], 'col3': [42.0, np.nan, 0.0]}, index=index) pd.testing.assert_frame_equal(obs, exp) self.assertEqual(obs.dtypes.to_dict(), {'col1': np.float64, 'col2': np.float64, 'col3': np.float64}) def test_encode_missing_no_missing(self): df = pd.DataFrame({'col1': [42.0, 50.0], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df, default_missing_scheme='INSDC:missing') obs = md.to_dataframe(encode_missing=True) pd.testing.assert_frame_equal(obs, df) self.assertIsNot(obs, df) def test_insdc_missing_encode_missing_true(self): df = pd.DataFrame({'col1': [42, 'missing'], 'col2': ['foo', 'not applicable']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df, default_missing_scheme='INSDC:missing') obs = md.to_dataframe(encode_missing=True) pd.testing.assert_frame_equal(obs, df) self.assertIsNot(obs, df) def test_insdc_missing_encode_missing_false(self): df = pd.DataFrame({'col1': [42, 'missing'], 'col2': ['foo', 'not applicable']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df, default_missing_scheme='INSDC:missing') obs = md.to_dataframe() exp = pd.DataFrame({'col1': [42, np.nan], 'col2': ['foo', np.nan]}, index=pd.Index(['id1', 'id2'], name='id')) pd.testing.assert_frame_equal(obs, exp) self.assertIsNot(obs, df) class TestGetColumn(unittest.TestCase): def setUp(self): get_dummy_plugin() def test_column_name_not_found(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df) with self.assertRaisesRegex(ValueError, "'col3'.*not a column.*'col1', 'col2'"): md.get_column('col3') def test_artifacts_are_propagated(self): A = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) md = A.view(Metadata) obs = md.get_column('b') exp = CategoricalMetadataColumn( pd.Series(['3'], name='b', index=pd.Index(['0'], name='id'))) exp._add_artifacts([A]) self.assertEqual(obs, exp) self.assertEqual(obs.artifacts, (A,)) def test_categorical_column(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df) obs = md.get_column('col2') exp = CategoricalMetadataColumn( pd.Series(['foo', 'bar'], name='col2', index=pd.Index(['id1', 'id2'], name='id'))) self.assertEqual(obs, exp) def test_numeric_column(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['id1', 'id2'], name='id')) md = Metadata(df) obs = md.get_column('col1') exp = NumericMetadataColumn( pd.Series([42, 2.5], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) self.assertEqual(obs, exp) def test_id_header_preserved(self): df = pd.DataFrame({'col1': [42, 2.5], 'col2': ['foo', 'bar']}, index=pd.Index(['a', 'b'], name='#OTU ID')) md = Metadata(df) obs = md.get_column('col1') exp = NumericMetadataColumn( pd.Series([42, 2.5], name='col1', index=pd.Index(['a', 'b'], name='#OTU ID'))) self.assertEqual(obs, exp) self.assertEqual(obs.id_header, '#OTU ID') class TestGetIDs(unittest.TestCase): def test_default(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) actual = metadata.get_ids() expected = {'S1', 'S2', 'S3'} self.assertEqual(actual, expected) def test_incomplete_where(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='sampleid')) metadata = Metadata(df) where = "Subject='subject-1' AND SampleType=" with self.assertRaises(ValueError): metadata.get_ids(where) where = "Subject=" with self.assertRaises(ValueError): metadata.get_ids(where) def test_invalid_where(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='sampleid')) metadata = Metadata(df) where = "not-a-column-name='subject-1'" with self.assertRaises(ValueError): metadata.get_ids(where) def test_empty_result(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) where = "Subject='subject-3'" actual = metadata.get_ids(where) expected = set() self.assertEqual(actual, expected) def test_simple_expression(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) where = "Subject='subject-1'" actual = metadata.get_ids(where) expected = {'S1', 'S2'} self.assertEqual(actual, expected) where = "Subject='subject-2'" actual = metadata.get_ids(where) expected = {'S3'} self.assertEqual(actual, expected) where = "Subject='subject-3'" actual = metadata.get_ids(where) expected = set() self.assertEqual(actual, expected) where = "SampleType='gut'" actual = metadata.get_ids(where) expected = {'S1', 'S3'} self.assertEqual(actual, expected) where = "SampleType='tongue'" actual = metadata.get_ids(where) expected = {'S2'} self.assertEqual(actual, expected) def test_more_complex_expressions(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) where = "Subject='subject-1' OR Subject='subject-2'" actual = metadata.get_ids(where) expected = {'S1', 'S2', 'S3'} self.assertEqual(actual, expected) where = "Subject='subject-1' AND Subject='subject-2'" actual = metadata.get_ids(where) expected = set() self.assertEqual(actual, expected) where = "Subject='subject-1' AND SampleType='gut'" actual = metadata.get_ids(where) expected = {'S1'} self.assertEqual(actual, expected) def test_query_by_id(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'SampleType': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) actual = metadata.get_ids(where="id='S2' OR id='S1'") expected = {'S1', 'S2'} self.assertEqual(actual, expected) def test_query_by_alternate_id_header(self): metadata = Metadata(pd.DataFrame( {}, index=pd.Index(['id1', 'id2', 'id3'], name='#OTU ID'))) obs = metadata.get_ids(where="\"#OTU ID\" IN ('id2', 'id3')") exp = {'id2', 'id3'} self.assertEqual(obs, exp) def test_no_columns(self): metadata = Metadata( pd.DataFrame({}, index=pd.Index(['a', 'b', 'my-id'], name='id'))) obs = metadata.get_ids() exp = {'a', 'b', 'my-id'} self.assertEqual(obs, exp) def test_query_mixed_column_types(self): df = pd.DataFrame({'Name': ['Foo', 'Bar', 'Baz', 'Baaz'], # numbers that would sort incorrectly as strings 'Age': [9, 10, 11, 101], 'Age_Str': ['9', '10', '11', '101'], 'Weight': [80.5, 85.3, np.nan, 120.0]}, index=pd.Index(['S1', 'S2', 'S3', 'S4'], name='id')) metadata = Metadata(df) # string pattern matching obs = metadata.get_ids(where="Name LIKE 'Ba_'") exp = {'S2', 'S3'} self.assertEqual(obs, exp) # string comparison obs = metadata.get_ids(where="Age_Str >= 11") exp = {'S1', 'S3'} self.assertEqual(obs, exp) # numeric comparison obs = metadata.get_ids(where="Age >= 11") exp = {'S3', 'S4'} self.assertEqual(obs, exp) # numeric comparison with missing data obs = metadata.get_ids(where="Weight < 100") exp = {'S1', 'S2'} self.assertEqual(obs, exp) def test_column_with_space_in_name(self): df = pd.DataFrame({'Subject': ['subject-1', 'subject-1', 'subject-2'], 'Sample Type': ['gut', 'tongue', 'gut']}, index=pd.Index(['S1', 'S2', 'S3'], name='id')) metadata = Metadata(df) with warnings.catch_warnings(record=True) as w: warnings.simplefilter('always') metadata.get_ids() # The list of captured warnings should be empty self.assertFalse(w) class TestMerge(unittest.TestCase): def setUp(self): get_dummy_plugin() def test_merging_nothing(self): md = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) with self.assertRaisesRegex(ValueError, 'At least one Metadata.*nothing to merge'): md.merge() def test_merging_two(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [7, 8, 9], 'd': [10, 11, 12]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) obs = md1.merge(md2) exp = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9], 'd': [10, 11, 12]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertEqual(obs, exp) def test_merging_three(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [7, 8, 9], 'd': [10, 11, 12]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md3 = Metadata(pd.DataFrame( {'e': [13, 14, 15], 'f': [16, 17, 18]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) obs = md1.merge(md2, md3) exp = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9], 'd': [10, 11, 12], 'e': [13, 14, 15], 'f': [16, 17, 18]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertEqual(obs, exp) def test_merging_unaligned_indices(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [9, 8, 7], 'd': [12, 11, 10]}, index=pd.Index(['id3', 'id2', 'id1'], name='id'))) md3 = Metadata(pd.DataFrame( {'e': [13, 15, 14], 'f': [16, 18, 17]}, index=pd.Index(['id1', 'id3', 'id2'], name='id'))) obs = md1.merge(md2, md3) exp = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9], 'd': [10, 11, 12], 'e': [13, 14, 15], 'f': [16, 17, 18]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertEqual(obs, exp) def test_inner_join(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [7, 8, 9], 'd': [10, 11, 12]}, index=pd.Index(['id2', 'X', 'Y'], name='id'))) md3 = Metadata(pd.DataFrame( {'e': [13, 14, 15], 'f': [16, 17, 18]}, index=pd.Index(['X', 'id3', 'id2'], name='id'))) # Single shared ID. obs = md1.merge(md2, md3) exp = Metadata(pd.DataFrame( {'a': [2], 'b': [5], 'c': [7], 'd': [10], 'e': [15], 'f': [18]}, index=pd.Index(['id2'], name='id'))) self.assertEqual(obs, exp) # Multiple shared IDs. obs = md1.merge(md3) exp = Metadata(pd.DataFrame( {'a': [2, 3], 'b': [5, 6], 'e': [15, 14], 'f': [18, 17]}, index=pd.Index(['id2', 'id3'], name='id'))) self.assertEqual(obs, exp) def test_index_and_column_merge_order(self): md1 = Metadata(pd.DataFrame( [[1], [2], [3], [4]], index=pd.Index(['id1', 'id2', 'id3', 'id4'], name='id'), columns=['a'])) md2 = Metadata(pd.DataFrame( [[5], [6], [7]], index=pd.Index(['id4', 'id3', 'id1'], name='id'), columns=['b'])) md3 = Metadata(pd.DataFrame( [[8], [9], [10]], index=pd.Index(['id1', 'id4', 'id3'], name='id'), columns=['c'])) obs = md1.merge(md2, md3) exp = Metadata(pd.DataFrame( [[1, 7, 8], [3, 6, 10], [4, 5, 9]], index=pd.Index(['id1', 'id3', 'id4'], name='id'), columns=['a', 'b', 'c'])) self.assertEqual(obs, exp) # Merging in different order produces different ID/column order. obs = md2.merge(md1, md3) exp = Metadata(pd.DataFrame( [[5, 4, 9], [6, 3, 10], [7, 1, 8]], index=pd.Index(['id4', 'id3', 'id1'], name='id'), columns=['b', 'a', 'c'])) self.assertEqual(obs, exp) def test_id_column_only(self): md1 = Metadata(pd.DataFrame({}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame({}, index=pd.Index(['id2', 'X', 'id1'], name='id'))) md3 = Metadata(pd.DataFrame({}, index=pd.Index(['id1', 'id3', 'id2'], name='id'))) obs = md1.merge(md2, md3) exp = Metadata( pd.DataFrame({}, index=pd.Index(['id1', 'id2'], name='id'))) self.assertEqual(obs, exp) def test_merged_id_column_name(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2]}, index=pd.Index(['id1', 'id2'], name='sample ID'))) md2 = Metadata(pd.DataFrame( {'b': [3, 4]}, index=pd.Index(['id1', 'id2'], name='feature ID'))) obs = md1.merge(md2) exp = Metadata(pd.DataFrame( {'a': [1, 2], 'b': [3, 4]}, index=pd.Index(['id1', 'id2'], name='id'))) self.assertEqual(obs, exp) def test_merging_preserves_column_types(self): # Test that column types remain the same even if a categorical column # *could* be reinterpreted as numeric after the merge. md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [np.nan, np.nan, np.nan]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': ['1', 'foo', '3'], 'd': np.array([np.nan, np.nan, np.nan], dtype=object)}, index=pd.Index(['id1', 'id4', 'id3'], name='id'))) obs = md1.merge(md2) exp = Metadata(pd.DataFrame( {'a': [1, 3], 'b': [np.nan, np.nan], 'c': ['1', '3'], 'd': np.array([np.nan, np.nan], dtype=object)}, index=pd.Index(['id1', 'id3'], name='id'))) self.assertEqual(obs, exp) self.assertEqual(obs.columns['a'].type, 'numeric') self.assertEqual(obs.columns['b'].type, 'numeric') self.assertEqual(obs.columns['c'].type, 'categorical') self.assertEqual(obs.columns['d'].type, 'categorical') def test_no_artifacts(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2]}, index=pd.Index(['id1', 'id2'], name='id'))) md2 = Metadata(pd.DataFrame( {'b': [3, 4]}, index=pd.Index(['id1', 'id2'], name='id'))) metadata = md1.merge(md2) self.assertEqual(metadata.artifacts, ()) def test_with_artifacts(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'d': '4'}) md_from_artifact1 = artifact1.view(Metadata) md_from_artifact2 = artifact2.view(Metadata) md_no_artifact = Metadata(pd.DataFrame( {'c': ['3', '42']}, index=pd.Index(['0', '1'], name='id'))) # Merge three metadata objects -- the first has an artifact, the second # does not, and the third has an artifact. obs_md = md_from_artifact1.merge(md_no_artifact, md_from_artifact2) exp_df = pd.DataFrame( {'a': '1', 'b': '2', 'c': '3', 'd': '4'}, index=pd.Index(['0'], name='id')) exp_md = Metadata(exp_df) exp_md._add_artifacts((artifact1, artifact2)) self.assertEqual(obs_md, exp_md) self.assertEqual(obs_md.artifacts, (artifact1, artifact2)) def test_disjoint_indices(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [4, 5, 6]}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [7, 8, 9], 'd': [10, 11, 12]}, index=pd.Index(['X', 'Y', 'Z'], name='id'))) with self.assertRaisesRegex(ValueError, 'no IDs shared'): md1.merge(md2) def test_duplicate_columns(self): md1 = Metadata(pd.DataFrame( {'a': [1, 2], 'b': [3, 4]}, index=pd.Index(['id1', 'id2'], name='id'))) md2 = Metadata(pd.DataFrame( {'c': [5, 6], 'b': [7, 8]}, index=pd.Index(['id1', 'id2'], name='id'))) with self.assertRaisesRegex(ValueError, "columns overlap: 'b'"): md1.merge(md2) def test_duplicate_columns_self_merge(self): md = Metadata(pd.DataFrame( {'a': [1, 2], 'b': [3, 4]}, index=pd.Index(['id1', 'id2'], name='id'))) with self.assertRaisesRegex(ValueError, "columns overlap: 'a', 'b'"): md.merge(md) class TestFilterIDs(unittest.TestCase): def setUp(self): get_dummy_plugin() def test_supports_iterable(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_ids(iter({'a', 'c'})) exp = Metadata(pd.DataFrame( {'col1': [1, 3], 'col2': ['foo', 'baz']}, index=pd.Index(['a', 'c'], name='id'))) self.assertEqual(obs, exp) def test_keep_all(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_ids({'a', 'b', 'c'}) self.assertEqual(obs, md) self.assertIsNot(obs, md) def test_keep_multiple(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_ids({'a', 'c'}) exp = Metadata(pd.DataFrame( {'col1': [1, 3], 'col2': ['foo', 'baz']}, index=pd.Index(['a', 'c'], name='id'))) self.assertEqual(obs, exp) def test_keep_one(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_ids({'b'}) exp = Metadata(pd.DataFrame( {'col1': [2], 'col2': ['bar']}, index=pd.Index(['b'], name='id'))) self.assertEqual(obs, exp) def test_filtering_preserves_column_types(self): # Test that column types remain the same even if a categorical column # *could* be reinterpreted as numeric after the filter. md = Metadata(pd.DataFrame( {'a': [1, 2, 3], 'b': [np.nan, np.nan, np.nan], 'c': ['1', 'foo', '3'], 'd': np.array([np.nan, np.nan, np.nan], dtype=object)}, index=pd.Index(['id1', 'id2', 'id3'], name='id'))) obs = md.filter_ids({'id1', 'id3'}) exp = Metadata(pd.DataFrame( {'a': [1, 3], 'b': [np.nan, np.nan], 'c': ['1', '3'], 'd': np.array([np.nan, np.nan], dtype=object)}, index=pd.Index(['id1', 'id3'], name='id'))) self.assertEqual(obs, exp) self.assertEqual(obs.columns['a'].type, 'numeric') self.assertEqual(obs.columns['b'].type, 'numeric') self.assertEqual(obs.columns['c'].type, 'categorical') self.assertEqual(obs.columns['d'].type, 'categorical') def test_alternate_id_header(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3, 4], 'col2': ['foo', 'bar', 'baz', 'bazz']}, index=pd.Index(['a', 'b', 'c', 'd'], name='#Sample ID'))) obs = md.filter_ids({'b', 'd'}) exp = Metadata(pd.DataFrame( {'col1': [2, 4], 'col2': ['bar', 'bazz']}, index=pd.Index(['b', 'd'], name='#Sample ID'))) self.assertEqual(obs, exp) def test_retains_column_order(self): data = [[1, 'foo', 'cat'], [2, 'bar', 'dog'], [3, 'baz', 'bat']] md = Metadata(pd.DataFrame( data, columns=['z', 'a', 'ch'], index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_ids({'b', 'c'}) exp_data = [[2, 'bar', 'dog'], [3, 'baz', 'bat']] exp = Metadata(pd.DataFrame( exp_data, columns=['z', 'a', 'ch'], index=pd.Index(['b', 'c'], name='id'))) self.assertEqual(obs, exp) def test_no_artifacts(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(md.artifacts, ()) filtered = md.filter_ids({'b'}) self.assertEqual(filtered.artifacts, ()) def test_with_artifacts(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'d': '4'}) md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) md._add_artifacts([artifact1, artifact2]) obs = md.filter_ids({'a', 'c'}) exp = Metadata(pd.DataFrame( {'col1': [1, 3], 'col2': ['foo', 'baz']}, index=pd.Index(['a', 'c'], name='id'))) exp._add_artifacts([artifact1, artifact2]) self.assertEqual(obs, exp) self.assertEqual(obs.artifacts, (artifact1, artifact2)) def test_empty_ids_to_keep(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, 'ids_to_keep.*at least one ID'): md.filter_ids({}) def test_duplicate_ids_to_keep(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, "ids_to_keep.*unique IDs.*'b'"): md.filter_ids(['b', 'c', 'b']) def test_missing_ids_to_keep(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, "IDs.*not present.*'d', 'id1'"): md.filter_ids({'b', 'id1', 'c', 'd'}) class TestFilterColumns(unittest.TestCase): def setUp(self): get_dummy_plugin() # This object can be reused in many of the tests because its columns # match various filtering criteria, allowing test cases to test # individual parameters or combinations of parameters. self.metadata = Metadata(pd.DataFrame( {'cat': ['foo', 'bar', np.nan, 'foo'], 'num': [42, np.nan, -5.5, 42], 'uniq-cat': ['foo', np.nan, 'bar', np.nan], 'uniq-num': [np.nan, 9.9, np.nan, 42], 'zvar-cat': ['foo', np.nan, 'foo', 'foo'], 'zvar-num': [9.9, 9.9, np.nan, 9.9], 'empty-cat': np.array([np.nan, np.nan, np.nan, np.nan], dtype=object), 'empty-num': [np.nan, np.nan, np.nan, np.nan]}, index=pd.Index(['a', 'b', 'c', 'd'], name='id'))) # Basic sanity check to ensure column types are what we expect them to # be. obs = {n: p.type for n, p in self.metadata.columns.items()} exp = {'cat': 'categorical', 'num': 'numeric', 'uniq-cat': 'categorical', 'uniq-num': 'numeric', 'zvar-cat': 'categorical', 'zvar-num': 'numeric', 'empty-cat': 'categorical', 'empty-num': 'numeric'} self.assertEqual(obs, exp) def test_unknown_column_type(self): with self.assertRaisesRegex( ValueError, "Unknown column type 'foo'.*categorical, numeric"): self.metadata.filter_columns(column_type='foo') def test_no_filters(self): obs = self.metadata.filter_columns() self.assertEqual(obs, self.metadata) self.assertIsNot(obs, self.metadata) def test_all_filters_no_columns(self): md = Metadata(pd.DataFrame( {}, index=pd.Index(['a', 'b', 'c'], name='id'))) obs = md.filter_columns( column_type='categorical', drop_all_unique=True, drop_zero_variance=True, drop_all_missing=True) self.assertEqual(obs, md) self.assertIsNot(obs, md) obs = md.filter_columns( column_type='numeric', drop_all_unique=True, drop_zero_variance=True, drop_all_missing=True) self.assertEqual(obs, md) self.assertIsNot(obs, md) def test_all_filters(self): obs = self.metadata.filter_columns( column_type='categorical', drop_all_unique=True, drop_zero_variance=True, drop_all_missing=True) self.assertEqual(set(obs.columns), {'cat'}) obs = self.metadata.filter_columns( column_type='numeric', drop_all_unique=True, drop_zero_variance=True, drop_all_missing=True) self.assertEqual(set(obs.columns), {'num'}) def test_all_columns_filtered(self): categorical = self.metadata.filter_columns(column_type='categorical') obs = categorical.filter_columns(column_type='numeric') exp = Metadata(pd.DataFrame( {}, index=pd.Index(['a', 'b', 'c', 'd'], name='id'))) self.assertEqual(obs, exp) def test_filter_to_categorical(self): obs = self.metadata.filter_columns(column_type='categorical') self.assertEqual(set(obs.columns), {'cat', 'uniq-cat', 'zvar-cat', 'empty-cat'}) def test_filter_to_numeric(self): obs = self.metadata.filter_columns(column_type='numeric') self.assertEqual(set(obs.columns), {'num', 'uniq-num', 'zvar-num', 'empty-num'}) def test_drop_all_unique(self): obs = self.metadata.filter_columns(drop_all_unique=True) self.assertEqual(set(obs.columns), {'cat', 'num', 'zvar-cat', 'zvar-num'}) def test_drop_zero_variance(self): obs = self.metadata.filter_columns(drop_zero_variance=True) self.assertEqual(set(obs.columns), {'cat', 'num', 'uniq-cat', 'uniq-num'}) def test_drop_all_missing(self): obs = self.metadata.filter_columns(drop_all_missing=True) self.assertEqual( set(obs.columns), {'cat', 'num', 'uniq-cat', 'uniq-num', 'zvar-cat', 'zvar-num'}) def test_drop_all_unique_with_single_id(self): md = Metadata(pd.DataFrame( {'cat': ['foo'], 'num': [-4.2], 'empty-cat': np.array([np.nan], dtype=object), 'empty-num': [np.nan]}, index=pd.Index(['id1'], name='id'))) obs = md.filter_columns(drop_all_unique=True) exp = Metadata(pd.DataFrame({}, index=pd.Index(['id1'], name='id'))) self.assertEqual(obs, exp) def test_drop_zero_variance_with_single_id(self): md = Metadata(pd.DataFrame( {'cat': ['foo'], 'num': [-4.2], 'empty-cat': np.array([np.nan], dtype=object), 'empty-num': [np.nan]}, index=pd.Index(['id1'], name='id'))) obs = md.filter_columns(drop_zero_variance=True) exp = Metadata(pd.DataFrame({}, index=pd.Index(['id1'], name='id'))) self.assertEqual(obs, exp) def test_retains_column_order(self): data = [[42, 'foo', 2.5], [42, 'bar', 0.5], [11, 'foo', 0.0]] md = Metadata(pd.DataFrame( data, columns=['z', 'a', 'ch'], index=pd.Index(['id1', 'id2', 'id3'], name='id'))) obs = md.filter_columns(column_type='numeric') exp_data = [[42, 2.5], [42, 0.5], [11, 0.0]] exp = Metadata(pd.DataFrame( exp_data, columns=['z', 'ch'], index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertEqual(obs, exp) def test_alternate_id_header(self): md = Metadata(pd.DataFrame( {'col1': ['foo', 'bar'], 'col2': [-4.2, -4.2], 'col3': ['bar', 'baz']}, index=pd.Index(['id1', 'id2'], name='feature-id'))) obs = md.filter_columns(drop_zero_variance=True) exp = Metadata(pd.DataFrame( {'col1': ['foo', 'bar'], 'col3': ['bar', 'baz']}, index=pd.Index(['id1', 'id2'], name='feature-id'))) self.assertEqual(obs, exp) def test_no_artifacts(self): md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(md.artifacts, ()) filtered = md.filter_columns(column_type='categorical') self.assertEqual(filtered.artifacts, ()) def test_with_artifacts(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'d': '4'}) md = Metadata(pd.DataFrame( {'col1': [1, 2, 3], 'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) md._add_artifacts([artifact1, artifact2]) obs = md.filter_columns(column_type='categorical') exp = Metadata(pd.DataFrame( {'col2': ['foo', 'bar', 'baz']}, index=pd.Index(['a', 'b', 'c'], name='id'))) exp._add_artifacts([artifact1, artifact2]) self.assertEqual(obs, exp) self.assertEqual(obs.artifacts, (artifact1, artifact2)) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/metadata/tests/test_metadata_column.py000066400000000000000000001260471462552636000240330ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os.path import tempfile import unittest import pandas as pd import numpy as np from qiime2 import Artifact from qiime2.metadata import (MetadataColumn, CategoricalMetadataColumn, NumericMetadataColumn) from qiime2.core.testing.util import get_dummy_plugin, ReallyEqualMixin # Dummy class for testing MetadataColumn ABC class DummyMetadataColumn(MetadataColumn): type = 'dummy' @classmethod def _is_supported_dtype(cls, dtype): return dtype == 'float' or dtype == 'int' or dtype == 'int64' @classmethod def _normalize_(cls, series): return series.astype(float, copy=True, errors='raise') class TestInvalidMetadataColumnConstruction(unittest.TestCase): def test_non_series(self): with self.assertRaisesRegex( TypeError, 'DummyMetadataColumn constructor.*Series.*not.*' 'DataFrame'): DummyMetadataColumn(pd.DataFrame( {'col1': [1, 2, 3]}, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_no_ids(self): with self.assertRaisesRegex(ValueError, 'DummyMetadataColumn.*at least one ID'): DummyMetadataColumn(pd.Series([], name='col', index=pd.Index([], name='id'), dtype=object)) def test_invalid_id_header(self): # default index name with self.assertRaisesRegex(ValueError, r'Index\.name.*None'): DummyMetadataColumn(pd.Series([1, 2, 3], name='col', index=pd.Index(['a', 'b', 'c'], dtype=object))) with self.assertRaisesRegex(ValueError, r'Index\.name.*my-id-header'): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', 'b', 'c'], name='my-id-header'))) def test_non_str_id(self): with self.assertRaisesRegex( TypeError, 'non-string metadata ID.*type.*float.*nan'): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', np.nan, 'c'], name='id'))) def test_non_str_column_name(self): # default series name with self.assertRaisesRegex( TypeError, 'non-string metadata column name.*type.*' 'NoneType.*None'): DummyMetadataColumn(pd.Series( [1, 2, 3], index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex( TypeError, 'non-string metadata column name.*type.*' 'float.*nan'): DummyMetadataColumn(pd.Series( [1, 2, 3], name=np.nan, index=pd.Index(['a', 'b', 'c'], name='id'))) def test_empty_id(self): with self.assertRaisesRegex( ValueError, 'empty metadata ID.*at least one character'): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', '', 'c'], name='id'))) def test_empty_column_name(self): with self.assertRaisesRegex( ValueError, 'empty metadata column name.*' 'at least one character'): DummyMetadataColumn(pd.Series( [1, 2, 3], name='', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_pound_sign_id(self): with self.assertRaisesRegex( ValueError, "metadata ID.*begins with a pound sign.*'#b'"): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', '#b', 'c'], name='id'))) def test_id_conflicts_with_id_header(self): with self.assertRaisesRegex( ValueError, "metadata ID 'sample-id'.*conflicts.*reserved.*" "ID header"): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', 'sample-id', 'c'], name='id'))) def test_column_name_conflicts_with_id_header(self): with self.assertRaisesRegex( ValueError, "metadata column name 'featureid'.*conflicts.*" "reserved.*ID header"): DummyMetadataColumn(pd.Series( [1, 2, 3], name='featureid', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_duplicate_ids(self): with self.assertRaisesRegex(ValueError, "Metadata IDs.*unique.*'a'"): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', 'b', 'a'], name='id'))) def test_unsupported_column_dtype(self): with self.assertRaisesRegex( TypeError, "DummyMetadataColumn 'col1' does not support.*" "Series.*dtype.*bool"): DummyMetadataColumn(pd.Series( [True, False, True], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_unknown_missing_scheme(self): with self.assertRaisesRegex(ValueError, "BAD:SCHEME"): DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')), missing_scheme='BAD:SCHEME') def test_missing_q2_error(self): with self.assertRaisesRegex(ValueError, "col1.*no-missing"): DummyMetadataColumn(pd.Series( [1, np.nan, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')), missing_scheme='no-missing') class TestMetadataColumnConstructionAndProperties(unittest.TestCase): def test_single_id(self): index = pd.Index(['id1'], name='id') series = pd.Series([42], name='col1', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 1) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('id1',)) self.assertEqual(mdc.name, 'col1') def test_multiple_ids(self): index = pd.Index(['id1', 'a', 'my-id'], name='id') series = pd.Series([42, 4.2, -4.2], name='column', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 3) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('id1', 'a', 'my-id')) self.assertEqual(mdc.name, 'column') def test_supported_id_headers(self): case_insensitive = { 'id', 'sampleid', 'sample id', 'sample-id', 'featureid', 'feature id', 'feature-id' } exact_match = { '#SampleID', '#Sample ID', '#OTUID', '#OTU ID', 'sample_name' } # Build a set of supported headers, including exact matches and headers # with different casing. headers = set() for header in case_insensitive: headers.add(header) headers.add(header.upper()) headers.add(header.title()) for header in exact_match: headers.add(header) count = 0 for header in headers: index = pd.Index(['id1', 'id2'], name=header) series = pd.Series([0, 123], name='column', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_header, header) count += 1 # Since this test case is a little complicated, make sure that the # expected number of comparisons are happening. self.assertEqual(count, 26) def test_recommended_ids(self): index = pd.Index(['c6ca034a-223f-40b4-a0e0-45942912a5ea', 'My.ID'], name='id') series = pd.Series([-1, -2], name='col1', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 2) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('c6ca034a-223f-40b4-a0e0-45942912a5ea', 'My.ID')) self.assertEqual(mdc.name, 'col1') def test_non_standard_characters(self): index = pd.Index(['©id##1', '((id))2', "'id_3<>'", '"id#4"', 'i d\r\t\n5'], name='id') series = pd.Series([0, 1, 2, 3, 4], name='↩c@l1™', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 5) self.assertEqual(mdc.id_header, 'id') self.assertEqual( mdc.ids, ('©id##1', '((id))2', "'id_3<>'", '"id#4"', 'i d\r\t\n5')) self.assertEqual(mdc.name, '↩c@l1™') def test_missing_data(self): index = pd.Index(['None', 'nan', 'NA'], name='id') series = pd.Series([np.nan, np.nan, np.nan], name='NA', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 3) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('None', 'nan', 'NA')) self.assertEqual(mdc.name, 'NA') def test_missing_insdc(self): index = pd.Index(['None', 'nan', 'NA'], name='id') # TODO: note we cannot make a numeric style column of entirely encoded # nans, as there's no way to indicate the true type of the column series = pd.Series(['missing', 'not applicable', 5.0], name='NA', index=index) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') self.assertEqual(mdc.id_count, 3) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('None', 'nan', 'NA')) self.assertEqual(mdc.name, 'NA') pd.testing.assert_series_equal( mdc.to_series(), pd.Series( [np.nan, np.nan, 5.0], name='NA', index=index)) def test_does_not_cast_ids_or_column_name(self): index = pd.Index(['0.000001', '0.004000', '0.000000'], dtype=object, name='id') series = pd.Series([2.0, 1.0, 3.0], name='42.0', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.id_count, 3) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('0.000001', '0.004000', '0.000000')) self.assertEqual(mdc.name, '42.0') def test_case_insensitive_duplicate_ids(self): index = pd.Index(['a', 'b', 'A'], name='id') series = pd.Series([1, 2, 3], name='column', index=index) mdc = DummyMetadataColumn(series) self.assertEqual(mdc.ids, ('a', 'b', 'A')) class TestSourceArtifacts(unittest.TestCase): def setUp(self): self.mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_no_source_artifacts(self): self.assertEqual(self.mdc.artifacts, ()) def test_add_zero_artifacts(self): self.mdc._add_artifacts([]) self.assertEqual(self.mdc.artifacts, ()) def test_add_artifacts(self): # First two artifacts have the same data but different UUIDs. artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) self.mdc._add_artifacts([artifact1]) artifact2 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) artifact3 = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) self.mdc._add_artifacts([artifact2, artifact3]) self.assertEqual(self.mdc.artifacts, (artifact1, artifact2, artifact3)) def test_add_non_artifact(self): artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) with self.assertRaisesRegex(TypeError, "Artifact object.*42"): self.mdc._add_artifacts([artifact, 42]) # Test that the object hasn't been mutated. self.assertEqual(self.mdc.artifacts, ()) def test_add_duplicate_artifact(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) artifact2 = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) self.mdc._add_artifacts([artifact1, artifact2]) with self.assertRaisesRegex( ValueError, "Duplicate source artifacts.*DummyMetadataColumn.*" "artifact: Mapping"): self.mdc._add_artifacts([artifact1]) # Test that the object hasn't been mutated. self.assertEqual(self.mdc.artifacts, (artifact1, artifact2)) class TestRepr(unittest.TestCase): def test_single_id(self): mdc = DummyMetadataColumn(pd.Series( [42], name='foo', index=pd.Index(['id1'], name='id'))) obs = repr(mdc) self.assertEqual(obs, "") def test_multiple_ids(self): mdc = DummyMetadataColumn(pd.Series( [42, 43, 44], name='my column', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) obs = repr(mdc) self.assertEqual( obs, "") class TestEqualityOperators(unittest.TestCase, ReallyEqualMixin): def setUp(self): get_dummy_plugin() def test_type_mismatch(self): dummy = DummyMetadataColumn(pd.Series( [1.0, 2.0, 3.0], name='col1', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) numeric = NumericMetadataColumn(pd.Series( [1.0, 2.0, 3.0], name='col1', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) categorical = CategoricalMetadataColumn(pd.Series( ['a', 'b', 'c'], name='col1', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) self.assertReallyNotEqual(dummy, numeric) self.assertReallyNotEqual(dummy, categorical) def test_id_header_mismatch(self): mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='ID'))) self.assertReallyNotEqual(mdc1, mdc2) def test_artifacts_mismatch(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) series = pd.Series([42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id')) # No artifacts mdc1 = DummyMetadataColumn(series) # Has an artifact mdc2 = DummyMetadataColumn(series) mdc2._add_artifacts([artifact1]) # Has a different artifact mdc3 = DummyMetadataColumn(series) mdc3._add_artifacts([artifact2]) self.assertReallyNotEqual(mdc1, mdc2) self.assertReallyNotEqual(mdc2, mdc3) def test_id_mismatch(self): mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id3'], name='id'))) self.assertReallyNotEqual(mdc1, mdc2) def test_column_name_mismatch(self): mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, 43], name='col2', index=pd.Index(['id1', 'id2'], name='id'))) self.assertReallyNotEqual(mdc1, mdc2) def test_data_mismatch(self): mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, 42], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) self.assertReallyNotEqual(mdc1, mdc2) def test_equality_without_artifact(self): mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) self.assertReallyEqual(mdc1, mdc2) def test_equality_with_artifact(self): artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) mdc1 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc1._add_artifacts([artifact]) mdc2 = DummyMetadataColumn(pd.Series( [42, 43], name='col1', index=pd.Index(['id1', 'id2'], name='id'))) mdc2._add_artifacts([artifact]) self.assertReallyEqual(mdc1, mdc2) def test_equality_with_missing_data(self): mdc1 = DummyMetadataColumn(pd.Series( [42, np.nan, 43, np.nan], name='col1', index=pd.Index(['id1', 'id2', 'id3', 'id4'], name='id'))) mdc2 = DummyMetadataColumn(pd.Series( [42, np.nan, 43, np.nan], name='col1', index=pd.Index(['id1', 'id2', 'id3', 'id4'], name='id'))) self.assertReallyEqual(mdc1, mdc2) # Extensive tests of the MetadataWriter are performed in test_io.py. This test # is a sanity check that a new MetadataColumn subclass (DummyMetadataColumn) # can be written to disk with its column type preserved. This test would have # caught a bug in the original implementation of MetadataColumn.save(), which # converted itself into a Metadata object, losing the "dummy" column type and # replacing it with "numeric". In order for a MetadataColumn to turn itself # into a Metadata object in a lossless/safe way, the Metadata constructor needs # a `column_types` parameter to preserve column types. class TestSave(unittest.TestCase): def setUp(self): self.temp_dir_obj = tempfile.TemporaryDirectory( prefix='qiime2-metadata-tests-temp-') self.temp_dir = self.temp_dir_obj.name self.filepath = os.path.join(self.temp_dir, 'metadata.tsv') def tearDown(self): self.temp_dir_obj.cleanup() def test_basic(self): mdc = DummyMetadataColumn(pd.Series( [42, 42.5, -999.123], name='dummy-column', index=pd.Index(['id1', 'id2', 'id3'], name='id'))) mdc.save(self.filepath) with open(self.filepath, 'r') as fh: obs = fh.read() exp = ( "id\tdummy-column\n" "#q2:types\tdummy\n" "id1\t42\n" "id2\t42.5\n" "id3\t-999.123\n" ) self.assertEqual(obs, exp) class TestToSeries(unittest.TestCase): def test_single_id(self): series = pd.Series([0.0], name='col', index=pd.Index(['id1'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.to_series() pd.testing.assert_series_equal(obs, series) def test_multiple_ids(self): series = pd.Series([-1.5, np.nan, 42], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.to_series() pd.testing.assert_series_equal(obs, series) def test_id_header_preserved(self): series = pd.Series( [-1.5, 0.0, 42], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='#OTU ID')) mdc = DummyMetadataColumn(series) obs = mdc.to_series() pd.testing.assert_series_equal(obs, series) self.assertEqual(obs.index.name, '#OTU ID') def test_series_copy(self): series = pd.Series([1, 2.5, 3], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.to_series() pd.testing.assert_series_equal(obs, series) self.assertIsNot(obs, series) def test_encode_missing_no_missing(self): series = pd.Series([1, 2.5, 3], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_series(encode_missing=True) pd.testing.assert_series_equal(obs, series) self.assertIsNot(obs, series) def test_encode_missing_true(self): series = pd.Series([1, 2.5, 'missing'], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_series(encode_missing=True) pd.testing.assert_series_equal(obs, series) self.assertIsNot(obs, series) def test_encode_missing_false(self): series = pd.Series([1, 2.5, 'missing'], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_series() exp = pd.Series([1, 2.5, np.nan], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertIsNot(obs, series) class TestToDataframe(unittest.TestCase): def test_single_id(self): series = pd.Series([0.0], name='col', index=pd.Index(['id1'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.to_dataframe() exp = pd.DataFrame({'col': [0.0]}, index=pd.Index(['id1'], name='id')) pd.testing.assert_frame_equal(obs, exp) def test_multiple_ids(self): series = pd.Series([0.0, 4.2, np.nan], name='my column', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.to_dataframe() exp = pd.DataFrame({'my column': [0.0, 4.2, np.nan]}, index=pd.Index(['a', 'b', 'c'], name='id')) pd.testing.assert_frame_equal(obs, exp) def test_id_header_preserved(self): series = pd.Series([0.0, 4.2, 123], name='my column', index=pd.Index(['a', 'b', 'c'], name='#Sample ID')) mdc = DummyMetadataColumn(series) obs = mdc.to_dataframe() exp = pd.DataFrame({'my column': [0.0, 4.2, 123]}, index=pd.Index(['a', 'b', 'c'], name='#Sample ID')) pd.testing.assert_frame_equal(obs, exp) self.assertEqual(obs.index.name, '#Sample ID') def test_encode_missing_no_missing(self): series = pd.Series([1, 2.5, 3], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_dataframe(encode_missing=True) exp = pd.DataFrame({'col': series}, index=series.index) pd.testing.assert_frame_equal(obs, exp) def test_encode_missing_true(self): series = pd.Series([1, 2.5, 'missing'], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_dataframe(encode_missing=True) exp = pd.DataFrame({'col': series}, index=series.index) pd.testing.assert_frame_equal(obs, exp) def test_encode_missing_false(self): series = pd.Series([1, 2.5, 'missing'], name='col', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.to_dataframe() exp = pd.DataFrame({'col': [1, 2.5, np.nan]}, index=series.index) pd.testing.assert_frame_equal(obs, exp) class TestGetValue(unittest.TestCase): def test_id_not_found(self): series = pd.Series([1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) with self.assertRaisesRegex( ValueError, "'d' is not present.*DummyMetadataColumn.*'col1'"): mdc.get_value('d') def test_get_value(self): series = pd.Series([1, 2, np.nan], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.get_value('a') self.assertEqual(obs, 1.0) obs = mdc.get_value('b') self.assertEqual(obs, 2.0) obs = mdc.get_value('c') self.assertTrue(np.isnan(obs)) class TestHasMissingValues(unittest.TestCase): def test_no_missing_values(self): series = pd.Series([0.0, 2.2, 3.3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.has_missing_values() self.assertEqual(obs, False) def test_with_missing_values(self): series = pd.Series([0.0, np.nan, 3.3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.has_missing_values() self.assertEqual(obs, True) class TestDropMissingValues(unittest.TestCase): def test_no_missing_values(self): series = pd.Series([0.0, 2.2, 3.3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.drop_missing_values() self.assertEqual(obs, mdc) self.assertIsNot(obs, mdc) def test_with_missing_values(self): series = pd.Series( [0.0, np.nan, 3.3, np.nan, np.nan, 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series) obs = mdc.drop_missing_values() exp = DummyMetadataColumn(pd.Series( [0.0, 3.3, 4.4], name='col1', index=pd.Index(['a', 'c', 'f'], name='sampleid'))) self.assertEqual(obs, exp) def test_with_missing_scheme(self): series = pd.Series( [0.0, np.nan, 3.3, 'missing', 'not applicable', 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') obs = mdc.drop_missing_values() exp = DummyMetadataColumn(pd.Series( [0.0, 3.3, 4.4], name='col1', index=pd.Index(['a', 'c', 'f'], name='sampleid'))) self.assertEqual(obs, exp) def test_artifacts_are_propagated(self): artifact = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) series = pd.Series( [0.0, np.nan, 3.3, np.nan, np.nan, 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series) mdc._add_artifacts([artifact]) obs = mdc.drop_missing_values() exp = DummyMetadataColumn(pd.Series( [0.0, 3.3, 4.4], name='col1', index=pd.Index(['a', 'c', 'f'], name='sampleid'))) exp._add_artifacts([artifact]) self.assertEqual(obs, exp) self.assertEqual(obs.artifacts, (artifact,)) class TestGetIDs(unittest.TestCase): def test_single_id(self): series = pd.Series([1.234], name='col1', index=pd.Index(['my id'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.get_ids() self.assertEqual(obs, {'my id'}) def test_multiple_ids(self): series = pd.Series( [1.234, np.nan, 5.67, np.nan, 8.9], name='col1', index=pd.Index(['id1', 'id2', 'id3', 'id4', 'id5'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.get_ids() self.assertEqual(obs, {'id1', 'id2', 'id3', 'id4', 'id5'}) def test_where_values_missing(self): series = pd.Series( [1.234, np.nan, 5.67, np.nan, 8.9], name='col1', index=pd.Index(['id1', 'id2', 'id3', 'id4', 'id5'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.get_ids(where_values_missing=True) self.assertEqual(obs, {'id2', 'id4'}) def test_where_values_missing_all_missing(self): series = pd.Series( [np.nan, np.nan, np.nan], name='col1', index=pd.Index(['id1', 'id2', 'id3'], name='id')) mdc = DummyMetadataColumn(series) obs = mdc.get_ids(where_values_missing=True) self.assertEqual(obs, {'id1', 'id2', 'id3'}) class TestFilterIDs(unittest.TestCase): def setUp(self): get_dummy_plugin() def test_supports_iterable(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.filter_ids(iter({'a', 'c'})) exp = DummyMetadataColumn(pd.Series( [1, 3], name='col1', index=pd.Index(['a', 'c'], name='id'))) self.assertEqual(obs, exp) def test_keep_all(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.filter_ids({'a', 'b', 'c'}) self.assertEqual(obs, mdc) self.assertIsNot(obs, mdc) def test_keep_multiple(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.filter_ids({'a', 'c'}) exp = DummyMetadataColumn(pd.Series( [1, 3], name='col1', index=pd.Index(['a', 'c'], name='id'))) self.assertEqual(obs, exp) def test_keep_one(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.filter_ids({'b'}) exp = DummyMetadataColumn(pd.Series( [2], name='col1', index=pd.Index(['b'], name='id'))) self.assertEqual(obs, exp) def test_alternate_id_header(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='#OTU ID'))) obs = mdc.filter_ids({'b', 'c'}) exp = DummyMetadataColumn(pd.Series( [2, 3], name='col1', index=pd.Index(['b', 'c'], name='#OTU ID'))) self.assertEqual(obs, exp) def test_no_artifacts(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(mdc.artifacts, ()) filtered = mdc.filter_ids({'b'}) self.assertEqual(filtered.artifacts, ()) def test_with_artifacts(self): artifact1 = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) artifact2 = Artifact.import_data('Mapping', {'d': '4'}) mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) mdc._add_artifacts([artifact1, artifact2]) obs = mdc.filter_ids({'a', 'c'}) exp = DummyMetadataColumn(pd.Series( [1, 3], name='col1', index=pd.Index(['a', 'c'], name='id'))) exp._add_artifacts([artifact1, artifact2]) self.assertEqual(obs, exp) self.assertEqual(obs.artifacts, (artifact1, artifact2)) def test_empty_ids_to_keep(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, 'ids_to_keep.*at least one ID'): mdc.filter_ids({}) def test_duplicate_ids_to_keep(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, "ids_to_keep.*unique IDs.*'b'"): mdc.filter_ids(['b', 'c', 'b']) def test_missing_ids_to_keep(self): mdc = DummyMetadataColumn(pd.Series( [1, 2, 3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) with self.assertRaisesRegex(ValueError, "IDs.*not present.*'d', 'id1'"): mdc.filter_ids({'b', 'id1', 'c', 'd'}) class TestGetMissing(unittest.TestCase): def test_missing_mixed(self): series = pd.Series( [0.0, np.nan, 3.3, 'missing', 'not applicable', 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') missing = mdc.get_missing() exp = pd.Series([np.nan, 'missing', 'not applicable'], name='col1', index=pd.Index(['b', 'd', 'e'], name='sampleid')) pd.testing.assert_series_equal(missing, exp) def test_missing_blanks(self): series = pd.Series( [0.0, np.nan, 3.3, np.nan, np.nan, 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') missing = mdc.get_missing() exp = pd.Series([np.nan, np.nan, np.nan], name='col1', dtype=object, index=pd.Index(['b', 'd', 'e'], name='sampleid')) pd.testing.assert_series_equal(missing, exp) def test_no_missing(self): series = pd.Series( [0.0, 1.1, 3.3, 3.5, 4.0, 4.4], name='col1', index=pd.Index(['a', 'b', 'c', 'd', 'e', 'f'], name='sampleid')) mdc = DummyMetadataColumn(series, missing_scheme='INSDC:missing') missing = mdc.get_missing() exp = pd.Series([], name='col1', dtype=object, index=pd.Index([], name='sampleid')) pd.testing.assert_series_equal(missing, exp) # The tests for CategoricalMetadataColumn and NumericMetadataColumn only test # behavior specific to these subclasses. More extensive tests of these objects # are performed above by testing the MetadataColumn ABC in a generic way. class TestCategoricalMetadataColumn(unittest.TestCase): def test_unsupported_dtype(self): with self.assertRaisesRegex( TypeError, "CategoricalMetadataColumn 'col1' does not support" ".*Series.*dtype.*float64"): CategoricalMetadataColumn(pd.Series( [42.5, 42.6, 42.7], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_unsupported_type_value(self): with self.assertRaisesRegex( TypeError, "CategoricalMetadataColumn.*strings or missing " r"values.*42\.5.*float.*'col1'"): CategoricalMetadataColumn(pd.Series( ['foo', 'bar', 42.5], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_empty_str_value(self): with self.assertRaisesRegex( ValueError, "CategoricalMetadataColumn.*empty strings.*" "column 'col1'"): CategoricalMetadataColumn(pd.Series( ['foo', '', 'bar'], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_type_property(self): self.assertEqual(CategoricalMetadataColumn.type, 'categorical') def test_supported_dtype(self): series = pd.Series( ['foo', np.nan, 'bar', 'foo'], name='my column', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) mdc = CategoricalMetadataColumn(series) self.assertEqual(mdc.id_count, 4) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('a', 'b', 'c', 'd')) self.assertEqual(mdc.name, 'my column') obs_series = mdc.to_series() pd.testing.assert_series_equal(obs_series, series) self.assertEqual(obs_series.dtype, object) def test_numeric_strings_preserved_as_strings(self): series = pd.Series( ['1', np.nan, '2.5', '3.0'], name='my column', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) mdc = CategoricalMetadataColumn(series) self.assertEqual(mdc.id_count, 4) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('a', 'b', 'c', 'd')) self.assertEqual(mdc.name, 'my column') obs_series = mdc.to_series() pd.testing.assert_series_equal(obs_series, series) self.assertEqual(obs_series.dtype, object) def test_missing_data_normalized(self): # Different missing data representations should be normalized to np.nan mdc = CategoricalMetadataColumn(pd.Series( [np.nan, 'foo', float('nan'), None], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id'))) obs = mdc.to_series() exp = pd.Series( [np.nan, 'foo', np.nan, np.nan], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, object) self.assertTrue(np.isnan(obs['a'])) self.assertTrue(np.isnan(obs['c'])) self.assertTrue(np.isnan(obs['d'])) def test_all_missing_data(self): mdc = CategoricalMetadataColumn(pd.Series( np.array([np.nan, np.nan, np.nan], dtype=object), name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.to_series() exp = pd.Series( np.array([np.nan, np.nan, np.nan], dtype=object), name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, object) def test_leading_trailing_whitespace_value(self): col1 = CategoricalMetadataColumn(pd.Series( ['foo', ' bar ', 'baz'], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) col2 = CategoricalMetadataColumn(pd.Series( ['foo', 'bar', 'baz'], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(col1, col2) def test_leading_trailing_whitespace_id(self): col1 = CategoricalMetadataColumn(pd.Series( ['foo', ' bar ', 'baz'], name='col', index=pd.Index(['a', ' b ', 'c'], name='id'))) col2 = CategoricalMetadataColumn(pd.Series( ['foo', ' bar ', 'baz'], name='col', index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(col1, col2) def test_leading_trailing_whitespace_column_name(self): col1 = CategoricalMetadataColumn(pd.Series( ['foo', ' bar ', 'baz'], name=' col ', index=pd.Index(['a', 'b', 'c'], name='id'))) col2 = CategoricalMetadataColumn(pd.Series( ['foo', ' bar ', 'baz'], name='col', index=pd.Index(['a', 'b', 'c'], name='id'))) self.assertEqual(col1, col2) def test_missing_insdc(self): mdc = CategoricalMetadataColumn(pd.Series( ['missing', 'foo', float('nan'), None], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')), missing_scheme='INSDC:missing') obs = mdc.to_series() exp = pd.Series( [np.nan, 'foo', np.nan, np.nan], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, object) self.assertTrue(np.isnan(obs['a'])) self.assertTrue(np.isnan(obs['c'])) self.assertTrue(np.isnan(obs['d'])) class TestNumericMetadataColumn(unittest.TestCase): def test_unsupported_dtype(self): with self.assertRaisesRegex( TypeError, "NumericMetadataColumn 'col1' does not support" ".*Series.*dtype.*bool"): NumericMetadataColumn(pd.Series( [True, False, True], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_infinity_value(self): with self.assertRaisesRegex( ValueError, "NumericMetadataColumn.*positive or negative " "infinity.*column 'col1'"): NumericMetadataColumn(pd.Series( [42, float('+inf'), 4.3], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) def test_type_property(self): self.assertEqual(NumericMetadataColumn.type, 'numeric') def test_supported_dtype_float(self): series = pd.Series( [1.23, np.nan, 4.56, -7.891], name='my column', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) mdc = NumericMetadataColumn(series) self.assertEqual(mdc.id_count, 4) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('a', 'b', 'c', 'd')) self.assertEqual(mdc.name, 'my column') obs_series = mdc.to_series() pd.testing.assert_series_equal(obs_series, series) self.assertEqual(obs_series.dtype, np.float64) def test_supported_dtype_int(self): series = pd.Series( [0, 1, 42, -2], name='my column', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) mdc = NumericMetadataColumn(series) self.assertEqual(mdc.id_count, 4) self.assertEqual(mdc.id_header, 'id') self.assertEqual(mdc.ids, ('a', 'b', 'c', 'd')) self.assertEqual(mdc.name, 'my column') obs_series = mdc.to_series() exp_series = pd.Series( [0.0, 1.0, 42.0, -2.0], name='my column', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) pd.testing.assert_series_equal(obs_series, exp_series) self.assertEqual(obs_series.dtype, np.float64) def test_missing_data_normalized(self): # Different missing data representations should be normalized to np.nan mdc = NumericMetadataColumn(pd.Series( [np.nan, 4.2, float('nan'), -5.678], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id'))) obs = mdc.to_series() exp = pd.Series( [np.nan, 4.2, np.nan, -5.678], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, np.float64) self.assertTrue(np.isnan(obs['a'])) self.assertTrue(np.isnan(obs['c'])) def test_all_missing_data(self): mdc = NumericMetadataColumn(pd.Series( [np.nan, np.nan, np.nan], name='col1', index=pd.Index(['a', 'b', 'c'], name='id'))) obs = mdc.to_series() exp = pd.Series( [np.nan, np.nan, np.nan], name='col1', index=pd.Index(['a', 'b', 'c'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, np.float64) def test_missing_insdc(self): mdc = NumericMetadataColumn(pd.Series( ['missing', 4.2, float('nan'), -5.678], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')), missing_scheme='INSDC:missing') obs = mdc.to_series() exp = pd.Series( [np.nan, 4.2, np.nan, -5.678], name='col1', index=pd.Index(['a', 'b', 'c', 'd'], name='id')) pd.testing.assert_series_equal(obs, exp) self.assertEqual(obs.dtype, np.float64) self.assertTrue(np.isnan(obs['a'])) self.assertTrue(np.isnan(obs['c'])) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/plugin/000077500000000000000000000000001462552636000156275ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/__init__.py000066400000000000000000000025401462552636000177410ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .model import (TextFileFormat, BinaryFileFormat, DirectoryFormat, ValidationError) from .plugin import Plugin from .util import get_available_cores from qiime2.core.cite import Citations, CitationRecord from qiime2.core.type import (SemanticType, Int, Str, Float, Metadata, MetadataColumn, Categorical, Numeric, Properties, Range, Start, End, Choices, Bool, Set, List, Collection, Visualization, TypeMap, TypeMatch, Jobs, Threads) __all__ = ['TextFileFormat', 'BinaryFileFormat', 'DirectoryFormat', 'Plugin', 'SemanticType', 'Set', 'List', 'Collection', 'Bool', 'Int', 'Str', 'Float', 'Metadata', 'MetadataColumn', 'Categorical', 'Numeric', 'Properties', 'Range', 'Start', 'End', 'Choices', 'Visualization', 'Jobs', 'Threads', 'TypeMap', 'TypeMatch', 'ValidationError', 'Citations', 'CitationRecord', 'get_available_cores'] qiime2-2024.5.0/qiime2/plugin/model/000077500000000000000000000000001462552636000167275ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/__init__.py000066400000000000000000000014121462552636000210360ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .directory_format import ( DirectoryFormat, File, FileCollection, SingleFileDirectoryFormat, SingleFileDirectoryFormatBase) from .file_format import TextFileFormat, BinaryFileFormat from .base import ValidationError __all__ = ['DirectoryFormat', 'File', 'FileCollection', 'TextFileFormat', 'BinaryFileFormat', 'SingleFileDirectoryFormat', 'SingleFileDirectoryFormatBase', 'ValidationError'] qiime2-2024.5.0/qiime2/plugin/model/base.py000066400000000000000000000014421462552636000202140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.core.format import FormatBase from qiime2.core.exceptions import ValidationError __all__ = ['FormatBase', 'ValidationError', '_check_validation_level'] # TODO: once sniff is dropped, move this up into FormatBase as validate method def _check_validation_level(level): if level not in ('min', 'max'): raise ValueError('Invalid validation level requested (%s), must ' 'be \'min\' or \'max\'.' % level) qiime2-2024.5.0/qiime2/plugin/model/directory_format.py000066400000000000000000000177161462552636000226710ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import re import shutil import sys import pathlib from qiime2.core import transform from .base import FormatBase, ValidationError, _check_validation_level class PathMakerDescriptor: def __init__(self, file): self.file = file def __get__(self, obj, cls=None): if obj is None: raise Exception() return getattr(obj, self.file.name).path_maker class File: def __init__(self, pathspec, *, format=None, optional=False): if format is None: raise TypeError("Must provide a format.") self.pathspec = pathspec self.format = format self.optional = optional def __get__(self, obj, cls=None): if obj is None: return self return BoundFile(self, obj) class FileCollection(File): def __init__(self, pathspec, *, format=None, optional=False): super().__init__(pathspec, format=format, optional=optional) self._path_maker = None def set_path_maker(self, function): self._path_maker = function return PathMakerDescriptor(self) def __get__(self, obj, cls=None): if obj is None: return self if self._path_maker is None: raise NotImplementedError( "FileCollection: {} missing pathmaker" " definition. To set one add use `@{}.set_path_maker`" " decorator to assign one".format(self.name, self.name)) return BoundFileCollection(self, obj, path_maker=self._path_maker) class BoundFile: @property def mode(self): return self._directory_format._mode def __init__(self, unbound_file, directory_format): self.name = unbound_file.name self.pathspec = unbound_file.pathspec self.format = unbound_file.format self.optional = unbound_file.optional self._directory_format = directory_format self._path_maker = lambda s: unbound_file.pathspec def view(self, view_type): from_type = transform.ModelType.from_view_type(self.format) to_type = transform.ModelType.from_view_type(view_type) transformation = from_type.make_transformation(to_type) return transformation(self.path_maker()) def write_data(self, view, view_type, **kwargs): # TODO: make `view_type` optional like in `Artifact.import_data` if self.mode != 'w': raise TypeError("Cannot use `set`/`add` when mode=%r" % self.mode) from_type = transform.ModelType.from_view_type(view_type) to_type = transform.ModelType.from_view_type(self.format) transformation = from_type.make_transformation(to_type) result = transformation(view) result.path._move_or_copy(self.path_maker(**kwargs)) def _validate_members(self, collected_paths, level): found_members = False root = pathlib.Path(self._directory_format.path) for path in collected_paths: if re.fullmatch(self.pathspec, str(path.relative_to(root))): if collected_paths[path]: # Not a ValidationError, this just shouldn't happen. raise ValueError("%r was already validated by another" " field, the pathspecs (regexes) must" " overlap." % path) collected_paths[path] = True found_members = True self.format(path, mode='r').validate(level) if not found_members and not self.optional: raise ValidationError( "Missing one or more files for %s: %r" % (self._directory_format.__class__.__name__, self.pathspec)) @property def path_maker(self): def bound_path_maker(**kwargs): # Must wrap in a naive Path, otherwise an OutPath would be summoned # into this world, and would destroy everything in its path. path = (pathlib.Path(self._directory_format.path) / self._path_maker(self._directory_format, **kwargs)) # NOTE: path makers are bound to the directory format, so must be # provided as the first argument which will look like `self` to # the plugin-dev. path.parent.mkdir(parents=True, exist_ok=True) return path return bound_path_maker class BoundFileCollection(BoundFile): def __init__(self, unbound_file_collection, directory_format, path_maker): super().__init__(unbound_file_collection, directory_format) self._path_maker = path_maker def view(self, view_type): raise NotImplementedError("Use `iter_views` instead.") def iter_views(self, view_type): # Don't want an OutPath, just a Path root = pathlib.Path(self._directory_format.path) paths = [fp for fp in sorted(root.glob('**/*')) if re.match(self.pathspec, str(fp.relative_to(root)))] from_type = transform.ModelType.from_view_type(self.format) to_type = transform.ModelType.from_view_type(view_type) transformation = from_type.make_transformation(to_type) for fp in paths: # TODO: include capture? yield fp.relative_to(root), transformation(fp) class _DirectoryMeta(type): def __init__(self, name, bases, dct): super().__init__(name, bases, dct) if hasattr(self, '_fields'): fields = self._fields.copy() else: fields = [] for key, value in dct.items(): if isinstance(value, File): # TODO: validate that the paths described by `value` are unique # within a DirectoryFormat value.name = key fields.append(key) self._fields = fields class DirectoryFormat(FormatBase, metaclass=_DirectoryMeta): def validate(self, level='max'): _check_validation_level(level) if not self.path.is_dir(): raise ValidationError("%s is not a directory." % self.path) collected_paths = {p: None for p in self.path.glob('**/*') if not p.name.startswith('.') and p.is_file()} for field in self._fields: getattr(self, field)._validate_members(collected_paths, level) for path, value in collected_paths.items(): if value: continue if value is None: raise ValidationError("Unrecognized file (%s) for %s." % (path, self.__class__.__name__)) if hasattr(self, '_validate_'): try: self._validate_(level) except ValidationError as e: raise ValidationError( "%s is not a(n) %s:\n\n%s" % (self.path, self.__class__.__name__, str(e)) ) from e def save(self, path, ext=None): path = str(path) # in case of pathlib.Path path = path.rstrip('.') # ignore the extension when saving a directory shutil.copytree(self.path, path) return path class SingleFileDirectoryFormatBase(DirectoryFormat): pass def SingleFileDirectoryFormat(name, pathspec, format): # TODO: do the same hack namedtuple does so we don't mangle globals # (arguably the code is going to be broken if defined dynamically anyways, # but better to find that out later than writing in the module namespace # even if it isn't called module-level [which is must be!]) df = type(name, (SingleFileDirectoryFormatBase,), {'file': File(pathspec, format=format)}) df.__module__ = sys._getframe(1).f_globals.get('__name__', '__main__') return df qiime2-2024.5.0/qiime2/plugin/model/file_format.py000066400000000000000000000050061462552636000215710ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import abc import shutil from qiime2.core import transform from .base import FormatBase, ValidationError, _check_validation_level class _FileFormat(FormatBase, metaclass=abc.ABCMeta): def validate(self, level='max'): _check_validation_level(level) if not self.path.is_file(): raise ValidationError("%s is not a file." % self.path) if hasattr(self, '_validate_'): try: self._validate_(level) except ValidationError as e: raise ValidationError( "%s is not a(n) %s file:\n\n%s" % (self.path, self.__class__.__name__, str(e)) ) from e # TODO: remove this branch elif hasattr(self, 'sniff'): if not self.sniff(): raise ValidationError("%s is not a(n) %s file" % (self.path, self.__class__.__name__)) # TODO: define an abc.abstractmethod for `validate` when sniff is # removed instead of this else: raise NotImplementedError("%r does not implement validate." % type(self)) def view(self, view_type): from_type = transform.ModelType.from_view_type(self.__class__) to_type = transform.ModelType.from_view_type(view_type) transformation = from_type.make_transformation(to_type) return transformation(self) def save(self, path, ext=None): path = str(path) # in case of pathlib.Path path = path.rstrip('.') if ext is not None: ext = '.' + ext.lstrip('.') if not path.endswith(ext): path += ext shutil.copyfile(self.path, path) return path class TextFileFormat(_FileFormat): def open(self): mode = 'r' if self._mode == 'r' else 'r+' # ignore BOM only when reading, do not emit BOM on write encoding = 'utf-8-sig' if mode == 'r' else 'utf-8' return self.path.open(mode=mode, encoding=encoding) class BinaryFileFormat(_FileFormat): def open(self): mode = 'rb' if self._mode == 'r' else 'r+b' return self.path.open(mode=mode) qiime2-2024.5.0/qiime2/plugin/model/tests/000077500000000000000000000000001462552636000200715ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/__init__.py000066400000000000000000000005351462552636000222050ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/plugin/model/tests/data/000077500000000000000000000000001462552636000210025ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files/000077500000000000000000000000001462552636000242075ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files/test_text1.txt000066400000000000000000000000001462552636000270420ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files/test_text2.txt000066400000000000000000000000001462552636000270430ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files_extra/000077500000000000000000000000001462552636000254125ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files_extra/bad_4th.txt000066400000000000000000000000001462552636000274460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files_extra/test_text1.txt000066400000000000000000000000001462552636000302450ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files_extra/test_text2.txt000066400000000000000000000000001462552636000302460ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/data/test_text_files_extra/test_text3.txt000066400000000000000000000000001462552636000302470ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/model/tests/test_directory_format.py000066400000000000000000000076071462552636000250700ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2022-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import pkg_resources from qiime2.plugin import model from qiime2.core.testing.format import IntSequenceFormat from qiime2.core.exceptions import ValidationError # Define dummy plugin formats to test with class AllRequiredDirFmt(model.DirectoryFormat): file1 = model.File(r'test_text1.txt', format=IntSequenceFormat, optional=False) file2 = model.File(r'test_text2.txt', format=IntSequenceFormat, optional=False) file3 = model.File(r'test_text3.txt', format=IntSequenceFormat, optional=False) class AllRequiredDefaultDirFmt(model.DirectoryFormat): file1 = model.File(r'test_text1.txt', format=IntSequenceFormat) file2 = model.File(r'test_text2.txt', format=IntSequenceFormat) file3 = model.File(r'test_text3.txt', format=IntSequenceFormat) class OptionalDirFmt(model.DirectoryFormat): file1 = model.File(r'test_text1.txt', format=IntSequenceFormat, optional=False) file2 = model.File(r'test_text2.txt', format=IntSequenceFormat, optional=False) file3 = model.File(r'test_text3.txt', format=IntSequenceFormat, optional=True) class TestDirectoryFormat(unittest.TestCase): package = 'qiime2.plugin.model.tests' def get_data_path(self, filename): """Convenience method for getting a data asset while testing. Test data stored in the ``data/`` dir local to the running test can be accessed via this method. Parameters ---------- filename : str The name of the file to look up. Returns ------- filepath : str The materialized filepath to the requested test data. """ return pkg_resources.resource_filename(self.package, 'data/%s' % filename) def test_fails_missing_required(self): files_dir_fp = self.get_data_path('test_text_files/') with self.assertRaisesRegex( ValidationError, "Missing one or more files for" " AllRequiredDirFmt"): format_object = AllRequiredDirFmt( files_dir_fp, mode='r', ) format_object.validate() def test_fails_missing_with_optional_default(self): files_dir_fp = self.get_data_path('test_text_files/') with self.assertRaisesRegex(ValidationError, "Missing one or more files for " "AllRequiredDefaultDirFmt"): format_object = AllRequiredDefaultDirFmt( files_dir_fp, mode='r', ) format_object.validate() def test_passes_with_missing_optional(self): files_dir_fp = self.get_data_path('test_text_files/') format_object = OptionalDirFmt( files_dir_fp, mode='r', ) format_object.validate() def test_fails_on_unknown_file(self): files_dir_fp = self.get_data_path('test_text_files_extra/') with self.assertRaisesRegex(ValidationError, ".*Unrecognized file.*"): format_object = AllRequiredDirFmt( files_dir_fp, mode='r', ) format_object.validate() qiime2-2024.5.0/qiime2/plugin/model/tests/test_file_format.py000066400000000000000000000050301462552636000237670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import unittest import tempfile import qiime2.plugin.model as model from qiime2.core.testing.plugin import SingleIntFormat class TestTextFileFormat(unittest.TestCase): PAYLOAD = "Somewhere over the rainbow." def setUp(self): self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') def tearDown(self): self.test_dir.cleanup() def test_open_read_good(self): path = os.path.join(self.test_dir.name, 'file') with open(path, 'w', encoding='utf-8') as fh: fh.write(self.PAYLOAD) ff = model.TextFileFormat(path, mode='r') with ff.open() as fh: self.assertEqual(self.PAYLOAD, fh.read()) def test_open_read_ignore_bom(self): path = os.path.join(self.test_dir.name, 'file') with open(path, 'w', encoding='utf-8-sig') as fh: fh.write(self.PAYLOAD) ff = model.TextFileFormat(path, mode='r') with ff.open() as fh: self.assertEqual(self.PAYLOAD, fh.read()) def test_open_write_good(self): ff = model.TextFileFormat() with ff.open() as fh: fh.write(self.PAYLOAD) with open(str(ff), mode='r', encoding='utf-8') as fh: self.assertEqual(self.PAYLOAD, fh.read()) def test_open_write_no_bom(self): ff = model.TextFileFormat() with ff.open() as fh: fh.write(self.PAYLOAD) with open(str(ff), mode='rb') as fh: self.assertEqual(b'S', fh.read(1)) class TestFileFormat(unittest.TestCase): def setUp(self): self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') path = os.path.join(self.test_dir.name, 'int') with open(path, 'w') as fh: fh.write('1') self.format = SingleIntFormat(path, mode='r') def tearDown(self): self.test_dir.cleanup() def test_view_expected(self): number = self.format.view(int) self.assertEqual(1, number) def test_view_invalid_type(self): with self.assertRaisesRegex( Exception, "No transformation.*SingleIntFormat.*float"): self.format.view(float) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/plugin/plugin.py000066400000000000000000000400651462552636000175040ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import inspect import types import qiime2.sdk import qiime2.core.type.grammar as grammar from qiime2.core.validate import ValidationObject from qiime2.plugin.model import DirectoryFormat from qiime2.plugin.model.base import FormatBase from qiime2.core.type import is_semantic_type from qiime2.core.util import get_view_name TransformerRecord = collections.namedtuple( 'TransformerRecord', ['transformer', 'plugin', 'citations']) SemanticTypeRecord = collections.namedtuple( 'SemanticTypeRecord', ['semantic_type', 'plugin']) SemanticTypeFragmentRecord = collections.namedtuple( 'SemanticTypeFragmentRecord', ['fragment', 'plugin']) FormatRecord = collections.namedtuple('FormatRecord', ['format', 'plugin']) ViewRecord = collections.namedtuple( 'ViewRecord', ['name', 'view', 'plugin', 'citations']) # semantic_type and type_expression will point to the same value in # ArtifactClassRecords as type_expression is deprecated in favor of # semantic_type ArtifactClassRecord = collections.namedtuple( 'ArtifactClassRecord', ['semantic_type', 'format', 'plugin', 'description', 'examples', 'type_expression']) ValidatorRecord = collections.namedtuple( 'ValidatorRecord', ['validator', 'view', 'plugin', 'context']) class Plugin: def __init__(self, name, version, website, package=None, project_name=None, citation_text=None, user_support_text=None, short_description=None, description=None, citations=None): self.id = name.replace('-', '_') self.name = name self.version = version self.website = website # Filled in by the PluginManager if not provided. self.package = package self.project_name = project_name if user_support_text is None: self.user_support_text = ('Please post to the QIIME 2 forum for ' 'help with this plugin: https://forum.' 'qiime2.org') else: self.user_support_text = user_support_text if short_description is None: self.short_description = '' else: self.short_description = short_description if description is None: self.description = ('No description available. ' 'See plugin website: %s' % self.website) else: self.description = description if citations is None: self.citations = () else: self.citations = tuple(citations) self.methods = PluginMethods(self) self.visualizers = PluginVisualizers(self) self.pipelines = PluginPipelines(self) self.formats = {} self.views = {} self.type_fragments = {} self.transformers = {} self.artifact_classes = {} self.validators = {} def freeze(self): pass @property def actions(self): # TODO this doesn't handle method/visualizer name collisions. The # auto-generated `qiime2.plugins..actions` API has the # same problem. This should be solved at method/visualizer registration # time, which will solve the problem for both APIs. actions = {} actions.update(self.methods) actions.update(self.visualizers) actions.update(self.pipelines) return types.MappingProxyType(actions) @property def types(self): return self.artifact_classes @property def type_formats(self): # self.type_formats was replaced with self.artifact_classes - this # property provides backward compatibility return list(self.artifact_classes.values()) def register_formats(self, *formats, citations=None): for format in formats: if not issubclass(format, FormatBase): raise TypeError("%r is not a valid format." % format) self.register_views(*formats, citations=citations) def register_views(self, *views, citations=None): if citations is None: citations = () else: citations = tuple(citations) for view in views: if not isinstance(view, type): raise TypeError("%r should be a class." % view) is_format = False if issubclass(view, FormatBase): is_format = True name = get_view_name(view) if name in self.views: raise NameError("View %r is already registered by this " "plugin." % name) self.views[name] = ViewRecord( name=name, view=view, plugin=self, citations=citations) if is_format: self.formats[name] = FormatRecord(format=view, plugin=self) def register_validator(self, semantic_expression): if not is_semantic_type(semantic_expression): raise TypeError('%s is not a Semantic Type' % semantic_expression) def decorator(validator): validator_signature = inspect.getfullargspec(validator) if 'data' not in validator_signature.annotations: raise TypeError('No expected view type provided as annotation' ' for `data` variable in %r.' % (validator.__name__)) if not ['data', 'level'] == validator_signature.args: raise TypeError('The function signature: %r does not contain' ' the required arguments and only the required' ' arguments: %r' % ( validator_signature.args, ['data', 'level'])) for semantic_type in semantic_expression: if semantic_type not in self.validators: self.validators[semantic_type] = \ ValidationObject(semantic_type) self.validators[semantic_type].add_validator( ValidatorRecord( validator=validator, view=validator.__annotations__['data'], plugin=self, context=semantic_expression)) return validator return decorator def register_transformer(self, _fn=None, *, citations=None): """ A transformer has the type Callable[[type], type] """ # `_fn` allows us to figure out if we are called with or without # arguments in order to support both: # ``` # @plugin.register_transformer # def _(x: A) -> B: # ... # ``` # and # ``` # @plugin.register_transformer(restrict=True) # def _(x: A) -> B: # ... # ``` if citations is None: citations = () else: citations = tuple(citations) def decorator(transformer): annotations = transformer.__annotations__.copy() if len(annotations) != 2: raise TypeError("A transformer must only have a single input" " and output annotation.") try: output = annotations.pop('return') except KeyError: raise TypeError("A transformer must provide a return type.") if type(output) is tuple: raise TypeError("A transformer can only return a single type," " not %r." % (output,)) input = list(annotations.values())[0] if (input, output) in self.transformers: raise TypeError("Duplicate transformer (%r) from %r to %r." % (transformer, input, output)) if input == output: raise TypeError("Plugins should not register identity" " transformations (%r, %r to %r)." % (transformer, input, output)) self.transformers[input, output] = TransformerRecord( transformer=transformer, plugin=self, citations=citations) return transformer if _fn is None: return decorator else: # Apply the decorator as we were applied with a single function return decorator(_fn) def register_semantic_types(self, *type_fragments): for type_fragment in type_fragments: if not is_semantic_type(type_fragment): raise TypeError("%r is not a semantic type." % type_fragment) if not (isinstance(type_fragment, grammar.IncompleteExp) or (type_fragment.is_concrete() and not type_fragment.fields)): raise ValueError("%r is not a semantic type symbol." % type_fragment) if type_fragment.name in self.type_fragments: raise ValueError("Duplicate semantic type symbol %r." % type_fragment) self.type_fragments[type_fragment.name] = \ SemanticTypeFragmentRecord( fragment=type_fragment, plugin=self) def _register_artifact_class(self, semantic_type, directory_format, description, examples): if not issubclass(directory_format, DirectoryFormat): raise TypeError("%r is not a directory format." % directory_format) if not is_semantic_type(semantic_type): raise TypeError("%r is not a semantic type." % semantic_type) for t in semantic_type: if t.predicate is not None: raise ValueError("%r has a predicate, differentiating format" " on predicate is not supported.") if description is None: description = "" if examples is None: examples = {} # register_semantic_type_to_format can accept type expressions such as # Kennel[Dog | Cat]. By iterating, we will register the concrete types # (e.g., Kennel[Dog] and Kennel[Cat], not the type expression) for e in list(semantic_type): semantic_type_str = str(e) if semantic_type_str in self.artifact_classes: raise NameError("Artifact class %s was registered more than " "once. Artifact classes can only be " "registered once." % semantic_type_str) self.artifact_classes[semantic_type_str] =\ ArtifactClassRecord( semantic_type=e, format=directory_format, plugin=self, description=description, examples=types.MappingProxyType(examples), type_expression=e) def register_semantic_type_to_format(self, semantic_type, artifact_format=None, directory_format=None): # Handle the deprecated parameter name, artifact_format. This is being # replaced with directory_format for clarity. if artifact_format is not None and directory_format is not None: raise ValueError('directory_format and artifact_format were both' 'provided when registering artifact class %s.' 'Please provide directory_format only as ' 'artifact_format is deprecated.' % str(semantic_type)) elif artifact_format is None and directory_format is None: raise ValueError('directory_format or artifact_format must be ' 'provided when registering artifact class %s.' 'Please provide directory_format only as ' 'artifact_format is deprecated.' % str(semantic_type)) else: directory_format = directory_format or artifact_format self._register_artifact_class(semantic_type=semantic_type, directory_format=directory_format, description=None, examples=None) def register_artifact_class(self, semantic_type, directory_format, description=None, examples=None): if not semantic_type.is_concrete(): raise TypeError("Only a single type can be registered at a time " "with register_artifact_class. Registration " "attempted for %s." % str(semantic_type)) self._register_artifact_class( semantic_type, directory_format, description, examples) class PluginActions(dict): def __init__(self, plugin): self._plugin_id = plugin.id super().__init__() class PluginMethods(PluginActions): def register_function(self, function, inputs, parameters, outputs, name, description, input_descriptions=None, parameter_descriptions=None, output_descriptions=None, citations=None, deprecated=False, examples=None): if citations is None: citations = () else: citations = tuple(citations) if examples is None: examples = {} method = qiime2.sdk.Method._init(function, inputs, parameters, outputs, self._plugin_id, name, description, input_descriptions, parameter_descriptions, output_descriptions, citations, deprecated, examples) self[method.id] = method class PluginVisualizers(PluginActions): def register_function(self, function, inputs, parameters, name, description, input_descriptions=None, parameter_descriptions=None, citations=None, deprecated=False, examples=None): if citations is None: citations = () else: citations = tuple(citations) if examples is None: examples = {} visualizer = qiime2.sdk.Visualizer._init(function, inputs, parameters, self._plugin_id, name, description, input_descriptions, parameter_descriptions, citations, deprecated, examples) self[visualizer.id] = visualizer class PluginPipelines(PluginActions): def register_function(self, function, inputs, parameters, outputs, name, description, input_descriptions=None, parameter_descriptions=None, output_descriptions=None, citations=None, deprecated=False, examples=None): if citations is None: citations = () else: citations = tuple(citations) if examples is None: examples = {} pipeline = qiime2.sdk.Pipeline._init(function, inputs, parameters, outputs, self._plugin_id, name, description, input_descriptions, parameter_descriptions, output_descriptions, citations, deprecated, examples) self[pipeline.id] = pipeline qiime2-2024.5.0/qiime2/plugin/testing.py000066400000000000000000000222561462552636000176650ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pkg_resources import tempfile import unittest import shutil import pathlib import itertools import qiime2 from qiime2.sdk import usage from qiime2.plugin.util import transform from qiime2.plugin.model.base import FormatBase # TODO Split out into more specific subclasses if necessary. class TestPluginBase(unittest.TestCase): """Test harness for simplifying testing QIIME 2 plugins. ``TestPluginBase`` extends ``unittest.TestCase``, with a few extra helpers and assertions. Attributes ---------- package : str The name of the plugin package to be tested. test_dir_prefix : str The prefix for the temporary testing dir created by the harness. """ package = None test_dir_prefix = 'qiime2-plugin' def setUp(self): """Test runner setup hook. If overriding this hook in a test, call ``__super__`` to invoke this method in the overridden hook, otherwise the harness might not work as expected. """ try: package = self.package.split('.')[0] except AttributeError: self.fail('Test class must have a package property.') # plugins are keyed by their names, so a search inside the plugin # object is required to match to the correct plugin plugin = None for name, plugin_ in qiime2.sdk.PluginManager().plugins.items(): if plugin_.package == package: plugin = plugin_ if plugin is not None: self.plugin = plugin else: self.fail('%s is not a registered QIIME 2 plugin.' % package) # TODO use qiime2 temp dir when ported to framework, and when the # configurable temp dir exists self.temp_dir = tempfile.TemporaryDirectory( prefix='%s-test-temp-' % self.test_dir_prefix) def tearDown(self): """Test runner teardown hook. If overriding this hook in a test, call ``__super__`` to invoke this method in the overridden hook, otherwise the harness might not work as expected. """ self.temp_dir.cleanup() def get_data_path(self, filename): """Convenience method for getting a data asset while testing. Test data stored in the ``data/`` dir local to the running test can be accessed via this method. Parameters ---------- filename : str The name of the file to look up. Returns ------- filepath : str The materialized filepath to the requested test data. """ return pkg_resources.resource_filename(self.package, 'data/%s' % filename) def get_transformer(self, from_type, to_type): """Convenience method for getting a registered transformer. This helper deliberately side-steps the framework's validation machinery, so that it is possible for plugin developers to test failing conditions. Parameters ---------- from_type : A View Type The :term:`View` type of the source data. to_type : A View Type The :term:`View` type to transform to. Returns ------- transformer : A Transformer Function The registered tranformer from ``from_type`` to ``to_type``. """ try: transformer_record = self.plugin.transformers[from_type, to_type] except KeyError: self.fail( "Could not find registered transformer from %r to %r." % (from_type, to_type)) return transformer_record.transformer def assertRegisteredSemanticType(self, semantic_type): """Test assertion for ensuring a plugin's semantic type is registered. Fails if the semantic type requested is not found in the Plugin Manager. Parameters ---------- semantic_type : A Semantic Type The :term:`Semantic Type` to test the presence of. """ try: record = self.plugin.type_fragments[semantic_type.name] except KeyError: self.fail( "Semantic type %r is not registered on the plugin." % semantic_type) self.assertEqual(record.fragment, semantic_type) def assertSemanticTypeRegisteredToFormat(self, semantic_type, exp_format): """Test assertion for ensuring a semantic type is registered to a format. Fails if the semantic type requested is not registered to the format specified with ``exp_format``. Also fails if the semantic type isn't registered to **any** format. Parameters ---------- semantic_type : A Semantic Type The :term:`Semantic Type` to check for. exp_format : A Format The :term:`Format` to check that the Semantic Type is registed on. """ # For backward compatibility, support type expressions as input here for t in semantic_type: obs_format = None try: obs_format = self.plugin.artifact_classes[str(t)].format except KeyError: self.assertIsNotNone( obs_format, "Semantic type %r is not registered to a format." % t) self.assertEqual( obs_format, exp_format, "Expected semantic type %r to be registered to format %r, " "not %r." % (t, exp_format, obs_format)) def transform_format(self, source_format, target, filename=None, filenames=None): """Helper utility for loading data and transforming it. Combines several other utilities in this class, will load files from ``data/``, as ``source_format``, then transform to the ``target`` view. This helper deliberately side-steps the framework's validation machinery, so that it is possible for plugin developers to test failing conditions. Parameters ---------- source_format : A Format The :term:`Format` to load the data as. target : A View Type The :term:`View Type ` to transform the data to. filename : str The name of the file to load from ``data``. Use this for formats that use a single file in their format definition. Mutually exclusive with the ``filenames`` parameter. filenames : list[str] The names of the files to load from ``data``. Use this for formats that use multiple files in their format definition. Mutually exclusive with the ``filename`` parameter. Returns ------- input : A Format The data loaded from ``data`` as the specified ``source_format``. obs : A View Type The loaded data, transformed to the specified ``target`` view type. """ # Guard any non-QIIME2 Format sources from being tested if not issubclass(source_format, FormatBase): raise ValueError("`source_format` must be a subclass of " "FormatBase.") # Guard against invalid filename(s) usage if filename is not None and filenames is not None: raise ValueError("Cannot use both `filename` and `filenames` at " "the same time.") # Handle format initialization source_path = None if filename: source_path = self.get_data_path(filename) elif filenames: source_path = self.temp_dir.name for filename in filenames: filepath = self.get_data_path(filename) shutil.copy(filepath, source_path) input = source_format(source_path, mode='r') obs = transform(input, from_type=source_format, to_type=target) if issubclass(target, FormatBase): self.assertIsInstance(obs, (type(pathlib.Path()), str, target)) else: self.assertIsInstance(obs, target) return input, obs def execute_examples(self): if self.plugin is None: raise ValueError('Attempted to run `execute_examples` without ' 'configuring test harness.') for _, action in itertools.chain(self.plugin.actions.items(), self.plugin.types.items()): for name, example_f in action.examples.items(): with self.subTest(example=name): use = usage.ExecutionUsage() example_f(use) def assert_no_nans_in_tables(fh): ''' Checks for NaNs present in any of the tables in the indicated file then resets to the head of the file. ''' from pandas import read_html tables = read_html(fh) for df in tables: assert not df.isnull().values.any() fh.seek(0) qiime2-2024.5.0/qiime2/plugin/tests/000077500000000000000000000000001462552636000167715ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/tests/__init__.py000066400000000000000000000005351462552636000211050ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/plugin/tests/data/000077500000000000000000000000001462552636000177025ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/plugin/tests/data/has_nan.html000066400000000000000000000002161462552636000221760ustar00rootroot00000000000000
NaN Not NaN
NaN 42
qiime2-2024.5.0/qiime2/plugin/tests/data/no_nan.html000066400000000000000000000002261462552636000220400ustar00rootroot00000000000000
Not NaN Also Not NaN
42 43
qiime2-2024.5.0/qiime2/plugin/tests/test_plugin.py000066400000000000000000000464351462552636000217140ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import types import unittest import pkg_resources import qiime2.plugin import qiime2.sdk from qiime2.core.testing.type import (IntSequence1, IntSequence2, Mapping, FourInts, Kennel, Dog, Cat, SingleInt) from qiime2.core.testing.format import (IntSequenceDirectoryFormat, IntSequenceV2DirectoryFormat) from qiime2.core.testing.util import get_dummy_plugin from qiime2.core.testing.plugin import is1_use, is2_use from qiime2.plugin.testing import assert_no_nans_in_tables class TestPlugin(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() def get_data_path(self, filename): return pkg_resources.resource_filename('qiime2.plugin.tests', 'data/%s' % filename) def test_name(self): self.assertEqual(self.plugin.name, 'dummy-plugin') def test_version(self): self.assertEqual(self.plugin.version, '0.0.0-dev') def test_website(self): self.assertEqual(self.plugin.website, 'https://github.com/qiime2/qiime2') def test_package(self): self.assertEqual(self.plugin.package, 'qiime2.core.testing') def test_citations(self): self.assertEqual(self.plugin.citations[0].type, 'article') def test_user_support_text(self): self.assertEqual(self.plugin.user_support_text, 'For help, see https://qiime2.org') def test_short_description_text(self): self.assertEqual(self.plugin.short_description, 'Dummy plugin for testing.') def test_description_text(self): self.assertEqual(self.plugin.description, 'Description of dummy plugin.') def test_citations_default(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') self.assertEqual(plugin.citations, ()) def test_user_support_text_default(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') self.assertTrue(plugin.user_support_text.startswith('Please post')) self.assertTrue(plugin.user_support_text.endswith( 'https://forum.qiime2.org')) def test_actions(self): actions = self.plugin.actions self.assertIsInstance(actions, types.MappingProxyType) self.assertEqual(actions.keys(), {'merge_mappings', 'concatenate_ints', 'split_ints', 'most_common_viz', 'mapping_viz', 'identity_with_metadata', 'identity_with_metadata_column', 'identity_with_categorical_metadata_column', 'identity_with_numeric_metadata_column', 'identity_with_optional_metadata', 'identity_with_optional_metadata_column', 'params_only_method', 'no_input_method', 'optional_artifacts_method', 'variadic_input_method', 'params_only_viz', 'no_input_viz', 'long_description_method', 'parameter_only_pipeline', 'typical_pipeline', 'optional_artifact_pipeline', 'pointless_pipeline', 'visualizer_only_pipeline', 'pipelines_in_pipeline', 'resumable_pipeline', 'resumable_varied_pipeline', 'resumable_nested_varied_pipeline', 'internal_fail_pipeline', 'de_facto_list_pipeline', 'de_facto_dict_pipeline', 'de_facto_collection_pipeline', 'list_pipeline', 'collection_pipeline', 'failing_pipeline', 'docstring_order_method', 'constrained_input_visualization', 'combinatorically_mapped_method', 'double_bound_variable_method', 'bool_flag_swaps_output_method', 'predicates_preserved_method', 'deprecated_method', 'union_inputs', 'unioned_primitives', 'type_match_list_and_set', 'list_of_ints', 'dict_of_ints', 'returns_int', 'collection_inner_union', 'collection_outer_union', 'dict_params', 'list_params', 'varied_method', '_underscore_method', 'return_four_ints', 'return_many_ints' }) for action in actions.values(): self.assertIsInstance(action, qiime2.sdk.Action) # Read-only dict. with self.assertRaises(TypeError): actions["i-shouldn't-do-this"] = "my-action" with self.assertRaises(TypeError): actions["merge_mappings"] = "my-action" def test_methods(self): methods = self.plugin.methods self.assertEqual(methods.keys(), {'merge_mappings', 'concatenate_ints', 'split_ints', 'identity_with_metadata', 'identity_with_metadata_column', 'identity_with_categorical_metadata_column', 'identity_with_numeric_metadata_column', 'identity_with_optional_metadata', 'identity_with_optional_metadata_column', 'params_only_method', 'no_input_method', 'optional_artifacts_method', 'long_description_method', 'variadic_input_method', 'docstring_order_method', 'combinatorically_mapped_method', 'double_bound_variable_method', 'bool_flag_swaps_output_method', 'predicates_preserved_method', 'deprecated_method', 'union_inputs', 'unioned_primitives', 'type_match_list_and_set', 'list_of_ints', 'dict_of_ints', 'returns_int', 'collection_inner_union', 'collection_outer_union', 'dict_params', 'list_params', 'varied_method', '_underscore_method', 'return_four_ints', 'return_many_ints' }) for method in methods.values(): self.assertIsInstance(method, qiime2.sdk.Method) def test_visualizers(self): visualizers = self.plugin.visualizers self.assertEqual(visualizers.keys(), {'most_common_viz', 'mapping_viz', 'params_only_viz', 'no_input_viz', 'constrained_input_visualization'}) for viz in visualizers.values(): self.assertIsInstance(viz, qiime2.sdk.Visualizer) def test_pipelines(self): pipelines = self.plugin.pipelines self.assertEqual(pipelines.keys(), {'parameter_only_pipeline', 'typical_pipeline', 'optional_artifact_pipeline', 'pointless_pipeline', 'visualizer_only_pipeline', 'pipelines_in_pipeline', 'resumable_pipeline', 'resumable_varied_pipeline', 'resumable_nested_varied_pipeline', 'internal_fail_pipeline', 'de_facto_list_pipeline', 'de_facto_dict_pipeline', 'de_facto_collection_pipeline', 'list_pipeline', 'collection_pipeline', 'failing_pipeline'}) for pipeline in pipelines.values(): self.assertIsInstance(pipeline, qiime2.sdk.Pipeline) # TODO test registration of directory formats. def test_deprecated_type_formats(self): # Plugin.type_formats was replaced with Plugin.artifact_classes. For # backward compatibility the Plugin.type_formats property returns # the plugin's artifact_classes self.assertEqual(self.plugin.type_formats, list(self.plugin.artifact_classes.values())) def test_type_fragments(self): types = self.plugin.type_fragments.keys() self.assertEqual( set(types), set(['IntSequence1', 'IntSequence2', 'IntSequence3', 'Mapping', 'FourInts', 'Kennel', 'Dog', 'Cat', 'SingleInt', 'C1', 'C2', 'C3', 'Foo', 'Bar', 'Baz', 'AscIntSequence', 'Squid', 'Octopus', 'Cuttlefish'])) def test_types(self): types = self.plugin.types # Get just the ArtifactClassRecords out of the types dictionary, then # get just the types out of the ArtifactClassRecords namedtuples types = {type_.semantic_type for type_ in types.values()} exp = {IntSequence1, IntSequence2, FourInts, Mapping, Kennel[Dog], Kennel[Cat], SingleInt} self.assertLessEqual(exp, types) self.assertNotIn(Cat, types) self.assertNotIn(Dog, types) self.assertNotIn(Kennel, types) def test_register_semantic_type_to_format_deprecated_parameter_name(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') # both the new (directory_format) and old (artifact_format) names for # the format work plugin.register_semantic_type_to_format( IntSequence1, directory_format=IntSequenceDirectoryFormat) plugin.register_semantic_type_to_format( IntSequence2, artifact_format=IntSequenceV2DirectoryFormat) ac = plugin.artifact_classes['IntSequence1'] self.assertEqual(ac.semantic_type, IntSequence1) self.assertEqual(ac.format, IntSequenceDirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) ac = plugin.artifact_classes['IntSequence2'] self.assertEqual(ac.semantic_type, IntSequence2) self.assertEqual(ac.format, IntSequenceV2DirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) # errors are raised when both or neither the new or old names for the # format are provided plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') regex = r'ory_format and artifact_for.*IntSequence1' with self.assertRaisesRegex(ValueError, regex): plugin.register_semantic_type_to_format( IntSequence1, directory_format=IntSequenceDirectoryFormat, artifact_format=IntSequenceDirectoryFormat) regex = r'ory_format or artifact_for.*IntSequence1' with self.assertRaisesRegex(ValueError, regex): plugin.register_semantic_type_to_format(IntSequence1) def test_register_artifact_class(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') plugin.register_artifact_class(IntSequence1, IntSequenceDirectoryFormat) # the original approach for registering artifact_class still works plugin.register_semantic_type_to_format(IntSequence2, IntSequenceV2DirectoryFormat) plugin.register_artifact_class(Kennel[Dog], IntSequenceDirectoryFormat) plugin.register_artifact_class(Kennel[Cat], IntSequenceV2DirectoryFormat) # all and only the expected artifact classes have been registered self.assertEqual(len(plugin.artifact_classes), 4) ac = plugin.artifact_classes['IntSequence1'] self.assertEqual(ac.semantic_type, IntSequence1) self.assertEqual(ac.type_expression, IntSequence1) self.assertEqual(ac.format, IntSequenceDirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) ac = plugin.artifact_classes['IntSequence2'] self.assertEqual(ac.semantic_type, IntSequence2) self.assertEqual(ac.type_expression, IntSequence2) self.assertEqual(ac.format, IntSequenceV2DirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) ac = plugin.artifact_classes['Kennel[Dog]'] self.assertEqual(ac.semantic_type, Kennel[Dog]) self.assertEqual(ac.type_expression, Kennel[Dog]) self.assertEqual(ac.format, IntSequenceDirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) ac = plugin.artifact_classes['Kennel[Cat]'] self.assertEqual(ac.semantic_type, Kennel[Cat]) self.assertEqual(ac.type_expression, Kennel[Cat]) self.assertEqual(ac.format, IntSequenceV2DirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "") self.assertEqual(ac.examples, types.MappingProxyType({})) self.assertFalse(plugin.artifact_classes['IntSequence1'] is plugin.artifact_classes['IntSequence2']) def test_duplicate_artifact_class_registration_disallowed(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') plugin.register_artifact_class(IntSequence1, IntSequenceDirectoryFormat) # Registration of type to the same format with both registration # methods is disallowed with self.assertRaisesRegex(NameError, "ct class IntSequence1.*once"): plugin.register_semantic_type_to_format( IntSequence1, IntSequenceDirectoryFormat) with self.assertRaisesRegex(NameError, "ct class IntSequence1.*once"): plugin.register_artifact_class( IntSequence1, IntSequenceDirectoryFormat) # Registration of type to the different format with both registration # methods is disallowed with self.assertRaisesRegex(NameError, "ct class IntSequence1.*once"): plugin.register_semantic_type_to_format( IntSequence1, IntSequenceV2DirectoryFormat) with self.assertRaisesRegex(NameError, "ct class IntSequence1.*once"): plugin.register_artifact_class( IntSequence1, IntSequenceV2DirectoryFormat) def test_register_artifact_class_w_annotations(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') plugin.register_artifact_class( IntSequence1, IntSequenceDirectoryFormat, description="A sequence of integers.", examples=types.MappingProxyType({'Import ex 1': is1_use})) plugin.register_artifact_class( IntSequence2, IntSequenceV2DirectoryFormat, description="Different seq of ints.", examples=types.MappingProxyType({'Import ex': is2_use})) ac = plugin.artifact_classes['IntSequence1'] self.assertEqual(ac.semantic_type, IntSequence1) self.assertEqual(ac.format, IntSequenceDirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "A sequence of integers.") self.assertEqual(ac.examples, types.MappingProxyType({'Import ex 1': is1_use})) ac = plugin.artifact_classes['IntSequence2'] self.assertEqual(ac.semantic_type, IntSequence2) self.assertEqual(ac.format, IntSequenceV2DirectoryFormat) self.assertEqual(ac.plugin, plugin) self.assertEqual(ac.description, "Different seq of ints.") self.assertEqual(ac.examples, types.MappingProxyType({'Import ex': is2_use})) def test_register_artifact_class_multiple(self): plugin = qiime2.plugin.Plugin( name='local-dummy-plugin', version='0.0.0-dev', website='https://github.com/qiime2/qiime2', package='qiime2.core.testing') # multiple artifact_classes can be registered using the original # approach, since default descriptions and examples are used plugin.register_semantic_type_to_format(Kennel[Dog | Cat], IntSequenceDirectoryFormat) ac_c = plugin.artifact_classes['Kennel[Cat]'] self.assertEqual(ac_c.semantic_type, Kennel[Cat]) self.assertEqual(ac_c.format, IntSequenceDirectoryFormat) self.assertEqual(ac_c.plugin, plugin) self.assertEqual(ac_c.description, "") self.assertEqual(ac_c.examples, types.MappingProxyType({})) ac_d = plugin.artifact_classes['Kennel[Dog]'] self.assertEqual(ac_d.semantic_type, Kennel[Dog]) self.assertEqual(ac_d.format, IntSequenceDirectoryFormat) self.assertEqual(ac_d.plugin, plugin) self.assertEqual(ac_d.description, "") self.assertEqual(ac_d.examples, types.MappingProxyType({})) # multiple artifact_classes cannot be registered using # register_artifact_class, since default descriptions and examples # should be different from one another with self.assertRaisesRegex(TypeError, r'Only a single.*Kennel\[Dog \| Cat\]'): plugin.register_artifact_class( Kennel[Dog | Cat], IntSequenceDirectoryFormat) def test_table_does_not_have_nans(self): noNaN = self.get_data_path('no_nan.html') with open(noNaN) as fh: assert_no_nans_in_tables(fh) def test_table_has_nans(self): hasNaN = self.get_data_path('has_nan.html') with open(hasNaN) as fh: with self.assertRaises(AssertionError): assert_no_nans_in_tables(fh) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/plugin/tests/test_tests.py000066400000000000000000000023641462552636000215510ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import tempfile from qiime2.core.testing.format import SingleIntFormat from qiime2.core.testing.util import get_dummy_plugin from qiime2.plugin.testing import TestPluginBase class TestTesting(TestPluginBase): package = 'qiime2.sdk.tests' def setUp(self): self.plugin = get_dummy_plugin() # TODO standardize temporary directories created by QIIME 2 # create a temporary data_dir for sample Visualizations self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') def tearDown(self): self.test_dir.cleanup() def test_transformer_in_other_plugin(self): _, obs = self.transform_format(SingleIntFormat, str, filename='singleint.txt') self.assertEqual('42', obs) def test_examples(self): self.execute_examples() if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/plugin/util.py000066400000000000000000000025721462552636000171640ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import psutil from qiime2.core.transform import ModelType def transform(data, *, from_type=None, to_type): from_type = type(data) if from_type is None else from_type from_model_type = ModelType.from_view_type(from_type) to_model_type = ModelType.from_view_type(to_type) transformation = from_model_type.make_transformation(to_model_type) return transformation(data) def get_available_cores(n_less: int = 0): ''' Finds the number of currently available (logical) cores. Useful for plugins that need to convert a 0 to a concrete number of cores when 0 is not supported by the underlying/called software. Parameters ---------- n_less : int The number of cores less than the total number available to request. For example `get_available_cores(n_less=2) with 10 available cores will return 8. Returns ------- int The number of cores to be requested. ''' cpus = psutil.cpu_count() if cpus is not None: return cpus - n_less return 1 qiime2-2024.5.0/qiime2/plugins.py000066400000000000000000000423121462552636000163660ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from dataclasses import dataclass import re import sys import importlib.machinery from qiime2.sdk import usage __all__ = ['available_plugins', 'ArtifactAPIUsage'] __path__ = [] def available_plugins(): import qiime2.sdk pm = qiime2.sdk.PluginManager() return set('qiime2.plugins.' + s.replace('-', '_') for s in pm.plugins) class ArtifactAPIUsageVariable(usage.UsageVariable): """A specialized implementation for :class:`ArtifactAPIUsage`.""" # this lets us repr all inputs (including parameters) and have # them template out in a consistent manner. without this we would wind # up with `foo('my_artifact')` rather than `foo(my_artifact)`. class repr_raw_variable_name: def __init__(self, value): self.value = value def __repr__(self): return self.value def to_interface_name(self): if self.var_type == 'format': return self.name parts = { 'artifact': [self.name], 'artifact_collection': [self.name, 'artifact_collection'], 'visualization': [self.name, 'viz'], 'visualization_collection': [self.name, 'viz_collection'], 'metadata': [self.name, 'md'], 'column': [self.name, 'mdc'], # No format here - it shouldn't be possible to make it this far }[self.var_type] var_name = '_'.join(parts) # ensure var_name is a valid python identifier var_name = re.sub(r'\W|^(?=\d)', '_', var_name) return self.repr_raw_variable_name(var_name) def assert_has_line_matching(self, path, expression): if not self.use.enable_assertions: return self.use._update_imports(import_='re') name = self.to_interface_name() expr = expression lines = [ 'hits = sorted(%r._archiver.data_dir.glob(%r))' % (name, path), 'if len(hits) != 1:', self.use.INDENT + 'raise ValueError', 'target = hits[0].read_text()', 'match = re.search(%r, target, flags=re.MULTILINE)' % (expr,), 'if match is None:', self.use.INDENT + 'raise AssertionError', ] self.use._add(lines) def assert_output_type(self, semantic_type, key=None): if not self.use.enable_assertions: return name = self.to_interface_name() if key: name = "%s[%s]" % (name, key) lines = [ 'if str(%r.type) != %r:' % (name, str(semantic_type)), self.use.INDENT + 'raise AssertionError', ] self.use._add(lines) class ArtifactAPIUsage(usage.Usage): INDENT = ' ' * 4 @dataclass(frozen=True) class ImporterRecord: import_: str from_: str = None as_: str = None def render(self): tmpl = 'import %s' % (self.import_,) if self.from_ is not None: tmpl = 'from %s %s' % (self.from_, tmpl) if self.as_ is not None: tmpl = '%s as %s' % (tmpl, self.as_) return tmpl def __init__(self, enable_assertions: bool = False, action_collection_size: int = 3): """Constructor for ArtifactAPIUsage Warning ------- For SDK use only. Do not use in a written usage example. Parameters ---------- enable_assertions : bool Whether :class:`qiime2.sdk.usage.UsageVariable` assertions should be rendered. Note that these are not executed, rather code that would assert is templated by :meth:`render`. action_collection_size : int The maximum number of outputs to automatically desctructure before creating seperate lines with output property access. e.g. ``x, y, z = foo()`` vs ``results = foo()`` with ``results.x`` etc. """ super().__init__() self.enable_assertions = enable_assertions self.action_collection_size = action_collection_size self._reset_state(reset_global_imports=True) def _reset_state(self, reset_global_imports=False): self.local_imports = set() self.recorder = [] self.init_data_refs = dict() if reset_global_imports: self.global_imports = set() def _add(self, lines): self.recorder.extend(lines) def usage_variable(self, name, factory, var_type): return ArtifactAPIUsageVariable(name, factory, var_type, self) def render(self, flush: bool = False) -> str: """Return a newline-seperated string of Artifact API python code. Warning ------- For SDK use only. Do not use in a written usage example. Parameters ---------- flush : bool Whether to 'flush' the current code. Importantly, this will clear the top-line imports for future invocations. """ sorted_imps = sorted(self.local_imports) if sorted_imps: sorted_imps = sorted_imps + [''] rendered = '\n'.join(sorted_imps + self.recorder) if flush: self._reset_state() return rendered def init_artifact(self, name, factory): variable = super().init_artifact(name, factory) var_name = str(variable.to_interface_name()) self.init_data_refs[var_name] = variable return variable def init_metadata(self, name, factory): variable = super().init_metadata(name, factory) var_name = str(variable.to_interface_name()) self.init_data_refs[var_name] = variable return variable def init_artifact_collection(self, name, factory): variable = super().init_artifact_collection(name, factory) var_name = str(variable.to_interface_name()) self.init_data_refs[var_name] = variable return variable def construct_artifact_collection(self, name, members): variable = super().construct_artifact_collection(name, members) var_name = variable.to_interface_name() lines = [f'{var_name} = ResultCollection({{'] for key, member in members.items(): lines.append(self.INDENT + f"'{key}': {member.name},") lines.append('})') self._update_imports(from_='qiime2', import_='ResultCollection') self._add(lines) return variable def get_artifact_collection_member(self, name, variable, key): accessed_variable = super().get_artifact_collection_member( name, variable, key ) lines = [ f"{name} = {variable.to_interface_name()}['{key}']" ] self._add(lines) return accessed_variable def init_format(self, name, factory, ext=None): if ext is not None: name = '%s.%s' % (name, ext.lstrip('.')) variable = super().init_format(name, factory, ext=ext) var_name = str(variable.to_interface_name()) self.init_data_refs[var_name] = variable return variable def import_from_format(self, name, semantic_type, variable, view_type=None): imported_var = super().import_from_format( name, semantic_type, variable, view_type=view_type) interface_name = imported_var.to_interface_name() import_fp = variable.to_interface_name() lines = [ '%s = Artifact.import_data(' % (interface_name,), self.INDENT + '%r,' % (semantic_type,), self.INDENT + '%r,' % (import_fp,), ] if view_type is not None: if type(view_type) is not str: # Show users where these formats come from when used in the # Python API to make things less "magical". import_path = _canonical_module(view_type) view_type = view_type.__name__ if import_path is not None: self._update_imports(from_=import_path, import_=view_type) else: # May be in scope already, but something is quite wrong at # this point, so assume the plugin_manager is sufficiently # informed. view_type = repr(view_type) else: view_type = repr(view_type) lines.append(self.INDENT + '%s,' % (view_type,)) lines.append(')') self._update_imports(from_='qiime2', import_='Artifact') self._add(lines) return imported_var def merge_metadata(self, name, *variables): variable = super().merge_metadata(name, *variables) first_var, remaining_vars = variables[0], variables[1:] first_md = first_var.to_interface_name() names = [str(r.to_interface_name()) for r in remaining_vars] remaining = ', '.join(names) var_name = variable.to_interface_name() lines = ['%r = %r.merge(%s)' % (var_name, first_md, remaining)] self._add(lines) return variable def get_metadata_column(self, name, column_name, variable): col_variable = super().get_metadata_column(name, column_name, variable) to_name = col_variable.to_interface_name() from_name = variable.to_interface_name() lines = ['%s = %s.get_column(%r)' % (to_name, from_name, column_name)] self._add(lines) return col_variable def view_as_metadata(self, name, from_variable): to_variable = super().view_as_metadata(name, from_variable) from_name = from_variable.to_interface_name() to_name = to_variable.to_interface_name() lines = ['%r = %r.view(Metadata)' % (to_name, from_name)] self._update_imports(from_='qiime2', import_='Metadata') self._add(lines) return to_variable def peek(self, variable): var_name = variable.to_interface_name() lines = [] for attr in ('uuid', 'type', 'format'): lines.append('print(%r.%s)' % (var_name, attr)) self._add(lines) return variable def comment(self, text): lines = ['# %s' % (text,)] self._add(lines) def help(self, action): action_name = self._plugin_import_as_name(action) # TODO: this isn't pretty, but it gets the job done lines = ['help(%s.%s.__call__)' % (action_name, action.action_id)] self._add(lines) def action(self, action, input_opts, output_opts): variables = super().action(action, input_opts, output_opts) self._plugin_import_as_name(action) inputs = input_opts.map_variables(lambda v: v.to_interface_name()) self._template_action(action, inputs, variables) return variables def get_example_data(self): return {r: v.execute() for r, v in self.init_data_refs.items()} def _plugin_import_as_name(self, action): action_f = action.get_action() full_import = action_f.get_import_path() base, _, _ = full_import.rsplit('.', 2) as_ = '%s_actions' % (action.plugin_id,) self._update_imports(import_='%s.actions' % (base,), as_=as_) return as_ def _template_action(self, action, input_opts, variables): if len(variables) > self.action_collection_size: output_vars = 'action_results' else: output_vars = self._template_outputs(action, variables) plugin_id = action.plugin_id action_id = action.action_id lines = [ '%s = %s_actions.%s(' % (output_vars, plugin_id, action_id), ] for k, v in input_opts.items(): line = self._template_input(k, v) lines.append(line) lines.append(')') if len(variables) > self.action_collection_size: for k, v in variables._asdict().items(): var_name = v.to_interface_name() lines.append('%s = action_results.%s' % (var_name, k)) self._add(lines) def _template_outputs(self, action, variables): output_vars = [] action_f = action.get_action() # need to coax the outputs into the correct order for unpacking for output in action_f.signature.outputs: variable = getattr(variables, output) output_vars.append(str(variable.to_interface_name())) if len(output_vars) == 1: output_vars.append('') return ', '.join(output_vars).strip() def _template_input(self, input_name, value): if isinstance(value, list): t = ', '.join(repr(el) for el in value) return self.INDENT + '%s=[%s],' % (input_name, t) if isinstance(value, set): t = ', '.join(repr(el) for el in sorted(value, key=str)) return self.INDENT + '%s={%s},' % (input_name, t) return self.INDENT + '%s=%r,' % (input_name, value) def _update_imports(self, import_, from_=None, as_=None): import_record = self.ImporterRecord( import_=import_, from_=from_, as_=as_) if as_ is not None: self.namespace.add(as_) else: self.namespace.add(import_) rendered = import_record.render() if rendered not in self.global_imports: self.local_imports.add(rendered) self.global_imports.add(rendered) def _canonical_module(obj): last_module = None module_str = obj.__module__ parts = module_str.split('.') while parts: try: module = importlib.import_module('.'.join(parts)) except ModuleNotFoundError: return last_module if not hasattr(module, obj.__name__): return last_module last_module = '.'.join(parts) parts.pop() return None class QIIMEArtifactAPIImporter: def _plugin_lookup(self, plugin_name): import qiime2.sdk pm = qiime2.sdk.PluginManager() lookup = {s.replace('-', '_'): s for s in pm.plugins} if plugin_name not in lookup: return None return pm.plugins[lookup[plugin_name]] def find_spec(self, name, path=None, target=None): # Don't waste time doing anything if it's not a qiime2 plugin if not name.startswith('qiime2.plugins.'): return None if target is not None: # TODO: experiment with this to see if it is possible raise ImportError("Reloading the QIIME 2 Artifact API is not" " currently supported.") # We couldn't care less about path, it is useless to us # (It is the __path__ of the parent module) fqn = name.split('.') plugin_details = fqn[2:] # fqn[len(['qiime2', 'plugins']):] plugin_name = plugin_details[0] plugin = self._plugin_lookup(plugin_name) if plugin is None or len(plugin_details) > 2: return None if len(plugin_details) == 1: return self._make_spec(name, plugin) elif plugin_details[1] == 'visualizers': return self._make_spec(name, plugin, ('visualizers',)) elif plugin_details[1] == 'methods': return self._make_spec(name, plugin, ('methods',)) elif plugin_details[1] == 'pipelines': return self._make_spec(name, plugin, ('pipelines',)) elif plugin_details[1] == 'actions': return self._make_spec(name, plugin, ('methods', 'visualizers', 'pipelines')) return None def _make_spec(self, name, plugin, action_types=None): # See PEP 451 for explanation of what is happening: # https://www.python.org/dev/peps/pep-0451/#modulespec return importlib.machinery.ModuleSpec( name, loader=self, origin='generated QIIME 2 API', loader_state={'plugin': plugin, 'action_types': action_types}, is_package=action_types is None ) def create_module(self, spec): # Required by Python 3.6, we just need the default behavior return None def exec_module(self, module): spec = module.__spec__ plugin = spec.loader_state['plugin'] action_types = spec.loader_state['action_types'] module.__plugin__ = plugin if action_types is None: module.methods = importlib.import_module('.methods', package=spec.name) module.visualizers = importlib.import_module('.visualizers', package=spec.name) module.pipelines = importlib.import_module('.pipelines', package=spec.name) module.actions = importlib.import_module('.actions', package=spec.name) else: for action_type in action_types: actions = getattr(plugin, action_type) for key, value in actions.items(): setattr(module, key, value) sys.meta_path += [QIIMEArtifactAPIImporter()] qiime2-2024.5.0/qiime2/sdk/000077500000000000000000000000001462552636000151125ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/sdk/__init__.py000066400000000000000000000021451462552636000172250ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from .context import Context from .action import Action, Method, Visualizer, Pipeline from .plugin_manager import PluginManager, UninitializedPluginManagerError from .result import Result, Artifact, Visualization, ResultCollection from .results import Results from .util import parse_type, parse_format, type_from_ast from ..core.cite import Citations from ..core.exceptions import ValidationError, ImplementationError __all__ = ['Result', 'Results', 'Artifact', 'Visualization', 'ResultCollection', 'Action', 'Method', 'Visualizer', 'Pipeline', 'PluginManager', 'parse_type', 'parse_format', 'type_from_ast', 'Context', 'Citations', 'PARALLEL_CONFIG', 'ValidationError', 'ImplementationError', 'UninitializedPluginManagerError'] qiime2-2024.5.0/qiime2/sdk/action.py000066400000000000000000000772671462552636000167640ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import abc import concurrent.futures import inspect import tempfile import textwrap import decorator import dill from parsl.app.app import python_app, join_app import qiime2.sdk import qiime2.core.type as qtype import qiime2.core.archive as archive from qiime2.core.util import (LateBindingAttribute, DropFirstParameter, tuplize, create_collection_name) from qiime2.sdk.proxy import Proxy def _subprocess_apply(action, ctx, args, kwargs): # We with in the cache here to make sure archiver.load* puts things in the # right cache with ctx.cache: exe = action._bind( lambda: qiime2.sdk.Context(parent=ctx), {'type': 'asynchronous'}) results = exe(*args, **kwargs) return results def _run_parsl_action(action, ctx, execution_ctx, mapped_args, mapped_kwargs, inputs=[]): """This is what the parsl app itself actually runs. It's basically just a wrapper around our QIIME 2 action. When this is initially called, args and kwargs may contain proxies that reference futures in inputs. By the time this starts executing, those futures will have resolved. We then need to take the resolved inputs and map the correct parts of them to the correct args/kwargs before calling the action with them. This is necessary because a single future in inputs will resolve into a Results object. We need to take singular Result objects off of that Results object and map them to the correct inputs for the action we want to call. """ args = [] for arg in mapped_args: unmapped = _unmap_arg(arg, inputs) args.append(unmapped) kwargs = {} for key, value in mapped_kwargs.items(): unmapped = _unmap_arg(value, inputs) kwargs[key] = unmapped # We with in the cache here to make sure archiver.load* puts things in the # right cache with ctx.cache: exe = action._bind( lambda: qiime2.sdk.Context(parent=ctx), execution_ctx) results = exe(*args, **kwargs) # If we are running a pipeline, we need to create a future here because # the parsl join app the pipeline was running in is expected to return # a future, but we will have concrete results by this point if we are a # pipeline if isinstance(action, Pipeline) and ctx.parallel: return _create_future(results) return results def _map_arg(arg, futures): """ Map a proxy artifact for input to a parsl action """ # We add this future to the list and create a new proxy with its index as # its future. if isinstance(arg, Proxy): futures.append(arg._future_) mapped = arg.__class__(len(futures) - 1, arg._selector_) # We do the above but for all elements in the collection elif isinstance(arg, list) and _is_all_proxies(arg): mapped = [] for proxy in arg: futures.append(proxy._future_) mapped.append(proxy.__class__(len(futures) - 1, proxy._selector_)) elif isinstance(arg, dict) and _is_all_proxies(arg): mapped = {} for key, value in arg.items(): futures.append(value._future_) mapped[key] = value.__class__(len(futures) - 1, value._selector_) # We just have a real artifact and don't need to map else: mapped = arg return mapped def _unmap_arg(arg, inputs): """ Unmap a proxy artifact given to a parsl action """ # We were hacky and set _future_ to be the index of this artifact in the # inputs list if isinstance(arg, Proxy): resolved_result = inputs[arg._future_] unmapped = arg._get_element_(resolved_result) # If we got a collection of proxies as the input we were even hackier and # added each proxy to the inputs list individually while having a list of # their indices in the args. elif isinstance(arg, list) and _is_all_proxies(arg): unmapped = [] for proxy in arg: resolved_result = inputs[proxy._future_] unmapped.append(proxy._get_element_(resolved_result)) elif isinstance(arg, dict) and _is_all_proxies(arg): unmapped = {} for key, value in arg.items(): resolved_result = inputs[value._future_] unmapped[key] = value._get_element_(resolved_result) # We didn't have a proxy at all else: unmapped = arg return unmapped def _is_all_proxies(collection): """ Returns whether the collection is all proxies or all artifacts. Raises a ValueError if there is a mix. """ if isinstance(collection, dict): collection = list(collection.values()) if all(isinstance(elem, Proxy) for elem in collection): return True if any(isinstance(elem, Proxy) for elem in collection): raise ValueError("Collection has mixed proxies and artifacts. " "This is not allowed.") return False @python_app def _create_future(results): """ This is a bit of a dumb hack. It's just a way for us to make pipelines return a future which is what Parsl wants a join_app to return even though we will have real results at this point. """ return results class Action(metaclass=abc.ABCMeta): """QIIME 2 Action""" type = 'action' _ProvCaptureCls = archive.ActionProvenanceCapture __call__ = LateBindingAttribute('_dynamic_call') asynchronous = LateBindingAttribute('_dynamic_async') parallel = LateBindingAttribute('_dynamic_parsl') # Converts a callable's signature into its wrapper's signature (i.e. # converts the "view API" signature into the "artifact API" signature). # Accepts a callable as input and returns a callable as output with # converted signature. @abc.abstractmethod def _callable_sig_converter_(self, callable): raise NotImplementedError # Executes a callable on the provided `view_args`, wrapping and returning # the callable's outputs. In other words, executes the "view API", wrapping # and returning the outputs as the "artifact API". `view_args` is a dict # mapping parameter name to unwrapped value (i.e. view). `view_args` # contains an entry for each parameter accepted by the wrapper. It is the # executor's responsibility to perform any additional transformations on # these parameters, or provide extra parameters, in order to execute the # callable. `output_types` is an OrderedDict mapping output name to QIIME # type (e.g. semantic type). @abc.abstractmethod def _callable_executor_(self, scope, view_args, output_types): raise NotImplementedError # Private constructor @classmethod def _init(cls, callable, signature, plugin_id, name, description, citations, deprecated, examples): """ Parameters ---------- callable : callable signature : qiime2.core.type.Signature plugin_id : str name : str Human-readable name for this action. description : str Human-readable description for this action. """ self = cls.__new__(cls) self.__init(callable, signature, plugin_id, name, description, citations, deprecated, examples) return self # This "extra private" constructor is necessary because `Action` objects # can be initialized from a static (classmethod) context or on an # existing instance (see `_init` and `__setstate__`, respectively). def __init(self, callable, signature, plugin_id, name, description, citations, deprecated, examples): self._callable = callable self.signature = signature self.plugin_id = plugin_id self.name = name self.description = description self.citations = citations self.deprecated = deprecated self.examples = examples self.id = callable.__name__ self._dynamic_call = self._get_callable_wrapper() self._dynamic_async = self._get_async_wrapper() # This a temp thing to play with parsl before integrating more deeply self._dynamic_parsl = self._get_parsl_wrapper() def __init__(self): raise NotImplementedError( "%s constructor is private." % self.__class__.__name__) @property def source(self): """ The source code for the action's callable. Returns ------- str The source code of this action's callable formatted as Markdown text. """ try: source = inspect.getsource(self._callable) except OSError: raise TypeError( "Cannot retrieve source code for callable %r" % self._callable.__name__) return markdown_source_template % {'source': source} def get_import_path(self, include_self=True): path = f'qiime2.plugins.{self.plugin_id}.{self.type}s' if include_self: path += f'.{self.id}' return path def __repr__(self): return "<%s %s>" % (self.type, self.get_import_path()) def __getstate__(self): return dill.dumps({ 'callable': self._callable, 'signature': self.signature, 'plugin_id': self.plugin_id, 'name': self.name, 'description': self.description, 'citations': self.citations, 'deprecated': self.deprecated, 'examples': self.examples, }) def __setstate__(self, state): self.__init(**dill.loads(state)) def _bind(self, context_factory, execution_ctx={'type': 'synchronous'}): """Bind an action to a Context factory, returning a decorated function. This is a very primitive API and should be used primarily by the framework and very advanced interfaces which need deep control over the calling semantics of pipelines and garbage collection. The basic idea behind this is outlined as follows: Every action is defined as an *instance* that a plugin constructs. This means that `self` represents the internal details as to what the action is. If you need to associate additional state with the *application* of an action, you cannot mutate `self` without changing all future applications. So there needs to be an additional instance variable that can serve as the state of a given application. We call this a Context object. It is also important that each application of an action has *independent* state, so providing an instance of Context won't work. We need a factory. Parameterizing the context is necessary because it is possible for an action to call other actions. The details need to be coordinated behind the scenes to the user, so we can parameterize the behavior by providing different context factories to `bind` at different points in the "call stack". """ def bound_callable(*args, **kwargs): # This function's signature is rewritten below using # `decorator.decorator`. When the signature is rewritten, # args[0] is the function whose signature was used to rewrite # this function's signature. args = args[1:] ctx = context_factory() # Set up a scope under which we can track destructable references # if something goes wrong, the __exit__ handler of this context # manager will clean up. (It also cleans up when things go right) with ctx as scope: provenance = self._ProvCaptureCls( self.type, self.plugin_id, self.id, execution_ctx) scope.add_reference(provenance) if self.deprecated: with qiime2.core.util.warning() as warn: warn(self._build_deprecation_message(), FutureWarning) # Type management collated_inputs = self.signature.collate_inputs( *args, **kwargs) self.signature.check_types(**collated_inputs) output_types = self.signature.solve_output(**collated_inputs) callable_args = self.signature.coerce_user_input( **collated_inputs) callable_args = \ self.signature.transform_and_add_callable_args_to_prov( provenance, **callable_args) outputs = self._callable_executor_( scope, callable_args, output_types, provenance) if len(outputs) != len(self.signature.outputs): raise ValueError( "Number of callable outputs must match number of " "outputs defined in signature: %d != %d" % (len(outputs), len(self.signature.outputs))) # Wrap in a Results object mapping output name to value so # users have access to outputs by name or position. results = qiime2.sdk.Results( self.signature.outputs.keys(), outputs) return results bound_callable = self._rewrite_wrapper_signature(bound_callable) self._set_wrapper_properties(bound_callable) self._set_wrapper_name(bound_callable, self.id) return bound_callable def _get_callable_wrapper(self): # This is a "root" level invocation (not a nested call within a # pipeline), so no special factory is needed. callable_wrapper = self._bind(qiime2.sdk.Context) self._set_wrapper_name(callable_wrapper, '__call__') return callable_wrapper def _get_async_wrapper(self): def async_wrapper(*args, **kwargs): # TODO handle this better in the future, but stop the massive error # caused by MacOSX asynchronous runs for now. try: import matplotlib as plt if plt.rcParams['backend'].lower() == 'macosx': raise EnvironmentError(backend_error_template % plt.matplotlib_fname()) except ImportError: pass # This function's signature is rewritten below using # `decorator.decorator`. When the signature is rewritten, args[0] # is the function whose signature was used to rewrite this # function's signature. args = args[1:] pool = concurrent.futures.ProcessPoolExecutor(max_workers=1) future = pool.submit( _subprocess_apply, self, qiime2.sdk.Context(), args, kwargs) # TODO: pool.shutdown(wait=False) caused the child process to # hang unrecoverably. This seems to be a bug in Python 3.7 # It's probably best to gut concurrent.futures entirely, so we're # ignoring the resource leakage for the moment. return future async_wrapper = self._rewrite_wrapper_signature(async_wrapper) self._set_wrapper_properties(async_wrapper) self._set_wrapper_name(async_wrapper, 'asynchronous') return async_wrapper def _bind_parsl(self, ctx, *args, **kwargs): futures = [] mapped_args = [] mapped_kwargs = {} # If this is the first time we called _bind_parsl on a pipeline, the # first argument will be the callable for the pipeline which we do not # want to pass on in this manner, so we skip it. if len(args) >= 1 and callable(args[0]): args = args[1:] # Parsl will queue up apps with futures as their arguments then not # execute the apps until the futures are resolved. This is an extremely # handy feature, but QIIME 2 does not play nice with it out of the box. # You can look in qiime2/sdk/proxy.py for some more details on how this # is working, but we are basically taking future QIIME 2 results and # mapping them to the correct inputs in the action we are trying to # call. This is necessary if we are running a pipeline in particular # because the inputs to the next action could contain outputs from the # last action that might not be resolved yet because Parsl may be # queueing the next action before the last one has completed. for arg in args: mapped = _map_arg(arg, futures) mapped_args.append(mapped) for key, value in kwargs.items(): mapped = _map_arg(value, futures) mapped_kwargs[key] = mapped # If the user specified a particular executor for a this action # determine that here executor = ctx.action_executor_mapping.get(self.id, 'default') execution_ctx = {'type': 'parsl'} # Pipelines run in join apps and are a sort of synchronization point # right now. Unfortunately it is not currently possible to make say a # pipeline that calls two other pipelines within it and execute both of # those internal pipelines simultaneously. if isinstance(self, qiime2.sdk.action.Pipeline): # If ctx._parent is None then this is the root pipeline and we want # to dispatch it to a join_app execution_ctx['parsl_type'] = 'DFK' if ctx._parent is None: # NOTE: Do not make this a python_app(join=True). We need it to # run in the parsl main thread future = join_app()( _run_parsl_action)(self, ctx, execution_ctx, mapped_args, mapped_kwargs, inputs=futures) # If there is a parent then this is not the root pipeline and we # want to just _bind it with a parallel context. The fact that # parallel is set on the context will cause ctx.get_action calls in # the pipeline to use the action's _bind_parsl method. else: return self._bind(lambda: qiime2.sdk.Context(ctx), execution_ctx=execution_ctx)(*args, **kwargs) else: execution_ctx['parsl_type'] = \ ctx.executor_name_type_mapping[executor] future = python_app( executors=[executor])( _run_parsl_action)(self, ctx, execution_ctx, mapped_args, mapped_kwargs, inputs=futures) collated_input = self.signature.collate_inputs(*args, **kwargs) output_types = self.signature.solve_output(**collated_input) # Again, we return a set of futures not a set of real results return qiime2.sdk.proxy.ProxyResults(future, output_types) def _get_parsl_wrapper(self): def parsl_wrapper(*args, **kwargs): # TODO: Maybe make this a warning instead? if not isinstance(self, Pipeline): raise ValueError('Only pipelines may be run in parallel') return self._bind_parsl(qiime2.sdk.Context(parallel=True), *args, **kwargs) parsl_wrapper = self._rewrite_wrapper_signature(parsl_wrapper) self._set_wrapper_properties(parsl_wrapper) self._set_wrapper_name(parsl_wrapper, 'parsl') return parsl_wrapper def _rewrite_wrapper_signature(self, wrapper): # Convert the callable's signature into the wrapper's signature and set # it on the wrapper. return decorator.decorator( wrapper, self._callable_sig_converter_(self._callable)) def _set_wrapper_name(self, wrapper, name): wrapper.__name__ = wrapper.__qualname__ = name def _set_wrapper_properties(self, wrapper): wrapper.__module__ = self.get_import_path(include_self=False) wrapper.__doc__ = self._build_numpydoc() wrapper.__annotations__ = self._build_annotations() # This is necessary so that `inspect` doesn't display the wrapped # function's annotations (the annotations apply to the "view API" and # not the "artifact API"). del wrapper.__wrapped__ def _build_annotations(self): annotations = {} for name, spec in self.signature.signature_order.items(): annotations[name] = spec.qiime_type output = [] for spec in self.signature.outputs.values(): output.append(spec.qiime_type) output = tuple(output) annotations["return"] = output return annotations def _build_numpydoc(self): numpydoc = [] numpydoc.append(textwrap.fill(self.name, width=75)) if self.deprecated: base_msg = textwrap.indent( textwrap.fill(self._build_deprecation_message(), width=72), ' ') numpydoc.append('.. deprecated::\n' + base_msg) numpydoc.append(textwrap.fill(self.description, width=75)) sig = self.signature parameters = self._build_section("Parameters", sig.signature_order) returns = self._build_section("Returns", sig.outputs) # TODO: include Usage-rendered examples here for section in (parameters, returns): if section: numpydoc.append(section) return '\n\n'.join(numpydoc) + '\n' def _build_section(self, header, iterable): section = [] if iterable: section.append(header) section.append('-'*len(header)) for key, value in iterable.items(): variable_line = ( "{item} : {type}".format(item=key, type=value.qiime_type)) if value.has_default(): variable_line += ", optional" section.append(variable_line) if value.has_description(): section.append(textwrap.indent(textwrap.fill( str(value.description), width=71), ' ')) return '\n'.join(section).strip() def _build_deprecation_message(self): return (f'This {self.type.title()} is deprecated and will be removed ' 'in a future version of this plugin.') class Method(Action): """QIIME 2 Method""" type = 'method' # Abstract method implementations: def _callable_sig_converter_(self, callable): # No conversion necessary. return callable def _callable_executor_(self, scope, view_args, output_types, provenance): output_views = self._callable(**view_args) output_views = tuplize(output_views) # TODO this won't work if the user has annotated their "view API" to # return a `typing.Tuple` with some number of components. Python will # return a tuple when there are multiple return values, and this length # check will fail because the tuple as a whole should be matched up to # a single output type instead of its components. This is an edgecase # due to how Python handles multiple returns, and can be worked around # by using something like `typing.List` instead. if len(output_views) != len(output_types): raise TypeError( "Number of output views must match number of output " "semantic types: %d != %d" % (len(output_views), len(output_types))) output_artifacts = \ self.signature.coerce_given_outputs(output_views, output_types, scope, provenance) return tuple(output_artifacts) @classmethod def _init(cls, callable, inputs, parameters, outputs, plugin_id, name, description, input_descriptions, parameter_descriptions, output_descriptions, citations, deprecated, examples): signature = qtype.MethodSignature(callable, inputs, parameters, outputs, input_descriptions, parameter_descriptions, output_descriptions) return super()._init(callable, signature, plugin_id, name, description, citations, deprecated, examples) class Visualizer(Action): """QIIME 2 Visualizer""" type = 'visualizer' # Abstract method implementations: def _callable_sig_converter_(self, callable): return DropFirstParameter.from_function(callable) def _callable_executor_(self, scope, view_args, output_types, provenance): # TODO use qiime2.plugin.OutPath when it exists, and update visualizers # to work with OutPath instead of str. Visualization._from_data_dir # will also need to be updated to support OutPath instead of str. with tempfile.TemporaryDirectory(prefix='qiime2-temp-') as temp_dir: ret_val = self._callable(output_dir=temp_dir, **view_args) if ret_val is not None: raise TypeError( "Visualizer %r should not return anything. " "Received %r as a return value." % (self, ret_val)) provenance.output_name = 'visualization' viz = qiime2.sdk.Visualization._from_data_dir(temp_dir, provenance) viz = scope.add_parent_reference(viz) return (viz, ) @classmethod def _init(cls, callable, inputs, parameters, plugin_id, name, description, input_descriptions, parameter_descriptions, citations, deprecated, examples): signature = qtype.VisualizerSignature(callable, inputs, parameters, input_descriptions, parameter_descriptions) return super()._init(callable, signature, plugin_id, name, description, citations, deprecated, examples) class Pipeline(Action): """QIIME 2 Pipeline""" type = 'pipeline' _ProvCaptureCls = archive.PipelineProvenanceCapture def _callable_sig_converter_(self, callable): return DropFirstParameter.from_function(callable) def _callable_executor_(self, scope, view_args, output_types, provenance): outputs = self._callable(scope.ctx, **view_args) # Just make sure we have an iterable even if there was only one output outputs = tuplize(outputs) # Make sure any collections returned are in the form of # ResultCollections and that futures are resolved # # TODO: Ideally we would not need to resolve futures here as this # prevents us from properly parallelizing nested pipelines outputs = self._coerce_pipeline_outputs(outputs) for output in outputs: if isinstance(output, qiime2.sdk.ResultCollection): for elem in output.values(): if not isinstance(elem, qiime2.sdk.Result): raise TypeError("Pipelines must return `Result` " "objects, not %s" % (type(elem), )) elif not isinstance(output, qiime2.sdk.Result): raise TypeError("Pipelines must return `Result` objects, " "not %s" % (type(output), )) # This condition *is* tested by the caller of _callable_executor_, but # the kinds of errors a plugin developer see will make more sense if # this check happens before the subtype check. Otherwise forgetting an # output would more likely error as a wrong type, which while correct, # isn't root of the problem. if len(outputs) != len(output_types): raise TypeError( "Number of outputs must match number of output " "semantic types: %d != %d" % (len(outputs), len(output_types))) results = [] for output, (name, spec) in zip(outputs, output_types.items()): # If we don't have a Result, we should have a collection, if we # have neither, or our types just don't match up, something bad # happened if isinstance(output, qiime2.sdk.Result) and \ (output.type <= spec.qiime_type): prov = provenance.fork(name, output) scope.add_reference(prov) aliased_result = output._alias(prov) aliased_result = scope.add_parent_reference(aliased_result) results.append(aliased_result) elif spec.qiime_type.name == 'Collection' and \ output.collection in spec.qiime_type: size = len(output) aliased_output = qiime2.sdk.ResultCollection() for idx, (key, value) in enumerate(output.items()): collection_name = create_collection_name( name=name, key=key, idx=idx, size=size) prov = provenance.fork(collection_name, value) scope.add_reference(prov) aliased_result = value._alias(prov) aliased_result = scope.add_parent_reference(aliased_result) aliased_output[str(key)] = aliased_result results.append(aliased_output) else: _type = output.type if isinstance(output, qiime2.sdk.Result) \ else type(output) raise TypeError( "Expected output type %r, received %r" % (spec.qiime_type, _type)) if len(results) != len(self.signature.outputs): raise ValueError( "Number of callable outputs must match number of " "outputs defined in signature: %d != %d" % (len(results), len(self.signature.outputs))) return tuple(results) def _coerce_pipeline_outputs(self, outputs): """Ensure all futures are resolved and all collections are of type ResultCollection """ coerced_outputs = [] for output in outputs: # Handle proxy outputs if isinstance(output, Proxy): output = output.result() # Handle collection outputs if isinstance(output, dict) or \ isinstance(output, list): output = qiime2.sdk.ResultCollection(output) # Handle proxies as elements of collections for key, value in output.items(): if isinstance(value, Proxy): output[key] = value.result() coerced_outputs.append(output) return tuple(coerced_outputs) @classmethod def _init(cls, callable, inputs, parameters, outputs, plugin_id, name, description, input_descriptions, parameter_descriptions, output_descriptions, citations, deprecated, examples): signature = qtype.PipelineSignature(callable, inputs, parameters, outputs, input_descriptions, parameter_descriptions, output_descriptions) return super()._init(callable, signature, plugin_id, name, description, citations, deprecated, examples) markdown_source_template = """ ```python %(source)s ``` """ # TODO add unit test for callables raising this backend_error_template = """ Your current matplotlib backend (MacOSX) does not work with asynchronous calls. A recommended backend is Agg, and can be changed by modifying your matplotlibrc "backend" parameter, which can be found at: \n\n %s """ qiime2-2024.5.0/qiime2/sdk/actiongraph.py000066400000000000000000000156451462552636000177760ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from itertools import product, chain import networkx as nx import copy import qiime2 def get_next_arguments(action, type="input"): """ Get a tuple of required/nonrequired inputs or outputs for each method Parameters ---------- action : Qiime2.action type : {"input", "param", "output"} Delineates if getting the action input, param, or output types Returns ------- List of tuples containing name and required semantic types List of tuples containing name and optional semantic types """ req = [] non_req = [] if type == "input": for k, v in action.signature.inputs.items(): if not v.has_default(): req.append([k, v.qiime_type]) else: non_req.append(["."+k, v.qiime_type]) elif type == "param": for k, v in action.signature.parameters.items(): if not v.has_default(): req.append([k, v.qiime_type]) else: non_req.append(["."+k, v.qiime_type]) else: for k, v in action.signature.outputs.items(): if not v.has_default(): req.append([k, v.qiime_type]) else: non_req.append(["."+k, v.qiime_type]) return req, non_req def unravel(list_): """ Unravel Union node to get all permutations of types for each action Parameters ---------- list : list of Qiime2.types Returns ------- list of lists - list of permuations of types for each action """ result = [list_] for i, x in enumerate(list_): if len(list(x[1])) > 1: members = list(x[1]) temp = copy.deepcopy(result) # update result with first element of types in member for each_list in result: each_list[i][1] = members[0] # add in other permutations of types in member for n in range(1, len(members)): copy_result = copy.deepcopy(temp) for each_list in copy_result: each_list[i][1] = members[n] result += copy_result return result def generate_nodes_by_action(action, opt=False): """ Given a method, generates all combinations of inputs and outputs for that particular method and and stores the combinations as dictionaries in a resulting list. Parameters ---------- method : Qiime2.action opt : {True, False} Delineates if optional types should be included Returns ------- list of dictionaries - each dictionary is a combination inputs and outputs for particular node """ input, input_nr = get_next_arguments(action, "input") param, param_nr = get_next_arguments(action, "param") output, output_nr = get_next_arguments(action, "output") input = unravel(input) param = unravel(param) opt_in_list = [] if opt: opt_in_list += input_nr opt_in_list += param_nr opt_in_list = unravel(opt_in_list) ins = [dict(x) for x in [list(chain.from_iterable(i)) for i in list(product(input, param, opt_in_list))]] outs = dict(output + output_nr) results = [{'inputs': i, 'outputs': outs} for i in ins] return results ins = [dict(x) for x in [list(chain.from_iterable(i)) for i in list(product(input, param))]] outs = dict(output) results = [{'inputs': i, 'outputs': outs} for i in ins] return results def build_graph(action_list=[], opt=False): """ Constructs a networkx graph with different semantic types and actions as nodes Parameters ---------- action_list : list of Qiime2.action If list is empty, will pull from all methods in the Qiime2 plugin opt : {True, False} Delineates if optional types should be included in the graph Returns ------- nx.DiGraph - networkx graph connected based on all or specified methods """ G = nx.DiGraph() G.edges(data=True) # get all actions or specifc actions if specified in sigs pm = qiime2.sdk.PluginManager() if not action_list: for _, pg in pm.plugins.items(): action_list += list(pg.actions.values()) for action in action_list: results = generate_nodes_by_action(action, opt) for dict_ in results: for k, v in dict_.items(): if not v: continue # renaming dictionary to remove '.' action_node = {} for x, y in v.items(): if x[0] == '.': action_node[x[1:]] = y else: action_node[x] = y dict_[k] = action_node if not G.has_node(str(dict_)): G.add_node(str(dict_), value=action, node='action') if k == 'inputs': for in_k, in_v in v.items(): if not in_v: continue if in_k[0] == '.': name = "opt_"+str(in_v) G.add_edge(name, str(dict_)) G[name][str(dict_)]['name'] = in_k[1:] G.nodes[name]['type'] = in_v G.nodes[name]['optional'] = True G.nodes[name]['node'] = 'type' else: G.add_edge(in_v, str(dict_)) G[in_v][str(dict_)]['name'] = in_k G.nodes[in_v]['type'] = in_v G.nodes[in_v]['optional'] = False G.nodes[in_v]['node'] = 'type' else: for out_k, out_v in v.items(): if not out_v: continue if out_k[0] == '.': name = "opt_"+str(out_v) G.add_edge("opt_"+str(out_v), str(dict_)) G[str(dict_)][name]['name'] = out_k[1:] G.nodes[name]['type'] = in_v G.nodes[name]['optional'] = True G.nodes[name]['node'] = 'type' else: G.add_edge(str(dict_), out_v) G[str(dict_)][out_v]['name'] = out_k G.nodes[out_v]['type'] = out_v G.nodes[out_v]['optional'] = False G.nodes[out_v]['node'] = 'type' return G qiime2-2024.5.0/qiime2/sdk/context.py000066400000000000000000000256421462552636000171610ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- from qiime2.core.type.util import is_collection_type from qiime2.core.type import HashableInvocation from qiime2.core.cache import get_cache import qiime2.sdk from qiime2.sdk.parallel_config import PARALLEL_CONFIG class Context: def __init__(self, parent=None, parallel=False): if parent is not None: self.action_executor_mapping = parent.action_executor_mapping self.executor_name_type_mapping = parent.executor_name_type_mapping self.parallel = parent.parallel self.cache = parent.cache else: self.action_executor_mapping = \ PARALLEL_CONFIG.action_executor_mapping # Cast type to str so yaml doesn't think it needs tpo instantiate # an executor object when we write this to then read this from # provenance self.executor_name_type_mapping = \ None if PARALLEL_CONFIG.parallel_config is None \ else {v.label: v.__class__.__name__ for v in PARALLEL_CONFIG.parallel_config.executors} self.parallel = parallel self.cache = get_cache() # Only ever do this on the root context. We only want to index the # pool once before we start adding our own stuff to it. if self.cache.named_pool is not None: self.cache.named_pool.create_index() self._parent = parent self._scope = None def get_action(self, plugin: str, action: str): """Return a function matching the callable API of an action. This function is aware of the pipeline context and manages its own cleanup as appropriate. """ plugin = plugin.replace('_', '-') plugin_action = plugin + ':' + action pm = qiime2.sdk.PluginManager() try: plugin_obj = pm.plugins[plugin] except KeyError: raise ValueError("A plugin named %r could not be found." % plugin) try: action_obj = plugin_obj.actions[action] except KeyError: raise ValueError( "An action named %r was not found for plugin %r" % (action, plugin)) # We return this callable which determines whether to return cached # results or to run the action requested. def deferred_action(*args, **kwargs): # The function is the first arg, we ditch that args = args[1:] # If we have a named_pool, we need to check for cached results that # we can reuse. # # We can short circuit our index checking if any of our arguments # are proxies because if we got a proxy as an argument, we know it # is a new thing we are computing from a prior step in the pipeline # and thus will not be cached. We can only have proxies if we are # executing with parsl if self.cache.named_pool is not None and (not self.parallel or ( self.parallel and not self._contains_proxies( *args, **kwargs))): collated_inputs = action_obj.signature.collate_inputs( *args, **kwargs) callable_args = action_obj.signature.coerce_user_input( **collated_inputs) # Make args and kwargs look how they do when we read them out # of a .yaml file (list of single value dicts of # input_name: value) arguments = [] for k, v in callable_args.items(): arguments.append({k: v}) invocation = HashableInvocation(plugin_action, arguments) if invocation in self.cache.named_pool.index: cached_outputs = self.cache.named_pool.index[invocation] loaded_outputs = {} for name, _type in action_obj.signature.outputs.items(): if is_collection_type(_type.qiime_type): loaded_collection = qiime2.sdk.ResultCollection() cached_collection = cached_outputs[name] # Get the order we should load collection items in collection_order = list(cached_collection.keys()) self._validate_collection(collection_order) collection_order.sort(key=lambda x: x.idx) for elem_info in collection_order: elem = cached_collection[elem_info] loaded_elem = self.cache.named_pool.load(elem) loaded_collection[ elem_info.item_name] = loaded_elem loaded_outputs[name] = loaded_collection else: output = cached_outputs[name] loaded_outputs[name] = \ self.cache.named_pool.load(output) return qiime2.sdk.Results( loaded_outputs.keys(), loaded_outputs.values()) # If we didn't have cached results to reuse, we need to execute the # action. # # These factories will create new Contexts with this context as # their parent. This allows scope cleanup to happen recursively. A # factory is necessary so that independent applications of the # returned callable receive their own Context objects. # # The parsl factory is a bit more complicated because we need to # pass this exact Context along for a while longer until we run a # normal _bind in action/_run_parsl_action. Then we create a new # Context with this one as its parent inside of the parsl app def _bind_parsl_context(ctx): def _bind_parsl_args(*args, **kwargs): return action_obj._bind_parsl(ctx, *args, **kwargs) return _bind_parsl_args if self.parallel: return _bind_parsl_context(self)(*args, **kwargs) return action_obj._bind( lambda: Context(parent=self))(*args, **kwargs) deferred_action = action_obj._rewrite_wrapper_signature( deferred_action) action_obj._set_wrapper_properties(deferred_action) return deferred_action def _contains_proxies(self, *args, **kwargs): """Returns True if any of the args or kwargs are proxies """ return any(isinstance(arg, qiime2.sdk.proxy.Proxy) for arg in args) \ or any(isinstance(value, qiime2.sdk.proxy.Proxy) for value in kwargs.values()) def _validate_collection(self, collection_order): """Validate that all indexed items in the collection agree on how large the collection should be and that we have that many elements. """ assert all([elem.total == collection_order[0].total for elem in collection_order]) assert len(collection_order) == collection_order[0].total def make_artifact(self, type, view, view_type=None): """Return a new artifact from a given view. This artifact is automatically tracked and cleaned by the pipeline context. """ artifact = qiime2.sdk.Artifact.import_data(type, view, view_type) # self._scope WILL be defined at this point, as pipelines always enter # a scope before deferring to plugin code. (Otherwise cleanup wouldn't # happen) self._scope.add_reference(artifact) return artifact def __enter__(self): """For internal use only. Creates a scope API that can track references that need to be destroyed. """ if self._scope is not None: # Prevent odd things from happening to lifecycle cleanup raise Exception('Cannot enter a context twice.') self._scope = Scope(self) return self._scope def __exit__(self, exc_type, exc_value, exc_tb): if exc_type is not None: # Something went wrong, teardown everything self._scope.destroy() else: # Everything is fine, just cleanup internal references and pass # ownership off to the parent context. parent_refs = self._scope.destroy(local_references_only=True) if self._parent is not None and self._parent._scope is not None: for ref in parent_refs: self._parent._scope.add_reference(ref) class Scope: def __init__(self, ctx): self.ctx = ctx self._locals = [] self._parent_locals = [] def add_reference(self, ref): """Add a reference to something destructable that is owned by this scope. """ self._locals.append(ref) # NOTE: We end up with both the artifact and the pipeline alias of artifact # in the named cache in the end. We only have the pipeline alias in the # process pool def add_parent_reference(self, ref): """Add a reference to something destructable that will be owned by the parent scope. The reason it needs to be tracked is so that on failure, a context can still identify what will (no longer) be returned. """ new_ref = self.ctx.cache.process_pool.save(ref) if self.ctx.cache.named_pool is not None: self.ctx.cache.named_pool.save(new_ref) self._parent_locals.append(new_ref) self._parent_locals.append(ref) # Return an artifact backed by the data in the cache return new_ref def destroy(self, local_references_only=False): """Destroy all references and clear state. Parameters ---------- local_references_only : bool Whether to destroy references that will belong to the parent scope. Returns ------- list The list of references that were not destroyed. """ ctx = self.ctx local_refs = self._locals parent_refs = self._parent_locals # Unset instance state, handy to prevent cycles in GC, and also causes # catastrophic failure if some invariant is violated. del self.ctx del self._locals del self._parent_locals for ref in local_refs: ref._destructor() if local_references_only: return parent_refs for ref in parent_refs: ref._destructor() ctx.cache.garbage_collection() return [] qiime2-2024.5.0/qiime2/sdk/parallel_config.py000066400000000000000000000265601462552636000206160ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2022, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import psutil import appdirs import threading import importlib import parsl import tomlkit PARALLEL_CONFIG = threading.local() PARALLEL_CONFIG.dfk = None PARALLEL_CONFIG.parallel_config = None PARALLEL_CONFIG.action_executor_mapping = {} # We write a default config to a location in the conda env if there is an # active conda env. If there is not an active conda env (most likely because we # are using Docker) then the path we want to write the default to will not # exist, so we will not write a default, we will just load it from memory CONDA_PREFIX = os.environ.get('CONDA_PREFIX', '') VENDORED_FP = os.path.join(CONDA_PREFIX, 'etc', 'qiime2_config.toml') VENDORED_CONFIG = { 'parsl': { 'strategy': 'None', 'executors': [ {'class': 'ThreadPoolExecutor', 'label': 'default', 'max_threads': max(psutil.cpu_count() - 1, 1)}, {'class': 'HighThroughputExecutor', 'label': 'htex', 'max_workers': max(psutil.cpu_count() - 1, 1), 'provider': {'class': 'LocalProvider'}} ] } } # As near as I can tell, loading a config with a HighThroughputExecutor leaks # open sockets. This leads to issues (especially on osx) with "too many open # files" errors while running the test, so this test config with no # HighThroughputExecutor was created to mitigate that scenario. This config is # only to be used in tests that do not specifically need to test multiple # different executors _TEST_CONFIG_ = { 'parsl': { 'strategy': 'None', 'executors': [ {'class': 'ThreadPoolExecutor', 'label': 'default', 'max_threads': 1}, {'class': '_TEST_EXECUTOR_', 'label': 'test', 'max_threads': 1} ] } } # Directs keys in the config whose values need to be objects to the module that # contains the class they need to instantiate PARSL_CHANNEL = 'parsl.channels' PARSL_DATA_PROVIDER = 'parsl.data_provider' PARSL_EXECUTOR = 'parsl.executors' PARSL_LAUNCHER = 'parsl.launchers' PARSL_MONITORING = 'parsl.monitoring' PARSL_PROVIDER = 'parsl.providers' module_paths = { 'channel': PARSL_CHANNEL, 'channels': PARSL_CHANNEL, 'data_provider': PARSL_DATA_PROVIDER, 'data_providers': PARSL_DATA_PROVIDER, 'executor': PARSL_EXECUTOR, 'executors': PARSL_EXECUTOR, 'launcher': PARSL_LAUNCHER, 'launchers': PARSL_LAUNCHER, 'monitoring': PARSL_MONITORING, 'provider': PARSL_PROVIDER, 'providers': PARSL_PROVIDER } def _setup_parallel(): """Sets the parsl config and action executor mapping to the values set on the thread local """ parallel_config = PARALLEL_CONFIG.parallel_config mapping = PARALLEL_CONFIG.action_executor_mapping # If we are running tests, get the test config not the normal one if os.environ.get('QIIMETEST') is not None and parallel_config is None: _parallel_config, _mapping = get_config_from_dict(_TEST_CONFIG_) else: # If they did not already supply a config, we get the vendored one if parallel_config is None: config_fp = _get_vendored_config() # If we are not in a conda environment, we may not get an fp back # (because the vendored fp uses the conda prefix), so we load from # the vendored dict. Otherwise we load from the vendored file if config_fp is not None: _parallel_config, _mapping = get_config_from_file(config_fp) else: _parallel_config, _mapping = \ get_config_from_dict(VENDORED_CONFIG) # If they did not supply a parallel_config, set the vendored one if parallel_config is None: PARALLEL_CONFIG.parallel_config = _parallel_config # If they did not supply a mapping, set the vendored one if mapping is {}: PARALLEL_CONFIG.action_executor_mapping = _mapping PARALLEL_CONFIG.dfk = parsl.load(PARALLEL_CONFIG.parallel_config) def _cleanup_parallel(): """Ask parsl to cleanup and then remove the currently active dfk """ PARALLEL_CONFIG.dfk.cleanup() parsl.clear() def get_config_from_file(config_fp): """Takes a config filepath and determines if the file exists and if so if it contains parsl config info. """ with open(config_fp, 'r') as fh: # After parsing the file tomlkit has the data wrapped in its own # proprietary classes. Unwrap recursively turns these classes into # Python built-ins # # ex: tomlkit.items.Int -> int # # issue caused by this wrapping # https://github.com/Parsl/parsl/issues/3027 config_dict = tomlkit.load(fh).unwrap() return get_config_from_dict(config_dict) def get_config_from_dict(config_dict): parallel_config_dict = config_dict.get('parsl') mapping = parallel_config_dict.pop('executor_mapping', {}) processed_parallel_config_dict = _process_config(parallel_config_dict) if processed_parallel_config_dict != {}: parallel_config = parsl.Config(**processed_parallel_config_dict) else: parallel_config = None return parallel_config, mapping def _get_vendored_config(): # 1. Check envvar config_fp = os.environ.get('QIIME2_CONFIG') if config_fp is None: # 2. Check in user writable location # appdirs.user_config_dir(appname='qiime2', author='...') if os.path.exists(fp_ := os.path.join( appdirs.user_config_dir('qiime2'), 'qiime2_config.toml')): config_fp = fp_ # 3. Check in admin writable location # /etc/ # site_config_dir # appdirs.site_config_dir(appname='qiime2, author='...') elif os.path.exists(fp_ := os.path.join( appdirs.site_config_dir('qiime2'), 'qiime2_config.toml')): config_fp = fp_ # NOTE: These next two are dependent on us being in a conda environment # 4. Check in conda env # ~/miniconda3/env/{env_name}/conf elif CONDA_PREFIX != '' and os.path.exists(fp_ := VENDORED_FP): config_fp = fp_ # 5. Write the vendored config to the vendored location and use # that elif CONDA_PREFIX != '': with open(VENDORED_FP, 'w') as fh: tomlkit.dump(VENDORED_CONFIG, fh) config_fp = VENDORED_FP return config_fp def _process_config(config_dict): """Takes a path to a toml file describing a parsl.Config object and parses it into a dictionary of kwargs that can be used to instantiate a parsl.Config object. """ config_kwargs = {} for key, value in config_dict.items(): # Parsl likes the string 'none' as opposed to None or 'None' if isinstance(value, str) and value.lower() == 'none': config_kwargs[key] = value.lower() # We have a list of values elif isinstance(value, list): config_kwargs[key] = [] for item in value: config_kwargs[key].append(_process_key(key, item)) # We have a single value else: config_kwargs[key] = _process_key(key, config_dict[key]) return config_kwargs def _process_key(key, value): """Takes a key given in the parsl config file and turns its value into the correct data type or class instance to be used in instantiating a parsl.Config object. """ # Our key points to a list if isinstance(value, list): processed_value = [] for item in value: processed_value.append(_process_key(key, item)) return processed_value # Our key needs to point to some object. elif key in module_paths: # Get the module our class is from module = importlib.import_module(module_paths[key]) _type = value['class'] if _type == '_TEST_EXECUTOR_': # Only used for tests cls = _TEST_EXECUTOR_ else: # Get the class we need to instantiate cls = getattr(module, value['class']) # Get the kwargs we need to pass to the class constructor kwargs = {} for k, v in value.items(): # We already handled this key if k != 'class': kwargs[k] = _process_key(k, v) # Instantiate the class return cls(**kwargs) # Our key points to primitive data else: return value class ParallelConfig(): def __init__(self, parallel_config=None, action_executor_mapping={}): """Tell QIIME 2 how to parsl from the Python API action_executor_mapping: maps actions to executors. All unmapped actions will be run on the default executor. We check which executor a given action is supposed to use when we get ready to run the action, so errors will only occur if an action that is being run in a given QIIME 2 invocation has been mapped to an executor that does not exist parallel_config: Specifies which executors should be created and how they should be created. If this is None, it will use the default config. """ self.parallel_config = parallel_config self.action_executor_mapping = action_executor_mapping def __enter__(self): """Set this to be our Parsl config on the current thread local """ if PARALLEL_CONFIG.parallel_config is not None: raise ValueError('ParallelConfig already loaded, cannot nest ' 'ParallelConfigs') PARALLEL_CONFIG.parallel_config = self.parallel_config PARALLEL_CONFIG.action_executor_mapping = self.action_executor_mapping _setup_parallel() def __exit__(self, *args): """Set our Parsl config back to whatever it was before this one """ _cleanup_parallel() PARALLEL_CONFIG.dfk = None PARALLEL_CONFIG.parallel_config = None PARALLEL_CONFIG.action_executor_mapping = {} def _check_env(cls): if 'QIIMETEST' not in os.environ: raise ValueError( f"Do not instantiate the class '{cls}' when not testing") class _MASK_CONDA_ENV_(): """Used to test config loading behavior when outside of a conda environment """ def __init__(self): _check_env(self.__class__) def __enter__(self): global CONDA_PREFIX, VENDORED_FP self.old_prefix = CONDA_PREFIX self.old_fp = VENDORED_FP CONDA_PREFIX = '' VENDORED_FP = None def __exit__(self, *args): global CONDA_PREFIX, VENDORED_FP CONDA_PREFIX = self.old_prefix VENDORED_FP = self.old_fp class _TEST_EXECUTOR_(parsl.executors.threads.ThreadPoolExecutor): """We needed multiple kinds of executor to ensure we were mapping things correctly, but the HighThroughputExecutor was leaking sockets, so we avoid creating those during the tests because so many sockets were being opened that we were getting "Too many open files" errors, so this gets used as the second executor type.""" def __init__(self, *args, **kwargs): _check_env(self.__class__) super(_TEST_EXECUTOR_, self).__init__(*args, **kwargs) qiime2-2024.5.0/qiime2/sdk/plugin_manager.py000066400000000000000000000360041462552636000204570ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import os import pkg_resources import enum import qiime2.core.type from qiime2.core.format import FormatBase from qiime2.plugin.model import SingleFileDirectoryFormatBase from qiime2.core.validate import ValidationObject from qiime2.sdk.util import parse_type from qiime2.core.type import is_semantic_type class GetFormatFilters(enum.Flag): EXPORTABLE = enum.auto() IMPORTABLE = enum.auto() class UninitializedPluginManagerError(Exception): pass class PluginManager: entry_point_group = 'qiime2.plugins' __instance = None @classmethod def iter_entry_points(cls): """Yield QIIME 2 plugin entry points. If the QIIMETEST environment variable is set, only the framework testing plugin entry point (`dummy-plugin`) will be yielded. Otherwise, all available plugin entry points (excluding `dummy-plugin`) will be yielded. """ for entry_point in pkg_resources.iter_entry_points( group=cls.entry_point_group): if 'QIIMETEST' in os.environ: if entry_point.name in ('dummy-plugin', 'other-plugin'): yield entry_point else: if entry_point.name not in ('dummy-plugin', 'other-plugin'): yield entry_point @classmethod def reuse_existing(cls): if cls.__instance is not None: return cls.__instance raise UninitializedPluginManagerError # This class is a singleton as it is slow to create, represents the # state of a qiime2 installation, and is needed *everywhere* def __new__(cls, add_plugins=True): if cls.__instance is None: self = super().__new__(cls) cls.__instance = self try: self._init(add_plugins=add_plugins) except Exception: cls.__instance = None raise else: if add_plugins is False: raise ValueError( 'PluginManager singleton already exists, cannot change ' 'current value for `add_plugins`.') return cls.__instance def forget_singleton(self): """Allows later instatiation of PluginManager to produce new object This is done by clearing class member which saves the instance. This will NOT invalidate or remove the object this method is called on. """ self.__class__.__instance = None def _init(self, add_plugins): self.plugins = {} self.type_fragments = {} self._plugin_by_id = {} self.semantic_types = {} self.transformers = collections.defaultdict(dict) self._reverse_transformers = collections.defaultdict(dict) self.formats = {} self.views = {} self.artifact_classes = {} self._ff_to_sfdf = {} self.validators = {} if add_plugins: # These are all dependent loops, each requires the loop above it to # be completed. for entry_point in self.iter_entry_points(): project_name = entry_point.dist.project_name package = entry_point.module_name.split('.')[0] plugin = entry_point.load() self.add_plugin(plugin, package, project_name, consistency_check=False) self._consistency_check() def _consistency_check(self): for semantic_type, validator_obj in self.validators.items(): validator_obj.assert_transformation_available( self.get_directory_format(semantic_type)) def add_plugin(self, plugin, package=None, project_name=None, consistency_check=True): self.plugins[plugin.name] = plugin self._plugin_by_id[plugin.id] = plugin if plugin.package is None: plugin.package = package if plugin.project_name is None: plugin.project_name = project_name # validate _after_ applying arguments if plugin.package is None: raise ValueError( 'No value specified for package - must provide a value for ' '`package` or set `plugin.package`.') if plugin.project_name is None: raise ValueError( 'No value specified for project_name - must proved a value ' 'for `project_name` or set `plugin.project_name`.') self._integrate_plugin(plugin) plugin.freeze() if consistency_check is True: return self._consistency_check() def get_plugin(self, *, id=None, name=None): if id is None and name is None: raise ValueError("No plugin requested.") elif id is not None: try: return self._plugin_by_id[id] except KeyError: raise KeyError('No plugin currently registered ' 'with id: "%s".' % (id,)) else: try: return self.plugins[name] except KeyError: raise KeyError('No plugin currently registered ' 'with name: "%s".' % (name,)) def _integrate_plugin(self, plugin): for type_name, type_record in plugin.type_fragments.items(): if type_name in self.type_fragments: conflicting_type_record = \ self.type_fragments[type_name] raise ValueError("Duplicate semantic type (%r) defined in" " plugins: %r and %r" % (type_name, type_record.plugin.name, conflicting_type_record.plugin.name)) self.type_fragments[type_name] = type_record for (input, output), transformer_record in plugin.transformers.items(): if output in self.transformers[input]: raise ValueError("Transformer from %r to %r already exists." % (input, output)) self.transformers[input][output] = transformer_record self._reverse_transformers[output][input] = transformer_record for name, record in plugin.views.items(): if name in self.views: raise NameError( "Duplicate view registration (%r) defined in plugins: %r" " and %r" % (name, record.plugin.name, self.formats[name].plugin.name) ) self.views[name] = record for name, record in plugin.formats.items(): fmt = record.format if issubclass( fmt, qiime2.plugin.model.SingleFileDirectoryFormatBase): if fmt.file.format in self._ff_to_sfdf.keys(): self._ff_to_sfdf[fmt.file.format].add(fmt) else: self._ff_to_sfdf[fmt.file.format] = {fmt} # TODO: remove this when `sniff` is removed if hasattr(fmt, 'sniff') and hasattr(fmt, '_validate_'): raise RuntimeError( 'Format %r registered in plugin %r defines sniff and' '_validate_ methods - only one is permitted.' % (name, record.plugin.name) ) self.formats[name] = record for name, record in plugin.artifact_classes.items(): if name in self.artifact_classes: raise NameError( "Duplicate artifact class registration (%r) defined in " "plugins %r and %r." % (name, record.plugin.name, self.artifact_classes[name].plugin.name) ) else: self.artifact_classes[name] = record for semantic_type, validation_object in plugin.validators.items(): if semantic_type not in self.validators: self.validators[semantic_type] = \ ValidationObject(semantic_type) self.validators[semantic_type].add_validation_object( validation_object) def get_semantic_types(self): result = {} for plugin in self.plugins.values(): result.update(plugin.artifact_classes) return result # TODO: Should plugin loading be transactional? i.e. if there's # something wrong, the entire plugin fails to load any piece, like a # databases rollback/commit def get_formats(self, *, filter=None, semantic_type=None): """ get_formats(self, *, filter=None, semantic_type=None) filter : enum | "IMPORTABLE" | "EXPORTABLE" filter is an enum integer that will be used to determine user input to output specified formats semantic_type : TypeExpression | String The semantic type is used to filter the formats associated with that specific semantic type This method will filter out the formats using the filter provided by the user and the semantic type. The return is a dictionary of filtered formats keyed on their string names. """ filter_map = {"IMPORTABLE": GetFormatFilters.IMPORTABLE, "EXPORTABLE": GetFormatFilters.EXPORTABLE} if filter is not None and not isinstance(filter, GetFormatFilters) \ and filter not in filter_map: raise ValueError( f"The provided format filter {filter} is not valid.") if isinstance(filter, str): filter = filter_map[filter] if semantic_type is None: formats = set(f.format for f in self.artifact_classes.values()) else: formats = set() if isinstance(semantic_type, str): semantic_type = parse_type(semantic_type, "semantic") if is_semantic_type(semantic_type): for artifact_class in self.artifact_classes.values(): if semantic_type <= artifact_class.semantic_type: formats.add(artifact_class.format) break if not formats: raise ValueError("No formats associated with the type " f"{semantic_type}.") else: raise ValueError(f"{semantic_type} is not a valid semantic " "type.") transformable_formats = set(formats) if filter is None or GetFormatFilters.IMPORTABLE in filter: transformable_formats.update( self._get_formats_helper(formats, self._reverse_transformers)) if filter is None or GetFormatFilters.EXPORTABLE in filter: transformable_formats.update( self._get_formats_helper(formats, self.transformers)) result_formats = {} for format_ in transformable_formats: format_ = format_.__name__ result_formats[format_] = self.formats[format_] return result_formats def _get_formats_helper(self, formats, transformer_dict): """ _get_formats_helper(self, formats, transformer_dict) formats : Set[DirectoryFormat] We are finding all formats that are one transformer away from formats in this set tranformer_dict : Dict[ str, Dict[str, TransformerReord]] The dictionary of transformers allows the method to get formats that are transformable from the given format This method creates a set utilizing the transformers dictionary and the formats set to get related formats for a specific format. """ query_set = set(formats) for format_ in formats: if issubclass(format_, SingleFileDirectoryFormatBase): if format_.file.format.__name__ in self.formats: query_set.add(format_.file.format) result_formats = set(query_set) for format_ in query_set: for transformed_format in transformer_dict[format_]: if issubclass(transformed_format, FormatBase): result_formats.add(transformed_format) if issubclass(transformed_format, SingleFileDirectoryFormatBase): result_formats.add(transformed_format.file.format) if transformed_format in self._ff_to_sfdf: result_formats.update( self._ff_to_sfdf[transformed_format]) return result_formats @property def type_formats(self): # self.type_formats was replaced with self.artifact_classes - this # property provides backward compatibility return list(self.artifact_classes.values()) @property def importable_formats(self): """Formats that are importable. A format is importable in a QIIME 2 deployment if it can be transformed into at least one of the canonical semantic type formats. """ return self.get_formats(filter=GetFormatFilters.IMPORTABLE) @property def exportable_formats(self): """Formats that are exportable. A format is exportable in a QIIME 2 deployment if it can be transformed from at least one of the canonical semantic type formats. """ return self.get_formats(filter=GetFormatFilters.EXPORTABLE) @property def importable_types(self): """Return set of concrete semantic types that are importable. A concrete semantic type is importable if it has an associated directory format. """ return self.get_semantic_types() def get_directory_format(self, semantic_type): if not qiime2.core.type.is_semantic_type(semantic_type): raise TypeError( "Must provide a semantic type via `semantic_type`, not %r" % semantic_type) # TODO: ideally we could just lookup semantic_type in # self.artifact_classes but properties get in the way. Is there a way # to strip properties so this could be simplified to return # self.artifact_classes[semantic_type] while catching a KeyError? for artifact_class_record in self.artifact_classes.values(): if semantic_type <= artifact_class_record.semantic_type: return artifact_class_record.format # TODO: We need a good way to tell if a semantic type is registered but # does not have a directory format. The previous error was causing a # lot of confusion. # # Reference: https://github.com/qiime2/qiime2/issues/514 raise TypeError( "Semantic type %r is invalid, either because it doesn't have a " "compatible directory format, or because it's not registered." % semantic_type) qiime2-2024.5.0/qiime2/sdk/proxy.py000066400000000000000000000177311462552636000166560ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import qiime2.core.transform as transform from qiime2.core.type.util import is_visualization_type, is_collection_type class Proxy: """Base class to indicate that a given class that inherits from it is a proxy. Also implements some generic functionality """ def __eq__(self, other): return self.result() == other.result() def __ne__(self, other): return not (self == other) class ProxyResult(Proxy): def __init__(self, future, selector, qiime_type=None): """We have a future that represents the results of some QIIME 2 action, and we have a selector indicating specifically which result we want """ self._future_ = future self._selector_ = selector self._qiime_type_ = qiime_type def __repr__(self): if self._qiime_type_ is None: return f'<{self.__class__.__name__.lower()}: Unknown Type ' \ f'{object.__repr__(self)}>' else: return f'<{self.__class__.__name__.lower()}: {self.type}>' def __hash__(self): return hash(self.uuid) @property def _archiver(self): return self.result()._archiver @property def type(self): if self._qiime_type_ is not None: return self._qiime_type_ return self.result().type @property def uuid(self): return self._archiver.uuid @property def format(self): from qiime2.sdk import PluginManager pm = PluginManager() return pm.get_directory_format(self.type) @property def citations(self): return self._archiver.citations def result(self): return self._get_element_(self._future_.result()) def _get_element_(self, results): """Get the result we want off of the future we have """ return getattr(results, self._selector_) def export_data(self, output_dir): return self.result().export_data(output_dir) def save(self, filepath, ext=None): """Blocks then calls save on the result. """ return self.result().save(filepath, ext=ext) def validate(self, level=NotImplemented): return self.result().validate(level=level) class ProxyArtifact(ProxyResult): """This represents a future Artifact that is being returned by a Parsl app """ def view(self, type): """If we want to view the result we need the future to be resolved """ return self._get_element_(self._future_.result()).view(type) def has_metadata(self): from qiime2 import Metadata from_type = transform.ModelType.from_view_type(self.format) to_type = transform.ModelType.from_view_type(Metadata) return from_type.has_transformation(to_type) def validate(self, level='max'): self.result().validate(level=level) class ProxyVisualization(ProxyResult): """This represents a future Visualization that is being returned by a Parsl app """ def get_index_paths(self, relative=True): return self.result().get_index_paths(relative=relative) class ProxyResultCollection(Proxy): def __init__(self, future, selector, qiime_type=None): self._future_ = future self._selector_ = selector self._qiime_type_ = qiime_type def __len__(self): return len(self.collection) def __iter__(self): yield self.collection.__iter__() def __setitem__(self, key, item): self.collection[key] = item def __getitem__(self, key): return self.collection[key] def __repr__(self): return f"<{self.__class__.__name__.lower()}: {self.type}>" @property def type(self): # I'm not a huge fan of the fact that this may or may not need to # block. If this is a return from an action (which it basically # always will be) we don't need to block for type. Otherwise we do. if self._qiime_type_ is not None: return self._qiime_type_ return self.result().type @property def collection(self): return self.result().collection def save(self, directory): """Blocks then calls save on the result. """ return self.result().save(directory) def keys(self): return self.collection.keys() def values(self): return self.collection.values() def items(self): return self.collection.items() def _get_element_(self, results): """Get the result we want off of the future we have """ return getattr(results, self._selector_) def result(self): return self._get_element_(self._future_.result()) class ProxyResults(Proxy): """This represents future results that are being returned by a Parsl app """ def __init__(self, future, signature): """We have the future results and the outputs portion of the signature of the action creating the results """ self._future_ = future self._signature_ = signature def __iter__(self): """Give us a ProxyArtifact for each result in the future """ for s in self._signature_: yield self._create_proxy(s) def __getattr__(self, attr): """Get a particular ProxyArtifact out of the future """ return self._create_proxy(attr) def __getitem__(self, index): return self._create_proxy(list(self._signature_.keys())[index]) def __repr__(self): lines = [] lines.append('%s (name = value)' % self.__class__.__name__) lines.append('') max_len = -1 for field in self._signature_: if len(field) > max_len: max_len = len(field) for field, value in zip(self._signature_, self): field_padding = ' ' * (max_len - len(field)) lines.append('%s%s = %r' % (field, field_padding, value)) max_len = -1 for line in lines: if len(line) > max_len: max_len = len(line) lines[1] = '-' * max_len return '\n'.join(lines) def __eq__(self, other): """ Overriding the one on Proxy because we have _result not result """ return self._result() == other._result() def _asdict(self): return self.result()._asdict() def _result(self): """ If you are calling an action in a try-except block in a pipeline, you need to call this method on the Results object returned by the action. This is because if the Pipeline was executed with parsl, we need to block on the action in the try-except to ensure we get the result and raise the potential exception while we are still inside of the try-except. Otherwise we would just get the exception whenever the future resolved which would likely be outside of the try-except, so the exception would be raised and not caught. If you call an action in the Python API using parsl inside of a context manager (a withed in Cache for instance) you also must call this method there to ensure you get you don't start using a different cache/pool/whatever before your future resolves. """ return self._future_.result() def _create_proxy(self, selector): qiime_type = self._signature_[selector].qiime_type if is_collection_type(qiime_type): return ProxyResultCollection( self._future_, selector, qiime_type) elif is_visualization_type(qiime_type): return ProxyVisualization( self._future_, selector, qiime_type) return ProxyArtifact(self._future_, selector, qiime_type) qiime2-2024.5.0/qiime2/sdk/result.py000066400000000000000000000552021462552636000170060ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import shutil import warnings import tempfile import collections import distutils.dir_util import pathlib from typing import Union, get_args, get_origin import qiime2.metadata import qiime2.plugin import qiime2.sdk import qiime2.core.type import qiime2.core.transform as transform import qiime2.core.archive as archive import qiime2.plugin.model as model import qiime2.core.util as util import qiime2.core.exceptions as exceptions # Note: Result, Artifact, and Visualization classes are in this file to avoid # circular dependencies between Result and its subclasses. Result is tightly # coupled to Artifact and Visualization because it is a base class and a # factory, so having the classes in the same file helps make this coupling # explicit. ResultMetadata = collections.namedtuple('ResultMetadata', ['uuid', 'type', 'format']) class Result: """Base class for QIIME 2 result classes (Artifact and Visualization). This class is not intended to be instantiated. Instead, it acts as a public factory and namespace for interacting with Artifacts and Visualizations in a generic way. It also acts as a base class for code reuse and provides an API shared by Artifact and Visualization. """ # Subclasses must override to provide a file extension. extension = None @classmethod def _is_valid_type(cls, type_): """Subclasses should override this method.""" return True @classmethod def peek(cls, filepath): return ResultMetadata(*archive.Archiver.peek(filepath)) @classmethod def extract(cls, filepath, output_dir): """Unzip contents of Artifacts and Visualizations.""" return archive.Archiver.extract(filepath, output_dir) @classmethod def load(cls, filepath): """Factory for loading Artifacts and Visualizations.""" from qiime2.core.cache import get_cache # Check if the data is already in the cache (if the uuid is in # cache.data) and load it from the cache if it is. Avoids unzipping the # qza again if we already have it. cache = get_cache() peek = cls.peek(filepath) archiver = cache._load_uuid(peek.uuid) if not archiver: try: archiver = archive.Archiver.load(filepath) except OSError as e: if e.errno == 28: temp = tempfile.tempdir raise ValueError(f'There was not enough space left on ' f'{temp!r} to extract the artifact ' f'{filepath!r}. (Try setting $TMPDIR to ' 'a directory with more space, or ' f'increasing the size of {temp!r})') else: raise e if Artifact._is_valid_type(archiver.type): result = Artifact.__new__(Artifact) elif Visualization._is_valid_type(archiver.type): result = Visualization.__new__(Visualization) else: raise TypeError( "Cannot load filepath %r into an Artifact or Visualization " "because type %r is not supported." % (filepath, archiver.type)) if type(result) is not cls and cls is not Result: raise TypeError( "Attempting to load %s with `%s.load`. Use `%s.load` instead." % (type(result).__name__, cls.__name__, type(result).__name__)) result._archiver = archiver return result @classmethod def _from_archiver(cls, archiver): if Artifact._is_valid_type(archiver.type): result = Artifact.__new__(Artifact) elif Visualization._is_valid_type(archiver.type): result = Visualization.__new__(Visualization) else: raise TypeError( "Cannot load filepath %r into an Artifact or Visualization " "because type %r is not supported." % (archiver.path, archiver.type)) if type(result) is not cls and cls is not Result: raise TypeError( "Attempting to load %s with `%s.load`. Use `%s.load` instead." % (type(result).__name__, cls.__name__, type(result).__name__)) result._archiver = archiver return result @property def type(self): return self._archiver.type @property def uuid(self): return self._archiver.uuid @property def format(self): return self._archiver.format @property def citations(self): return self._archiver.citations def __init__(self): raise NotImplementedError( "%(classname)s constructor is private, use `%(classname)s.load`, " "`%(classname)s.peek`, or `%(classname)s.extract`." % {'classname': self.__class__.__name__}) def __new__(cls): result = object.__new__(cls) result._archiver = None return result def __repr__(self): return ("<%s: %r uuid: %s>" % (self.__class__.__name__.lower(), self.type, self.uuid)) def __hash__(self): return hash(self.uuid) def __eq__(self, other): # Checking the UUID is mostly sufficient but requiring an exact type # match makes it safer in case `other` is a subclass or a completely # different type that happens to have a `.uuid` property. We want to # ensure (as best as we can) that the UUIDs we are comparing are linked # to the same type of QIIME 2 object. return ( type(self) is type(other) and self.uuid == other.uuid ) def __ne__(self, other): return not (self == other) def export_data(self, output_dir): distutils.dir_util.copy_tree( str(self._archiver.data_dir), str(output_dir)) # Return None for now, although future implementations that include # format tranformations may return the invoked transformers return None @property def _destructor(self): return self._archiver._destructor def save(self, filepath, ext=None): """Save to a file. Parameters ---------- filepath : str Path to save file at. extension : str Preferred file extension (.qza, .qzv, .txt, etc). If no preferred extension input is included, Artifact extension will default to .qza and Visualization extension will default to .qzv. Including a period in the extension is optional, and any additional periods delimiting the filepath and the extension will be reduced to a single period. Returns ------- str Filepath and extension (if provided) that the file was saved to. See Also -------- load """ if ext is None: ext = self.extension # This accounts for edge cases in the filename extension # and ensures that there is only a single period in the ext. # Caste to str incase we received a pathlib.Path or similar filepath = str(filepath) filepath = filepath.rstrip('.') ext = '.' + ext.lstrip('.') if not filepath.endswith(ext): filepath += ext self._archiver.save(filepath) return filepath def _alias(self, provenance_capture): def clone_original(into): # directory is empty, this function is meant to fix that, so we # can rmdir so that copytree is happy into.rmdir() shutil.copytree(str(self._archiver.data_dir), str(into), copy_function=os.link) # Use hardlinks cls = type(self) alias = cls.__new__(cls) alias._archiver = archive.Archiver.from_data( self.type, self.format, clone_original, provenance_capture) return alias def validate(self, level=NotImplemented): diff = self._archiver.validate_checksums() if diff.changed or diff.added or diff.removed: error = "" if diff.added: error += "Unrecognized files:\n" for key in diff.added: error += " - %r\n" % key if diff.removed: error += "Missing files:\n" for key in diff.removed: error += " - %r\n" % key if diff.changed: error += "Changed files:\n" for (key, (exp, obs)) in diff.changed.items(): error += " - %r: %s -> %s\n" % (key, exp, obs) raise exceptions.ValidationError(error) def result(self): """ Noop to provide standardized interface with ProxyResult. """ return self class Artifact(Result): extension = '.qza' @classmethod def _is_valid_type(cls, type_): if qiime2.core.type.is_semantic_type(type_) and type_.is_concrete(): return True else: return False @classmethod def import_data(cls, type, view, view_type=None, validate_level='max'): type_, type = type, __builtins__['type'] if validate_level not in ('min', 'max'): raise ValueError("Expected 'min' or 'max' for `validate_level`.") is_format = False if isinstance(type_, str): type_ = qiime2.sdk.parse_type(type_) if isinstance(view_type, str): view_type = qiime2.sdk.parse_format(view_type) is_format = True if view_type is None: if type(view) is str or isinstance(view, pathlib.PurePath): is_format = True pm = qiime2.sdk.PluginManager() output_dir_fmt = pm.get_directory_format(type_) if pathlib.Path(view).is_file(): if not issubclass(output_dir_fmt, model.SingleFileDirectoryFormatBase): raise qiime2.plugin.ValidationError( "Importing %r requires a directory, not %s" % (output_dir_fmt.__name__, view)) view_type = output_dir_fmt.file.format else: view_type = output_dir_fmt else: view_type = type(view) format_ = None md5sums = None if is_format: path = pathlib.Path(view) if path.is_file(): md5sums = {path.name: util.md5sum(path)} elif path.is_dir(): md5sums = util.md5sum_directory(path) else: raise qiime2.plugin.ValidationError( "Path '%s' does not exist." % path) format_ = view_type provenance_capture = archive.ImportProvenanceCapture(format_, md5sums) return cls._from_view(type_, view, view_type, provenance_capture, validate_level=validate_level) @classmethod def _from_view(cls, type, view, view_type, provenance_capture, validate_level='min'): type_raw = type if isinstance(type, str): type = qiime2.sdk.parse_type(type) if not cls._is_valid_type(type): raise TypeError( "An artifact requires a concrete semantic type, not type %r." % type) pm = qiime2.sdk.PluginManager() output_dir_fmt = pm.get_directory_format(type) if view_type is None: # lookup default format for the type view_type = output_dir_fmt from_type = transform.ModelType.from_view_type(view_type) to_type = transform.ModelType.from_view_type(output_dir_fmt) recorder = provenance_capture.transformation_recorder('return') transformation = from_type.make_transformation(to_type, recorder=recorder) result = transformation(view, validate_level) if type_raw in pm.validators: validation_object = pm.validators[type] validation_object(data=result, level=validate_level) artifact = cls.__new__(cls) artifact._archiver = archive.Archiver.from_data( type, output_dir_fmt, data_initializer=result.path._move_or_copy, provenance_capture=provenance_capture) return artifact def view(self, view_type): return self._view(view_type) def _view(self, view_type, recorder=None): if view_type is qiime2.Metadata and not self.has_metadata(): raise TypeError( "Artifact %r cannot be viewed as QIIME 2 Metadata." % self) from_type = transform.ModelType.from_view_type(self.format) if isinstance(get_origin(view_type), type(Union)): transformation = None for arg in get_args(view_type): to_type = transform.ModelType.from_view_type(arg) try: transformation = from_type.make_transformation( to_type, recorder=recorder) if transformation: break except Exception as e: if str(e).startswith("No transformation from"): continue else: raise e if not transformation: raise Exception( "No transformation into either of %s was found" % ", ".join([str(x) for x in view_type.__args__]) ) else: to_type = transform.ModelType.from_view_type(view_type) transformation = from_type.make_transformation(to_type, recorder=recorder) result = transformation(self._archiver.data_dir) if view_type is qiime2.Metadata: result._add_artifacts([self]) to_type.set_user_owned(result, True) return result def has_metadata(self): """ Checks for metadata within an artifact Returns ------- bool True if the artifact has metadata (i.e. can be viewed as ``qiime2.Metadata``), False otherwise. """ from_type = transform.ModelType.from_view_type(self.format) to_type = transform.ModelType.from_view_type(qiime2.Metadata) return from_type.has_transformation(to_type) def validate(self, level='max'): """ Validates the data contents of an artifact Raises ------ ValidationError If the artifact is invalid at the specified level of validation. """ super().validate() self.format.validate(self.view(self.format), level) class Visualization(Result): extension = '.qzv' @classmethod def _is_valid_type(cls, type_): return type_ == qiime2.core.type.Visualization @classmethod def _from_data_dir(cls, data_dir, provenance_capture): # shutil.copytree doesn't allow the destination directory to exist. def data_initializer(destination): return distutils.dir_util.copy_tree( str(data_dir), str(destination)) viz = cls.__new__(cls) viz._archiver = archive.Archiver.from_data( qiime2.core.type.Visualization, None, data_initializer=data_initializer, provenance_capture=provenance_capture) return viz def get_index_paths(self, relative=True): result = {} for abspath in self._archiver.data_dir.iterdir(): data_path = str(abspath.relative_to(self._archiver.data_dir)) if data_path.startswith('index.'): relpath = abspath.relative_to(self._archiver.root_dir) ext = relpath.suffix[1:] if ext in result: raise ValueError( "Multiple index files identified with %s " "extension (%s, %s). This is currently " "unsupported." % (ext, result[ext], relpath)) else: result[ext] = str(relpath) if relative else str(abspath) return result def _repr_html_(self): from qiime2.jupyter import make_html return make_html(str(self._archiver.path)) class ResultCollection: @classmethod def load(cls, directory): """ Determines how to load a Collection of QIIME 2 Artifacts in a directory and dispatches to helpers """ if not os.path.isdir(directory): raise ValueError( f"Given filepath '{directory}' is not a directory") order_fp = os.path.join(directory, '.order') if os.path.isfile(order_fp): collection = cls._load_ordered(directory, order_fp) else: warnings.warn(f"The directory '{directory}' does not contain a " ".order file. The files will be read into the " "collection in the order the filesystem provides " "them in.") collection = cls._load_unordered(directory) return collection @classmethod def _load_ordered(cls, directory, order_fp): collection = cls() with open(order_fp, 'r') as order_fh: for result_name in order_fh.read().splitlines(): result_fp = cls._get_result_fp(directory, result_name) collection[result_name] = Result.load(result_fp) return collection @classmethod def _load_unordered(cls, directory): collection = cls() for result in os.listdir(directory): result_fp = os.path.join(directory, result) result_name = result.rstrip('.qza') result_name = result_name.rstrip('.qzv') collection[result_name] = Result.load(result_fp) return collection @classmethod def _get_result_fp(cls, directory, result_name): result_fp = os.path.join(directory, result_name) # Check if thing in .order file exists and if not try it with .qza at # the end and if not try it with .qzv at the end if not os.path.isfile(result_fp): result_fp += '.qza' if not os.path.isfile(result_fp): # Get rid of the trailing .qza before adding .qzv result_fp = result_fp[:-4] result_fp += '.qzv' if not os.path.isfile(result_fp): raise ValueError( f"The Result '{result_name}' is referenced in the " "order file but does not exist in the directory.") return result_fp def __init__(self, collection=None): if collection is None: self.collection = {} elif isinstance(collection, dict): qiime2.sdk.util.validate_result_collection_keys(*collection.keys()) self.collection = collection else: self.collection = {str(k): v for k, v in enumerate(collection)} def __contains__(self, item): return item in self.collection def __eq__(self, other): if isinstance(other, dict): return self.collection == other elif isinstance(other, ResultCollection): return self.collection == other.collection else: raise TypeError(f"Equality between '{type(other)}' and " "ResultCollection is undefined.") def __len__(self): return len(self.collection) def __iter__(self): yield self.collection.__iter__() def __setitem__(self, key, item): qiime2.sdk.util.validate_result_collection_keys(key) self.collection[key] = item def __getitem__(self, key): return self.collection[key] def __repr__(self): return f"<{self.__class__.__name__.lower()}: {self.type}>" @property def type(self): inner_type = qiime2.core.type.grammar.UnionExp( v.type for v in self.collection.values()).normalize() return qiime2.core.type.Collection[inner_type] @property def extension(self): if str(self.type) == 'Collection[Visualization]': return '.qzv' return '.qza' def save(self, directory): """Saves a collection of QIIME 2 Results into a given directory with an order file. NOTE: The directory given must not exist """ if os.path.exists(directory): raise ValueError(f"The given directory '{directory}' already " "exists. A new directory must be given to save " "the collection to.") os.makedirs(directory) order_string = '' for name, result in self.collection.items(): result_fp = os.path.join(directory, name) result.save(result_fp) order_string += f'{name}\n' with open(os.path.join(directory, '.order'), 'w') as fh: fh.write(order_string) # Do this to give us a unified API with Result.save return directory def save_unordered(self, directory): """Saves a collection of QIIME 2 Results into a given directory without an order file. This is used by q2galaxy where an order file will be interpreted as another dataset in the collection which is not desirable NOTE: The directory given must not exist """ if os.path.exists(directory): raise ValueError(f"The given directory '{directory}' already " "exists. A new directory must be given to save " "the collection to.") os.makedirs(directory) for name, result in self.collection.items(): result_fp = os.path.join(directory, name) result.save(result_fp) # Do this to give us a unified API with Result.save return directory def keys(self): return self.collection.keys() def values(self): return self.collection.values() def items(self): return self.collection.items() def validate(self, view, level=None): for result in self.values(): result.validate(view, level) def result(self): """ Noop to provide standardized interface with ProxyResultCollection. """ return self qiime2-2024.5.0/qiime2/sdk/results.py000066400000000000000000000111251462552636000171650ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- # This class provides an interface similar to a `namedtuple` type. We can't use # `namedtuple` directly because each `Action` will return a `Results` object # with `Action`-specific fields (`Action.signature` determines the fields). # Dynamically-defined namedtuple types aren't pickleable, which is necessary # for `asynchronous`. They aren't pickleable because the namedtuple type must # be accessible as a module global, but this global type would be redefined # each time an `Action` is instantiated. class Results(tuple): """Tuple class representing the named results of an ``Action``. Provides an interface similar to a ``namedtuple`` type (e.g. fields are accessible as attributes). Users should not need to instantiate this class directly. """ # Subclassing `tuple` requires `__new__` override. def __new__(cls, fields, values): fields = tuple(fields) values = tuple(values) if len(fields) != len(values): raise ValueError( "`fields` and `values` must have matching length: %d != %d" % (len(fields), len(values))) # Create tuple instance, store fields, and create read-only attributes # for each field name. Fields must be stored for pickling/copying (see # `__getnewargs__`). # # Note: setting field names as attributes allows for tab-completion in # interactive contexts! Using `__getattr__` does not support this. self = super().__new__(cls, values) # Must set attributes this way because `__setattr__` prevents # setting directly (necessary for immutability). object.__setattr__(self, '_fields', fields) # Attach field names as instance attributes. for field, value in zip(fields, values): object.__setattr__(self, field, value) return self def __getnewargs__(self): """Arguments to pass to `__new__`. Used by copy and pickle.""" # `tuple(self)` returns `values`. return self._fields, tuple(self) # `__setattr__` and `__delattr__` must be defined to prevent users from # creating or deleting attributes after this class has been instantiated. # `tuple` and `namedtuple` do not have this problem because they are # immutable (`__slots__ = ()`). We cannot make this class immutable because # we cannot define nonempty `__slots__` when subclassing `tuple`, and we # need the `_fields` attribute. We work around this issue by disallowing # setting and deleting attributes. The error messages here match those # raised by `namedtuple` in Python 3.5.1. def __setattr__(self, name, value): raise AttributeError("can't set attribute") def __delattr__(self, name): raise AttributeError("can't delete attribute") def __eq__(self, other): # Results with different field names should not compare equal, even if # their values are equal. return ( isinstance(other, Results) and self._fields == other._fields and tuple(self) == tuple(other) ) def __ne__(self, other): return not (self == other) def __repr__(self): # It is possible to provide an evalable repr but this type of repr does # not make the field/value pairs apparent. If the constructor accepted # **kwargs, the order of field/value pairs would be lost. lines = [] lines.append('%s (name = value)' % self.__class__.__name__) lines.append('') max_len = -1 for field in self._fields: if len(field) > max_len: max_len = len(field) for field, value in zip(self._fields, self): field_padding = ' ' * (max_len - len(field)) lines.append('%s%s = %r' % (field, field_padding, value)) max_len = -1 for line in lines: if len(line) > max_len: max_len = len(line) lines[1] = '-' * max_len return '\n'.join(lines) def _asdict(self): return dict(zip(self._fields, self)) def _result(self): """ This exists to provide a standardized interface with ProxyResults. Check the 'result' method on ProxyResults for a full explanation. """ return self qiime2-2024.5.0/qiime2/sdk/tests/000077500000000000000000000000001462552636000162545ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/sdk/tests/__init__.py000066400000000000000000000005351462552636000203700ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- qiime2-2024.5.0/qiime2/sdk/tests/data/000077500000000000000000000000001462552636000171655ustar00rootroot00000000000000qiime2-2024.5.0/qiime2/sdk/tests/data/complex_config.toml000066400000000000000000000007461462552636000230650ustar00rootroot00000000000000[parsl] strategy = "None" [[parsl.executors]] class = "HighThroughputExecutor" label = "default" max_workers = 10 [parsl.executors.provider] class = "SlurmProvider" [parsl.executors.provider.launcher] class = "SrunLauncher" [parsl.executors.provider.channel] class = "LocalChannel" [[parsl.executors]] class = "HighThroughputExecutor" label = "other" max_workers = 10 [parsl.executors.provider] class = "AdHocProvider" [[parsl.executors.provider.channels]] class = "LocalChannel"qiime2-2024.5.0/qiime2/sdk/tests/data/intsequence-fail-max-validation.txt000066400000000000000000000000141462552636000260700ustar00rootroot000000000000001 2 3 4 5 8 qiime2-2024.5.0/qiime2/sdk/tests/data/mapping_config.toml000066400000000000000000000003541462552636000230440ustar00rootroot00000000000000[parsl] strategy = "None" [[parsl.executors]] class = "ThreadPoolExecutor" label = "default" max_threads = 1 [[parsl.executors]] class = "_TEST_EXECUTOR_" label = "test" max_threads = 1 [parsl.executor_mapping] list_of_ints = "test" qiime2-2024.5.0/qiime2/sdk/tests/data/mapping_only_config.toml000066400000000000000000000000701462552636000241000ustar00rootroot00000000000000[parsl] [parsl.executor_mapping] list_of_ints = "test" qiime2-2024.5.0/qiime2/sdk/tests/data/singleint.txt000066400000000000000000000000021462552636000217120ustar00rootroot0000000000000042qiime2-2024.5.0/qiime2/sdk/tests/data/test_config.toml000066400000000000000000000002741462552636000223710ustar00rootroot00000000000000[parsl] strategy = "None" [[parsl.executors]] class = "ThreadPoolExecutor" label = "default" max_threads = 1 [[parsl.executors]] class = "_TEST_EXECUTOR_" label = "test" max_threads = 1 qiime2-2024.5.0/qiime2/sdk/tests/test_action.py000066400000000000000000000166241462552636000211530ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import collections import tempfile import unittest import warnings import qiime2.core.archive as archive from qiime2.core.testing.util import get_dummy_plugin from qiime2.plugin.testing import TestPluginBase from qiime2.sdk import Artifact, Visualization from qiime2.core.testing.type import IntSequence1, IntSequence2, SingleInt from qiime2.core.testing.visualizer import most_common_viz from qiime2 import Metadata from qiime2.metadata.tests.test_io import get_data_path # NOTE: This test suite exists for tests not easily split into # test_method, test_visualizer, test_pipeline # TestBadInputs tests type mismatches between Action signatures and passed args class TestBadInputs(TestPluginBase): def make_provenance_capture(self): # importing visualizations is not supported, but we do that here to # simplify testing machinery return archive.ImportProvenanceCapture() def setUp(self): self.plugin = get_dummy_plugin() # TODO standardize temporary directories created by QIIME 2 # create a temporary data_dir for sample Visualizations self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.data_dir = os.path.join(self.test_dir.name, 'viz-output') os.mkdir(self.data_dir) most_common_viz(self.data_dir, collections.Counter(range(42))) def tearDown(self): self.test_dir.cleanup() def test_viz_passed_as_input(self): saved_viz = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) method = self.plugin.methods['optional_artifacts_method'] ints1 = Artifact.import_data(IntSequence1, [0, 42, 43]) # tests Viz passed as primitive parameter with self.assertRaisesRegex( TypeError, 'Visualizations may not be used as inputs.'): method(saved_viz, 42) # tests Viz passed as Artifact input with self.assertRaisesRegex( TypeError, 'Visualizations may not be used as inputs.'): method(ints1, 42, optional1=saved_viz) # tests Viz passed as metadata method = self.plugin.methods['identity_with_optional_metadata'] with self.assertRaisesRegex( TypeError, 'Visualizations may not be used as inputs.'): method(ints1, metadata=saved_viz) def test_artifact_passed_incorrectly(self): concatenate_ints = self.plugin.methods['concatenate_ints'] identity_with_metadata = self.plugin.methods['identity_with_metadata'] ints1 = Artifact.import_data(IntSequence1, [0, 42, 43]) ints2 = Artifact.import_data(IntSequence1, [99, -22]) ints3 = Artifact.import_data(IntSequence2, [12, 111]) inappropriate_Artifact = Artifact.import_data(IntSequence1, [-9999999]) int1 = 4 int2 = 5 # tests Artifact passed as integer with self.assertRaisesRegex( TypeError, 'int1.*type Int.*IntSequence1'): concatenate_ints(ints1, ints2, ints3, inappropriate_Artifact, int2) # tests Artifact passed as metadata with self.assertRaisesRegex( TypeError, '\'metadata\'.*type Metadata.*IntSequence1'): identity_with_metadata(ints1, inappropriate_Artifact) # tests wrong type of Artifact passed with self.assertRaisesRegex( TypeError, 'ints3.*IntSequence2.*IntSequence1'): concatenate_ints(ints1, ints2, inappropriate_Artifact, int1, int2) def test_primitive_passed_incorrectly(self): concatenate_ints = self.plugin.methods['concatenate_ints'] identity_with_metadata = self.plugin.methods['identity_with_metadata'] params_only_method = self.plugin.methods['params_only_method'] md_fp = get_data_path('valid/simple.tsv') inappropriate_metadata = Metadata.load(md_fp) ints1 = Artifact.import_data(IntSequence1, [0, 42, 43]) ints3 = Artifact.import_data(IntSequence1, [12, 111]) int1 = 4 int2 = 5 arbitrary_int = 43 # tests primitive int passed as IntSequence artifact with self.assertRaisesRegex(TypeError, 'ints2.*43.*incompatible.*IntSequence1'): concatenate_ints(ints1, arbitrary_int, ints3, int1, int2) # tests primitive passed as metadata with self.assertRaisesRegex(TypeError, 'metadata.*43.*incompatible.*Metadata'): identity_with_metadata(ints1, arbitrary_int) # tests wrong type of primitive passed with self.assertRaisesRegex(TypeError, 'age.*arbitraryString.*incompatible.*Int'): params_only_method('key string', 'arbitraryString') # tests metadata passed as artifact with self.assertRaisesRegex(TypeError, '\'ints2\'.*Metadata.*IntSequence1'): concatenate_ints(ints1, inappropriate_metadata, ints3, int1, int2) def test_primitive_param_out_of_range(self): range_nested_in_list = self.plugin.methods['variadic_input_method'] range_not_nested_in_list = self.plugin.visualizers['params_only_viz'] ints_list = [Artifact.import_data(IntSequence1, [0, 42, 43]), Artifact.import_data(IntSequence2, [4, 5, 6])] int_set = {Artifact.import_data(SingleInt, 7), Artifact.import_data(SingleInt, 8)} nums = {9, 10} bad_range_val = [11, 12, -9999] invalid_age = -99999 # Tests primitives of correct type but outside of Range... # ... in a list with self.assertRaisesRegex( TypeError, 'opt_nums.*-9999.*incompatible.*List'): range_nested_in_list(ints_list, int_set, nums, bad_range_val) # ... not in a list with self.assertRaisesRegex( TypeError, r'\'age\'.*-99999.*incompatible.*Int % Range\(0, None\)'): range_not_nested_in_list('John Doe', invalid_age) def test_primitive_param_not_valid_choice(self): pipeline = self.plugin.pipelines['failing_pipeline'] int_sequence = Artifact.import_data(IntSequence1, [0, 42, 43]) break_from = "invalid choice" # test String not a valid choice with self.assertRaisesRegex( TypeError, 'break_from.*\'invalid choice\''): pipeline(int_sequence, break_from) class TestDeprecation(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() self.method = self.plugin.methods['deprecated_method'] def test_successful_registration(self): self.assertTrue(self.method.deprecated) def test_deprecation_warning(self): with warnings.catch_warnings(record=True) as w: self.method() self.assertEqual(1, len(w)) warning = w[0] self.assertEqual(warning.category, FutureWarning) self.assertTrue('Method is deprecated' in str(warning.message)) def test_docstring(self): self.assertIn('Method is deprecated', self.method.__call__.__doc__) qiime2-2024.5.0/qiime2/sdk/tests/test_actiongraph.py000066400000000000000000000123651462552636000221730ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest from qiime2.core.testing.type import (Mapping, IntSequence1, IntSequence2) from qiime2.core.type.primitive import (Int, Str, Metadata) from qiime2.core.type.visualization import (Visualization) from qiime2.core.testing.util import get_dummy_plugin from qiime2.sdk.actiongraph import build_graph class TestActiongraph(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() self.g = None def test_simple_graph(self): methods = [self.plugin.actions['no_input_method']] self.g = build_graph(methods) obs = list(self.g.nodes) exp_node = str({ 'inputs': {}, 'outputs': { 'out': Mapping }, }) type_node = Mapping exp = [type_node, exp_node] for item in obs: assert item in exp assert self.g.has_edge(str(exp_node), type_node) def test_cycle_in_graph_no_params(self): methods = [self.plugin.actions['docstring_order_method']] self.g = build_graph(methods) obs = list(self.g.nodes) exp = [Mapping, Str] exp_node = str({ 'inputs': { 'req_input': Mapping, 'req_param': Str, }, 'outputs': { 'out': Mapping }, }) exp += [exp_node] for item in obs: assert item in exp assert self.g.in_degree(exp_node) == 2 assert self.g.out_degree(exp_node) == 1 def test_cycle_in_graph_with_params(self): methods = [self.plugin.actions['docstring_order_method']] self.g = build_graph(methods, True) obs = list(self.g.nodes) exp = [Mapping, Int, Str, 'opt_Mapping', 'opt_Int'] exp_node = str({ 'inputs': { 'req_input': Mapping, 'req_param': Str, 'opt_input': Mapping, 'opt_param': Int }, 'outputs': { 'out': Mapping }, }) exp += [exp_node] for item in obs: assert item in exp assert self.g.in_degree(exp_node) == 4 assert self.g.out_degree(exp_node) == 1 def test_union(self): vis = [self.plugin.actions['most_common_viz']] self.g = build_graph(vis) obs = list(self.g.nodes) exp = [Visualization, IntSequence1, IntSequence2] exp_node_1 = str({ 'inputs': { 'ints': IntSequence1, }, 'outputs': { 'visualization': Visualization }, }) exp_node_2 = str({ 'inputs': { 'ints': IntSequence2, }, 'outputs': { 'visualization': Visualization }, }) exp += [exp_node_1, exp_node_2] for item in obs: assert item in exp assert self.g.in_degree(exp_node_1) == 1 assert self.g.out_degree(exp_node_1) == 1 assert self.g.in_degree(exp_node_2) == 1 assert self.g.out_degree(exp_node_2) == 1 assert self.g.in_degree(Visualization) == 2 assert self.g.out_degree(Visualization) == 0 def test_multiple_outputs(self): actions = [self.plugin.actions['visualizer_only_pipeline']] self.g = build_graph(actions) obs = list(self.g.nodes) exp = [Visualization, Mapping] exp_node = str({ 'inputs': { 'mapping': Mapping }, 'outputs': { 'viz1': Visualization, 'viz2': Visualization }, }) exp += [exp_node] for item in obs: assert item in exp assert self.g.in_degree(exp_node) == 1 assert self.g.out_degree(exp_node) == 1 def test_metadata(self): actions = [self.plugin.actions['identity_with_metadata']] self.g = build_graph(actions) obs = list(self.g.nodes) exp = [Metadata, IntSequence1, IntSequence2] exp_node_1 = str({ 'inputs': { 'ints': IntSequence1, 'metadata': Metadata }, 'outputs': { 'out': IntSequence1 }, }) exp_node_2 = str({ 'inputs': { 'ints': IntSequence2, 'metadata': Metadata }, 'outputs': { 'out': IntSequence1 }, }) exp += [exp_node_1, exp_node_2] for item in obs: assert item in exp assert self.g.in_degree(exp_node_1) == 2 assert self.g.out_degree(exp_node_1) == 1 assert self.g.in_degree(exp_node_1) == 2 assert self.g.out_degree(exp_node_1) == 1 assert self.g.in_degree(IntSequence1) == 2 assert self.g.out_degree(IntSequence1) == 1 if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_artifact.py000066400000000000000000000604101462552636000214630ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import os import tempfile import unittest import uuid import pathlib from typing import Union import pkg_resources import pandas as pd import qiime2.plugin import qiime2.core.type from qiime2 import Metadata from qiime2.sdk import Artifact from qiime2.sdk.result import ResultMetadata from qiime2.plugin.model import ValidationError import qiime2.core.archive as archive from qiime2.core.testing.format import IntSequenceFormat from qiime2.core.testing.type import IntSequence1, FourInts, Mapping, SingleInt from qiime2.core.testing.util import get_dummy_plugin, ArchiveTestingMixin def get_data_path(filename): return pkg_resources.resource_filename('qiime2.sdk.tests', 'data/%s' % filename) class TestArtifact(unittest.TestCase, ArchiveTestingMixin): def setUp(self): # Ignore the returned dummy plugin object, just run this to verify the # plugin exists as the tests rely on it being loaded. get_dummy_plugin() # TODO standardize temporary directories created by QIIME 2 self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.provenance_capture = archive.ImportProvenanceCapture() def tearDown(self): self.test_dir.cleanup() def test_private_constructor(self): with self.assertRaisesRegex( NotImplementedError, 'Artifact constructor.*private.*Artifact.load'): Artifact() # Note on testing strategy below: many of the tests for `_from_view` and # `load` are similar, with the exception that when `load`ing, the # artifact's UUID is known so more specific assertions can be performed. # While these tests appear somewhat redundant, they are important because # they exercise the same operations on Artifact objects constructed from # different sources, whose codepaths have very different internal behavior. # This internal behavior could be tested explicitly but it is safer to test # the public API behavior (e.g. as a user would interact with the object) # in case the internals change. def test_from_view(self): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) self.assertEqual(artifact.type, FourInts) # We don't know what the UUID is because it's generated within # Artifact._from_view. self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(list), [-1, 42, 0, 43]) # Can produce same view if called again. self.assertEqual(artifact.view(list), [-1, 42, 0, 43]) def test_from_view_union(self): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) self.assertEqual(artifact.type, FourInts) # We don't know what the UUID is because it's generated within # Artifact._from_view. self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(Union[list, str]), [-1, 42, 0, 43]) # Can produce same view if called again. self.assertEqual(artifact.view(Union[list, str]), [-1, 42, 0, 43]) def test_from_view_union_reordered(self): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) self.assertEqual(artifact.type, FourInts) # We don't know what the UUID is because it's generated within # Artifact._from_view. self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(Union[str, list]), [-1, 42, 0, 43]) # Can produce same view if called again. self.assertEqual(artifact.view(Union[str, list]), [-1, 42, 0, 43]) def test_from_view_union_not_valid(self): artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) self.assertEqual(artifact.type, FourInts) # We don't know what the UUID is because it's generated within # Artifact._from_view. self.assertIsInstance(artifact.uuid, uuid.UUID) with self.assertRaisesRegex( Exception, 'No transformation into either of'): self.assertEqual(artifact.view(Union[str, dict]), [-1, 42, 0, 43]) def test_from_view_different_type_with_multiple_view_types(self): artifact = Artifact._from_view(IntSequence1, [42, 42, 43, -999, 42], list, self.provenance_capture) self.assertEqual(artifact.type, IntSequence1) self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(list), [42, 42, 43, -999, 42]) self.assertEqual(artifact.view(list), [42, 42, 43, -999, 42]) self.assertEqual(artifact.view(collections.Counter), collections.Counter({42: 3, 43: 1, -999: 1})) self.assertEqual(artifact.view(collections.Counter), collections.Counter({42: 3, 43: 1, -999: 1})) def test_from_view_and_save(self): fp = os.path.join(self.test_dir.name, 'artifact.qza') # Using four-ints data layout because it has multiple files, some of # which are in a nested directory. artifact = Artifact._from_view(FourInts, [-1, 42, 0, 43], list, self.provenance_capture) artifact.save(fp) root_dir = str(artifact.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp, root_dir, expected) def test_load(self): saved_artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) fp = os.path.join(self.test_dir.name, 'artifact.qza') saved_artifact.save(fp) artifact = Artifact.load(fp) self.assertEqual(artifact.type, FourInts) self.assertEqual(artifact.uuid, saved_artifact.uuid) self.assertEqual(artifact.view(list), [-1, 42, 0, 43]) self.assertEqual(artifact.view(list), [-1, 42, 0, 43]) def test_load_different_type_with_multiple_view_types(self): saved_artifact = Artifact.import_data(IntSequence1, [42, 42, 43, -999, 42]) fp = os.path.join(self.test_dir.name, 'artifact.qza') saved_artifact.save(fp) artifact = Artifact.load(fp) self.assertEqual(artifact.type, IntSequence1) self.assertEqual(artifact.uuid, saved_artifact.uuid) self.assertEqual(artifact.view(list), [42, 42, 43, -999, 42]) self.assertEqual(artifact.view(list), [42, 42, 43, -999, 42]) self.assertEqual(artifact.view(collections.Counter), collections.Counter({42: 3, 43: 1, -999: 1})) self.assertEqual(artifact.view(collections.Counter), collections.Counter({42: 3, 43: 1, -999: 1})) def test_load_and_save(self): fp1 = os.path.join(self.test_dir.name, 'artifact1.qza') fp2 = os.path.join(self.test_dir.name, 'artifact2.qza') artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact.save(fp1) artifact = Artifact.load(fp1) # Overwriting its source file works. artifact.save(fp1) # Saving to a new file works. artifact.save(fp2) root_dir = str(artifact.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp1, root_dir, expected) root_dir = str(artifact.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp2, root_dir, expected) def test_roundtrip(self): fp1 = os.path.join(self.test_dir.name, 'artifact1.qza') fp2 = os.path.join(self.test_dir.name, 'artifact2.qza') artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact.save(fp1) artifact1 = Artifact.load(fp1) artifact1.save(fp2) artifact2 = Artifact.load(fp2) self.assertEqual(artifact1.type, artifact2.type) self.assertEqual(artifact1.format, artifact2.format) self.assertEqual(artifact1.uuid, artifact2.uuid) self.assertEqual(artifact1.view(list), artifact2.view(list)) # double view to make sure multiple views can be taken self.assertEqual(artifact1.view(list), artifact2.view(list)) def test_roundtrip_pathlib(self): fp1 = pathlib.Path(os.path.join(self.test_dir.name, 'artifact1.qza')) fp2 = pathlib.Path(os.path.join(self.test_dir.name, 'artifact2.qza')) artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact.save(fp1) artifact1 = Artifact.load(fp1) artifact1.save(fp2) artifact2 = Artifact.load(fp2) self.assertEqual(artifact1.type, artifact2.type) self.assertEqual(artifact1.format, artifact2.format) self.assertEqual(artifact1.uuid, artifact2.uuid) self.assertEqual(artifact1.view(list), artifact2.view(list)) # double view to make sure multiple views can be taken self.assertEqual(artifact1.view(list), artifact2.view(list)) def test_load_with_archive_filepath_modified(self): # Save an artifact for use in the following test case. fp = os.path.join(self.test_dir.name, 'artifact.qza') Artifact.import_data(FourInts, [-1, 42, 0, 43]).save(fp) # Load the artifact from a filepath then save a different artifact to # the same filepath. Assert that both artifacts produce the correct # views of their data. # # `load` used to be lazy, only extracting data when it needed to (e.g. # when `save` or `view` was called). This was buggy as the filepath # could have been deleted, or worse, modified to contain a different # .qza file. Thus, the wrong archive could be extracted on demand, or # the archive could be missing altogether. There isn't an easy # cross-platform compatible way to solve this problem, so Artifact.load # is no longer lazy and always extracts its data immediately. The real # motivation for lazy loading was for quick inspection of archives # without extracting/copying data, so that API is now provided through # Artifact.peek. artifact1 = Artifact.load(fp) Artifact.import_data(FourInts, [10, 11, 12, 13]).save(fp) artifact2 = Artifact.load(fp) self.assertEqual(artifact1.view(list), [-1, 42, 0, 43]) self.assertEqual(artifact2.view(list), [10, 11, 12, 13]) def test_extract(self): fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact.save(fp) root_dir = str(artifact.uuid) # pathlib normalizes away the `.`, it doesn't matter, but this is the # implementation we're using, so let's test against that assumption. output_dir = pathlib.Path(self.test_dir.name) / 'artifact-extract-test' result_dir = Artifact.extract(fp, output_dir=output_dir) self.assertEqual(result_dir, str(output_dir / root_dir)) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertExtractedArchiveMembers(output_dir, root_dir, expected) def test_peek(self): artifact = Artifact.import_data(FourInts, [0, 0, 42, 1000]) fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact.save(fp) metadata = Artifact.peek(fp) self.assertIsInstance(metadata, ResultMetadata) self.assertEqual(metadata.type, 'FourInts') self.assertEqual(metadata.uuid, str(artifact.uuid)) self.assertEqual(metadata.format, 'FourIntsDirectoryFormat') def test_import_data_invalid_type(self): with self.assertRaisesRegex(TypeError, 'concrete semantic type.*Visualization'): Artifact.import_data(qiime2.core.type.Visualization, self.test_dir) with self.assertRaisesRegex(TypeError, 'concrete semantic type.*Visualization'): Artifact.import_data('Visualization', self.test_dir) def test_import_data_with_filepath_multi_file_data_layout(self): fp = os.path.join(self.test_dir.name, 'test.txt') with open(fp, 'w') as fh: fh.write('42\n') with self.assertRaisesRegex(qiime2.plugin.ValidationError, "FourIntsDirectoryFormat.*directory"): Artifact.import_data(FourInts, fp) def test_import_data_with_wrong_number_of_files(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) error_regex = ("Missing.*MappingDirectoryFormat.*mapping.tsv") with self.assertRaisesRegex(ValidationError, error_regex): Artifact.import_data(Mapping, data_dir) def test_import_data_with_unrecognized_files(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) with open(os.path.join(data_dir, 'file1.txt'), 'w') as fh: fh.write('42\n') with open(os.path.join(data_dir, 'file2.txt'), 'w') as fh: fh.write('43\n') nested = os.path.join(data_dir, 'nested') os.mkdir(nested) with open(os.path.join(nested, 'file3.txt'), 'w') as fh: fh.write('44\n') with open(os.path.join(nested, 'foo.txt'), 'w') as fh: fh.write('45\n') error_regex = ("Unrecognized.*foo.txt.*FourIntsDirectoryFormat") with self.assertRaisesRegex(ValidationError, error_regex): Artifact.import_data(FourInts, data_dir) def test_import_data_with_unreachable_path(self): with self.assertRaisesRegex(qiime2.plugin.ValidationError, "does not exist"): Artifact.import_data(IntSequence1, os.path.join(self.test_dir.name, 'foo.txt')) with self.assertRaisesRegex(qiime2.plugin.ValidationError, "does not exist"): Artifact.import_data(FourInts, os.path.join(self.test_dir.name, 'bar', '')) def test_import_data_with_invalid_format_single_file(self): fp = os.path.join(self.test_dir.name, 'foo.txt') with open(fp, 'w') as fh: fh.write('42\n') fh.write('43\n') fh.write('abc\n') fh.write('123\n') error_regex = "foo.txt.*IntSequenceFormat.*\n\n.*Line 3" with self.assertRaisesRegex(ValidationError, error_regex): Artifact.import_data(IntSequence1, fp) def test_import_data_with_invalid_format_multi_file(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) with open(os.path.join(data_dir, 'file1.txt'), 'w') as fh: fh.write('42\n') with open(os.path.join(data_dir, 'file2.txt'), 'w') as fh: fh.write('43\n') nested = os.path.join(data_dir, 'nested') os.mkdir(nested) with open(os.path.join(nested, 'file3.txt'), 'w') as fh: fh.write('44\n') with open(os.path.join(nested, 'file4.txt'), 'w') as fh: fh.write('foo\n') error_regex = "file4.txt.*SingleIntFormat.*\n\n.*integer" with self.assertRaisesRegex(ValidationError, error_regex): Artifact.import_data(FourInts, data_dir) def test_import_data_with_good_validation_multi_files(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) with open(os.path.join(data_dir, 'file1.txt'), 'w') as fh: fh.write('1\n') with open(os.path.join(data_dir, 'file2.txt'), 'w') as fh: fh.write('1\n') a = Artifact.import_data(SingleInt, data_dir) self.assertEqual(1, a.view(int)) def test_import_data_with_bad_validation_multi_files(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) with open(os.path.join(data_dir, 'file1.txt'), 'w') as fh: fh.write('1\n') with open(os.path.join(data_dir, 'file2.txt'), 'w') as fh: fh.write('2\n') error_regex = ("test.*RedundantSingleIntDirectoryFormat.*\n\n" ".*does not match") with self.assertRaisesRegex(ValidationError, error_regex): Artifact.import_data(SingleInt, data_dir) def test_import_data_with_filepath(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) # Filename shouldn't matter for single-file case. fp = os.path.join(data_dir, 'foo.txt') with open(fp, 'w') as fh: fh.write('42\n') fh.write('43\n') fh.write('42\n') fh.write('0\n') artifact = Artifact.import_data(IntSequence1, fp) self.assertEqual(artifact.type, IntSequence1) self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(list), [42, 43, 42, 0]) def test_import_data_with_directory_single_file(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) fp = os.path.join(data_dir, 'ints.txt') with open(fp, 'w') as fh: fh.write('-1\n') fh.write('-2\n') fh.write('10\n') fh.write('100\n') artifact = Artifact.import_data(IntSequence1, data_dir) self.assertEqual(artifact.type, IntSequence1) self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(list), [-1, -2, 10, 100]) def test_import_data_with_directory_multi_file(self): data_dir = os.path.join(self.test_dir.name, 'test') os.mkdir(data_dir) with open(os.path.join(data_dir, 'file1.txt'), 'w') as fh: fh.write('42\n') with open(os.path.join(data_dir, 'file2.txt'), 'w') as fh: fh.write('41\n') nested = os.path.join(data_dir, 'nested') os.mkdir(nested) with open(os.path.join(nested, 'file3.txt'), 'w') as fh: fh.write('43\n') with open(os.path.join(nested, 'file4.txt'), 'w') as fh: fh.write('40\n') artifact = Artifact.import_data(FourInts, data_dir) self.assertEqual(artifact.type, FourInts) self.assertIsInstance(artifact.uuid, uuid.UUID) self.assertEqual(artifact.view(list), [42, 41, 43, 40]) def test_eq_identity(self): artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) self.assertEqual(artifact, artifact) def test_eq_same_uuid(self): fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact1 = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact1.save(fp) artifact2 = Artifact.load(fp) self.assertEqual(artifact1, artifact2) def test_ne_same_data_different_uuid(self): artifact1 = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact2 = Artifact.import_data(FourInts, [-1, 42, 0, 43]) self.assertNotEqual(artifact1, artifact2) def test_ne_different_data_different_uuid(self): artifact1 = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact2 = Artifact.import_data(FourInts, [1, 2, 3, 4]) self.assertNotEqual(artifact1, artifact2) def test_ne_subclass_same_uuid(self): class ArtifactSubclass(Artifact): pass fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact1 = ArtifactSubclass.import_data(FourInts, [-1, 42, 0, 43]) artifact1.save(fp) artifact2 = Artifact.load(fp) self.assertNotEqual(artifact1, artifact2) self.assertNotEqual(artifact2, artifact1) def test_ne_different_type_same_uuid(self): artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) class Faker: @property def uuid(self): return artifact.uuid faker = Faker() self.assertNotEqual(artifact, faker) def test_artifact_validate_max(self): A = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) A.validate() self.assertTrue(True) # Checkpoint assertion A.validate(level='max') self.assertTrue(True) # Checkpoint assertion A = Artifact.import_data('IntSequence1', [1, 2, 3, 4, 5, 6, 7, 10]) with self.assertRaisesRegex(ValidationError, '3 more'): A.validate(level='max') def test_artifact_validate_max_on_import(self): fp = get_data_path('intsequence-fail-max-validation.txt') fmt = IntSequenceFormat(fp, mode='r') fmt.validate(level='min') self.assertTrue(True) # Checkpoint assertion with self.assertRaisesRegex(ValidationError, '3 more'): Artifact.import_data('IntSequence1', fp) def test_artifact_validate_min(self): A = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) A.validate(level='min') self.assertTrue(True) # Checkpoint assertion A = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) A.validate(level='min') self.assertTrue(True) # Checkpoint assertion def test_artifact_validate_invalid_level(self): A = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) with self.assertRaisesRegex(ValueError, 'peanut'): A.validate(level='peanut') def test_view_as_metadata(self): A = Artifact.import_data('Mapping', {'a': '1', 'b': '3'}) obs_md = A.view(Metadata) exp_df = pd.DataFrame({'a': '1', 'b': '3'}, index=pd.Index(['0'], name='id', dtype=object), dtype=object) exp_md = Metadata(exp_df) exp_md._add_artifacts([A]) self.assertEqual(obs_md, exp_md) # This check is redundant because `Metadata.__eq__` being used above # takes source artifacts into account. Doesn't hurt to have an explicit # check though, since this API didn't always track source artifacts # (this check also future-proofs the test in case `Metadata.__eq__` # changes in the future). self.assertEqual(obs_md.artifacts, (A,)) def test_cannot_be_viewed_as_metadata(self): A = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) with self.assertRaisesRegex(TypeError, 'Artifact.*IntSequence1.*cannot be viewed ' 'as QIIME 2 Metadata'): A.view(Metadata) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_config.py000066400000000000000000000222311462552636000211320ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2022, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import pkg_resources import parsl from parsl.executors.threads import ThreadPoolExecutor from parsl.errors import NoDataFlowKernelError from qiime2 import Artifact, Cache from qiime2.core.util import load_action_yaml from qiime2.core.testing.type import SingleInt from qiime2.core.testing.util import get_dummy_plugin from qiime2.sdk.parallel_config import (PARALLEL_CONFIG, _TEST_EXECUTOR_, _MASK_CONDA_ENV_, ParallelConfig, get_config_from_file) class TestConfig(unittest.TestCase): # Get actions plugin = get_dummy_plugin() pipeline = plugin.pipelines['resumable_pipeline'] method = plugin.methods['list_of_ints'] # Expected provenance based on type of executor used tpool_expected = [{ 'type': 'parsl', 'parsl_type': 'ThreadPoolExecutor'}, { 'type': 'parsl', 'parsl_type': 'ThreadPoolExecutor'}] test_expected = [{ 'type': 'parsl', 'parsl_type': '_TEST_EXECUTOR_'}, { 'type': 'parsl', 'parsl_type': '_TEST_EXECUTOR_'}] def setUp(self): # Create config self.test_default = parsl.Config( executors=[ ThreadPoolExecutor( max_threads=1, label='tpool' ), _TEST_EXECUTOR_( max_threads=1, label='default' ) ], # AdHoc Clusters should not be setup with scaling strategy. strategy='none', ) # Create temp test dir and cache in dir self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.cache = Cache(os.path.join(self.test_dir.name, 'new_cache')) # Create artifacts here so we have unique inputs in each test self.art = [Artifact.import_data(SingleInt, 0), Artifact.import_data(SingleInt, 1)] # Get paths to config files self.config_fp = self.get_data_path('test_config.toml') self.mapping_config_fp = self.get_data_path('mapping_config.toml') self.mapping_only_config_fp = \ self.get_data_path('mapping_only_config.toml') self.complex_config_fp = \ self.get_data_path('complex_config.toml') def tearDown(self): self.test_dir.cleanup() # Ensure default state post test PARALLEL_CONFIG.parallel_config = None PARALLEL_CONFIG.action_executor_mapping = {} def get_data_path(self, filename): return pkg_resources.resource_filename('qiime2.sdk.tests', 'data/%s' % filename) def test_default_config(self): with ParallelConfig(): self.assertIsInstance( PARALLEL_CONFIG.parallel_config, parsl.Config) self.assertEqual(PARALLEL_CONFIG.action_executor_mapping, {}) def test_mapping_from_config(self): config, mapping = get_config_from_file(self.mapping_config_fp) with self.cache: with ParallelConfig(config, mapping): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.test_expected) self.assertEqual(dict_execution_contexts, self.tpool_expected) def test_mapping_only_config(self): _, mapping = get_config_from_file(self.mapping_only_config_fp) with self.cache: with ParallelConfig(action_executor_mapping=mapping): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.test_expected) self.assertEqual(dict_execution_contexts, self.tpool_expected) def test_mapping_from_dict(self): mapping = {'list_of_ints': 'test'} with self.cache: with ParallelConfig(action_executor_mapping=mapping): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.test_expected) self.assertEqual(dict_execution_contexts, self.tpool_expected) def test_parallel_configs(self): with self.cache: with ParallelConfig(): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.tpool_expected) self.assertEqual(dict_execution_contexts, self.tpool_expected) with ParallelConfig(self.test_default): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.test_expected) self.assertEqual(dict_execution_contexts, self.test_expected) # At this point we should be using the default config again which # does not have an executor called tpool with ParallelConfig( action_executor_mapping={'list_of_ints': 'tpool'}): with self.assertRaisesRegex(KeyError, 'tpool'): future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() def test_nested_configs(self): with self.cache: with self.assertRaisesRegex( ValueError, 'cannot nest ParallelConfigs'): with ParallelConfig(): with ParallelConfig(self.test_default): pass def test_parallel_non_pipeline(self): with self.assertRaisesRegex( ValueError, 'Only pipelines may be run in parallel'): self.method.parallel(self.art) def test_no_vendored_fp(self): with _MASK_CONDA_ENV_(): with ParallelConfig(): with self.cache: future = self.pipeline.parallel(self.art, self.art) list_return, dict_return = future._result() list_execution_contexts = self._load_alias_execution_contexts( list_return) dict_execution_contexts = self._load_alias_execution_contexts( dict_return) self.assertEqual(list_execution_contexts, self.tpool_expected) self.assertEqual(dict_execution_contexts, self.tpool_expected) def test_load_complex_config(self): """ Test that all parsl modules we currently map are correct """ config, mapping = get_config_from_file(self.complex_config_fp) # Just assert that we were able to parse the file and get a config out with ParallelConfig(config, mapping): self.assertIsInstance( PARALLEL_CONFIG.parallel_config, parsl.Config) self.assertEqual(PARALLEL_CONFIG.action_executor_mapping, {}) def test_no_config(self): with self.assertRaisesRegex(NoDataFlowKernelError, 'Must first load config'): self.pipeline.parallel(self.art, self.art) def test_config_unset(self): with ParallelConfig(): self.pipeline.parallel(self.art, self.art) with self.assertRaisesRegex(NoDataFlowKernelError, 'Must first load config'): self.pipeline.parallel(self.art, self.art) def _load_alias_execution_contexts(self, collection): execution_contexts = [] for result in collection.values(): alias_uuid = load_action_yaml( result._archiver.path)['action']['alias-of'] execution_contexts.append(load_action_yaml( self.cache.data / alias_uuid) ['execution']['execution_context']) return execution_contexts qiime2-2024.5.0/qiime2/sdk/tests/test_method.py000066400000000000000000000735641462552636000211640ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import collections import concurrent.futures import inspect import unittest import uuid import qiime2.plugin from qiime2.sdk.util import view_collection from qiime2.core.type import MethodSignature, Int from qiime2.sdk import Artifact, Method, Results, ResultCollection from qiime2.core.testing.method import (concatenate_ints, merge_mappings, params_only_method, no_input_method) from qiime2.core.testing.type import ( IntSequence1, IntSequence2, SingleInt, Mapping) from qiime2.core.testing.util import get_dummy_plugin # TODO refactor these tests along with Visualizer tests to remove duplication. class TestMethod(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() def test_private_constructor(self): with self.assertRaisesRegex(NotImplementedError, 'Method constructor.*private'): Method() def test_from_function_with_artifacts_and_parameters(self): concatenate_ints_sig = MethodSignature( concatenate_ints, inputs={ 'ints1': IntSequence1 | IntSequence2, 'ints2': IntSequence1, 'ints3': IntSequence2 }, parameters={ 'int1': qiime2.plugin.Int, 'int2': qiime2.plugin.Int }, outputs=[ ('concatenated_ints', IntSequence1) ] ) method = self.plugin.methods['concatenate_ints'] self.assertEqual(method.id, 'concatenate_ints') self.assertEqual(method.signature, concatenate_ints_sig) self.assertEqual(method.name, 'Concatenate integers') self.assertTrue( method.description.startswith('This method concatenates integers')) self.assertTrue( method.source.startswith('\n```python\ndef concatenate_ints(')) def test_from_function_with_multiple_outputs(self): method = self.plugin.methods['split_ints'] sig_input = method.signature.inputs['ints'].qiime_type self.assertEqual(list(method.signature.inputs.keys()), ['ints']) self.assertLessEqual(IntSequence1, sig_input) self.assertLessEqual(IntSequence2, sig_input) self.assertEqual({}, method.signature.parameters) self.assertEqual(list(method.signature.outputs.keys()), ['left', 'right']) self.assertIs(sig_input, method.signature.outputs['left'].qiime_type) self.assertIs(sig_input, method.signature.outputs['right'].qiime_type) self.assertEqual(method.id, 'split_ints') self.assertEqual(method.name, 'Split sequence of integers in half') self.assertTrue( method.description.startswith('This method splits a sequence')) self.assertTrue( method.source.startswith('\n```python\ndef split_ints(')) def test_from_function_without_parameters(self): method = self.plugin.methods['merge_mappings'] self.assertEqual(method.id, 'merge_mappings') exp_sig = MethodSignature( merge_mappings, inputs={ 'mapping1': Mapping, 'mapping2': Mapping }, input_descriptions={ 'mapping1': 'Mapping object to be merged' }, parameters={}, outputs=[ ('merged_mapping', Mapping) ], output_descriptions={ 'merged_mapping': 'Resulting merged Mapping object' } ) self.assertEqual(method.signature, exp_sig) self.assertEqual(method.name, 'Merge mappings') self.assertTrue( method.description.startswith('This method merges two mappings')) self.assertTrue( method.source.startswith('\n```python\ndef merge_mappings(')) def test_from_function_with_parameters_only(self): method = self.plugin.methods['params_only_method'] self.assertEqual(method.id, 'params_only_method') exp_sig = MethodSignature( params_only_method, inputs={}, parameters={ 'name': qiime2.plugin.Str, 'age': qiime2.plugin.Int }, outputs=[ ('out', Mapping) ] ) self.assertEqual(method.signature, exp_sig) self.assertEqual(method.name, 'Parameters only method') self.assertTrue( method.description.startswith('This method only accepts')) self.assertTrue( method.source.startswith('\n```python\ndef params_only_method(')) def test_from_function_without_inputs_or_parameters(self): method = self.plugin.methods['no_input_method'] self.assertEqual(method.id, 'no_input_method') exp_sig = MethodSignature( no_input_method, inputs={}, parameters={}, outputs=[ ('out', Mapping) ] ) self.assertEqual(method.signature, exp_sig) self.assertEqual(method.name, 'No input method') self.assertTrue( method.description.startswith('This method does not accept any')) self.assertTrue( method.source.startswith('\n```python\ndef no_input_method(')) def test_is_callable(self): self.assertTrue(callable(self.plugin.methods['concatenate_ints'])) def test_callable_properties(self): concatenate_ints = self.plugin.methods['concatenate_ints'] merge_mappings = self.plugin.methods['merge_mappings'] concatenate_exp = { 'int2': Int, 'ints2': IntSequence1, 'return': (IntSequence1,), 'int1': Int, 'ints3': IntSequence2, 'ints1': IntSequence1 | IntSequence2} merge_exp = { 'mapping2': Mapping, 'mapping1': Mapping, 'return': (Mapping,)} mapper = { concatenate_ints: concatenate_exp, merge_mappings: merge_exp} for method, exp in mapper.items(): self.assertEqual(method.__call__.__name__, '__call__') self.assertEqual(method.__call__.__annotations__, exp) self.assertFalse(hasattr(method.__call__, '__wrapped__')) def test_async_properties(self): concatenate_ints = self.plugin.methods['concatenate_ints'] merge_mappings = self.plugin.methods['merge_mappings'] concatenate_exp = { 'int2': Int, 'ints2': IntSequence1, 'return': (IntSequence1,), 'int1': Int, 'ints3': IntSequence2, 'ints1': IntSequence1 | IntSequence2} merge_exp = { 'mapping2': Mapping, 'mapping1': Mapping, 'return': (Mapping,)} mapper = { concatenate_ints: concatenate_exp, merge_mappings: merge_exp} for method, exp in mapper.items(): self.assertEqual(method.asynchronous.__name__, 'asynchronous') self.assertEqual(method.asynchronous.__annotations__, exp) self.assertFalse(hasattr(method.asynchronous, '__wrapped__')) def test_callable_and_async_signature_with_artifacts_and_parameters(self): # Signature with input artifacts and parameters (i.e. primitives). concatenate_ints = self.plugin.methods['concatenate_ints'] for callable_attr in '__call__', 'asynchronous': signature = inspect.Signature.from_callable( getattr(concatenate_ints, callable_attr)) parameters = list(signature.parameters.items()) kind = inspect.Parameter.POSITIONAL_OR_KEYWORD exp_parameters = [ ('ints1', inspect.Parameter( 'ints1', kind, annotation=IntSequence1 | IntSequence2)), ('ints2', inspect.Parameter( 'ints2', kind, annotation=IntSequence1)), ('ints3', inspect.Parameter( 'ints3', kind, annotation=IntSequence2)), ('int1', inspect.Parameter( 'int1', kind, annotation=Int)), ('int2', inspect.Parameter( 'int2', kind, annotation=Int)) ] self.assertEqual(parameters, exp_parameters) def test_callable_and_async_signature_with_no_parameters(self): # Signature without parameters (i.e. primitives), only input artifacts. method = self.plugin.methods['merge_mappings'] for callable_attr in '__call__', 'asynchronous': signature = inspect.Signature.from_callable( getattr(method, callable_attr)) parameters = list(signature.parameters.items()) kind = inspect.Parameter.POSITIONAL_OR_KEYWORD exp_parameters = [ ('mapping1', inspect.Parameter( 'mapping1', kind, annotation=Mapping)), ('mapping2', inspect.Parameter( 'mapping2', kind, annotation=Mapping)) ] self.assertEqual(parameters, exp_parameters) def test_call_with_artifacts_and_parameters(self): concatenate_ints = self.plugin.methods['concatenate_ints'] artifact1 = Artifact.import_data(IntSequence1, [0, 42, 43]) artifact2 = Artifact.import_data(IntSequence2, [99, -22]) result = concatenate_ints(artifact1, artifact1, artifact2, 55, 1) # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.concatenated_ints.view(list), [0, 42, 43, 0, 42, 43, 99, -22, 55, 1]) result = result[0] self.assertIsInstance(result, Artifact) self.assertEqual(result.type, IntSequence1) self.assertIsInstance(result.uuid, uuid.UUID) # Can retrieve multiple views of different type. exp_list_view = [0, 42, 43, 0, 42, 43, 99, -22, 55, 1] self.assertEqual(result.view(list), exp_list_view) self.assertEqual(result.view(list), exp_list_view) exp_counter_view = collections.Counter( {0: 2, 42: 2, 43: 2, 99: 1, -22: 1, 55: 1, 1: 1}) self.assertEqual(result.view(collections.Counter), exp_counter_view) self.assertEqual(result.view(collections.Counter), exp_counter_view) # Accepts IntSequence1 | IntSequence2 artifact3 = Artifact.import_data(IntSequence2, [10, 20]) result, = concatenate_ints(artifact3, artifact1, artifact2, 55, 1) self.assertEqual(result.type, IntSequence1) self.assertEqual(result.view(list), [10, 20, 0, 42, 43, 99, -22, 55, 1]) def test_call_with_multiple_outputs(self): split_ints = self.plugin.methods['split_ints'] artifact = Artifact.import_data(IntSequence1, [0, 42, -2, 43, 6]) result = split_ints(artifact) self.assertIsInstance(result, tuple) self.assertEqual(len(result), 2) for output_artifact in result: self.assertIsInstance(output_artifact, Artifact) self.assertEqual(output_artifact.type, IntSequence1) self.assertIsInstance(output_artifact.uuid, uuid.UUID) # Output artifacts have different UUIDs. self.assertNotEqual(result[0].uuid, result[1].uuid) # Index lookup. self.assertEqual(result[0].view(list), [0, 42]) self.assertEqual(result[1].view(list), [-2, 43, 6]) # Test properties of the `Results` object. self.assertIsInstance(result, Results) self.assertEqual(result.left.view(list), [0, 42]) self.assertEqual(result.right.view(list), [-2, 43, 6]) def test_call_with_multiple_outputs_matched_types(self): split_ints = self.plugin.methods['split_ints'] artifact = Artifact.import_data(IntSequence2, [0, 42, -2, 43, 6]) result = split_ints(artifact) self.assertIsInstance(result, tuple) self.assertEqual(len(result), 2) for output_artifact in result: self.assertIsInstance(output_artifact, Artifact) self.assertEqual(output_artifact.type, IntSequence2) self.assertIsInstance(output_artifact.uuid, uuid.UUID) # Output artifacts have different UUIDs. self.assertNotEqual(result[0].uuid, result[1].uuid) # Index lookup. self.assertEqual(result[0].view(list), [0, 42]) self.assertEqual(result[1].view(list), [-2, 43, 6]) # Test properties of the `Results` object. self.assertIsInstance(result, Results) self.assertEqual(result.left.view(list), [0, 42]) self.assertEqual(result.right.view(list), [-2, 43, 6]) def test_call_with_no_parameters(self): merge_mappings = self.plugin.methods['merge_mappings'] artifact1 = Artifact.import_data(Mapping, {'foo': 'abc', 'bar': 'def'}) artifact2 = Artifact.import_data(Mapping, {'bazz': 'abc'}) result = merge_mappings(artifact1, artifact2) # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.merged_mapping.view(dict), {'foo': 'abc', 'bar': 'def', 'bazz': 'abc'}) result = result[0] self.assertIsInstance(result, Artifact) self.assertEqual(result.type, Mapping) self.assertIsInstance(result.uuid, uuid.UUID) self.assertEqual(result.view(dict), {'foo': 'abc', 'bar': 'def', 'bazz': 'abc'}) def test_call_with_parameters_only(self): params_only_method = self.plugin.methods['params_only_method'] result, = params_only_method("Someone's Name", 999) self.assertIsInstance(result, Artifact) self.assertEqual(result.type, Mapping) self.assertIsInstance(result.uuid, uuid.UUID) self.assertEqual(result.view(dict), {"Someone's Name": '999'}) def test_call_without_inputs_or_parameters(self): no_input_method = self.plugin.methods['no_input_method'] result, = no_input_method() self.assertIsInstance(result, Artifact) self.assertEqual(result.type, Mapping) self.assertIsInstance(result.uuid, uuid.UUID) self.assertEqual(result.view(dict), {'foo': '42'}) def test_call_with_optional_artifacts(self): method = self.plugin.methods['optional_artifacts_method'] ints1 = Artifact.import_data(IntSequence1, [0, 42, 43]) ints2 = Artifact.import_data(IntSequence1, [99, -22]) ints3 = Artifact.import_data(IntSequence2, [43, 43]) # No optional artifacts provided. obs = method(ints1, 42).output self.assertEqual(obs.view(list), [0, 42, 43, 42]) # One optional artifact provided. obs = method(ints1, 42, optional1=ints2).output self.assertEqual(obs.view(list), [0, 42, 43, 42, 99, -22]) # All optional artifacts provided. obs = method( ints1, 42, optional1=ints2, optional2=ints3, num2=111).output self.assertEqual(obs.view(list), [0, 42, 43, 42, 99, -22, 43, 43, 111]) # Invalid type provided as optional artifact. with self.assertRaisesRegex(TypeError, 'type IntSequence1.*type IntSequence2'): method(ints1, 42, optional1=ints3) def test_call_with_variadic_inputs(self): method = self.plugin.methods['variadic_input_method'] ints = [Artifact.import_data(IntSequence1, [1, 2, 3]), Artifact.import_data(IntSequence2, [4, 5, 6])] int_set = {Artifact.import_data(SingleInt, 7), Artifact.import_data(SingleInt, 8)} nums = {9, 10} opt_nums = [11, 12, 13] result, = method(ints, int_set, nums, opt_nums) self.assertEqual(result.view(list), list(range(1, 14))) def test_asynchronous(self): concatenate_ints = self.plugin.methods['concatenate_ints'] artifact1 = Artifact.import_data(IntSequence1, [0, 42, 43]) artifact2 = Artifact.import_data(IntSequence2, [99, -22]) future = concatenate_ints.asynchronous( artifact1, artifact1, artifact2, 55, 1) self.assertIsInstance(future, concurrent.futures.Future) result = future.result() # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.concatenated_ints.view(list), [0, 42, 43, 0, 42, 43, 99, -22, 55, 1]) result = result[0] self.assertIsInstance(result, Artifact) self.assertEqual(result.type, IntSequence1) self.assertIsInstance(result.uuid, uuid.UUID) # Can retrieve multiple views of different type. exp_list_view = [0, 42, 43, 0, 42, 43, 99, -22, 55, 1] self.assertEqual(result.view(list), exp_list_view) self.assertEqual(result.view(list), exp_list_view) exp_counter_view = collections.Counter( {0: 2, 42: 2, 43: 2, 99: 1, -22: 1, 55: 1, 1: 1}) self.assertEqual(result.view(collections.Counter), exp_counter_view) self.assertEqual(result.view(collections.Counter), exp_counter_view) # Accepts IntSequence1 | IntSequence2 artifact3 = Artifact.import_data(IntSequence2, [10, 20]) future = concatenate_ints.asynchronous(artifact3, artifact1, artifact2, 55, 1) result, = future.result() self.assertEqual(result.type, IntSequence1) self.assertEqual(result.view(list), [10, 20, 0, 42, 43, 99, -22, 55, 1]) def test_async_with_multiple_outputs(self): split_ints = self.plugin.methods['split_ints'] artifact = Artifact.import_data(IntSequence1, [0, 42, -2, 43, 6]) future = split_ints.asynchronous(artifact) self.assertIsInstance(future, concurrent.futures.Future) result = future.result() self.assertIsInstance(result, tuple) self.assertEqual(len(result), 2) for output_artifact in result: self.assertIsInstance(output_artifact, Artifact) self.assertEqual(output_artifact.type, IntSequence1) self.assertIsInstance(output_artifact.uuid, uuid.UUID) # Output artifacts have different UUIDs. self.assertNotEqual(result[0].uuid, result[1].uuid) # Index lookup. self.assertEqual(result[0].view(list), [0, 42]) self.assertEqual(result[1].view(list), [-2, 43, 6]) # Test properties of the `Results` object. self.assertIsInstance(result, Results) self.assertEqual(result.left.view(list), [0, 42]) self.assertEqual(result.right.view(list), [-2, 43, 6]) def test_async_with_multiple_outputs_matched_types(self): split_ints = self.plugin.methods['split_ints'] artifact = Artifact.import_data(IntSequence2, [0, 42, -2, 43, 6]) future = split_ints.asynchronous(artifact) self.assertIsInstance(future, concurrent.futures.Future) result = future.result() self.assertIsInstance(result, tuple) self.assertEqual(len(result), 2) for output_artifact in result: self.assertIsInstance(output_artifact, Artifact) self.assertEqual(output_artifact.type, IntSequence2) self.assertIsInstance(output_artifact.uuid, uuid.UUID) # Output artifacts have different UUIDs. self.assertNotEqual(result[0].uuid, result[1].uuid) # Index lookup. self.assertEqual(result[0].view(list), [0, 42]) self.assertEqual(result[1].view(list), [-2, 43, 6]) # Test properties of the `Results` object. self.assertIsInstance(result, Results) self.assertEqual(result.left.view(list), [0, 42]) self.assertEqual(result.right.view(list), [-2, 43, 6]) def test_async_with_typing_unions(self): union_inputs = self.plugin.methods['union_inputs'] artifact1 = Artifact.import_data(IntSequence1, [0, 42, 43]) artifact2 = Artifact.import_data(IntSequence2, [99, -22]) future = union_inputs.asynchronous(artifact1, artifact2) self.assertIsInstance(future, concurrent.futures.Future) result = future.result() self.assertIsInstance(result, tuple) self.assertEqual(len(result), 1) # Test the `Results` object. self.assertIsInstance(result, Results) self.assertEqual(result[0].view(list), [0]) def test_docstring(self): merge_mappings = self.plugin.methods['merge_mappings'] split_ints = self.plugin.methods['split_ints'] identity_with_optional_metadata = ( self.plugin.methods['identity_with_optional_metadata']) no_input_method = self.plugin.methods['no_input_method'] params_only_method = self.plugin.methods['params_only_method'] long_description_method = self.plugin.methods[ 'long_description_method'] docstring_order_method = self.plugin.methods['docstring_order_method'] self.assertEqual(merge_mappings.__doc__, 'QIIME 2 Method') merge_calldoc = merge_mappings.__call__.__doc__ self.assertEqual(exp_merge_calldoc, merge_calldoc) split_ints_return = split_ints.__call__.__doc__.split('\n\n')[3] self.assertEqual(exp_split_ints_return, split_ints_return) optional_params = ( identity_with_optional_metadata.__call__.__doc__.split('\n\n')[2]) self.assertEqual(exp_optional_params, optional_params) no_input_method = no_input_method.__call__.__doc__ self.assertEqual(exp_no_input_method, no_input_method) params_only = params_only_method.__call__.__doc__ self.assertEqual(exp_params_only, params_only) long_desc = long_description_method.__call__.__doc__ self.assertEqual(exp_long_description, long_desc) docstring_order = docstring_order_method.__call__.__doc__ self.assertEqual(exp_docstring_order, docstring_order) def test_collection_list_input(self): list_method = self.plugin.methods['list_of_ints'] dict_method = self.plugin.methods['dict_of_ints'] int_list = [Artifact.import_data(SingleInt, 1), Artifact.import_data(SingleInt, 2)] expected = {'0': 1, '1': 2} list_out = list_method(int_list) dict_out = dict_method(int_list) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, ResultCollection) self.assertIsInstance(dict_out.output, ResultCollection) view_list_out = view_collection(list_out.output, int) view_dict_out = view_collection(dict_out.output, int) self.assertEqual(view_list_out, expected) self.assertEqual(view_dict_out, expected) def test_collection_dict_input(self): list_method = self.plugin.methods['list_of_ints'] dict_method = self.plugin.methods['dict_of_ints'] int_dict = {'foo': Artifact.import_data(SingleInt, 1), 'bar': Artifact.import_data(SingleInt, 2)} # The dict method should have preserved the keys, the list method can't # have because it never received them because it recieved only the # values as a list so uses list indices as keys expected_list = {'0': 1, '1': 2} expected_dict = {'foo': 1, 'bar': 2} list_out = list_method(int_dict) dict_out = dict_method(int_dict) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, ResultCollection) self.assertIsInstance(dict_out.output, ResultCollection) view_list_out = view_collection(list_out.output, int) view_dict_out = view_collection(dict_out.output, int) self.assertEqual(view_list_out, expected_list) self.assertEqual(view_dict_out, expected_dict) def test_collection_inner_union(self): inner_union = self.plugin.methods['collection_inner_union'] inner_test = [Artifact.import_data(IntSequence1, [0, 1, 2]), Artifact.import_data(IntSequence2, [3, 4, 5])] out = inner_union(inner_test) self.assertEqual(len(out), 1) self.assertIsInstance(out.output, ResultCollection) def test_collection_outer_union(self): outer_union = self.plugin.methods['collection_outer_union'] int_dict = {'1': Artifact.import_data(IntSequence1, [0, 1, 2]), '2': Artifact.import_data(IntSequence1, [3, 4, 5])} out = outer_union(int_dict) self.assertEqual(len(out), 1) self.assertIsInstance(out.output, ResultCollection) def test_collection_list_param(self): list_method = self.plugin.methods['list_params'] param_list = [1, 2, 3, 4] param_dict = {'a': 1, 'b': 2, 'c': 3, 'd': 4} expected = {'0': 1, '1': 2, '2': 3, '3': 4} list_out = list_method(param_list) dict_out = list_method(param_dict) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, ResultCollection) self.assertIsInstance(dict_out.output, ResultCollection) view_list_out = view_collection(list_out.output, int) view_dict_out = view_collection(dict_out.output, int) self.assertEqual(view_list_out, expected) self.assertEqual(view_dict_out, expected) def test_collection_dict_param(self): dict_method = self.plugin.methods['dict_params'] param_list = [1, 2, 3, 4] param_dict = {'a': 1, 'b': 2, 'c': 3, 'd': 4} # The dict method should have preserved the keys, the list method can't # have because it never received them because it recieved only the # values as a list so uses list indices as keys expected_list = {'0': 1, '1': 2, '2': 3, '3': 4} expected_dict = {'a': 1, 'b': 2, 'c': 3, 'd': 4} list_out = dict_method(param_list) dict_out = dict_method(param_dict) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, ResultCollection) self.assertIsInstance(dict_out.output, ResultCollection) view_list_out = view_collection(list_out.output, int) view_dict_out = view_collection(dict_out.output, int) self.assertEqual(view_list_out, expected_list) self.assertEqual(view_dict_out, expected_dict) def test_varied_method(self): varied_method = self.plugin.methods['varied_method'] ints1 = [Artifact.import_data(SingleInt, 1), Artifact.import_data(SingleInt, 2)] ints2 = {'foo': Artifact.import_data(IntSequence1, [0, 1, 2]), 'bar': Artifact.import_data(IntSequence1, [3, 4, 5])} int1 = Artifact.import_data(SingleInt, 1) ints1_expected = {'0': 1, '1': 2} ints2_expected = {'foo': [0, 1, 2], 'bar': [3, 4, 5]} int1_expected = 1 ints1_ret, ints2_ret, int1_ret = varied_method( ints1, ints2, int1, 'Hi') self.assertEqual(len(ints1_ret), 2) self.assertEqual(len(ints2_ret), 2) self.assertIsInstance(ints1_ret, ResultCollection) self.assertIsInstance(ints2_ret, ResultCollection) view_ints1_ret = view_collection(ints1_ret, int) view_ints2_ret = view_collection(ints2_ret, list) view_int1_ret = int1_ret.view(int) self.assertEqual(view_ints1_ret, ints1_expected) self.assertEqual(view_ints2_ret, ints2_expected) self.assertEqual(view_int1_ret, int1_expected) exp_merge_calldoc = """\ Merge mappings This method merges two mappings into a single new mapping. If a key is shared between mappings and the values differ, an error will be raised. Parameters ---------- mapping1 : Mapping Mapping object to be merged mapping2 : Mapping Returns ------- merged_mapping : Mapping Resulting merged Mapping object """ exp_split_ints_return = """\ Returns ------- left : IntSequence1\xb9 | IntSequence2\xb2 right : IntSequence1\xb9 | IntSequence2\xb2 """ exp_optional_params = """\ Parameters ---------- ints : IntSequence1 | IntSequence2 metadata : Metadata, optional\ """ exp_no_input_method = """\ No input method This method does not accept any type of input. Returns ------- out : Mapping """ exp_params_only = """\ Parameters only method This method only accepts parameters. Parameters ---------- name : Str age : Int Returns ------- out : Mapping """ exp_long_description = """\ Long Description This is a very long description. If asked about its length, I would have to say it is greater than 79 characters. Parameters ---------- mapping1 : Mapping This is a very long description. If asked about its length, I would have to say it is greater than 79 characters. name : Str This is a very long description. If asked about its length, I would have to say it is greater than 79 characters. age : Int Returns ------- out : Mapping This is a very long description. If asked about its length, I would have to say it is greater than 79 characters. """ exp_docstring_order = """\ Docstring Order Tests whether inputs and parameters are rendered in signature order Parameters ---------- req_input : Mapping This should show up first. req_param : Str This should show up second. opt_input : Mapping, optional This should show up third. opt_param : Int, optional This should show up fourth. Returns ------- out : Mapping This should show up last, in it's own section. """ if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_pipeline.py000066400000000000000000000404561462552636000215030ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import inspect import pandas as pd import qiime2 import qiime2.sdk from qiime2.core.testing.util import get_dummy_plugin from qiime2.core.testing.type import IntSequence1, SingleInt, Mapping from qiime2.plugin import Visualization, Int, Bool import qiime2.sdk.parallel_config class TestPipeline(unittest.TestCase): def setUp(self): self.plugin = get_dummy_plugin() self.single_int = qiime2.Artifact.import_data(SingleInt, -1) self.int_sequence = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3]) self.mapping = qiime2.Artifact.import_data(Mapping, {'foo': '42'}) def test_private_constructor(self): with self.assertRaisesRegex(NotImplementedError, 'Pipeline constructor.*private'): qiime2.sdk.Pipeline() def test_from_function_spot_check(self): typical_pipeline = self.plugin.pipelines['typical_pipeline'] self.assertEqual(typical_pipeline.id, 'typical_pipeline') assert typical_pipeline.signature.inputs for spec in typical_pipeline.signature.inputs.values(): assert spec.has_description() assert spec.has_qiime_type() assert not spec.has_view_type() assert not spec.has_default() spec = typical_pipeline.signature.parameters['add'] assert spec.has_default() def test_from_function_optional(self): optional_artifact_pipeline = self.plugin.pipelines[ 'optional_artifact_pipeline'] spec = optional_artifact_pipeline.signature.inputs['single_int'] assert spec.has_default() def test_is_callable(self): assert callable(self.plugin.pipelines['typical_pipeline']) def test_callable_and_async_signature(self): # Shouldn't include `ctx` typical_pipeline = self.plugin.pipelines['typical_pipeline'] kind = inspect.Parameter.POSITIONAL_OR_KEYWORD exp_parameters = [ ('int_sequence', inspect.Parameter( 'int_sequence', kind, annotation=IntSequence1)), ('mapping', inspect.Parameter( 'mapping', kind, annotation=Mapping)), ('do_extra_thing', inspect.Parameter( 'do_extra_thing', kind, annotation=Bool)), ('add', inspect.Parameter( 'add', kind, default=1, annotation=Int)) ] for callable_attr in '__call__', 'asynchronous': signature = inspect.Signature.from_callable( getattr(typical_pipeline, callable_attr)) parameters = list(signature.parameters.items()) self.assertEqual(parameters, exp_parameters) def test_signatures_independent(self): typical_pipeline = self.plugin.pipelines['typical_pipeline'] parameter_only_pipeline = self.plugin.pipelines[ 'parameter_only_pipeline'] for callable_attr in '__call__', 'asynchronous': signature_a = inspect.Signature.from_callable( getattr(typical_pipeline, callable_attr)) signature_b = inspect.Signature.from_callable( getattr(parameter_only_pipeline, callable_attr)) self.assertNotEqual(signature_a, signature_b) def test_list_pipeline(self): list_pipeline = self.plugin.pipelines['list_pipeline'] int_list = [qiime2.Artifact.import_data(IntSequence1, [0, 1, 2]), qiime2.Artifact.import_data(IntSequence1, [3, 4, 5])] int_dict = {'1': qiime2.Artifact.import_data(IntSequence1, [0, 1, 2]), '2': qiime2.Artifact.import_data(IntSequence1, [3, 4, 5])} list_out = list_pipeline(int_list) dict_out = list_pipeline(int_dict) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, qiime2.sdk.ResultCollection) self.assertIsInstance(dict_out.output, qiime2.sdk.ResultCollection) self.assertEqual(list(list_out.output.keys()), ['0', '1']) self.assertEqual(list(dict_out.output.keys()), ['0', '1']) self.assertEqual( [v.view(int) for v in list_out.output.values()], [4, 5]) self.assertEqual( [v.view(int) for v in dict_out.output.values()], [4, 5]) def test_collection_pipeline(self): collection_pipeline = self.plugin.pipelines['collection_pipeline'] int_list = [qiime2.Artifact.import_data(IntSequence1, [0, 1, 2]), qiime2.Artifact.import_data(IntSequence1, [3, 4, 5])] int_dict = {'1': qiime2.Artifact.import_data(IntSequence1, [0, 1, 2]), '2': qiime2.Artifact.import_data(IntSequence1, [3, 4, 5])} list_out = collection_pipeline(int_list) dict_out = collection_pipeline(int_dict) self.assertEqual(len(list_out), 1) self.assertEqual(len(dict_out), 1) self.assertIsInstance(list_out.output, qiime2.sdk.ResultCollection) self.assertIsInstance(dict_out.output, qiime2.sdk.ResultCollection) self.assertEqual(list(list_out.output.keys()), ['key1', 'key2']) self.assertEqual(list(dict_out.output.keys()), ['key1', 'key2']) self.assertEqual( [v.view(int) for v in list_out.output.values()], [4, 5]) self.assertEqual( [v.view(int) for v in dict_out.output.values()], [4, 5]) def test_de_facto_collection_pipeline(self): de_facto_collection_pipeline = \ self.plugin.pipelines['de_facto_collection_pipeline'] result = de_facto_collection_pipeline() self.assertEqual(len(result), 1) output = result.output self.assertIsInstance(output, qiime2.sdk.ResultCollection) expected = {'0': {'foo': '42'}, '1': {'foo': '42'}} observed = {} for k, v in output.items(): observed[k] = v.view(dict) self.assertEqual(observed, expected) def test_de_facto_collection_pipeline_parallel(self): de_facto_collection_pipeline = \ self.plugin.pipelines['de_facto_collection_pipeline'] with qiime2.sdk.parallel_config.ParallelConfig(): result = de_facto_collection_pipeline.parallel()._result() self.assertEqual(len(result), 1) output = result.output self.assertIsInstance(output, qiime2.sdk.ResultCollection) expected = {'0': {'foo': '42'}, '1': {'foo': '42'}} observed = {} for k, v in output.items(): observed[k] = v.view(dict) self.assertEqual(observed, expected) def iter_callables(self, name): pipeline = self.plugin.pipelines[name] yield pipeline yield lambda *args, **kwargs: pipeline.asynchronous( *args, **kwargs).result() def test_parameter_only_pipeline(self): index = pd.Index(['a', 'b', 'c'], name='id', dtype=object) df = pd.DataFrame({'col1': ['2', '1', '3']}, index=index, dtype=object) metadata = qiime2.Metadata(df) for call in self.iter_callables('parameter_only_pipeline'): results = call(100) self.assertEqual(results.foo.view(list), [100, 2, 3]) self.assertEqual(results.bar.view(list), [100, 2, 3, 100, 2, 3, 100, 2, 3, 100, 2]) results = call(3, int2=4, metadata=metadata) self.assertEqual(results.foo.view(list), [3, 4, 3]) self.assertEqual(results.bar.view(list), [3, 4, 3, 3, 4, 3, 3, 4, 3, 3, 4]) def test_typical_pipeline(self): for call in self.iter_callables('typical_pipeline'): results = call(self.int_sequence, self.mapping, False) self.assertEqual(results.left_viz.type, Visualization) self.assertEqual(results.left.view(list), [1]) self.assertEqual(results.right.view(list), [2, 3]) self.assertNotEqual(results.out_map.uuid, self.mapping.uuid) self.assertEqual(results.out_map.view(dict), self.mapping.view(dict)) results = call(self.int_sequence, self.mapping, True, add=5) self.assertEqual(results.left.view(list), [6]) self.assertEqual(results.right.view(list), [2, 3]) with self.assertRaisesRegex(ValueError, 'Bad mapping'): m = qiime2.Artifact.import_data(Mapping, {'a': 1}) call(self.int_sequence, m, False) def test_optional_artifact_pipeline(self): for call in self.iter_callables('optional_artifact_pipeline'): ints, = call(self.int_sequence) self.assertEqual(ints.view(list), [1, 2, 3, 4]) ints, = call(self.int_sequence, single_int=self.single_int) self.assertEqual(ints.view(list), [1, 2, 3, -1]) def test_visualizer_only_pipeline(self): for call in self.iter_callables('visualizer_only_pipeline'): viz1, viz2 = call(self.mapping) self.assertEqual(viz1.type, Visualization) self.assertEqual(viz2.type, Visualization) def test_pipeline_in_pipeline(self): for call in self.iter_callables('pipelines_in_pipeline'): results = call(self.int_sequence, self.mapping) self.assertEqual(results.int1.view(int), 4) self.assertEqual(results.right_viz.type, Visualization) self.assertEqual(len(results), 8) with self.assertRaisesRegex(ValueError, 'Bad mapping'): m = qiime2.Artifact.import_data(Mapping, {1: 1}) call(self.int_sequence, m) def test_pointless_pipeline(self): for call in self.iter_callables('pointless_pipeline'): single_int, = call() self.assertEqual(single_int.type, SingleInt) self.assertEqual(single_int.view(int), 4) def test_de_facto_list_arg(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] exp = {'0': 0, '1': 1, '2': 2} ret = pipeline() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_list_arg_parallel(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] exp = {'0': 0, '1': 1, '2': 2} with qiime2.sdk.parallel_config.ParallelConfig(): ret = pipeline.parallel()._result() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_list_kwarg(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] exp = {'0': 0, '1': 1, '2': 2} ret = pipeline(kwarg=True) obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_list_kwarg_parallel(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] exp = {'0': 0, '1': 1, '2': 2} with qiime2.sdk.parallel_config.ParallelConfig(): ret = pipeline.parallel(kwarg=True)._result() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_dict_arg(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] exp = {'1': 0, '2': 1, '3': 2} ret = pipeline() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_dict_arg_parallel(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] exp = {'1': 0, '2': 1, '3': 2} with qiime2.sdk.parallel_config.ParallelConfig(): ret = pipeline.parallel()._result() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_dict_kwarg(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] exp = {'1': 0, '2': 1, '3': 2} ret = pipeline(kwarg=True) obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_de_facto_dict_kwarg_parallel(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] exp = {'1': 0, '2': 1, '3': 2} with qiime2.sdk.parallel_config.ParallelConfig(): ret = pipeline.parallel(kwarg=True)._result() obs = qiime2.sdk.util.view_collection(ret.output, int) self.assertEqual(obs, exp) def test_nested_pipeline_parallel(self): ''' This test basically just validates that nested pipelines in parallel don't blow anything up. It was added concurrently with nested parallel pipelines being flattened. ''' pipeline = self.plugin.pipelines['pipelines_in_pipeline'] ints = qiime2.Artifact.import_data(IntSequence1, [1, 2, 3]) mapping = qiime2.Artifact.import_data(Mapping, {'foo': '42'}) with qiime2.sdk.parallel_config.ParallelConfig(): pipeline.parallel(ints, mapping)._result() self.assertTrue(True) def test_failing_from_arity(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(TypeError, 'match number.*3.*1'): call(self.int_sequence, break_from='arity') def test_failing_from_return_view(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(TypeError, 'Result.*objects.*None'): call(self.int_sequence, break_from='return-view') def test_failing_from_method(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(ValueError, "Key 'foo' exists"): call(self.int_sequence, break_from='method') def test_failing_from_type(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(TypeError, 'Mapping.*SingleInt'): call(self.int_sequence, break_from='type') def test_failing_from_internal(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(ValueError, 'this never works'): call(self.int_sequence, break_from='internal') def test_failing_from_missing_plugin(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(ValueError, r'plugin.*not\%a\$plugin'): call(self.int_sequence, break_from='no-plugin') def test_failing_from_missing_action(self): for call in self.iter_callables('failing_pipeline'): with self.assertRaisesRegex(ValueError, r'action.*not\%a\$method'): call(self.int_sequence, break_from='no-action') def test_fail_de_facto_list_arg_mixed(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] with self.assertRaisesRegex( ValueError, 'Collection has mixed proxies and artifacts.*'): with qiime2.sdk.parallel_config.ParallelConfig(): pipeline.parallel(non_proxies=True)._result() def test_fail_de_facto_list_kwarg_mixed(self): pipeline = self.plugin.pipelines['de_facto_list_pipeline'] with self.assertRaisesRegex( ValueError, 'Collection has mixed proxies and artifacts.*'): with qiime2.sdk.parallel_config.ParallelConfig(): pipeline.parallel(kwarg=True, non_proxies=True)._result() def test_fail_de_facto_dict_arg_mixed(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] with self.assertRaisesRegex( ValueError, 'Collection has mixed proxies and artifacts.*'): with qiime2.sdk.parallel_config.ParallelConfig(): pipeline.parallel(non_proxies=True)._result() def test_fail_de_facto_dict_kwarg_mixed(self): pipeline = self.plugin.pipelines['de_facto_dict_pipeline'] with self.assertRaisesRegex( ValueError, 'Collection has mixed proxies and artifacts.*'): with qiime2.sdk.parallel_config.ParallelConfig(): pipeline.parallel(kwarg=True, non_proxies=True)._result() if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_plugin_manager.py000066400000000000000000000456051462552636000226670ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import qiime2.plugin import qiime2.sdk from qiime2.plugin.plugin import (SemanticTypeRecord, FormatRecord, ArtifactClassRecord) from qiime2.sdk.plugin_manager import GetFormatFilters from qiime2.core.testing.type import (IntSequence1, IntSequence2, IntSequence3, Mapping, FourInts, Kennel, Dog, Cat, SingleInt, C1, C2, C3, Foo, Bar, Baz, AscIntSequence, Squid, Octopus, Cuttlefish) from qiime2.core.testing.format import (Cephalapod, IntSequenceDirectoryFormat, MappingDirectoryFormat, IntSequenceV2DirectoryFormat, IntSequenceFormatV2, IntSequenceMultiFileDirectoryFormat, FourIntsDirectoryFormat, IntSequenceFormat, RedundantSingleIntDirectoryFormat, EchoFormat, EchoDirectoryFormat, CephalapodDirectoryFormat, ImportableOnlyFormat, ExportableOnlyFormat) from qiime2.core.testing.validator import (validator_example_null1, validate_ascending_seq, validator_example_null2) from qiime2.core.testing.util import get_dummy_plugin from qiime2.core.testing.plugin import is1_use, is2_use class TestPluginManager(unittest.TestCase): def setUp(self): # PluginManager is a singleton so there's no issue creating it again. self.pm = qiime2.sdk.PluginManager() self.plugin = get_dummy_plugin() self.other_plugin = self.pm.plugins['other-plugin'] def test_plugins(self): plugins = self.pm.plugins exp = { 'dummy-plugin': self.plugin, 'other-plugin': self.other_plugin } self.assertEqual(plugins, exp) def test_validators(self): self.assertEqual({Kennel[Dog], Kennel[Cat], AscIntSequence, Squid, Octopus, Cuttlefish}, set(self.pm.validators)) self.assertEqual( set([r.validator for r in self.pm.validators[Kennel[Dog]]._validators]), {validator_example_null1, validator_example_null2}) self.assertEqual( [r.validator for r in self.pm.validators[Kennel[Cat]]._validators], [validator_example_null1]) self.assertEqual( [r.validator for r in self.pm.validators[AscIntSequence]._validators], [validate_ascending_seq]) def test_type_fragments(self): types = self.pm.type_fragments exp = { 'IntSequence1': SemanticTypeRecord(semantic_type=IntSequence1, plugin=self.plugin), 'IntSequence2': SemanticTypeRecord(semantic_type=IntSequence2, plugin=self.plugin), 'IntSequence3': SemanticTypeRecord(semantic_type=IntSequence3, plugin=self.plugin), 'Mapping': SemanticTypeRecord(semantic_type=Mapping, plugin=self.plugin), 'FourInts': SemanticTypeRecord(semantic_type=FourInts, plugin=self.plugin), 'Kennel': SemanticTypeRecord(semantic_type=Kennel, plugin=self.plugin), 'Dog': SemanticTypeRecord(semantic_type=Dog, plugin=self.plugin), 'Cat': SemanticTypeRecord(semantic_type=Cat, plugin=self.plugin), 'SingleInt': SemanticTypeRecord(semantic_type=SingleInt, plugin=self.plugin), 'C1': SemanticTypeRecord(semantic_type=C1, plugin=self.plugin), 'C2': SemanticTypeRecord(semantic_type=C2, plugin=self.plugin), 'C3': SemanticTypeRecord(semantic_type=C3, plugin=self.plugin), 'Foo': SemanticTypeRecord(semantic_type=Foo, plugin=self.plugin), 'Bar': SemanticTypeRecord(semantic_type=Bar, plugin=self.plugin), 'Baz': SemanticTypeRecord(semantic_type=Baz, plugin=self.plugin), 'AscIntSequence': SemanticTypeRecord(semantic_type=AscIntSequence, plugin=self.plugin), 'Squid': SemanticTypeRecord(semantic_type=Squid, plugin=self.plugin), 'Octopus': SemanticTypeRecord(semantic_type=Octopus, plugin=self.plugin), 'Cuttlefish': SemanticTypeRecord(semantic_type=Cuttlefish, plugin=self.plugin), } self.assertEqual(types, exp) def test_get_semantic_types(self): artifact_classes = self.pm.get_semantic_types() is1 = ArtifactClassRecord( semantic_type=IntSequence1, format=IntSequenceDirectoryFormat, plugin=self.plugin, description="The first IntSequence", examples={'IntSequence1 import example': is1_use}, type_expression=IntSequence1) is2 = ArtifactClassRecord( semantic_type=IntSequence2, format=IntSequenceV2DirectoryFormat, plugin=self.plugin, description="The second IntSequence", examples={'IntSequence2 import example': is2_use}, type_expression=IntSequence2) is3 = ArtifactClassRecord(semantic_type=IntSequence3, format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin, description="", examples={}, type_expression=IntSequence3) kd = ArtifactClassRecord(semantic_type=Kennel[Dog], format=MappingDirectoryFormat, plugin=self.plugin, description="", examples={}, type_expression=Kennel[Dog]) kc = ArtifactClassRecord(semantic_type=Kennel[Cat], format=MappingDirectoryFormat, plugin=self.plugin, description="", examples={}, type_expression=Kennel[Cat]) self.assertLessEqual( {str(e.semantic_type) for e in [is1, is2, is3, kd, kc]}, artifact_classes.keys()) self.assertEqual(is1, artifact_classes['IntSequence1']) self.assertEqual(is2, artifact_classes['IntSequence2']) self.assertEqual(is3, artifact_classes['IntSequence3']) self.assertNotIn('Cat', artifact_classes) self.assertNotIn('Dog', artifact_classes) self.assertNotIn('Kennel', artifact_classes) self.assertIn('Kennel[Dog]', artifact_classes) self.assertIn('Kennel[Cat]', artifact_classes) # TODO: add tests for type/directory/transformer registrations def test_get_formats_no_type_or_filter(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin), 'RedundantSingleIntDirectoryFormat': FormatRecord(format=RedundantSingleIntDirectoryFormat, plugin=self.plugin), 'FourIntsDirectoryFormat': FormatRecord(format=FourIntsDirectoryFormat, plugin=self.plugin), 'EchoFormat': FormatRecord(format=EchoFormat, plugin=self.plugin), 'EchoDirectoryFormat': FormatRecord(format=EchoDirectoryFormat, plugin=self.plugin), 'MappingDirectoryFormat': FormatRecord(format=MappingDirectoryFormat, plugin=self.plugin), 'Cephalapod': FormatRecord(format=Cephalapod, plugin=self.plugin), 'CephalapodDirectoryFormat': FormatRecord(format=CephalapodDirectoryFormat, plugin=self.plugin), 'ImportableOnlyFormat': FormatRecord(format=ImportableOnlyFormat, plugin=self.plugin), 'ExportableOnlyFormat': FormatRecord(format=ExportableOnlyFormat, plugin=self.plugin), } obs = self.pm.get_formats() self.assertEqual(obs, exp) def test_get_formats_SFDF(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin), 'ImportableOnlyFormat': FormatRecord(format=ImportableOnlyFormat, plugin=self.plugin), 'ExportableOnlyFormat': FormatRecord(format=ExportableOnlyFormat, plugin=self.plugin), } obs = self.pm.get_formats(semantic_type='IntSequence1') self.assertEqual(exp, obs) def test_get_formats_SFDF_EXPORTABLE(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'ExportableOnlyFormat': FormatRecord(format=ExportableOnlyFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter=GetFormatFilters.EXPORTABLE, semantic_type=IntSequence1) self.assertEqual(exp, obs) def test_get_formats_SFDF_IMPORTABLE(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin), 'ImportableOnlyFormat': FormatRecord(format=ImportableOnlyFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter=GetFormatFilters.IMPORTABLE, semantic_type=IntSequence1) self.assertEqual(exp, obs) def test_get_formats_DF(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin) } obs = self.pm.get_formats(semantic_type='IntSequence3') self.assertEqual(exp, obs) def test_get_formats_DF_EXPORTABLE(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter=GetFormatFilters.EXPORTABLE, semantic_type=IntSequence3) self.assertEqual(exp, obs) def test_get_formats_DF_exportable_str(self): exp = { 'IntSequenceFormat': FormatRecord(format=IntSequenceFormat, plugin=self.plugin), 'IntSequenceDirectoryFormat': FormatRecord(format=IntSequenceDirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter="EXPORTABLE", semantic_type=IntSequence3) self.assertEqual(exp, obs) def test_get_formats_DF_IMPORTABLE(self): exp = { 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter=GetFormatFilters.IMPORTABLE, semantic_type=IntSequence3) self.assertEqual(exp, obs) def test_get_formats_DF_importable_str(self): exp = { 'IntSequenceFormatV2': FormatRecord(format=IntSequenceFormatV2, plugin=self.plugin), 'IntSequenceV2DirectoryFormat': FormatRecord(format=IntSequenceV2DirectoryFormat, plugin=self.plugin), 'IntSequenceMultiFileDirectoryFormat': FormatRecord(format=IntSequenceMultiFileDirectoryFormat, plugin=self.plugin) } obs = self.pm.get_formats(filter="IMPORTABLE", semantic_type=IntSequence3) self.assertEqual(exp, obs) def test_get_formats_invalid_type(self): with self.assertRaisesRegex(ValueError, "No formats associated"): self.pm.get_formats(semantic_type='Random[Frequency]') def test_get_formats_invalid_filter(self): with self.assertRaisesRegex(ValueError, "filter.*is not valid"): self.pm.get_formats(filter="xyz") def test_importable_formats_property(self): imp_f = self.pm.importable_formats self.assertTrue(isinstance(imp_f, dict)) # spot check for a few formats that should be present self.assertTrue('IntSequenceFormatV2' in imp_f) self.assertTrue('CephalapodDirectoryFormat' in imp_f) self.assertTrue('ImportableOnlyFormat' in imp_f) # spot check one that shouldn't be here self.assertFalse('ExportableOnlyFormat' in imp_f) def test_exportable_formats_property(self): exp_f = self.pm.exportable_formats self.assertTrue(isinstance(exp_f, dict)) # spot check for a few formats that should be present self.assertTrue('IntSequenceDirectoryFormat' in exp_f) self.assertTrue('IntSequenceV2DirectoryFormat' in exp_f) self.assertTrue('ExportableOnlyFormat' in exp_f) # spot check one that shouldn't be here self.assertFalse('ImportableOnlyFormat' in exp_f) def test_deprecated_type_formats(self): # PluginManager.type_formats was replaced with # PluginManager.artifact_classes. For backward compatibility the # PluginManager.type_formats property returns the plugin manager's # artifact_classes self.assertEqual(self.pm.type_formats, list(self.pm.artifact_classes.values())) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_result.py000066400000000000000000000613701462552636000212120ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import pathlib import qiime2.core.type from qiime2.sdk import Result, Artifact, Visualization, ResultCollection from qiime2.sdk.result import ResultMetadata import qiime2.core.archive as archive import qiime2.core.exceptions as exceptions from qiime2.core.testing.format import IntSequenceDirectoryFormat from qiime2.core.testing.type import (FourInts, SingleInt, IntSequence1, IntSequence2) from qiime2.core.testing.util import get_dummy_plugin, ArchiveTestingMixin from qiime2.core.testing.visualizer import mapping_viz from qiime2.core.util import set_permissions, OTHER_NO_WRITE class TestResult(unittest.TestCase, ArchiveTestingMixin): def make_provenance_capture(self): # You can't actually import a visualization, but I won't tell # visualization if you don't... return archive.ImportProvenanceCapture() def setUp(self): # Ignore the returned dummy plugin object, just run this to verify the # plugin exists as the tests rely on it being loaded. get_dummy_plugin() # TODO standardize temporary directories created by QIIME 2 self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.data_dir = os.path.join(self.test_dir.name, 'viz-output') os.mkdir(self.data_dir) mapping_viz(self.data_dir, {'abc': 'foo', 'def': 'bar'}, {'ghi': 'baz', 'jkl': 'bazz'}, key_label='Key', value_label='Value') def tearDown(self): self.test_dir.cleanup() def test_private_constructor(self): with self.assertRaisesRegex( NotImplementedError, 'Result constructor.*private.*Result.load'): Result() def test_load_artifact(self): saved_artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) fp = os.path.join(self.test_dir.name, 'artifact.qza') saved_artifact.save(fp) artifact = Result.load(fp) self.assertIsInstance(artifact, Artifact) self.assertEqual(artifact.type, FourInts) self.assertEqual(artifact.uuid, saved_artifact.uuid) self.assertEqual(artifact.view(list), [-1, 42, 0, 43]) def test_load_visualization(self): saved_visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) fp = os.path.join(self.test_dir.name, 'visualization.qzv') saved_visualization.save(fp) visualization = Result.load(fp) self.assertIsInstance(visualization, Visualization) self.assertEqual(visualization.type, qiime2.core.type.Visualization) self.assertEqual(visualization.uuid, saved_visualization.uuid) def test_extract_artifact(self): fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact = Artifact.import_data(FourInts, [-1, 42, 0, 43]) artifact.save(fp) root_dir = str(artifact.uuid) # pathlib normalizes away the `.`, it doesn't matter, but this is the # implementation we're using, so let's test against that assumption. output_dir = pathlib.Path(self.test_dir.name) / 'artifact-extract-test' result_dir = Result.extract(fp, output_dir=output_dir) self.assertEqual(result_dir, str(output_dir / root_dir)) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/file1.txt', 'data/file2.txt', 'data/nested/file3.txt', 'data/nested/file4.txt', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertExtractedArchiveMembers(output_dir, root_dir, expected) def test_extract_visualization(self): fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp) root_dir = str(visualization.uuid) output_dir = pathlib.Path(self.test_dir.name) / 'viz-extract-test' result_dir = Result.extract(fp, output_dir=output_dir) self.assertEqual(result_dir, str(output_dir / root_dir)) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertExtractedArchiveMembers(output_dir, root_dir, expected) def test_peek_artifact(self): artifact = Artifact.import_data(FourInts, [0, 0, 42, 1000]) fp = os.path.join(self.test_dir.name, 'artifact.qza') artifact.save(fp) metadata = Result.peek(fp) self.assertIsInstance(metadata, ResultMetadata) self.assertEqual(metadata.type, 'FourInts') self.assertEqual(metadata.uuid, str(artifact.uuid)) self.assertEqual(metadata.format, 'FourIntsDirectoryFormat') def test_peek_visualization(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization.save(fp) metadata = Result.peek(fp) self.assertIsInstance(metadata, ResultMetadata) self.assertEqual(metadata.type, 'Visualization') self.assertEqual(metadata.uuid, str(visualization.uuid)) self.assertIsNone(metadata.format) def test_save_artifact_auto_extension(self): artifact = Artifact.import_data(FourInts, [0, 0, 42, 1000]) # Filename & extension endswith is matching (default). fp = os.path.join(self.test_dir.name, 'artifactqza') obs_fp = artifact.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifactqza.qza') # Filename & extension endswith is matching (non-default). fp = os.path.join(self.test_dir.name, 'artifacttxt') obs_fp = artifact.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifacttxt.txt') # No period in filename; no period in extension. fp = os.path.join(self.test_dir.name, 'artifact') obs_fp = artifact.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # No period in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'artifact') obs_fp = artifact.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # Single period in filename; no period in extension. fp = os.path.join(self.test_dir.name, 'artifact.') obs_fp = artifact.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # Single period in filename; single period in extension. fp = os.path.join(self.test_dir.name, 'artifact.') obs_fp = artifact.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # Single period in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'artifact.') obs_fp = artifact.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # Multiple periods in filename; single period in extension. fp = os.path.join(self.test_dir.name, 'artifact..') obs_fp = artifact.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # Multiple periods in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'artifact..') obs_fp = artifact.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # No extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'artifact') obs_fp = artifact.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.qza') # No extension in filename; different extension input. fp = os.path.join(self.test_dir.name, 'artifact') obs_fp = artifact.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.txt') # No extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'artifact') obs_fp = artifact.save(fp, '.qza') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.qza') # Different extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'artifact.zip') obs_fp = artifact.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.zip.qza') # Different extension in filename; # Different extension input (non-matching). fp = os.path.join(self.test_dir.name, 'artifact.zip') obs_fp = artifact.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.zip.txt') # Different extension in filename; # Different extension input (matching). fp = os.path.join(self.test_dir.name, 'artifact.zip') obs_fp = artifact.save(fp, '.zip') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.zip') # Different extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'artifact.zip') obs_fp = artifact.save(fp, '.qza') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.zip.qza') # Default extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'artifact.qza') obs_fp = artifact.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.qza') # Default extension in filename; different extension input. fp = os.path.join(self.test_dir.name, 'artifact.qza') obs_fp = artifact.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.qza.txt') # Default extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'artifact.qza') obs_fp = artifact.save(fp, '.qza') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'artifact.qza') def test_save_visualization_auto_extension(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) # Filename & extension endswith is matching (default). fp = os.path.join(self.test_dir.name, 'visualizationqzv') obs_fp = visualization.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualizationqzv.qzv') # Filename & extension endswith is matching (non-default). fp = os.path.join(self.test_dir.name, 'visualizationtxt') obs_fp = visualization.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualizationtxt.txt') # No period in filename; no period in extension. fp = os.path.join(self.test_dir.name, 'visualization') obs_fp = visualization.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # No period in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'visualization') obs_fp = visualization.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # Single period in filename; no period in extension. fp = os.path.join(self.test_dir.name, 'visualization.') obs_fp = visualization.save(fp, 'txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # Single period in filename; single period in extension. fp = os.path.join(self.test_dir.name, 'visualization.') obs_fp = visualization.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # Single period in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'visualization.') obs_fp = visualization.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # Multiple periods in filename; single period in extension. fp = os.path.join(self.test_dir.name, 'visualization..') obs_fp = visualization.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # Multiple periods in filename; multiple periods in extension. fp = os.path.join(self.test_dir.name, 'visualization..') obs_fp = visualization.save(fp, '..txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # No extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'visualization') obs_fp = visualization.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.qzv') # No extension in filename; different extension input. fp = os.path.join(self.test_dir.name, 'visualization') obs_fp = visualization.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.txt') # No extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'visualization') obs_fp = visualization.save(fp, '.qzv') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.qzv') # Different extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'visualization.zip') obs_fp = visualization.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.zip.qzv') # Different extension in filename; # Different extension input (non-matching). fp = os.path.join(self.test_dir.name, 'visualization.zip') obs_fp = visualization.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.zip.txt') # Different extension in filename; # Different extension input (matching). fp = os.path.join(self.test_dir.name, 'visualization.zip') obs_fp = visualization.save(fp, '.zip') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.zip') # Different extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'visualization.zip') obs_fp = visualization.save(fp, '.qzv') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.zip.qzv') # Default extension in filename; no extension input. fp = os.path.join(self.test_dir.name, 'visualization.qzv') obs_fp = visualization.save(fp) obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.qzv') # Default extension in filename; different extension input. fp = os.path.join(self.test_dir.name, 'visualization.qzv') obs_fp = visualization.save(fp, '.txt') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.qzv.txt') # Default extension in filename; default extension input. fp = os.path.join(self.test_dir.name, 'visualization.qzv') obs_fp = visualization.save(fp, '.qzv') obs_filename = os.path.basename(obs_fp) self.assertEqual(obs_filename, 'visualization.qzv') def test_import_data_single_dirfmt_to_single_dirfmt(self): temp_data_dir = os.path.join(self.test_dir.name, 'import') os.mkdir(temp_data_dir) with open(os.path.join(temp_data_dir, 'ints.txt'), 'w') as fh: fh.write("1\n2\n3\n") qiime2.Artifact.import_data('IntSequence2', temp_data_dir, view_type="IntSequenceDirectoryFormat") def test_artifact_has_metadata_true(self): A = Artifact.import_data('Mapping', {'a': '1', 'b': '2'}) self.assertTrue(A.has_metadata()) def test_artifact_has_metadata_false(self): A = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) self.assertFalse(A.has_metadata()) def test_validate_artifact_good(self): artifact = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) artifact.validate() self.assertTrue(True) # Checkpoint def test_validate_artifact_bad(self): artifact = Artifact.import_data('IntSequence1', [1, 2, 3, 4]) # We set everything in the artifact to be read-only. This test needs to # mimic if the user were to somehow write it anyway, so we set write # for self and group set_permissions(artifact._archiver.root_dir, OTHER_NO_WRITE, OTHER_NO_WRITE) with (artifact._archiver.root_dir / 'extra.file').open('w') as fh: fh.write('uh oh') with self.assertRaisesRegex(exceptions.ValidationError, r'extra\.file'): artifact.validate() def test_validate_vizualization_good(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.validate() self.assertTrue(True) # Checkpoint def test_validate_vizualization_bad(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) # We set everything in the artifact to be read-only. This test needs to # mimic if the user were to somehow write it anyway, so we set write # for self and group set_permissions(visualization._archiver.root_dir, OTHER_NO_WRITE, OTHER_NO_WRITE) with (visualization._archiver.root_dir / 'extra.file').open('w') as fh: fh.write('uh oh') with self.assertRaisesRegex(exceptions.ValidationError, r'extra\.file'): visualization.validate() def test_import_min_validate(self): with tempfile.TemporaryDirectory() as tempdir: fp = os.path.join(tempdir, 'ints.txt') with open(fp, 'w') as fh: for i in range(5): fh.write(f'{i}\n') fh.write('a\n') intseq_dir = IntSequenceDirectoryFormat(tempdir, 'r') # import with min allows format error outside of min purview _ = Artifact.import_data( 'IntSequence1', intseq_dir, validate_level='min' ) # import with max should catch all format errors, max is default with self.assertRaisesRegex( exceptions.ValidationError, 'Line 6 is not an integer' ): _ = Artifact.import_data('IntSequence1', tempdir) with tempfile.TemporaryDirectory() as tempdir: fp = os.path.join(tempdir, 'ints.txt') with open(fp, 'w') as fh: fh.write('1\n') fh.write('a\n') fh.write('3\n') intseq_dir = IntSequenceDirectoryFormat(tempdir, 'r') # import with min catches format errors within its purview with self.assertRaisesRegex( exceptions.ValidationError, 'Line 2 is not an integer' ): _ = Artifact.import_data( 'IntSequence1', [1, 'a', 3, 4], validate_level='min' ) class TestResultCollection(unittest.TestCase): def setUp(self): # Ignore the returned dummy plugin object, just run this to verify the # plugin exists as the tests rely on it being loaded. get_dummy_plugin() self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.output_fp = os.path.join(self.test_dir.name, 'output') self.collection = ResultCollection( {'foo': Artifact.import_data(SingleInt, 0), 'bar': Artifact.import_data(SingleInt, 1)}) def tearDown(self): self.test_dir.cleanup() def test_roundtrip_ordered_collection(self): self.collection.save(self.output_fp) foo = Artifact.load(os.path.join(self.output_fp, 'foo.qza')) bar = Artifact.load(os.path.join(self.output_fp, 'bar.qza')) self.assertEqual(foo.view(int), 0) self.assertEqual(bar.view(int), 1) with open(os.path.join(self.output_fp, '.order')) as fh: self.assertEqual(fh.read(), 'foo\nbar\n') read_collection = ResultCollection.load(self.output_fp) self.assertEqual(self.collection, read_collection) def test_roundtrip_unordered_collection(self): self.collection.save(self.output_fp) os.remove(os.path.join(self.output_fp, '.order')) foo = Artifact.load(os.path.join(self.output_fp, 'foo.qza')) bar = Artifact.load(os.path.join(self.output_fp, 'bar.qza')) self.assertEqual(foo.view(int), 0) self.assertEqual(bar.view(int), 1) with self.assertWarnsRegex( UserWarning, f"The directory '{self.output_fp}' does not " "contain a .order file"): read_collection = ResultCollection.load(self.output_fp) self.assertEqual( set(self.collection.items()), set(read_collection.items())) def test_type_normal_collection(self): self.assertEqual( self.collection.type, qiime2.core.type.Collection[SingleInt]) def test_type_weird_collection(self): weird_collection = ResultCollection({ 'foo': Artifact.import_data(SingleInt, 0), 'bar': Artifact.import_data(FourInts, [1, 2, 3, 4]), 'baz': Artifact.import_data(IntSequence1, [5, 6, 7]), 'qux': Artifact.import_data(IntSequence2, [8, 9, 10])}) self.assertEqual( weird_collection.type, qiime2.core.type.Collection[SingleInt | FourInts | IntSequence1 | IntSequence2]) def test_collection_order_file_contains_nonexistent_key(self): BAD_KEY = 'NonexistentKey' self.collection.save(self.output_fp) order_fp = os.path.join(self.output_fp, '.order') with open(order_fp, 'a') as order_fh: order_fh.write(BAD_KEY) foo = Artifact.load(os.path.join(self.output_fp, 'foo.qza')) bar = Artifact.load(os.path.join(self.output_fp, 'bar.qza')) self.assertEqual(foo.view(int), 0) self.assertEqual(bar.view(int), 1) with self.assertRaisesRegex( ValueError, f"The Result '{BAD_KEY}' is referenced in the " "order file but does not exist"): ResultCollection.load(self.output_fp) def test_collection_non_str_keys(self): with self.assertRaisesRegex( KeyError, 'ResultCollection keys must be strings and may only ' 'contain the following characters:.*1'): ResultCollection({1: 0}) def test_invalid_key_init(self): with self.assertRaisesRegex( KeyError, 'ResultCollection keys must be strings and may only contain ' 'the following characters:.*valid key'): ResultCollection({'not a valid key': 0}) def test_invalid_key_added(self): collection = ResultCollection() with self.assertRaisesRegex( KeyError, 'ResultCollection keys must be strings and may only contain ' 'the following characters:.*valid key'): collection['not a valid key'] = 0 if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_results.py000066400000000000000000000116331462552636000213720ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import pickle import unittest from qiime2.sdk import Results class TestResults(unittest.TestCase): def test_tuple_subclass(self): self.assertTrue(issubclass(Results, tuple)) self.assertIsInstance(Results(['a', 'b'], [42, 43]), tuple) def test_tuple_cast(self): r = Results(['a', 'b'], [42, 43]) t = tuple(r) self.assertIs(type(t), tuple) self.assertEqual(t, (42, 43)) def test_callable_return_and_unpacking(self): def f(): return Results(['a', 'b'], [42, 43]) a, b = f() self.assertEqual(a, 42) self.assertEqual(b, 43) def test_constructor_iterable(self): r = Results(iter(['a', 'b']), iter([42, 43])) self.assertEqual(tuple(r), (42, 43)) self.assertEqual(r._fields, ('a', 'b')) def test_constructor_len_mismatch(self): with self.assertRaises(ValueError): Results(['a', 'b'], [42]) def test_pickle(self): r = Results(['a', 'b'], [42, 'abc']) pickled = pickle.dumps(r) unpickled = pickle.loads(pickled) self.assertEqual(unpickled, r) def test_field_attributes(self): r = Results(['foo', 'bar'], [42, 'abc']) self.assertEqual(r.foo, 42) self.assertEqual(r.bar, 'abc') with self.assertRaises(AttributeError): r.baz def test_per_instance_field_attributes(self): # Field attributes are added to a `Results` instance, not the type. r1 = Results(['foo', 'bar'], [42, 'abc']) r2 = Results(['x'], [42.0]) for attr in 'foo', 'bar', 'x': self.assertFalse(hasattr(Results, attr)) self.assertTrue(hasattr(r1, 'foo')) self.assertTrue(hasattr(r1, 'bar')) self.assertFalse(hasattr(r1, 'x')) self.assertFalse(hasattr(r2, 'foo')) self.assertFalse(hasattr(r2, 'bar')) self.assertTrue(hasattr(r2, 'x')) def test_index_access(self): r = Results(['foo', 'bar'], [42, 'abc']) self.assertEqual(r[0], 42) self.assertEqual(r[1], 'abc') with self.assertRaises(IndexError): r[2] def test_immutability(self): r = Results(['foo', 'bar'], [42, 'abc']) # Setter for existing attribute. with self.assertRaises(AttributeError): r.bar = 999 # Setter for new attribute. with self.assertRaises(AttributeError): r.baz = 999 # Deleter for existing attribute. with self.assertRaises(AttributeError): del r.bar # Deleter for new attribute. with self.assertRaises(AttributeError): del r.baz with self.assertRaises(TypeError): r[0] = 999 def test_eq_same_obj(self): r = Results(['a', 'b'], [1, 2]) self.assertEqual(r, r) def test_eq_subclass(self): class ResultsSubclass(Results): pass r1 = Results(['foo'], ['abc']) r2 = ResultsSubclass(['foo'], ['abc']) self.assertEqual(r1, r2) def test_eq_different_source_types(self): r1 = Results(iter(['a', 'b']), iter([42, 43])) r2 = Results(['a', 'b'], [42, 43]) self.assertEqual(r1, r2) def test_eq_empty(self): r1 = Results([], []) r2 = Results([], []) self.assertEqual(r1, r2) def test_eq_nonempty(self): r1 = Results(['foo', 'bar'], ['abc', 'def']) r2 = Results(['foo', 'bar'], ['abc', 'def']) self.assertEqual(r1, r2) def test_ne_type(self): r1 = Results(['foo', 'bar'], ['abc', 'def']) r2 = ('abc', 'def') self.assertNotEqual(r1, r2) def test_ne_fields(self): r1 = Results(['foo', 'bar'], ['abc', 'def']) r2 = Results(['foo', 'baz'], ['abc', 'def']) self.assertNotEqual(r1, r2) def test_ne_values(self): r1 = Results(['foo', 'bar'], ['abc', 'def']) r2 = Results(['foo', 'bar'], ['abc', 'xyz']) self.assertNotEqual(r1, r2) def test_repr_empty(self): r = Results([], []) self.assertTrue(repr(r).startswith('Results')) self.assertTrue(repr(r).endswith('---')) def test_repr_single(self): r = Results(['a'], [42]) self.assertTrue(repr(r).startswith('Results')) self.assertTrue(repr(r).endswith('a = 42')) def test_repr_multiple(self): r = Results(['a', 'foo'], [42, 'abc']) self.assertTrue(repr(r).startswith('Results')) self.assertTrue('a = 42' in repr(r)) self.assertTrue(repr(r).endswith("foo = 'abc'")) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_usage.py000066400000000000000000000452731462552636000210040ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest.mock as mock import unittest import tempfile from qiime2.core.testing.util import get_dummy_plugin import qiime2.core.testing.examples as examples from qiime2.sdk import usage, action, UninitializedPluginManagerError from qiime2 import Metadata, Artifact, ResultCollection class TestCaseUsage(unittest.TestCase): def setUp(self): self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.plugin = get_dummy_plugin() def tearDown(self): self.test_dir.cleanup() class TestAssertUsageVarType(TestCaseUsage): def test_success(self): var = usage.UsageVariable('a', lambda: None, 'artifact', None) usage.assert_usage_var_type(var, 'artifact') self.assertTrue(True) def test_failure(self): var = usage.UsageVariable('a', lambda: None, 'artifact', None) with self.assertRaisesRegex(AssertionError, 'Incorrect.*a,.*visualization.*artifact'): usage.assert_usage_var_type(var, 'visualization') class TestUsageAction(TestCaseUsage): def test_successful_init(self): obs = usage.UsageAction(plugin_id='foo', action_id='bar') self.assertEqual('foo', obs.plugin_id) self.assertEqual('bar', obs.action_id) def test_invalid_plugin_id(self): with self.assertRaisesRegex(ValueError, 'specify a value for plugin_id'): usage.UsageAction(plugin_id='', action_id='bar') def test_invalid_action_id(self): with self.assertRaisesRegex(ValueError, 'specify a value for action_id'): usage.UsageAction(plugin_id='foo', action_id='') def test_successful_get_action(self): ua = usage.UsageAction( plugin_id='dummy_plugin', action_id='concatenate_ints') obs_action_f = ua.get_action() self.assertTrue(isinstance(obs_action_f, action.Method)) def test_unknown_action_get_action(self): ua = usage.UsageAction( plugin_id='dummy_plugin', action_id='concatenate_spleens') with self.assertRaisesRegex(KeyError, 'No action.*concatenate_spleens'): ua.get_action() @mock.patch('qiime2.sdk.PluginManager.reuse_existing', side_effect=UninitializedPluginManagerError) def test_uninitialized_plugin_manager(self, _): with self.assertRaisesRegex(UninitializedPluginManagerError, 'create an instance of sdk.PluginManager'): usage.UsageAction( plugin_id='dummy_plugin', action_id='concatenate_ints') class TestUsageInputs(TestCaseUsage): def test_successful_init(self): obs = usage.UsageInputs(foo='bar') self.assertEqual(['foo'], list(obs.values.keys())) self.assertEqual(['bar'], list(obs.values.values())) class TestUsageOutputNames(TestCaseUsage): def test_successful_init(self): obs = usage.UsageOutputNames(foo='bar') self.assertEqual(['foo'], list(obs.values.keys())) self.assertEqual(['bar'], list(obs.values.values())) def test_invalid_init(self): with self.assertRaisesRegex(TypeError, 'key.*foo.*string, not.*bool'): usage.UsageOutputNames(foo=True) class TestUsageBaseClass(TestCaseUsage): def setUp(self): super().setUp() def _reset_usage_variables(self, variables): for variable in variables: variable.value = usage.UsageVariable.DEFERRED def test_action_invalid_action_provided(self): use = usage.Usage() with self.assertRaisesRegex(ValueError, 'expected.*UsageAction'): use.action({}, {}, {}) def test_merge_metadata_one_input(self): use = usage.Usage() with self.assertRaisesRegex(ValueError, 'two or more'): use.merge_metadata('foo') def test_action_cache_is_working(self): use = usage.Usage() ints = use.init_artifact('ints', examples.ints1_factory) mapper = use.init_artifact('mapper', examples.mapping1_factory) obs = use.action( use.UsageAction(plugin_id='dummy_plugin', action_id='typical_pipeline'), use.UsageInputs(int_sequence=ints, mapping=mapper, do_extra_thing=True), use.UsageOutputNames(out_map='out_map', left='left', right='right', left_viz='left_viz', right_viz='right_viz') ) # nothing has been executed yet... self.assertEqual(obs._cache_info().misses, 0) self.assertEqual(obs._cache_info().hits, 0) obs_uuids = set() for result in obs: obs_result = result.execute() obs_uuids.add(obs_result.uuid) self.assertEqual(len(obs_uuids), 5) self.assertEqual(obs._cache_info().misses, 1) # 5 results, executed once, minus 1 miss self.assertEqual(obs._cache_info().hits, 5 - 1) # keep the lru cache intact, but reset the usage variables self._reset_usage_variables(obs) for result in obs: obs_result = result.execute() obs_uuids.add(obs_result.uuid) # the theory here is that if the memoized action execution wasn't # working, we would wind up with twice as many uuids self.assertEqual(len(obs_uuids), 5) self.assertEqual(obs._cache_info().misses, 1) # 5 results, executed twice, minus 1 miss self.assertEqual(obs._cache_info().hits, 5 * 2 - 1) # this time, reset the lru cache and watch as the results are # recompputed obs._cache_reset() self._reset_usage_variables(obs) for result in obs: obs_result = result.execute() obs_uuids.add(obs_result.uuid) # okay, now we should have duplicates of our 5 results self.assertEqual(len(obs_uuids), 5 * 2) self.assertEqual(obs._cache_info().misses, 1) # 5 results, executed once, minus 1 miss self.assertEqual(obs._cache_info().hits, 5 - 1) class TestUsageVariable(TestCaseUsage): def test_basic(self): # TODO ... class TestDiagnosticUsage(TestCaseUsage): def test_basic(self): action = self.plugin.actions['concatenate_ints'] use = usage.DiagnosticUsage() action.examples['concatenate_ints_simple'](use) self.assertEqual(5, len(use.render())) obs1, obs2, obs3, obs4, obs5 = use.render() self.assertEqual('init_artifact', obs1.source) self.assertEqual('init_artifact', obs2.source) self.assertEqual('init_artifact', obs3.source) self.assertEqual('comment', obs4.source) self.assertEqual('action', obs5.source) self.assertEqual('ints_a', obs1.variable.name) self.assertEqual('ints_b', obs2.variable.name) self.assertEqual('ints_c', obs3.variable.name) self.assertEqual('This example demonstrates basic usage.', obs4.variable) self.assertEqual('ints_d', obs5.variable[0].name) self.assertEqual('artifact', obs1.variable.var_type) self.assertEqual('artifact', obs2.variable.var_type) self.assertEqual('artifact', obs3.variable.var_type) self.assertEqual('artifact', obs5.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable.is_deferred) self.assertTrue(obs3.variable.is_deferred) self.assertTrue(obs5.variable[0].is_deferred) def test_chained(self): action = self.plugin.actions['concatenate_ints'] use = usage.DiagnosticUsage() action.examples['concatenate_ints_complex'](use) self.assertEqual(7, len(use.render())) obs1, obs2, obs3, obs4, obs5, obs6, obs7 = use.render() self.assertEqual('init_artifact', obs1.source) self.assertEqual('init_artifact', obs2.source) self.assertEqual('init_artifact', obs3.source) self.assertEqual('comment', obs4.source) self.assertEqual('action', obs5.source) self.assertEqual('comment', obs6.source) self.assertEqual('action', obs7.source) self.assertEqual('ints_a', obs1.variable.name) self.assertEqual('ints_b', obs2.variable.name) self.assertEqual('ints_c', obs3.variable.name) self.assertEqual('This example demonstrates chained usage (pt 1).', obs4.variable) self.assertEqual('ints_d', obs5.variable[0].name) self.assertEqual('This example demonstrates chained usage (pt 2).', obs6.variable) self.assertEqual('concatenated_ints', obs7.variable[0].name) self.assertEqual('artifact', obs1.variable.var_type) self.assertEqual('artifact', obs2.variable.var_type) self.assertEqual('artifact', obs3.variable.var_type) self.assertEqual('artifact', obs5.variable[0].var_type) self.assertEqual('artifact', obs7.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable.is_deferred) self.assertTrue(obs3.variable.is_deferred) self.assertTrue(obs5.variable[0].is_deferred) self.assertTrue(obs7.variable[0].is_deferred) def test_comments_only(self): action = self.plugin.actions['concatenate_ints'] use = usage.DiagnosticUsage() action.examples['comments_only'](use) self.assertEqual(2, len(use.render())) obs1, obs2 = use.render() self.assertEqual('comment', obs1.source) self.assertEqual('comment', obs2.source) self.assertEqual('comment 1', obs1.variable) self.assertEqual('comment 2', obs2.variable) def test_metadata_merging(self): action = self.plugin.actions['identity_with_metadata'] use = usage.DiagnosticUsage() action.examples['identity_with_metadata_merging'](use) self.assertEqual(5, len(use.render())) obs1, obs2, obs3, obs4, obs5 = use.render() self.assertEqual('init_artifact', obs1.source) self.assertEqual('init_metadata', obs2.source) self.assertEqual('init_metadata', obs3.source) self.assertEqual('merge_metadata', obs4.source) self.assertEqual('action', obs5.source) self.assertEqual('ints', obs1.variable.name) self.assertEqual('md1', obs2.variable.name) self.assertEqual('md2', obs3.variable.name) self.assertEqual('md3', obs4.variable.name) self.assertEqual('out', obs5.variable[0].name) self.assertEqual('artifact', obs1.variable.var_type) self.assertEqual('metadata', obs2.variable.var_type) self.assertEqual('metadata', obs3.variable.var_type) self.assertEqual('metadata', obs4.variable.var_type) self.assertEqual('artifact', obs5.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable.is_deferred) self.assertTrue(obs3.variable.is_deferred) self.assertTrue(obs4.variable.is_deferred) self.assertTrue(obs5.variable[0].is_deferred) def test_get_metadata_column(self): action = self.plugin.actions['identity_with_metadata_column'] use = usage.DiagnosticUsage() action.examples['identity_with_metadata_column_get_mdc'](use) self.assertEqual(4, len(use.render())) obs1, obs2, obs3, obs4 = use.render() self.assertEqual('init_artifact', obs1.source) self.assertEqual('init_metadata', obs2.source) self.assertEqual('get_metadata_column', obs3.source) self.assertEqual('action', obs4.source) self.assertEqual('ints', obs1.variable.name) self.assertEqual('md', obs2.variable.name) self.assertEqual('mdc', obs3.variable.name) self.assertEqual('out', obs4.variable[0].name) self.assertEqual('artifact', obs1.variable.var_type) self.assertEqual('metadata', obs2.variable.var_type) self.assertEqual('column', obs3.variable.var_type) self.assertEqual('artifact', obs4.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable.is_deferred) self.assertTrue(obs3.variable.is_deferred) self.assertTrue(obs4.variable[0].is_deferred) def test_optional_inputs(self): action = self.plugin.actions['optional_artifacts_method'] use = usage.DiagnosticUsage() action.examples['optional_inputs'](use) self.assertEqual(5, len(use.render())) obs1, obs2, obs3, obs4, obs5 = use.render() self.assertEqual('init_artifact', obs1.source) self.assertEqual('action', obs2.source) self.assertEqual('action', obs3.source) self.assertEqual('action', obs4.source) self.assertEqual('action', obs5.source) self.assertEqual('ints', obs1.variable.name) self.assertEqual('output1', obs2.variable[0].name) self.assertEqual('output2', obs3.variable[0].name) self.assertEqual('output3', obs4.variable[0].name) self.assertEqual('output4', obs5.variable[0].name) self.assertEqual('artifact', obs1.variable.var_type) self.assertEqual('artifact', obs2.variable[0].var_type) self.assertEqual('artifact', obs3.variable[0].var_type) self.assertEqual('artifact', obs4.variable[0].var_type) self.assertEqual('artifact', obs5.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable[0].is_deferred) self.assertTrue(obs3.variable[0].is_deferred) self.assertTrue(obs4.variable[0].is_deferred) self.assertTrue(obs5.variable[0].is_deferred) def test_artifact_collection_list_of_ints(self): action = self.plugin.actions['list_of_ints'] use = usage.DiagnosticUsage() action.examples['collection_list_of_ints'](use) self.assertEqual(2, len(use.render())) obs1, obs2 = use.render() self.assertEqual('init_artifact_collection', obs1.source) self.assertEqual('action', obs2.source) self.assertEqual('ints', obs1.variable.name) self.assertEqual('out', obs2.variable[0].name) self.assertEqual('artifact_collection', obs1.variable.var_type) self.assertEqual('artifact_collection', obs2.variable[0].var_type) self.assertTrue(obs1.variable.is_deferred) self.assertTrue(obs2.variable[0].is_deferred) class TestExecutionUsage(TestCaseUsage): def test_basic(self): action = self.plugin.actions['concatenate_ints'] use = usage.ExecutionUsage() action.examples['concatenate_ints_simple'](use) # TODO ... def test_pipeline(self): action = self.plugin.actions['typical_pipeline'] use = usage.ExecutionUsage() action.examples['typical_pipeline_simple'](use) # TODO ... def test_merge_metadata(self): use = usage.ExecutionUsage() md1 = use.init_metadata('md1', examples.md1_factory) md2 = use.init_metadata('md2', examples.md2_factory) merged = use.merge_metadata('md3', md1, md2) self.assertIsInstance(merged.execute(), Metadata) def test_variadic_input_simple(self): use = usage.ExecutionUsage() action = self.plugin.actions['variadic_input_method'] action.examples['variadic_input_simple'](use) ints_a, ints_b, single_int1, single_int2, out = use.render().values() self.assertIsInstance(ints_a.value, Artifact) self.assertIsInstance(ints_b.value, Artifact) self.assertIsInstance(single_int1.value, Artifact) self.assertIsInstance(single_int2.value, Artifact) self.assertIsInstance(out.value, Artifact) def test_variadic_input_simple_async(self): use = usage.ExecutionUsage(asynchronous=True) action = self.plugin.actions['variadic_input_method'] action.examples['variadic_input_simple'](use) ints_a, ints_b, single_int1, single_int2, out = use.render().values() self.assertIsInstance(ints_a.value, Artifact) self.assertIsInstance(ints_b.value, Artifact) self.assertIsInstance(single_int1.value, Artifact) self.assertIsInstance(single_int2.value, Artifact) self.assertIsInstance(out.value, Artifact) def test_artifact_collection_list_of_ints(self): use = usage.ExecutionUsage() action = self.plugin.actions['list_of_ints'] action.examples['collection_list_of_ints'](use) ints, out = use.render().values() self.assertIsInstance(ints.value, ResultCollection) self.assertIsInstance(out.value, ResultCollection) def test_init_artifact_from_url_error(self): use = usage.ExecutionUsage() with self.assertRaisesRegex(ValueError, 'Could no.*not-a-url'): use.init_artifact_from_url( 'bad_url_artifact', 'https://not-a-url.qiime2.org/junk.qza',) def test_init_metadata_from_url_error(self): use = usage.ExecutionUsage() with self.assertRaisesRegex(ValueError, 'Could no.*https://not-a-url'): use.init_metadata_from_url( 'bad_url_metadata', 'https://not-a-url.qiime2.org/junk.tsv',) # def _test_init_artifact_from_url(self): # TODO: need a url to an artifact that the test suite plugin manager # knows about. # artifact_url = '' # use = usage.ExecutionUsage() # a = use.init_artifact_from_url('a', artifact_url) # self.assertIsInstance(a, Artifact) def test_init_artifact_from_url_error_on_non_artifact(self): metadata_url = \ 'https://data.qiime2.org/2022.11/tutorials/' \ 'moving-pictures/sample_metadata.tsv' use = usage.ExecutionUsage() with self.assertRaisesRegex(ValueError, "Could not.*\n.*a QIIME arc"): use.init_artifact_from_url('a', metadata_url) def test_init_metadata_from_url_error_on_non_metadata(self): url = 'https://www.qiime2.org/' use = usage.ExecutionUsage() with self.assertRaisesRegex(ValueError, "Could not.*\n.*nized ID"): use.init_metadata_from_url('a', url) def test_init_metadata_from_url(self): metadata_url = \ 'https://data.qiime2.org/2022.11/tutorials/' \ 'moving-pictures/sample_metadata.tsv' use = usage.ExecutionUsage() md = use.init_metadata_from_url('md', metadata_url) self.assertIsInstance(md.value, Metadata) qiime2-2024.5.0/qiime2/sdk/tests/test_util.py000066400000000000000000000070431462552636000206460ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import unittest import qiime2 import qiime2.sdk from qiime2.sdk.util import validate_result_collection_keys class TestUtil(unittest.TestCase): def test_artifact_actions(self): obs = qiime2.sdk.util.actions_by_input_type(None) self.assertEqual(obs, []) # For simplicity, we are gonna test the names of the plugin and # the actions # raise ValueError(qiime2.sdk.util.actions_by_input_type('SingleInt')) obs = [(x.name, set([yy.name for yy in y])) for x, y in qiime2.sdk.util.actions_by_input_type('SingleInt')] exp = [('dummy-plugin', set([ 'To be resumed', 'Do stuff normally, but override this one step sometimes', 'Internal fail pipeline', 'Takes and returns a combination of colletions and non collections' ]))] self.assertEqual(obs, exp) obs = [(x.name, [yy.name for yy in y]) for x, y in qiime2.sdk.util.actions_by_input_type( 'Kennel[Cat]')] self.assertEqual(obs, []) obs = [(x.name, [yy.name for yy in y]) for x, y in qiime2.sdk.util.actions_by_input_type( 'IntSequence1')] exp = [('dummy-plugin', [ 'A typical pipeline with the potential to raise an error', 'Concatenate integers', 'Identity', 'Identity', 'Identity', 'Do a great many things', 'Identity', 'Identity', 'Identity', 'Visualize most common integers', 'Inputs with typing.Union', 'Split sequence of integers in half', 'Test different ways of failing', 'Optional artifacts method', 'Do stuff normally, but override this one step sometimes', 'TypeMatch with list and set params'])] self.assertEqual(len(obs), 2) self.assertEqual(obs[0][0], exp[0][0]) self.assertCountEqual(obs[0][1], exp[0][1]) def test_validate_result_collection_keys_valid(self): self.assertEqual(validate_result_collection_keys('a'), None) good_keys = ['-', '+', '.', '_', 'a', 'x', 'A', 'X', '0', '9', '90XAxa_.+-'] self.assertEqual(validate_result_collection_keys(*good_keys), None) def test_validate_result_collection_keys_invalid(self): with self.assertRaisesRegex(KeyError, "Invalid.*: @"): validate_result_collection_keys('@') with self.assertRaisesRegex(KeyError, "Invalid.*: @, a1@"): validate_result_collection_keys('@', 'a1@') with self.assertRaisesRegex(KeyError, "Invalid.*: @, a1@"): keys = ['@', 'a1@'] validate_result_collection_keys(*keys) with self.assertRaisesRegex(KeyError, "Invalid.*: @, a1@"): validate_result_collection_keys( 'good-key', '@', 'a1@') bad_keys = ['he llo', ' ', '!', '?'] for key in bad_keys: with self.assertRaisesRegex(KeyError, f"Invalid.*: {key}"): validate_result_collection_keys(key) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_visualization.py000066400000000000000000000356211462552636000225750ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import os import tempfile import unittest import uuid import collections import pathlib import qiime2.core.type from qiime2.sdk import Visualization from qiime2.sdk.result import ResultMetadata import qiime2.core.archive as archive from qiime2.core.testing.visualizer import ( mapping_viz, most_common_viz, multi_html_viz) from qiime2.core.testing.util import ArchiveTestingMixin class TestVisualization(unittest.TestCase, ArchiveTestingMixin): def make_provenance_capture(self): # You can't actually import a visualization, but I won't tell # visualization if you don't... return archive.ImportProvenanceCapture() def setUp(self): # TODO standardize temporary directories created by QIIME 2 self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') # Using `mapping_viz` because it produces multiple files, including a # nested directory. self.data_dir = os.path.join(self.test_dir.name, 'viz-output') os.mkdir(self.data_dir) mapping_viz(self.data_dir, {'abc': 'foo', 'def': 'bar'}, {'ghi': 'baz', 'jkl': 'bazz'}, key_label='Key', value_label='Value') def tearDown(self): self.test_dir.cleanup() def test_private_constructor(self): with self.assertRaisesRegex( NotImplementedError, 'Visualization constructor.*private.*Visualization.load'): Visualization() # Note on testing strategy below: many of the tests for `_from_data_dir` # and `load` are similar, with the exception that when `load`ing, the # visualization's UUID is known so more specific assertions can be # performed. While these tests appear somewhat redundant, they are # important because they exercise the same operations on Visualization # objects constructed from different sources, whose codepaths have very # different internal behavior. This internal behavior could be tested # explicitly but it is safer to test the public API behavior (e.g. as a # user would interact with the object) in case the internals change. def test_from_data_dir(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) self.assertEqual(visualization.type, qiime2.core.type.Visualization) self.assertIsInstance(visualization.uuid, uuid.UUID) def test_from_data_dir_and_save(self): fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp) root_dir = str(visualization.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp, root_dir, expected) def test_load(self): saved_visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) fp = os.path.join(self.test_dir.name, 'visualization.qzv') saved_visualization.save(fp) visualization = Visualization.load(fp) self.assertEqual(visualization.type, qiime2.core.type.Visualization) self.assertEqual(visualization.uuid, saved_visualization.uuid) def test_load_and_save(self): fp1 = os.path.join(self.test_dir.name, 'visualization1.qzv') fp2 = os.path.join(self.test_dir.name, 'visualization2.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp1) visualization = Visualization.load(fp1) # Overwriting its source file works. visualization.save(fp1) # Saving to a new file works. visualization.save(fp2) root_dir = str(visualization.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp1, root_dir, expected) root_dir = str(visualization.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(fp2, root_dir, expected) def test_roundtrip(self): fp1 = os.path.join(self.test_dir.name, 'visualization1.qzv') fp2 = os.path.join(self.test_dir.name, 'visualization2.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp1) visualization1 = Visualization.load(fp1) visualization1.save(fp2) visualization2 = Visualization.load(fp2) self.assertEqual(visualization1.type, visualization2.type) self.assertEqual(visualization1.uuid, visualization2.uuid) def test_load_with_archive_filepath_modified(self): # Save a visualization for use in the following test case. fp = os.path.join(self.test_dir.name, 'visualization.qzv') Visualization._from_data_dir(self.data_dir, self.make_provenance_capture()).save(fp) # Load the visualization from a filepath then save a different # visualization to the same filepath. Assert that both visualizations # access the correct data. # # `load` used to be lazy, only extracting data when it needed to (e.g. # when `save` or `get_index_paths` was called). This was buggy as the # filepath could have been deleted, or worse, modified to contain a # different .qzv file. Thus, the wrong archive could be extracted on # demand, or the archive could be missing altogether. There isn't an # easy cross-platform compatible way to solve this problem, so # Visualization.load is no longer lazy and always extracts its data # immediately. The real motivation for lazy loading was for quick # inspection of archives without extracting/copying data, so that API # is now provided through Visualization.peek. visualization1 = Visualization.load(fp) new_data_dir = os.path.join(self.test_dir.name, 'viz-output2') os.mkdir(new_data_dir) most_common_viz(new_data_dir, collections.Counter(range(42))) Visualization._from_data_dir(new_data_dir, self.make_provenance_capture()).save(fp) visualization2 = Visualization.load(fp) self.assertEqual(visualization1.get_index_paths(), {'html': 'data/index.html'}) self.assertEqual(visualization2.get_index_paths(), {'html': 'data/index.html', 'tsv': 'data/index.tsv'}) def test_extract(self): fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp) root_dir = str(visualization.uuid) # pathlib normalizes away the `.`, it doesn't matter, but this is the # implementation we're using, so let's test against that assumption. output_dir = pathlib.Path(self.test_dir.name) / 'viz-extract-test' result_dir = Visualization.extract(fp, output_dir=output_dir) self.assertEqual(result_dir, str(output_dir / root_dir)) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertExtractedArchiveMembers(output_dir, root_dir, expected) def test_get_index_paths_single_load(self): fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization.save(fp) visualization = Visualization.load(fp) actual = visualization.get_index_paths() expected = {'html': 'data/index.html'} self.assertEqual(actual, expected) def test_get_index_paths_single_from_data_dir(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) actual = visualization.get_index_paths() expected = {'html': 'data/index.html'} self.assertEqual(actual, expected) def test_get_index_paths_multiple_load(self): data_dir = os.path.join(self.test_dir.name, 'mc-viz-output1') os.mkdir(data_dir) most_common_viz(data_dir, collections.Counter(range(42))) fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) visualization.save(fp) visualization = Visualization.load(fp) actual = visualization.get_index_paths() expected = {'html': 'data/index.html', 'tsv': 'data/index.tsv'} self.assertEqual(actual, expected) def test_get_index_paths_multiple_from_data_dir(self): data_dir = os.path.join(self.test_dir.name, 'mc-viz-output2') os.mkdir(data_dir) most_common_viz(data_dir, collections.Counter(range(42))) visualization = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) actual = visualization.get_index_paths() expected = {'html': 'data/index.html', 'tsv': 'data/index.tsv'} self.assertEqual(actual, expected) def test_get_index_paths_multiple_html_load(self): data_dir = os.path.join(self.test_dir.name, 'multi-html-viz1') os.mkdir(data_dir) multi_html_viz(data_dir, [1, 42]) fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) visualization.save(fp) visualization = Visualization.load(fp) with self.assertRaises(ValueError): visualization.get_index_paths() def test_get_index_paths_multiple_html_from_data_dir(self): data_dir = os.path.join(self.test_dir.name, 'multi-html-viz2') os.mkdir(data_dir) multi_html_viz(data_dir, [1, 42]) visualization = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) with self.assertRaises(ValueError): visualization.get_index_paths() def test_get_index_paths_relative_false(self): data_dir = os.path.join(self.test_dir.name, 'mc-viz-output2') os.mkdir(data_dir) most_common_viz(data_dir, collections.Counter(range(42))) visualization = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) def get_abs_path(rel): return str(visualization._archiver.root_dir / rel) actual = visualization.get_index_paths(relative=False) expected = {'html': get_abs_path('data/index.html'), 'tsv': get_abs_path('data/index.tsv')} self.assertEqual(actual, expected) def test_peek(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization.save(fp) metadata = Visualization.peek(fp) self.assertIsInstance(metadata, ResultMetadata) self.assertEqual(metadata.type, 'Visualization') self.assertEqual(metadata.uuid, str(visualization.uuid)) self.assertIsNone(metadata.format) def test_eq_identity(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) self.assertEqual(visualization, visualization) def test_eq_same_uuid(self): fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization1 = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization1.save(fp) visualization2 = Visualization.load(fp) self.assertEqual(visualization1, visualization2) def test_ne_same_data_different_uuid(self): visualization1 = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization2 = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) self.assertNotEqual(visualization1, visualization2) def test_ne_different_data_different_uuid(self): visualization1 = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) data_dir = os.path.join(self.test_dir.name, 'mc-viz-output1') os.mkdir(data_dir) most_common_viz(data_dir, collections.Counter(range(42))) visualization2 = Visualization._from_data_dir( data_dir, self.make_provenance_capture()) self.assertNotEqual(visualization1, visualization2) def test_ne_subclass_same_uuid(self): class VisualizationSubclass(Visualization): pass fp = os.path.join(self.test_dir.name, 'visualization.qzv') visualization1 = VisualizationSubclass._from_data_dir( self.data_dir, self.make_provenance_capture()) visualization1.save(fp) visualization2 = Visualization.load(fp) self.assertNotEqual(visualization1, visualization2) self.assertNotEqual(visualization2, visualization1) def test_ne_different_type_same_uuid(self): visualization = Visualization._from_data_dir( self.data_dir, self.make_provenance_capture()) class Faker: @property def uuid(self): return visualization.uuid faker = Faker() self.assertNotEqual(visualization, faker) if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/tests/test_visualizer.py000066400000000000000000000444021462552636000220660ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- import concurrent.futures import inspect import os.path import tempfile import unittest import uuid import qiime2.plugin import qiime2.core.type from qiime2.core.type import VisualizerSignature, Str, Range from qiime2.core.type.visualization import Visualization as VisualizationType from qiime2.sdk import Artifact, Visualization, Visualizer, Results from qiime2.core.testing.visualizer import (most_common_viz, mapping_viz, params_only_viz, no_input_viz) from qiime2.core.testing.type import IntSequence1, IntSequence2, Mapping from qiime2.core.testing.util import get_dummy_plugin, ArchiveTestingMixin class TestVisualizer(unittest.TestCase, ArchiveTestingMixin): def setUp(self): # TODO standardize temporary directories created by QIIME 2 self.test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-') self.plugin = get_dummy_plugin() def tearDown(self): self.test_dir.cleanup() def test_private_constructor(self): with self.assertRaisesRegex(NotImplementedError, 'Visualizer constructor.*private'): Visualizer() def test_from_function_with_artifacts_and_parameters(self): visualizer = self.plugin.visualizers['mapping_viz'] self.assertEqual(visualizer.id, 'mapping_viz') exp_sig = VisualizerSignature( mapping_viz, inputs={ 'mapping1': Mapping, 'mapping2': Mapping }, parameters={ 'key_label': qiime2.plugin.Str, 'value_label': qiime2.plugin.Str }, ) self.assertEqual(visualizer.signature, exp_sig) self.assertEqual(visualizer.name, 'Visualize two mappings') self.assertTrue( visualizer.description.startswith('This visualizer produces an ' 'HTML visualization')) self.assertTrue( visualizer.source.startswith('\n```python\ndef mapping_viz(')) def test_from_function_without_parameters(self): visualizer = self.plugin.visualizers['most_common_viz'] self.assertEqual(visualizer.id, 'most_common_viz') exp_sig = VisualizerSignature( most_common_viz, inputs={ 'ints': IntSequence1 | IntSequence2 }, parameters={} ) self.assertEqual(visualizer.signature, exp_sig) self.assertEqual(visualizer.name, 'Visualize most common integers') self.assertTrue( visualizer.description.startswith('This visualizer produces HTML ' 'and TSV')) self.assertTrue( visualizer.source.startswith('\n```python\ndef most_common_viz(')) def test_from_function_with_parameters_only(self): visualizer = self.plugin.visualizers['params_only_viz'] self.assertEqual(visualizer.id, 'params_only_viz') exp_sig = VisualizerSignature( params_only_viz, inputs={}, parameters={ 'name': qiime2.plugin.Str, 'age': qiime2.plugin.Int % Range(0, None) } ) self.assertEqual(visualizer.signature, exp_sig) self.assertEqual(visualizer.name, 'Parameters only viz') self.assertTrue( visualizer.description.startswith('This visualizer only accepts ' 'parameters.')) self.assertTrue( visualizer.source.startswith('\n```python\ndef params_only_viz(')) def test_from_function_without_inputs_or_parameters(self): visualizer = self.plugin.visualizers['no_input_viz'] self.assertEqual(visualizer.id, 'no_input_viz') exp_sig = VisualizerSignature( no_input_viz, inputs={}, parameters={} ) self.assertEqual(visualizer.signature, exp_sig) self.assertEqual(visualizer.name, 'No input viz') self.assertTrue( visualizer.description.startswith('This visualizer does not ' 'accept any')) self.assertTrue( visualizer.source.startswith('\n```python\ndef no_input_viz(')) def test_is_callable(self): self.assertTrue(callable(self.plugin.visualizers['mapping_viz'])) self.assertTrue(callable(self.plugin.visualizers['most_common_viz'])) def test_callable_properties(self): mapping_viz = self.plugin.visualizers['mapping_viz'] most_common_viz = self.plugin.visualizers['most_common_viz'] mapping_exp = { 'mapping1': Mapping, 'return': (VisualizationType,), 'key_label': Str, 'mapping2': Mapping, 'value_label': Str} most_common_exp = { 'ints': IntSequence1 | IntSequence2, 'return': (VisualizationType,)} mapper = { mapping_viz: mapping_exp, most_common_viz: most_common_exp} for visualizer, exp in mapper.items(): self.assertEqual(visualizer.__call__.__name__, '__call__') self.assertEqual(visualizer.__call__.__annotations__, exp) self.assertFalse(hasattr(visualizer.__call__, '__wrapped__')) def test_async_properties(self): mapping_viz = self.plugin.visualizers['mapping_viz'] most_common_viz = self.plugin.visualizers['most_common_viz'] mapping_exp = { 'mapping1': Mapping, 'return': (VisualizationType,), 'key_label': Str, 'mapping2': Mapping, 'value_label': Str} most_common_exp = { 'ints': IntSequence1 | IntSequence2, 'return': (VisualizationType,)} mapper = { mapping_viz: mapping_exp, most_common_viz: most_common_exp} for visualizer, exp in mapper.items(): self.assertEqual(visualizer.asynchronous.__name__, 'asynchronous') self.assertEqual(visualizer.asynchronous.__annotations__, exp) self.assertFalse(hasattr(visualizer.asynchronous, '__wrapped__')) def test_callable_and_async_signature(self): mapping_viz = self.plugin.visualizers['mapping_viz'] for callable_attr in '__call__', 'asynchronous': signature = inspect.Signature.from_callable( getattr(mapping_viz, callable_attr)) parameters = list(signature.parameters.items()) kind = inspect.Parameter.POSITIONAL_OR_KEYWORD exp_parameters = [ ('mapping1', inspect.Parameter( 'mapping1', kind, annotation=Mapping)), ('mapping2', inspect.Parameter( 'mapping2', kind, annotation=Mapping)), ('key_label', inspect.Parameter( 'key_label', kind, annotation=Str)), ('value_label', inspect.Parameter( 'value_label', kind, annotation=Str)) ] self.assertEqual(parameters, exp_parameters) def test_callable_and_async_different_signature(self): # Test that a different Visualizer object has a different dynamic # signature. most_common_viz = self.plugin.visualizers['most_common_viz'] for callable_attr in '__call__', 'asynchronous': signature = inspect.Signature.from_callable( getattr(most_common_viz, callable_attr)) parameters = list(signature.parameters.items()) kind = inspect.Parameter.POSITIONAL_OR_KEYWORD exp_parameters = [ ('ints', inspect.Parameter( 'ints', kind, annotation=IntSequence1 | IntSequence2)) ] self.assertEqual(parameters, exp_parameters) def test_call_with_artifacts_and_parameters(self): mapping_viz = self.plugin.visualizers['mapping_viz'] artifact1 = Artifact.import_data(Mapping, {'foo': 'abc', 'bar': 'def'}) artifact2 = Artifact.import_data( Mapping, {'baz': 'abc', 'bazz': 'ghi'}) result = mapping_viz(artifact1, artifact2, 'Key', 'Value') # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.visualization, result[0]) result = result[0] self.assertIsInstance(result, Visualization) self.assertEqual(result.type, qiime2.core.type.Visualization) self.assertIsInstance(result.uuid, uuid.UUID) # TODO qiime2.sdk.Visualization doesn't have an API to access its # contents yet. For now, save and assert the correct files are present. filepath = os.path.join(self.test_dir.name, 'visualization.qzv') result.save(filepath) root_dir = str(result.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml', 'provenance/artifacts/%s/metadata.yaml' % artifact1.uuid, 'provenance/artifacts/%s/VERSION' % artifact1.uuid, 'provenance/artifacts/%s/citations.bib' % artifact1.uuid, 'provenance/artifacts/%s/action/action.yaml' % artifact1.uuid, 'provenance/artifacts/%s/metadata.yaml' % artifact2.uuid, 'provenance/artifacts/%s/VERSION' % artifact2.uuid, 'provenance/artifacts/%s/citations.bib' % artifact2.uuid, 'provenance/artifacts/%s/action/action.yaml' % artifact2.uuid } self.assertArchiveMembers(filepath, root_dir, expected) def test_call_with_no_parameters(self): most_common_viz = self.plugin.visualizers['most_common_viz'] artifact = Artifact.import_data( IntSequence1, [42, 42, 10, 0, 42, 5, 0]) result = most_common_viz(artifact) # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.visualization, result[0]) result = result[0] self.assertIsInstance(result, Visualization) self.assertEqual(result.type, qiime2.core.type.Visualization) self.assertIsInstance(result.uuid, uuid.UUID) # TODO qiime2.sdk.Visualization doesn't have an API to access its # contents yet. For now, save and assert the correct files are present. filepath = os.path.join(self.test_dir.name, 'visualization.qzv') result.save(filepath) root_dir = str(result.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/index.tsv', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml', 'provenance/artifacts/%s/metadata.yaml' % artifact.uuid, 'provenance/artifacts/%s/VERSION' % artifact.uuid, 'provenance/artifacts/%s/citations.bib' % artifact.uuid, 'provenance/artifacts/%s/action/action.yaml' % artifact.uuid } self.assertArchiveMembers(filepath, root_dir, expected) def test_call_with_parameters_only(self): params_only_viz = self.plugin.visualizers['params_only_viz'] # Parameters all have default values. result, = params_only_viz() self.assertIsInstance(result, Visualization) self.assertEqual(result.type, qiime2.core.type.Visualization) self.assertIsInstance(result.uuid, uuid.UUID) filepath = os.path.join(self.test_dir.name, 'visualization.qzv') result.save(filepath) root_dir = str(result.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(filepath, root_dir, expected) def test_call_without_inputs_or_parameters(self): no_input_viz = self.plugin.visualizers['no_input_viz'] result, = no_input_viz() self.assertIsInstance(result, Visualization) self.assertEqual(result.type, qiime2.core.type.Visualization) self.assertIsInstance(result.uuid, uuid.UUID) filepath = os.path.join(self.test_dir.name, 'visualization.qzv') result.save(filepath) root_dir = str(result.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml' } self.assertArchiveMembers(filepath, root_dir, expected) def test_asynchronous(self): mapping_viz = self.plugin.visualizers['mapping_viz'] artifact1 = Artifact.import_data(Mapping, {'foo': 'abc', 'bar': 'def'}) artifact2 = Artifact.import_data( Mapping, {'baz': 'abc', 'bazz': 'ghi'}) future = mapping_viz.asynchronous(artifact1, artifact2, 'Key', 'Value') self.assertIsInstance(future, concurrent.futures.Future) result = future.result() # Test properties of the `Results` object. self.assertIsInstance(result, tuple) self.assertIsInstance(result, Results) self.assertEqual(len(result), 1) self.assertEqual(result.visualization, result[0]) result = result[0] self.assertIsInstance(result, Visualization) self.assertEqual(result.type, qiime2.core.type.Visualization) self.assertIsInstance(result.uuid, uuid.UUID) # TODO qiime2.sdk.Visualization doesn't have an API to access its # contents yet. For now, save and assert the correct files are present. filepath = os.path.join(self.test_dir.name, 'visualization.qzv') result.save(filepath) root_dir = str(result.uuid) expected = { 'VERSION', 'checksums.md5', 'metadata.yaml', 'data/index.html', 'data/css/style.css', 'provenance/metadata.yaml', 'provenance/VERSION', 'provenance/citations.bib', 'provenance/action/action.yaml', 'provenance/artifacts/%s/metadata.yaml' % artifact1.uuid, 'provenance/artifacts/%s/VERSION' % artifact1.uuid, 'provenance/artifacts/%s/citations.bib' % artifact1.uuid, 'provenance/artifacts/%s/action/action.yaml' % artifact1.uuid, 'provenance/artifacts/%s/metadata.yaml' % artifact2.uuid, 'provenance/artifacts/%s/VERSION' % artifact2.uuid, 'provenance/artifacts/%s/citations.bib' % artifact2.uuid, 'provenance/artifacts/%s/action/action.yaml' % artifact2.uuid } self.assertArchiveMembers(filepath, root_dir, expected) def test_visualizer_callable_output(self): artifact = Artifact.import_data(Mapping, {'foo': 'abc', 'bar': 'def'}) # Callable returns a value from `return_vals` return_vals = (True, False, [], {}, '', 0, 0.0) for return_val in return_vals: def func(output_dir: str, foo: dict) -> None: return return_val self.plugin.visualizers.register_function( func, {'foo': Mapping}, {}, '', '' ) visualizer = self.plugin.visualizers['func'] with self.assertRaisesRegex(TypeError, "should not return"): visualizer(foo=artifact) # Callable returns None (default function return) def func(output_dir: str, foo: dict) -> None: return None self.plugin.visualizers.register_function( func, {'foo': Mapping}, {}, '', '' ) visualizer = self.plugin.visualizers['func'] # Should not raise an exception output = visualizer(foo=artifact) self.assertIsInstance(output, Results) self.assertIsInstance(output.visualization, Visualization) def test_docstring(self): mapping_viz = self.plugin.visualizers['mapping_viz'] common_viz = self.plugin.visualizers['most_common_viz'] params_only_viz = self.plugin.visualizers['params_only_viz'] no_input_viz = self.plugin.visualizers['no_input_viz'] obs = mapping_viz.__call__.__doc__ self.assertEqual(obs, exp_mapping_viz) obs = common_viz.__call__.__doc__ self.assertEqual(obs, exp_common_viz) obs = params_only_viz.__call__.__doc__ self.assertEqual(obs, exp_params_only_viz) obs = no_input_viz.__call__.__doc__ self.assertEqual(obs, exp_no_input_viz) exp_mapping_viz = """\ Visualize two mappings This visualizer produces an HTML visualization of two key-value mappings, each sorted in alphabetical order by key. Parameters ---------- mapping1 : Mapping mapping2 : Mapping key_label : Str value_label : Str Returns ------- visualization : Visualization """ exp_common_viz = """\ Visualize most common integers This visualizer produces HTML and TSV outputs containing the input sequence of integers ordered from most- to least-frequently occurring, along with their respective frequencies. Parameters ---------- ints : IntSequence1 | IntSequence2 Returns ------- visualization : Visualization """ exp_params_only_viz = """\ Parameters only viz This visualizer only accepts parameters. Parameters ---------- name : Str, optional age : Int % Range(0, None), optional Returns ------- visualization : Visualization """ exp_no_input_viz = """\ No input viz This visualizer does not accept any type of input. Returns ------- visualization : Visualization """ if __name__ == '__main__': unittest.main() qiime2-2024.5.0/qiime2/sdk/usage.py000066400000000000000000001771421462552636000166040ustar00rootroot00000000000000# ---------------------------------------------------------------------------- # Copyright (c) 2016-2023, QIIME 2 development team. # # Distributed under the terms of the Modified BSD License. # # The full license is in the file LICENSE, distributed with this software. # ---------------------------------------------------------------------------- """ The Usage API provides an interface-agnostic way for QIIME 2 plugin developers to define examples of how to use their plugin’s actions. This enables the programmatic generation of examples for all QIIME 2 interfaces, eliminating the need to maintain specific examples for multiple interfaces. **Importantly there are two sides to the API**, the usage example side, and the interface driver side. A usage example must never call a method which is intended for a usage driver. These methods will be denoted with the following admonition: Warning ------- For use by interface drivers only. Do not use in a written usage example. Interface developers may want to pay special attention to these methods, as they will likely simplify their code. If the above warning is not present, then the method is likely intended to be used to describe some example and may be used by an example writer, or overriden by a usage driver. For the docs below, we set the following artificial global, as if we were always in a usage example with a ``use`` variable defined. This is only to fool the doctest module. This should never be done in the real world. >>> import builtins >>> builtins.use = ExecutionUsage() """ from typing import Set, List, Literal, Any, Callable, Type, Union import dataclasses import functools import re import qiime2 from qiime2 import sdk from qiime2.core.type import ( is_semantic_type, is_visualization_type, is_collection_type ) def assert_usage_var_type(usage_variable, *valid_types): """Testing utility to assert a usage variable has the right type. Parameters ---------- usage_variable : `qiime2.sdk.usage.UsageVariable` The usage variable to test. *valid_types : 'artifact', 'artifact_collection', 'visualization', 'visualization_collection', 'metadata', 'column', 'format' The valid variable types to expect. Raises ------ AssertionError If the variable is not the correct type. """ if usage_variable.var_type not in valid_types: tmpl = ( usage_variable.name, valid_types, usage_variable.var_type, ) raise AssertionError('Incorrect var_type for %s, need %s got %s' % tmpl) class UsageAction: """An object which represents a deferred lookup for a QIIME 2 action. One of three "argument objects" used by :meth:`Usage.action`. The other two are :class:`UsageInputs` and :class:`UsageOutputNames`. """ def __init__(self, plugin_id: str, action_id: str): """Constructor for UsageAction. The parameters should identify an existing plugin and action of that plugin. Important --------- There should be an existing plugin manager by the time this object is created, or an error will be raised. Typically instantiation happens by executing an example, so this will generally be true. Parameters ---------- plugin_id : str The (typically under-scored) name of a plugin, e.g. "my_plugin". action_id : str The (typically under-scored) name of an action, e.g. "my_action". Raises ------ qiime2.sdk.UninitializedPluginManagerError If there is not an existing plugin manager to define the available plugins. Examples -------- >>> results = use.action( ... use.UsageAction(plugin_id='dummy_plugin', ... action_id='params_only_method'), ... use.UsageInputs(name='foo', age=42), ... use.UsageOutputNames(out='bar1') ... ) >>> results.out See Also -------- UsageInputs UsageOutputNames Usage.action qiime2.sdk.PluginManager """ if plugin_id == '': raise ValueError('Must specify a value for plugin_id.') if action_id == '': raise ValueError('Must specify a value for action_id.') self.plugin_id: str = plugin_id """The (typically under-scored) name of a plugin, e.g. "my_plugin". Warning ------- For use by interface drivers only. Do not use in a written usage example. """ self.action_id: str = action_id """The (typically under-scored) name of an action, e.g. "my_action". Warning ------- For use by interface drivers only. Do not use in a written usage example. """ try: self._plugin_manager = sdk.PluginManager.reuse_existing() except sdk.UninitializedPluginManagerError: raise sdk.UninitializedPluginManagerError( 'Please create an instance of sdk.PluginManager' ) def __repr__(self): return 'UsageAction(plugin_id=%r, action_id=%r)' %\ (self.plugin_id, self.action_id) def get_action(self) -> sdk.Action: """Retrieve the actual SDK object (qiime2.sdk.Action) Warning ------- For use by interface drivers only. Do not use in a written usage example. Returns ------- action : instance of qiime2.sdk.Action subclass Raises ------ KeyError If the action parameterized by this object does not exist in the pre-initialized plugin manager. """ plugin = self._plugin_manager.get_plugin(id=self.plugin_id) try: action_f = plugin.actions[self.action_id] except KeyError: raise KeyError('No action currently registered with ' 'id: "%s".' % (self.action_id,)) return action_f class UsageInputs: """A dict-like mapping of parameters to arguments for invoking an action. One of three "argument objects" used by :meth:`Usage.action`. The other two are :class:`UsageAction` and :class:`UsageOutputNames`. Parameters should match the signature of the associated action, and arguments may be `UsageVariable` s or primitive values. """ def __init__(self, **kwargs): """Constructor for UsageInputs. Parameters ---------- **kwargs : primitive or UsageVariable The keys used should match the signature of the action. The values should be valid arguments of the action or variables of such arguments. Examples -------- >>> results = use.action( ... use.UsageAction(plugin_id='dummy_plugin', ... action_id='params_only_method'), ... use.UsageInputs(name='foo', age=42), ... use.UsageOutputNames(out='bar2') ... ) >>> results.out See Also -------- UsageAction UsageOutputNames Usage.action """ self.values = kwargs def __repr__(self): return 'UsageInputs(**%r)' % (self.values,) def __getitem__(self, key): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values[key] def __contains__(self, key): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return key in self.values def items(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.items() def keys(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.keys() def values(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.values() def map_variables(self, function): """Convert variables into something else, leaving primitives alone. Warning ------- For use by interface drivers only. Do not use in a written usage example. Parameters ---------- function : Callable[[UsageVariable], Any] The function to map over all variables. This function will not be called on any primitive values. Returns ------- dict A new dictionary of key-value pairs where all variables have been converted by `function`. Examples -------- >>> # Example situation >>> var = use.usage_variable('foo', lambda: ..., 'artifact') >>> inputs = UsageInputs(foo=var, bar='bar') >>> inputs.map_variables(lambda v: v.to_interface_name()) {'foo': 'foo', 'bar': 'bar'} >>> inputs.map_variables(lambda v: v.execute()) {'foo': ..., 'bar': 'bar'} See Also -------- UsageVariable.to_interface_name UsageVariable.execute """ result = {} def mapped(v): if isinstance(v, UsageVariable): assert_usage_var_type(v, 'artifact', 'artifact_collection', 'visualization_collection', 'metadata', 'column') v = function(v) return v for name, value in self.items(): if isinstance(value, (list, set)): collection_type = type(value) value = [mapped(v) for v in value] value = collection_type(value) else: value = mapped(value) result[name] = value return result class UsageOutputNames: """A dict-like mapping of action outputs to desired names. One of three "argument objects" used by :meth:`Usage.action`. The other two are :class:`UsageAction` and :class:`UsageInputs`. All names must be strings. Note ---- The order defined by this object will dictate the order of the variables returned by :meth:`Usage.action`. """ def __init__(self, **kwargs): """Constructor for UsageOutputNames. Parameters ---------- **kwargs : str The name of the resulting variables to be returned by :meth:`Usage.action`. Raises ------ TypeError If the values provided are not strings. Examples -------- >>> results = use.action( ... use.UsageAction(plugin_id='dummy_plugin', ... action_id='params_only_method'), ... use.UsageInputs(name='foo', age=42), ... use.UsageOutputNames(out='bar3') ... ) >>> results.out See Also -------- UsageAction UsageInputs Usage.action """ for key, val in kwargs.items(): if not isinstance(val, str): raise TypeError( 'Name provided for key %r must be a string, not a %r.' % (key, type(val))) self.values = kwargs def __repr__(self): return 'UsageOutputNames(**%r)' % (self.values, ) def __getitem__(self, key): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values[key] def __contains__(self, key): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return key in self.values def items(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.items() def keys(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.keys() def values(self): """Same as a dictionary. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.values.values() class UsageOutputs(sdk.Results): """A vanity class over :class:`qiime2.sdk.Results`. Returned by :meth:`Usage.action` with order defined by :class:`UsageOutputNames`. """ pass VAR_TYPES = ('artifact', 'artifact_collection', 'visualization', 'visualization_collection', 'metadata', 'column', 'format') T_VAR_TYPES = Literal['artifact', 'artifact_collection', 'visualization', 'visualization_collection', 'metadata', 'column', 'format'] COLLECTION_VAR_TYPES = ('artifact_collection', 'visualization_collection') class UsageVariable: """A variable which represents some QIIME 2 generate-able value. These should not be used to represent primitive values such as strings, numbers, booleans, or lists/sets thereof. """ DEFERRED = object() VAR_TYPES = VAR_TYPES COLLECTION_VAR_TYPES = COLLECTION_VAR_TYPES def __init__(self, name: str, factory: Callable[[], Any], var_type: T_VAR_TYPES, usage: 'Usage'): """Constructor for UsageVariable. Generally initialized for you. Warning ------- For use by interface drivers only (and rarely at that). Do not use in a written usage example. Parameters ---------- name : str The name of this variable (interfaces will use this as a starting point). factory : Callable[[], Any] A function which will return a realized value of `var_type`. var_type : 'artifact', 'artifact_collection', 'visualization', 'visualization_collection', 'metadata', 'column', 'format' The type of value which will be returned by the factory. Most are self-explanatory, but "format" indicates that the factory produces a QIIME 2 file format or directory format, which is used for importing data. use : Usage The currently executing usage driver. Provided for convenience. """ if not callable(factory): raise TypeError('value for `factory` should be a `callable`, ' 'recieved %s' % (type(factory),)) if var_type not in self.VAR_TYPES: raise ValueError('value for `var_type` should be one of %r, ' 'received %s' % (self.VAR_TYPES, var_type)) self.name: str = name """The name of the variable, may differ from :meth:`to_interface_name`. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ self.factory: Callable[[], Any] = factory """The factory which produces the value. Generally :meth:`execute` should be used as it will calculate the results once, instead of generating a new object each time. Warning ------- For use by interface drivers only (and rarely at that). Do not use in a written usage example. """ self.var_type: Literal['artifact', 'artifact_collection', 'visualization', 'visualization_collection', 'metadata', 'column', 'format'] = var_type """The general type of this variable. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ self.value: Any = self.DEFERRED """The value of this variable, or DEFERRED. See :attr:`is_deferred`. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ self.use: Usage = usage """The current :class:`Usage` instance being used. Typically this is an instance of a subclass. Warning ------- For use by interface drivers only. It won't break anything, but it would be super-super-super weird to use in a written usage example. """ def __repr__(self): return '<%s name=%r, var_type=%r>' % (self.__class__.__name__, self.name, self.var_type) @property def is_deferred(self) -> bool: """Check if the value of this variable is available. Warning ------- For use by interface drivers only. Do not use in a written usage example. """ return self.value is self.DEFERRED def execute(self) -> Any: """Execute the factory to produce a value, this is stored and returned. Warning ------- For use by interface drivers only. Do not use in a written usage example. Examples -------- >>> var = UsageVariable('foo', lambda: '', ... 'artifact', use) >>> var.value >>> var.execute() '' >>> var.value '' See Also -------- factory value """ if self.is_deferred: self.value = self.factory() return self.value def save(self, filepath: str, ext: str = None) -> str: """Save the value of this variable to a filepath. The final path is returned. Warning ------- For use by interface drivers only. Do not use in a written usage example. Parameters ---------- filepath : path The filepath to save to. ext : str The extension to append. May be 'ext' or '.ext'. If the extension is already present on filepath, it is not added. Returns ------- path Path saved to, including the extension if added. """ value = self.execute() return value.save(filepath, ext=ext) def to_interface_name(self) -> str: """Convert this variable to an interface-specific name. Warning ------- For use by interface drivers only. Do not use in a written usage example. This method should generally be overriden by a driver to be interface-specific. Examples -------- >>> class MyUsageVariable(UsageVariable): ... def to_interface_name(self): ... return '