RNeXML/0000755000176200001440000000000012734566742011337 5ustar liggesusersRNeXML/inst/0000755000176200001440000000000012734262162012301 5ustar liggesusersRNeXML/inst/examples/0000755000176200001440000000000012641021656014115 5ustar liggesusersRNeXML/inst/examples/ontotrace-result.xml0000644000176200001440000001636712641021656020166 0ustar liggesusers Generated from the Phenoscape Knowledgebase on 2015-10-21 by Ontotrace query: * taxa: <http://purl.obolibrary.org/obo/VTO_0036217> * entities: <http://purl.obolibrary.org/obo/BFO_0000050> some <http://purl.obolibrary.org/obo/UBERON_0008897> RNeXML/inst/examples/meta_taxa.xml0000644000176200001440000000406712641021656016611 0ustar liggesusers RNeXML/inst/examples/mbank_X962_11-22-2013_1534.nex0000644000176200001440000005134512641021656020242 0ustar liggesusers#NEXUS BEGIN TAXA; DIMENSIONS NTAX=27; TAXLABELS 'Sargocentron vexillarium' 'Beryx decadactylus' 'Polymixia berndti' 'Lateolabrax japonicus' 'Mioplosus labricoides' 'Priscacara serrata' 'Priscacara liops' 'Roccus saxatilis' 'Roccus chrysops' 'Morone americana' 'Morone mississippiensis' 'Dicentrarchus labrax' 'Lates mariae' 'Lates calcarifer' 'Siniperica' 'Centropomus parallelus' 'Centropomus undecimalis' 'Mycteroperca tigris' 'Serranus subligarius' 'Macquaria australasica' 'Percichthys trucha' 'Lepomis macrochirus' 'Micropterus dolomieu' 'Perca flavescens' 'Stizostedion vitreum' 'Pseudaphritis urvillii' 'Niphon' ; ENDBLOCK; BEGIN CHARACTERS; DIMENSIONS NCHAR=77; FORMAT DATATYPE=STANDARD MISSING=? GAP=- SYMBOLS="0123"; CHARLABELS [1] 'Vomer, shape of tooth patch' [2] 'Orbitosphenoid' [3] 'Pterotic, enclosure of lateral line canal' [4] 'Frontals, midline suture' [5] 'Frontoparietal crests' [6] 'Frontoparietal crests, sensory pore on dorsal margin' [7] 'Supraoccipital crest, shape' [8] 'Supraoccipital crest, horizontal shelf projecting laterally at mid-height' [9] 'Supraoccipital crest, shape of dorsal margin' [10] 'Sphenotic, horizontal shelf' [11] 'Mesethmoid, anterolaterally facing projection' [12] 'Lateral ethmoid-lacrimal articulation, orientation' [13] 'Lacrimal, shape' [14] 'Lacrimal, serration on ventral margin' [15] 'Non-lacrimal suborbital bones, ventral margins' [16] 'Subocular shelf present on third infraorbital' [17] 'Subocular shelf, shape' [18] 'Subocular shelf, anterior extension' [19] 'Post-temporal, ornamentation of posterior margin' [20] 'Post-temporal, depression on medial face' [21] 'Epiotic, strong posterior projection' [22] 'Symplectic, shape' [23] 'Metapterygoid lamina' [24] 'Exoccipital, articular facet shape' [25] 'Ethmovomerine region, number of facets for articulation with palatine' [26] 'Premaxilla, articular and ascending processes' [27] 'Premaxilla, lateral ridge running dorsoventrally on articular process' [28] 'Premaxilla, shape of dorsal crest' [29] 'Premaxilla, maxillary shelf or groove on external surface' [30] 'Supramaxillary bones' [31] 'Dentary teeth morphology' [32] 'Basihyal dentition' [33] 'Ceratohyal foramen' [34] 'Ceratohyal foramen development' [35] 'Opercular spine count' [36] 'Urohyal, posterior margin' [37] 'Shape of articular process of urohyal' [38] 'Morphology of ventral surface of urohyal' [39] 'Spine of urohyal shape' [40] 'Preoperculum, noteworthy spine on posterior margin of corner' [41] 'Preopercule, ornamentation of the ventral border of the horizontal limb' [42] 'Interopercle, ornamentation of margin' [43] 'Subopercle, ornamentation of margin' [44] 'Preopercle, enclosure of sensory canal on ascending limb' [45] 'Preopercle, enclosure of sensory canal on horizontal limb' [46] 'Precaudal vertebrae, number' [47] 'Second neural spine, expansion' [48] 'Fourth neural spine, articulation with dorsal pterygiophores' [49] 'Posterior abdominal haemal arches' [50] 'First haemal spine, configuration' [51] 'First haemal spine, transverse expansion' [52] 'First haemal spine, anterior face' [53] 'Uroneural pair, number' [54] 'Epural count' [55] 'Hypurals, fusion' [56] 'Hypurapophysis' [57] 'Length of spinous dorsal fin' [58] 'Supraneural bones, number' [59] 'Supernumerary dorsal fin spines' [60] 'Caudal fin ray count' [61] 'Caudal fin shape' [62] 'Caudal fin, spur on posteriormost procurrent ray' [63] 'Ray preceeding spur shortened' [64] 'Anal pterygiophores, number anterior to first haemal spine' [65] 'Anal fin spine count' [66] 'First anal pterygiophore, associated anal spines' [67] 'Proximal radial of the first anal fin pterygiophore, anteromedial ridge' [68] 'Radial attachment to scapulocoracoid' [69] 'Cleithrum, ornamentation on posterior margin' [70] 'Cleithrum, ventral expansion on posterior plate' [71] 'Pelvic bones, post-pelvic process' [72] 'Pelvic bones, shape of post-pelvic process' [73] 'Pelvic bones, accessory sub-pelvic keel' [74] 'Pattern of posterior scalelet distribution' [75] 'Scales, resorption of old ctenii' [76] 'Lateral line, auxiliary row of lateral line scales above/below the main row' [77] 'Lateral line, expansion onto posterior margin of caudal fin' ; STATELABELS 1 'Trapezoidal to ovate' 'Narrow, v-shaped tooth patch' 'Vomerine teeth reduced to a few large teeth' , 2 'Present' 'Absent' , 3 'absent or incomplete' 'complete' , 4 'joined along entire midline' 'separated by supraoccipital crest' , 5 'absent' 'present' , 6 'absent' 'present' , 7 'long and low' 'height and length roughly equal' , 8 'present' 'absent' , 9 'blade-like' 'significantly expanded laterally' , 10 'absent' 'present' , 11 'absent' 'present' , 12 'entirely or primarily in the horizontal plane' 'primarily in the vertical plane' , 13 'rectangular' 'square' , 14 'present' 'absent' , 15 'serrate' 'smooth' , 16 'present' 'absent' , 17 'quadrangular' 'ovate' , 18 'present' 'absent' , 19 'Denticulate' 'Smooth' , 20 'Absent' 'Present, connection site of post-temporal with tunica externa' , 21 'Absent' 'Present' , 22 'straight or slightly tapered' 'with sharp anterior bend' , 23 'Broad and well-developed' 'reduced to a notch and a ridge' , 24 'Ovate' 'Bean-shaped' , 25 'one' 'two' , 26 'indistinct' 'separate along entire margine' , 27 'absent' 'present' , 28 'flat to convex' 'triangular' , 29 'present' 'absent' , 30 'present' 'absent' , 31 'villiform' 'caniniform' , 32 'absent' 'present' , 33 'present' 'absent' , 34 'complete' 'incomplete, partial loss of upper strut' , 35 '0/1' '2' '3' , 36 'concave or flat' 'convex' , 37 'strut-like, cylindrical' 'flattened laterally into a blade-like structure' , 38 'v-shaped channel or incomplete tube' 'flate and perpendicular to the dorsoventral axis, or convex' , 39 'fused to form single spine' 'spine bifurcate' , 40 'absent' 'present' , 41 'serrate' 'large, triangular spines' 'smooth' , 42 'serrate' 'smooth' , 43 'serrate' 'smooth' , 44 'contained in open "gutter"' 'partially enclosed' , 45 'contained in open channel' 'partially enclosed' , 46 '10-12' '14 or more' , 47 'does not contact first neural spine' 'partial articulation between first and second neural spines' 'second spine contacts first along entire border' , 48 'without specialized groove' 'with groove for insertion of third dorsal pterygiophore' , 49 'bridged ventrally' 'open' , 50 'first haemal spine fused to parapophyses' 'partial fusion only' , 51 'absent, transverse processes directed ventrally or ventromedially' 'present, transverse processes directed ventrolaterally, forming a wing-shaped projection' , 52 'flat or slightly folded anteriorly' 'forming a sharp, anteromedially-directed groove' , 53 'two' 'one' , 54 'three' 'two' , 55 'absent' 'present between hypurals three and four' , 56 'absent' 'present' , 57 'shorter than soft-rayed dorsal fin' 'longer than or subequal in length to soft-rayed dorsal fin' , 58 'three' 'two' 'one' , 59 'present, two or more spines attach to first pterygiophore' 'absent, one spine in serial correspondence only' , 60 '17 or more' '15' 'State 2' , 61 'deeply forked' 'shallowly forked' 'straight to slightly convex' , 62 'absent' 'present' , 63 'ray not shortened' 'ray shortened' , 64 'one' 'two' 'three or more' , 65 'four' 'three' 'two' , 66 'two' 'one' , 67 'wide' 'narrow' , 68 'radials i-iii insert on scapula, iv in the interspace between scapula and coracoid' 'i-ii insert on scapula, iii inserts in interspace, iv inserts on coracoid' , 69 'denticulate' 'smooth' , 70 'absent' 'present' , 71 'absent' 'present' , 72 'spearhead or knob-shaped' 'short, massive expansion' , 73 'present' 'absent' , 74 'imbricate, cycloid, or pseudoctenoid' 'ordered, columnar distribution' , 75 'absent to slight' 'strong' 'complete' , 76 'absent' 'present' , 77 'absent or nearly so' 'significant (>1/3 of caudal fin depth)' , ; MATRIX 'Sargocentron vexillarium' 00000?00000000000000000000010000000000010000001000000001101000000000010?00000 'Beryx decadactylus' 00000??00000000001??000?00001000000000?000000010?00?000?001000010??0?00??0000 'Polymixia berndti' 00000?00000000000000000?00000001000000?00000000000000001000000000000100?00000 'Lateolabrax japonicus' 01011000111101100100100111100000011110111110011011000001101111101000011001201 'Mioplosus labricoides' 1?111?0011?????0??0?1?0?1110100?0011?0?01??00010?1??00011011111020?0?1???0001 'Priscacara serrata' 1?011?1?100?00?00?0?10??0111010?0010101011100000??0?00011001111010100110112?? 'Priscacara liops' 1?011?1?100????0????10??0111010?001?101001100000??0?00011001111010100110112?? 'Roccus saxatilis' 11011011100100100100100001110101011110100110000000000001100111101010011011211 'Roccus chrysops' 11011001100100100100100001110101011110100110000000000001100111101010011011211 'Morone americana' 11011011100100100100100001110101011100100000000000000001100111101010011011211 'Morone mississippiensis' 11011011100100100100100001110101011100100000000000000001100111101010011011211 'Dicentrarchus labrax' 01011001100100100100100001100101011000101110000000000001100101101010010?11211 'Lates mariae' 01111100111100101101111111100000000110111111101101000101101111101000011100011 'Lates calcarifer' 011111001111001011??111111100000001?1111111110110100010110112110100001??000?1 'Siniperica' 01100?00110001101101110111100000000????11111101100010011100011101010111010000 'Centropomus parallelus' 11001000100000000100000000000000000001100110002000000001100101101000101100001 'Centropomus undecimalis' 11001000100000000100000000000000000001100110002000000001100101101000101100001 'Mycteroperca tigris' 110010000000001010100000111110101?2001002111002000001011100210001001100?00100 'Serranus subligarius' 110110000000011010000000011111101?2011000110001000001011100110001000101000100 'Macquaria australasica' 11001010000000101000000010011000010010000001001011110001100111111001010?00000 'Percichthys trucha' 01000?10000000101000001?10001000010010000000001001110001101111111001100?00000 'Lepomis macrochirus' 0101101100001111??10000000011100000010002111100011110001101110011010100?00000 'Micropterus dolomieu' 0100100000000111??10000001011000000010002111100011110001100110011010100?00000 'Perca flavescens' 11000?1010000111??00000011011100010010000000111011001001111210002111011001200 'Stizostedion vitreum' 21000?0000000111??00100111011110010011001000111011001001111210012111111001201 'Pseudaphritis urvillii' 11010?000???0111??10000??10111001?0000000000?10110001?010212?001211111???1200 'Niphon' ??1?0?0?010?0011???00?00??0?????00?01001?????1???00?1001110?100?10??1010?1200 ; ENDBLOCK; BEGIN NOTES; [Character comments] TEXT CHARACTER=12 TEXT='Waldman, 1986'; TEXT CHARACTER=17 TEXT='Carpenter and Johnson, 2002'; TEXT CHARACTER=18 TEXT='Carpenter and Johnson, 2002'; TEXT CHARACTER=20 TEXT='Otero, 2004'; TEXT CHARACTER=21 TEXT='Otero, 2004'; TEXT CHARACTER=23 TEXT='Otero, 2004'; TEXT CHARACTER=24 TEXT='Otero, 2004'; TEXT CHARACTER=26 TEXT='Modified from Day, 2002'; TEXT CHARACTER=27 TEXT='Otero, 2004'; TEXT CHARACTER=28 TEXT='Day, 2002'; TEXT CHARACTER=40 TEXT='Otero, 2004'; TEXT CHARACTER=48 TEXT='Waldman, 1986'; TEXT CHARACTER=49 TEXT='Chang, 1988'; TEXT CHARACTER=50 TEXT='Otero, 2004'; TEXT CHARACTER=51 TEXT='Chang, 1988'; TEXT CHARACTER=52 TEXT='Chang, 1988'; TEXT CHARACTER=53 TEXT='Otero, 2004'; TEXT CHARACTER=54 TEXT='Otero, 2004'; TEXT CHARACTER=55 TEXT='Otero, 2004'; TEXT CHARACTER=62 TEXT='Johnson, 1975'; TEXT CHARACTER=63 TEXT='Johnson, 1975'; TEXT CHARACTER=64 TEXT='Chang, 1988'; TEXT CHARACTER=67 TEXT='Chang, 1988'; TEXT CHARACTER=68 TEXT='Johnson, 1975'; TEXT CHARACTER=69 TEXT='Otero, 2004'; TEXT CHARACTER=70 TEXT='Otero, 2004'; TEXT CHARACTER=72 TEXT='modified from Otero, 2004'; TEXT CHARACTER=74 TEXT='McCully, 1963'; TEXT CHARACTER=75 TEXT='McCully, 1962'; [Attribute comments] ENDBLOCK; BEGIN MACCLADE; Version 4.0 84 ; LastModified -973521234 ; FileSettings editor 0 0 1 1 ; Singles 000 ; Editor 0001100111111110010001001 1 24 Geneva 9 100 1 all ; EditorPosition 46 48 691 963 ; TreeWindowPosition 46 6 699 974 ; ListWindow Characters closed Geneva 9 46 48 689 974 000 ; ListWindow Taxa closed Geneva 9 50 10 145 490 100000 ; ListWindow Trees closed Geneva 9 50 10 276 490 ; ListWindow TypeSets closed Geneva 9 50 10 276 490 ; ListWindow WtSets closed Geneva 9 50 10 276 490 ; ListWindow ExSets closed Geneva 9 50 10 276 490 ; ListWindow CharSets closed Geneva 9 50 10 276 490 ; ListWindow TaxSets closed Geneva 9 50 10 276 490 ; ListWindow CharPartitions closed Geneva 9 50 10 276 490 ; ListWindow CharPartNames closed Geneva 9 50 10 276 490 ; ListWindow WtSets closed Geneva 9 50 10 276 490 ; ChartWindowPosition 52 30 686 964 ; StateNamesSymbols closed Geneva 9 10 50 30 148 220 ; WindowOrder Data ; OtherSymbols & / 00 ? - ; Correlation 0 0 1000 0 0 10011010 ; Salmo 00000001 ; EditorFile 2 ; ExportHTML _ MOSS 100 110000 ; PrettyPrint 10 ; EditorToolsPosition 579 88 115 165 ; TreeWindowProgram 10 ; TreeWindow 0000 ; Continuous 0 3 1 ; Calculations 0000001 ; SummaryMode 0 0 0 ; Charts Geneva 9 ( normal ) 0010 ; NexusOptions 0 0 50 001011011 ; TipLabel 1 ; TreeFont Geneva 9 ( normal ) ; TreeShape 1.0 1.0 0100 ; TraceLabels 0101 ; ChartColors 0 0 65535 9 0 1 ; ChartBiggestSpot 1 ; ChartPercent 10 ; ChartBarWidth 10 1 ; ChartVerticalAxis 10101 ; ChartMinMax 0 ; TraceAllChangesDisplay 1 1 ; BarsOnBranchesDisplay 0 0 60000 10000 10000 10000 10000 60000 65000 65000 65000 6 1 0000101 ; ContinuousBranchLabels 0 ; AllStatesBranchLabels 1 ; IndexNotation 2 1 ; PrintTree 10.00 2 2 2 2 2 2 2 2 2 2 2 Geneva 9 ( normal ) Geneva 10 ( normal ) Geneva 9 ( normal ) Geneva 9 ( normal ) Geneva 9 ( bold ) Geneva 9 ( normal ) Geneva 9 ( normal ) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 256 -39 4 -40 0 1 2 1 8 0 0 0 2 1000111000000000000100000111000 ; MatchChar 00 . ; EntryInterpretation 01 ; ColorOptions 00 ; TreeTools 0 5 4 0 10 4 0 00100111111101110 ; EditorTools 0 0 0 1000 0 0 6 3 0 100000101110001 ; PairAlign 2 2 3 2 1 1 2 1 3 1010 ; BothTools 1 ; ENDBLOCK; BEGIN MESQUITE; MESQUITESCRIPTVERSION 2 ; TITLE AUTO; tell ProjectCoordinator ; getEmployee #mesquite.minimal.ManageTaxa.ManageTaxa ; tell It ; setID 0 8120504986075687201 ; tell It ; setDefaultOrder 0 1 2 3 6 4 5 7 8 9 10 11 12 27 13 14 15 20 22 16 17 23 24 18 19 25 26 ; attachments ; endTell ; endTell ; getEmployee #mesquite.charMatrices.ManageCharacters.ManageCharacters ; tell It ; setID 0 5510595068374379090 ; tell It ; setDefaultOrder 0 1 2 3 4 5 6 7 9 8 10 11 12 13 14 15 16 17 18 19 20 22 30 47 28 23 24 25 26 29 21 27 31 32 34 35 36 37 38 41 42 43 44 45 46 48 49 50 76 54 77 78 55 56 57 82 52 51 53 58 59 60 61 75 62 81 79 65 66 67 68 69 70 71 72 73 74 ; attachments ; endTell ; checksumv 0 2 1698488223 null ; endTell ; getWindow ; tell It ; suppress ; setResourcesState false false 100 ; setPopoutState 400 ; setExplanationSize 0 ; setAnnotationSize 0 ; setFontIncAnnot 0 ; setFontIncExp 0 ; setSize 1489 858 ; setLocation 96 36 ; setFont SanSerif ; setFontSize 10 ; getToolPalette ; tell It ; endTell ; desuppress ; endTell ; getEmployee #mesquite.minimal.ManageTaxa.ManageTaxa ; tell It ; showTaxa #8120504986075687201 #mesquite.lists.TaxonList.TaxonList ; tell It ; setTaxa #8120504986075687201 ; getWindow ; tell It ; newAssistant #mesquite.lists.DefaultTaxaOrder.DefaultTaxaOrder ; newAssistant #mesquite.lists.TaxonListCurrPartition.TaxonListCurrPartition ; setExplanationSize 30 ; setAnnotationSize 20 ; setFontIncAnnot 0 ; setFontIncExp 0 ; setSize 1389 791 ; setLocation 96 36 ; setFont SanSerif ; setFontSize 10 ; getToolPalette ; tell It ; endTell ; endTell ; showWindow ; getEmployee #mesquite.lists.ColorTaxon.ColorTaxon ; tell It ; setColor Red ; removeColor off ; endTell ; getEmployee #mesquite.lists.TaxonListAnnotPanel.TaxonListAnnotPanel ; tell It ; togglePanel off ; endTell ; endTell ; endTell ; getEmployee #mesquite.charMatrices.BasicDataWindowCoord.BasicDataWindowCoord ; tell It ; showDataWindow #5510595068374379090 #mesquite.charMatrices.BasicDataWindowMaker.BasicDataWindowMaker ; tell It ; getWindow ; tell It ; setExplanationSize 30 ; setAnnotationSize 20 ; setFontIncAnnot 0 ; setFontIncExp 0 ; setSize 1389 791 ; setLocation 96 36 ; setFont SanSerif ; setFontSize 10 ; getToolPalette ; tell It ; setTool mesquite.charMatrices.BasicDataWindowMaker.BasicDataWindow.ibeam ; endTell ; setActive ; setTool mesquite.charMatrices.BasicDataWindowMaker.BasicDataWindow.ibeam ; colorCells #mesquite.charMatrices.ColorByState.ColorByState ; colorRowNames #mesquite.charMatrices.TaxonGroupColor.TaxonGroupColor ; colorColumnNames #mesquite.charMatrices.CharGroupColor.CharGroupColor ; colorText #mesquite.charMatrices.NoColor.NoColor ; setBackground White ; toggleShowNames on ; toggleShowTaxonNames on ; toggleTight off ; toggleThinRows off ; toggleShowChanges on ; toggleSeparateLines off ; toggleShowStates on ; toggleAutoWCharNames on ; toggleShowDefaultCharNames off ; toggleConstrainCW on ; setColumnWidth 16 ; toggleBirdsEye off ; toggleAllowAutosize on ; toggleColorsPanel off ; toggleDiagonal on ; setDiagonalHeight 80 ; toggleLinkedScrolling on ; toggleScrollLinkedTables off ; endTell ; showWindow ; getEmployee #mesquite.charMatrices.ColorCells.ColorCells ; tell It ; setColor Red ; removeColor off ; endTell ; getEmployee #mesquite.categ.StateNamesStrip.StateNamesStrip ; tell It ; showStrip off ; endTell ; getEmployee #mesquite.charMatrices.AnnotPanel.AnnotPanel ; tell It ; togglePanel off ; endTell ; getEmployee #mesquite.charMatrices.CharReferenceStrip.CharReferenceStrip ; tell It ; showStrip off ; endTell ; getEmployee #mesquite.charMatrices.QuickKeySelector.QuickKeySelector ; tell It ; autotabOff ; endTell ; getEmployee #mesquite.categ.SmallStateNamesEditor.SmallStateNamesEditor ; tell It ; panelOpen true ; endTell ; endTell ; endTell ; getEmployee #mesquite.charMatrices.ManageCharacters.ManageCharacters ; tell It ; showCharacters #5510595068374379090 #mesquite.lists.CharacterList.CharacterList ; tell It ; setData 0 ; getWindow ; tell It ; newAssistant #mesquite.lists.DefaultCharOrder.DefaultCharOrder ; newAssistant #mesquite.lists.CharListInclusion.CharListInclusion ; newAssistant #mesquite.lists.CharListPartition.CharListPartition ; newAssistant #mesquite.stochchar.CharListProbModels.CharListProbModels ; setExplanationSize 30 ; setAnnotationSize 20 ; setFontIncAnnot 0 ; setFontIncExp 0 ; setSize 1389 791 ; setLocation 96 36 ; setFont SanSerif ; setFontSize 10 ; getToolPalette ; tell It ; setTool mesquite.lists.CharacterList.CharacterListWindow.ibeam ; endTell ; endTell ; showWindow ; getEmployee #mesquite.lists.CharListAnnotPanel.CharListAnnotPanel ; tell It ; togglePanel off ; endTell ; endTell ; endTell ; endTell ; ENDBLOCK; BEGIN MESQUITECHARMODELS; ProbModelSet * UNTITLED = Mk1 (est.) : 1 - 77 ; ENDBLOCK; BEGIN PAUP; outgroup Sargocentron_vexillarium ; outgroup Polymixia_berndti ; outgroup Beryx_decadactylus ; ENDBLOCK; BEGIN TREES; Title Trees from "WhitlockMatrix" ; LINK Taxa = Taxa ; TRANSLATE 1 Sargocentron_vexillarium , 2 Beryx_decadactylus , 3 Polymixia_berndti , 4 Lateolabrax_japonicus , 5 Mioplosus_labricoides , 6 Priscacara_serrata , 7 Priscacara_liops , 8 Roccus_saxatilis , 9 Roccus_chrysops , 10 Morone_americana , 11 Morone_mississippiensis , 12 Dicentrarchus_labrax , 13 Lates_mariae , 14 Lates_calcarifer , 15 Siniperca , 16 Centropomus_parallelus , 17 Centropomus_undecimalis , 18 Mycteroperca_tigris , 19 Serranus_subligarius , 20 Macquaria_australasica , 21 Percichthys_trucha , 22 Lepomis_macrochirus , 23 Micropterus_dolomieu , 24 Perca_flavescens , 25 Stizostedion_vitreum , 26 Pseudaphritis_urvillii , 27 Niphon ; TREE UNTITLED+ = (1 , (2 , 3) , ((((((6 , ((8 , 9 , (11 , 10)) , 7)) , 12) , ((5 , 4) , (15 , 13))) , (17 , 16)) , (18 , 19) , (22 , 23)) , (21 , ((25 , 24) , 20)))) ; ENDBLOCK; RNeXML/inst/examples/taxa.xml0000644000176200001440000000613212641021656015576 0ustar liggesusers RNeXML/inst/examples/trees.xml0000644000176200001440000001234012641021656015761 0ustar liggesusers RNeXML/inst/examples/gardiner_1984.xml0000644000176200001440000000546112641021656017125 0ustar liggesusers PSPUB:0000103 Gardiner, B. G. (1984). The relationships of the palaeoniscid fishes, a review based on new specimens of Mimia and Moythomasia from the Upper Devonian of Western Australia. Bulletin of the British Museum of Natural History, Geology, 37, 173–428. RNeXML/inst/examples/missing_some_branchlengths.xml0000644000176200001440000016563312641021656022253 0ustar liggesusers RNeXML/inst/examples/characters.xml0000644000176200001440000005654112641021656016771 0ustar liggesusers 0101 0101 0101 0101 0101 A C G C T C G C A T C G C A T C A C G C T C G C A T C G C A T C A C G C T C G C A T C G C A T C ACGCUCGCAUCGCAUC ACGCUCGCAUCGCAUC ACGCUCGCAUCGCAUC -1.545414144070023 -2.3905621575431044 -2.9610221833467265 0.7868662069161243 0.22968509237534918 -1.6259836379710066 3.649352410850134 1.778885099660406 -1.2580877968480846 0.22335354995610862 -1.5798979984134964 2.9548251411133157 1.522005675256233 -0.8642016921755289 -0.938129801832388 2.7436692306788086 -0.7151148143399818 4.592207937774776 -0.6898841440534845 0.5769509574453064 3.1060827493657683 -1.0453787389160105 2.67416332763427 -1.4045634106692808 0.019890469925520196 1 2 2 2 3 4 2 3 4 1 RNeXML/inst/examples/treebase-record.xml0000644000176200001440000055017512641021656017722 0ustar liggesusers AGTTCTGAAACGGGTTGTAGCTGGCCTTA-----CGAGGCATGTGCACGCCCTGCTCATCCACTCT-ACACCTGTGCACCATCTGTAGGTCGGTTTGGGTTCGGATGCTTCGCGGCGTTCGGGCTCGGGCCTTCCTATGTACT-TCACACACGCTTTAGTAT-CAGAATGTAATTGCGA----TAAAACGCACCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATCATCAACCCATATGTCCTTGTGTCG--GATGGGCTTGGA-TTTGGAGGCTTATGCCGGCCCTC-GTC--GGTCGGCTCCTCTTGAATGCATTAGCTCGATTCCTTGCGGATCGGCTCCCGGTGTGATAATTGTCTACGCCGTGACCGT-GAAGC----GTTTGGCGAGCTTCGAACCGTCCTATGGACAAACTTATATCTTGACATCTGACCTCCGAGGTGCGTGTCAAAATCAAGACGACGTTGCCCTCCATTAACAGCAGTCGTGACTTGTTAGCCCTACAACGCGACACTCTCTGTGCATCAACTCGTCGAGAACTCTGACCAGACGTTCTGCATTGATAACGAGGCATTATATGATATATGCTTCAGAACCCTCAAGCTCACTACACCAACTTATGGTGACCTTAACCACCTTGTATCGATTGTCATGTCCGGTATCACGACTTGCTTGCGTTTCCCTGGTCAGCTGAATTCTGACTTGCGGAAGTTGGCTGTCAACATGGGTAATGCTTTCCTTCAGACTGGCCTAGATGCGTT-TTTCTCATTCGATGTTTTCTTTTGCAGTTCCCTTCCCTCGTCTTCATTTCTTCATGACCGGCTTCGCGCCTTTGACCGCTCGGGGTAGCCAGCAATACCGCGCGGTCACCGTCCCTGAGCTGACGCAGCAAATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCAGGCATGGCCGCTACCTCACTGTAGGTGTTAATGTTTCTTCT----GTGTTCCG-------TCATCTGAAACCTGTTCCATAGGTTGCTGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTAA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCC?TATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTAA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGATCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGATCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTAA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGATCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGC?AGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGA?AACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATATGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGC?CTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCA?CAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGC?CTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCAT?TCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTAA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTAC?TTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGATCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTCAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGTGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTACAGAAAAGGGTTGTCGCTGGCCTCAAAATTCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGCGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCCACGGGACAATTCAATATGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTCAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGA?AAGGGTTGTCGCTGGCCTCAAAATTCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGCGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGCGGTCGTTGAAGCCTCAGTTGGGAGAGCT-ATAATCGTCCCCCACGGGACAAT-GAATTTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTCAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCACATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCTGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTATGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGTTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGCTCATGCCGGTCCCC-ATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGATATTTGCTTCCGGACACTGAAGCTGACGACACCCACATATGGCGATCTGAACCACCTCATTTCCATTGTCATGTCCGGCATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGACCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTTTCACTT-GATTCCTTGTGATATGGCACTTATGATTGACTGTTGAAATTC-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTATCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCGAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATCACG-GAGCGTATCAATCTGATCTATTTGTT?C?TTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATA-GACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAACAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCT--TTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTT?TTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAACAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGG?TGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCT--TTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATT--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTATATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGTTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCTGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATC?GGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCA?CAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGC?CTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTC?CCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTAC?TTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGC?CTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATAACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGCG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCTTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGATCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATCTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTAA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATCTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCACATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTACAGAAATGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCTCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCGCTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCTTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATTCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGACGTGCACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGCGGTCGTTGAAGCCTCAGTCGGGAGAGCTCATAATCGTCCCTTC-GGGACAATCGAATATTACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTTTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATCTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATCGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTTTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATCTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATTCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATCGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGACGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGCGGTCGTTGAAGCCTCGGTCGGGAGAGCTTATAATCGTCCCTTC-GGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CGTTTCAAATGTTACGTTGGAGAAAATC--TGACGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTTTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATCTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATC?GGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATAGTCCCCCACGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CGTTTCACATGTCACGTTGGAGAAAATC--TGA?GACCGTTGATC?T-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTAAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTAC?TTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATAGTCCCCTCCGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CATTTCACATGTCACGTTGGATAAAATC--TGATGACCGTTGATCAT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAA?GAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACATTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACATTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTCGCTGGCCTCAAAATTCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCGGGTCCCTCGCGGGGTCGGGTTCTGCGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACATCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTTGGGCGAGCTCACAATAGTCCCCTACGGGACAATTCAATCTGACATCTGACCTCCGAGGTCTGTG-CATTTCACATGTCACGTTGGATAAAATC--TGATGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCATCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAACCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGCTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGAAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGATATGACACTTATGATTGACTGTTGAAATTT-TAGTCCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTC?GACCCCCGGCATGGGCGATACCTGACCGTATGCGGCTCCCCCTATTACG-GAGCGTACCAATCTGATCTATTTGTTACGTTTTTCATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GA?GCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTATATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGCG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGACAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACAAGAGCGTACCAATCTGAAGTATCTGTTGCCTATTATAAAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGTTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCTGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACTGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTATATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTACAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGA?AACGAGGCACTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGATCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGTTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATTCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCTGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCT--TTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GACGCCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGACTTTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATT--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGACAACGAGGCCTTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGACCTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGCTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GATGCCTTGTGACATGACACTTATCATTGACTGTTAAAAATT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC AGTTCAGAAAAGGGTTGTAGCTGGCCTCAAA-TCCGGGGCATGTGCACACCCTGCTCATCCACTCTCACACCTGTGCACTTTCTGTAGGTCGGTTCGGGATCTGGTCCCTCGCGGGGTCGGGTTCTGTGCCTTCCTATGTACAATCACAAACGCTTCAGTATTCAGAATGTCATTGCGATAATTAAAACGCATCTTATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATGGAATTCTCAACCCACACGTCCTTGTGATGTGGACGGGCTTGGATATTGGAGGTTTCTGCCGGCCCCCCATTCGGGTCGGCTCCTCTGGAATGCATTAGCTCCATCCCTTGCGGATCGGCTCTCGGTGTGATAATTGTCTACGCCGTGGTCGTTGAAGCCTCAGTCGGGCGAGCTTATAATCGTCCCCTCCGGGACAATCGAATATGACATCTGACCTCCGAGGTCTGTG-CATTTCAAATGTCACGTTGGAGAAAATC--TGACGACCGTTGATCGT-AGCCCTACAACGCAACCCTCTCCGTGCACCAACTGGTCGAGAACTCTGATGAGACTTTCTGCATTGATAACGAGGCATTGTACGACATTTGCTTCCGGACACTGAAGCTGACGACACCGACATACGGCGA?CTGAATCACCTCATTTCCATCGTCATGTCCGGTATTACAACTTGTTTGCGTTTCCCTGGTCAGTTGAACTCCGATCTCCGGAAGTTGGCTGTCAACATGGGTGA-GTTCTCACTT-GA??CCTTGTGACATGACACTTATCATTGACTGTTAAAAAAT-TAGTTCCCTTCCCCCGTCTCCACTTCTTCATGACCGGTTTCGCGCCCTTGACTGCGCGCGGCAGCCAGCAGTACCGTGCTGTCACTGTACCCGAGCTGACTCAACAGATGTTCGATGCCAAGAACATGATGGCTGCGTCCGACCCCCGGCATGGCCGATACCTCACTGTATGCAACATCC--TAGCACA-GAGCGTACCAATCTGATGTATCTGTTGCCTATTTTATAGGTTGCCGC RNeXML/inst/examples/comp_analysis.xml0000644000176200001440000001371112641021656017503 0ustar liggesusers RNeXML/inst/examples/simmap.xml0000644000176200001440000002475512641021656016142 0ustar liggesusers t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 t21 t22 t23 t24 t25 t26 t27 t28 t29 t30 t31 t32 t33 t34 t35 t36 t37 t38 t39 t40 t41 t42 t43 t44 t45 t46 t47 t48 t49 t50 t51 t52 t53 t54 t55 t56 t57 t58 t59 t60 t61 t62 t63 t64 t65 t66 t67 t68 t69 t70 t71 t72 t73 t74 t75 t76 t77 t78 t79 t80 t81 t82 t83 t84 t85 t86 t87 t88 t89 t90 t91 t92 t93 t94 t95 t96 t97 t98 t99 t100 t101 t102 t103 t104 t105 t106 t107 t108 t109 t110 t111 t112 t113 t114 t115 t116 t117 t118 t119 t120 t121 t122 t123 t124 t125 t126 t127 t128 t129 (((((((((1:0.029712,2:0.029712):0.081255,3:0.110967):0.082038,4:0.193005):0.0653,(5:0.227507,(6:0.114204,((7:0.022164,8:0.022164):0.078416,9:0.10058):0.013624):0.113303):0.030798):0.022745,(10:0.036822,11:0.036822):0.244228):0.301951,((12:0.13643,(13:0.066417,14:0.066417):0.070013):0.313489,(((15:0.181312,(16:0.040903,17:0.040903):0.140409):0.047569,18:0.228881):0.024976,19:0.253857):0.196062):0.133082):0.142307,(((((20:0.061614,((21:0.012292,22:0.012292):0.004394,23:0.016686):0.044928):0.333716,(24:0.283505,((25:0.047565,26:0.047565):0.163808,(27:0.054112,28:0.054112):0.157261):0.072132):0.111825):0.003716,((((29:0.018011,30:0.018011):0.161776,31:0.179787):0.031075,32:0.210862):0.008284,((33:0.04511,(34:0.025024,35:0.025024):0.020086):0.022372,36:0.067482):0.151664):0.1799):0.066447,(((37:0.134245,(38:0.058193,39:0.058193):0.076052):0.11923,((40:0.029424,41:0.029424):0.064453,(42:0.088798,(43:0.006835,44:0.006835):0.081963):0.005079):0.159598):0.201775,(45:0.283876,(46:0.054755,47:0.054755):0.229121):0.171374):0.010243):0.082374,((48:0.014428,49:0.014428):0.0732,(50:0.079839,(51:0.063663,52:0.063663):0.016176):0.007789):0.460239):0.177441):0.093657,(53:0.190817,54:0.190817):0.628148):0.181035,(((55:0.390139,56:0.390139):0.141789,((57:0.287908,58:0.287908):0.192207,((59:0.001281,60:0.001281):0.402151,(((61:0.170246,((62:0.048774,63:0.048774):0.063415,64:0.112189):0.058057):0.100141,65:0.270387):0.10385,66:0.374237):0.029195):0.076683):0.051813):0.302289,(((((67:0.198747,68:0.198747):0.084922,((69:0.050633,70:0.050633):0.204302,(71:0.234648,(72:0.193941,(73:0.057538,(74:0.039504,75:0.039504):0.018034):0.136403):0.040707):0.020287):0.028734):0.096779,((((((76:0.067239,77:0.067239):0.040487,(78:0.080094,79:0.080094):0.027632):0.050747,(80:0.022775,81:0.022775):0.135698):0.103693,(82:0.185482,83:0.185482):0.076684):0.04376,(84:0.022435,85:0.022435):0.283491):0.018402,86:0.324328):0.05612):0.066885,(87:0.09773,88:0.09773):0.349603):0.374617,((((((((89:0.011431,90:0.011431):0.141046,91:0.152477):0.034806,92:0.187283):0.105663,(93:0.002354,94:0.002354):0.290592):0.111032,((((95:0.163727,(96:0.011349,97:0.011349):0.152378):0.18617,98:0.349897):0.014207,(99:0.042468,100:0.042468):0.321636):0.019155,(101:0.105583,(102:0.064286,103:0.064286):0.041297):0.277676):0.020719):0.162541,104:0.566519):0.069693,((105:0.004823,106:0.004823):0.553279,(((107:0.058968,108:0.058968):0.012749,109:0.071717):0.136388,110:0.208105):0.349997):0.07811):0.032159,(((111:0.310714,112:0.310714):0.2969,(((113:0.321349,114:0.321349):0.12059,((115:0.122887,116:0.122887):0.193406,(117:0.253592,(118:0.171163,119:0.171163):0.082429):0.062701):0.125646):0.066612,(120:0.056834,121:0.056834):0.451717):0.099063):0.027002,((((122:0.024476,123:0.024476):0.170791,124:0.195267):0.014201,125:0.209468):0.221987,((126:0.042169,127:0.042169):0.050774,(128:0.03018,129:0.03018):0.062763):0.338512):0.203161):0.033755):0.153579):0.012267):0.165783) RNeXML/inst/examples/primates_meta_xslt.xml0000644000176200001440000056432612641021656020563 0ustar liggesusers rvosa 2014-07-03T23:43:34 rutger.vos 2014-07-04T12:39:57 RNeXML/inst/examples/meta_example.xml0000644000176200001440000001365312641021656017310 0ustar liggesusers RNeXML/inst/examples/RDFa2RDFXML.xsl0000644000176200001440000007647612641021656016444 0ustar liggesusers RNeXML/inst/examples/primates_meta.xml0000644000176200001440000025055612641021656017506 0ustar liggesusers RNeXML/inst/examples/geospiza.xml0000644000176200001440000001536412641021656016471 0ustar liggesusers RNeXML/inst/examples/biophylo.xml0000644000176200001440000000211312641021656016461 0ustar liggesusers RNeXML/inst/examples/ncbii.xml0000644000176200001440000002030012641021656015716 0ustar liggesusers RNeXML/inst/examples/merge_data.md0000644000176200001440000001641012641021656016531 0ustar liggesusers``` r library("RNeXML") ``` ## Loading required package: ape ``` r library("dplyr") ``` ## ## Attaching package: 'dplyr' ## ## The following objects are masked from 'package:stats': ## ## filter, lag ## ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union ``` r library("geiger") knitr::opts_chunk$set(message = FALSE, comment = NA) ``` Let's generate a `NeXML` file using the tree and trait data from the `geiger` package's "primates" data: ``` r data("primates") add_trees(primates$phy) %>% add_characters(primates$dat, ., append=TRUE) %>% taxize_nexml() -> nex ``` Warning in taxize_nexml(.): ID for otu Alouatta_coibensis not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Aotus_hershkovitzi not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Aotus_miconax not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Callicebus_cinerascens not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Callicebus_dubius not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Callicebus_modestus not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Callicebus_oenanthe not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Callicebus_olallae not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Euoticus_pallidus not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Lagothrix_flavicauda not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Leontopithecus_caissara not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Leontopithecus_chrysomela not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Pithecia_aequatorialis not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Pithecia_albicans not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Procolobus_pennantii not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Procolobus_preussi not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Procolobus_rufomitratus not found. Consider checking the spelling or alternate classification Warning in taxize_nexml(.): ID for otu Tarsius_pumilus not found. Consider checking the spelling or alternate classification (Note that we've used `dplyr`'s cute pipe syntax, but unfortunately our `add_` methods take the `nexml` object as the *second* argument instead of the first, so this isn't as elegant since we need the stupid `.` to show where the piped output should go...) We now read in the three tables of interest. Note that we tell `get_characters` to give us species labels as there own column, rather than as rownames. The latter is the default only because this plays more nicely with the default format for character matrices that is expected by `geiger` and other phylogenetics packages, but is in general a silly choice for data manipulation. ``` r otu_meta <- get_metadata(nex, "otus/otu") taxa <- get_taxa(nex) char <- get_characters(nex, rownames_as_col = TRUE) ``` We can take a peek at what the tables look like, just to orient ourselves: ``` r otu_meta ``` Source: local data frame [215 x 9] id property datatype content xsi.type rel (chr) (lgl) (lgl) (lgl) (chr) (chr) 1 m1 NA NA NA ResourceMeta tc:toTaxon 2 m2 NA NA NA ResourceMeta tc:toTaxon 3 m3 NA NA NA ResourceMeta tc:toTaxon 4 m4 NA NA NA ResourceMeta tc:toTaxon 5 m5 NA NA NA ResourceMeta tc:toTaxon 6 m6 NA NA NA ResourceMeta tc:toTaxon 7 m7 NA NA NA ResourceMeta tc:toTaxon 8 m8 NA NA NA ResourceMeta tc:toTaxon 9 m9 NA NA NA ResourceMeta tc:toTaxon 10 m10 NA NA NA ResourceMeta tc:toTaxon .. ... ... ... ... ... ... Variables not shown: href (chr), otu (chr), otus (chr) ``` r taxa ``` Source: local data frame [233 x 5] id label about xsi.type otus (chr) (chr) (chr) (lgl) (chr) 1 ou1 Allenopithecus_nigroviridis #ou1 NA os1 2 ou2 Allocebus_trichotis #ou2 NA os1 3 ou3 Alouatta_belzebul #ou3 NA os1 4 ou4 Alouatta_caraya #ou4 NA os1 5 ou5 Alouatta_coibensis #ou5 NA os1 6 ou6 Alouatta_fusca #ou6 NA os1 7 ou7 Alouatta_palliata #ou7 NA os1 8 ou8 Alouatta_pigra #ou8 NA os1 9 ou9 Alouatta_sara #ou9 NA os1 10 ou10 Alouatta_seniculus #ou10 NA os1 .. ... ... ... ... ... ``` r head(char) ``` taxa x 1 Allenopithecus_nigroviridis 8.465900 2 Alouatta_seniculus 8.767173 3 Galago_alleni 5.521461 4 Galago_gallarum 5.365976 5 Galago_matschiei 5.267858 6 Galago_moholi 5.375278 Now that we have nice `data.frame` objects for all our data, it's easy to join them into the desired table with a few obvious `dplyr` commands: ``` r taxa %>% left_join(char, by = c("label" = "taxa")) %>% left_join(otu_meta, by = c("id" = "otu")) %>% select(id, label, x, href) ``` Warning in left_join_impl(x, y, by$x, by$y): joining factor and character vector, coercing into character vector Source: local data frame [233 x 4] id label x (chr) (chr) (dbl) 1 ou1 Allenopithecus_nigroviridis 8.465900 2 ou2 Allocebus_trichotis 4.368181 3 ou3 Alouatta_belzebul 8.729074 4 ou4 Alouatta_caraya 8.628735 5 ou5 Alouatta_coibensis 8.764053 6 ou6 Alouatta_fusca 8.554489 7 ou7 Alouatta_palliata 8.791790 8 ou8 Alouatta_pigra 8.881836 9 ou9 Alouatta_sara 8.796339 10 ou10 Alouatta_seniculus 8.767173 .. ... ... ... Variables not shown: href (chr) Because these are all from the same otus block anyway, we haven't selected that column, but were it of interest it is also available in the taxa table. RNeXML/inst/examples/multitrees.xml0000644000176200001440000001466312641021656017046 0ustar liggesusers RNeXML/inst/examples/merge_data.Rmd0000644000176200001440000000334612641021656016657 0ustar liggesusers--- output: md_document: variant: markdown_github --- ```{r} library("RNeXML") library("dplyr") library("geiger") knitr::opts_chunk$set(message = FALSE, warning=FALSE, comment = NA) ``` Let's generate a `NeXML` file using the tree and trait data from the `geiger` package's "primates" data: ```{r} data("primates") add_trees(primates$phy) %>% add_characters(primates$dat, ., append=TRUE) %>% taxize_nexml() -> nex ``` (Note that we've used `dplyr`'s cute pipe syntax, but unfortunately our `add_` methods take the `nexml` object as the _second_ argument instead of the first, so this isn't as elegant since we need the stupid `.` to show where the piped output should go...) We now read in the three tables of interest. Note that we tell `get_characters` to give us species labels as there own column, rather than as rownames. The latter is the default only because this plays more nicely with the default format for character matrices that is expected by `geiger` and other phylogenetics packages, but is in general a silly choice for data manipulation. ```{r} otu_meta <- get_metadata(nex, "otus/otu") taxa <- get_taxa(nex) char <- get_characters(nex, rownames_as_col = TRUE) ``` We can take a peek at what the tables look like, just to orient ourselves: ```{r} otu_meta taxa head(char) ``` Now that we have nice `data.frame` objects for all our data, it's easy to join them into the desired table with a few obvious `dplyr` commands: ```{r} taxa %>% left_join(char, by = c("label" = "taxa")) %>% left_join(otu_meta, by = c("id" = "otu")) %>% select(id, label, x, href) ``` Because these are all from the same otus block anyway, we haven't selected that column, but were it of interest it is also available in the taxa table. RNeXML/inst/examples/primates_from_R.xml0000644000176200001440000055234712641021656020007 0ustar liggesusers RNeXML/inst/examples/some_missing_branchlengths.xml0000644000176200001440000001226012641021656022236 0ustar liggesusers RNeXML/inst/examples/phenoscape.xml0000644000176200001440000000370512641021656016771 0ustar liggesusers RNeXML/inst/examples/sparql.newick0000644000176200001440000001177212641021656016631 0ustar liggesusers(((((Avahi_laniger)Avahi,(Propithecus_verreauxi,Propithecus_tattersalli,Propithecus_diadema)Propithecus,(Indri_indri)Indri)Indriidae,((Varecia_variegata)Varecia,(Prolemur_simus)Prolemur,(Hapalemur_griseus,Hapalemur_aureus)Hapalemur,(Eulemur_rubriventer,Eulemur_mongoz,Eulemur_macaco,Eulemur_fulvus,Eulemur_coronatus)Eulemur,(Lemur_catta)Lemur)Lemuridae,((Lepilemur_septentrionalis,Lepilemur_ruficaudatus,Lepilemur_mustelinus,Lepilemur_leucopus,Lepilemur_edwardsi,Lepilemur_dorsalis)Lepilemur)Lepilemuridae,((Cheirogaleus_medius,Cheirogaleus_major)Cheirogaleus,(Allocebus_trichotis)Allocebus,(Phaner_furcifer)Phaner,(Mirza_coquereli)Mirza,(Microcebus_rufus,Microcebus_murinus)Microcebus)Cheirogaleidae)Lemuriformes,(((Nycticebus_pygmaeus,Nycticebus_coucang)Nycticebus,(Loris_tardigradus)Loris,(Perodicticus_potto)Perodicticus,(Arctocebus_calabarensis)Arctocebus)Lorisidae,((Daubentonia_madagascariensis)Daubentonia)Daubentoniidae)Chiromyiformes,(((Otolemur_garnettii,Otolemur_crassicaudatus)Otolemur,(Euoticus_elegantulus)Euoticus,(Galagoides_zanzibaricus,Galagoides_demidovii)Galagoides,(Galago_senegalensis,Galago_moholi,Galago_matschiei,Galago_gallarum,Galago_alleni)Galago)Galagidae)Lorisiformes)Strepsirrhini,((((Tarsius_syrichta,Tarsius_tarsier,Tarsius_bancanus)Tarsius)Tarsiidae)Tarsiiformes,((((((Pongo_pygmaeus)Pongo)Ponginae,((Homo_sapiens)Homo,(Pan_troglodytes,Pan_paniscus)Pan,(Gorilla_gorilla)Gorilla)Homininae)Hominidae,((Hylobates_pileatus,Hylobates_muelleri,Hylobates_moloch,Hylobates_lar,Hylobates_klossii,Hylobates_agilis)Hylobates,(Nomascus_leucogenys,Nomascus_gabriellae,Nomascus_concolor)Nomascus,(Hoolock_hoolock)Hoolock,(Hylobates_syndactylus)Symphalangus)Hylobatidae)Hominoidea,((((Rhinopithecus_roxellana,Rhinopithecus_brelichi,Rhinopithecus_bieti,Rhinopithecus_avunculus)Rhinopithecus,(Simias_concolor)Simias,(Presbytis_rubicunda,Presbytis_potenziani,Presbytis_melalophos,Presbytis_frontata,Presbytis_comata)Presbytis,(Colobus_satanas,Colobus_polykomos,Colobus_guereza,Colobus_angolensis)Colobus,(Nasalis_larvatus)Nasalis,(Procolobus_verus)Procolobus,(Pygathrix_nemaeus)Pygathrix,(Semnopithecus_entellus)Semnopithecus,(Trachypithecus_vetulus,Trachypithecus_johnii,Trachypithecus_francoisi,Trachypithecus_auratus,Trachypithecus_pileatus,Trachypithecus_obscurus,Trachypithecus_phayrei,Trachypithecus_cristatus,Trachypithecus_geei)Trachypithecus,(Piliocolobus_badius)Piliocolobus)Colobinae,((Lophocebus_albigena)Lophocebus,(Theropithecus_gelada)Theropithecus,(Allenopithecus_nigroviridis)Allenopithecus,(Mandrillus_sphinx,Mandrillus_leucophaeus)Mandrillus,(Papio_hamadryas)Papio,(Erythrocebus_patas)Erythrocebus,(Macaca_radiata,Macaca_sylvanus,Macaca_ochreata,Macaca_thibetana,Macaca_fuscata,Macaca_mulatta,Macaca_silenus,Macaca_arctoides,Macaca_nemestrina,Macaca_fascicularis,Macaca_tonkeana,Macaca_assamensis,Macaca_sinica,Macaca_cyclopis,Macaca_maura,Macaca_nigra)Macaca,(Chlorocebus_aethiops)Chlorocebus,(Cercopithecus_petaurista,Cercopithecus_cephus,Cercopithecus_diana,Cercopithecus_solatus,Cercopithecus_dryas,Cercopithecus_mona,Cercopithecus_preussi,Cercopithecus_hamlyni,Cercopithecus_lhoesti,Cercopithecus_wolfi,Cercopithecus_nictitans,Cercopithecus_erythrotis,Cercopithecus_erythrogaster,Cercopithecus_ascanius,Cercopithecus_campbelli,Cercopithecus_pogonias,Cercopithecus_mitis,Cercopithecus_neglectus)Cercopithecus,(Miopithecus_talapoin)Miopithecus,(Cercocebus_torquatus,Cercocebus_galeritus)Cercocebus)Cercopithecinae)Cercopithecidae)Cercopithecoidea)Catarrhini,((((Alouatta_seniculus,Alouatta_pigra,Alouatta_palliata,Alouatta_guariba,Alouatta_caraya,Alouatta_belzebul)Alouatta)Alouattinae,((Ateles_paniscus,Ateles_geoffroyi,Ateles_fusciceps,Ateles_chamek,Ateles_belzebuth)Ateles,(Lagothrix_lagotricha)Lagothrix,(Brachyteles_arachnoides)Brachyteles)Atelinae)Atelidae,(((Callimico_goeldii)Callimico,(Callithrix_flaviceps,Callithrix_jacchus,Callithrix_argentata,Callithrix_penicillata,Callithrix_aurita,Callithrix_geoffroyi,Callithrix_pygmaea,Callithrix_humeralifera,Callithrix_kuhlii)Callithrix,(Saguinus_fuscicollis,Saguinus_oedipus,Saguinus_labiatus,Saguinus_imperator,Saguinus_leucopus,Saguinus_nigricollis,Saguinus_tripartitus,Saguinus_geoffroyi,Saguinus_mystax,Saguinus_bicolor,Saguinus_midas,Saguinus_inustus)Saguinus,(Leontopithecus_rosalia,Leontopithecus_chrysopygus,Leontopithecus_chrysomelas)Leontopithecus)Callitrichinae,((Saimiri_ustus,Saimiri_sciureus,Saimiri_oerstedii,Saimiri_boliviensis)Saimiri)Saimiriinae,((Cebus_olivaceus,Cebus_capucinus,Cebus_apella,Cebus_albifrons)Cebus)Cebinae)Cebidae,((Aotus_lemurinus,Aotus_azarae,Aotus_trivirgatus,Aotus_nigriceps,Aotus_infulatus,Aotus_nancymaae,Aotus_brumbacki,Aotus_vociferans)Aotus)Aotidae,(((Cacajao_melanocephalus,Cacajao_calvus)Cacajao,(Pithecia_pithecia,Pithecia_monachus,Pithecia_irrorata)Pithecia,(Chiropotes_satanas,Chiropotes_albinasus)Chiropotes)Pitheciinae,((Callicebus_donacophilus,Callicebus_personatus,Callicebus_cupreus,Callicebus_hoffmannsi,Callicebus_caligatus,Callicebus_torquatus,Callicebus_moloch,Callicebus_brunneus)Callicebus)Callicebinae)Pitheciidae)Platyrrhini)Simiiformes)Haplorrhini)Primates; RNeXML/inst/examples/simmap.nex0000644000176200001440000001200012641021656016110 0ustar liggesusers(((((((((t1:{C,0.029712},t2:{C,0.029712}):{C,0.081255},t3:{C,0.110967}):{C,0.082038},t4:{C,0.193005}):{C,0.0653},(t5:{C,0.227507},(t6:{B,0.114204},((t7:{B,0.022164},t8:{B,0.022164}):{B,0.078416},t9:{B,0.10058}):{B,0.013624}):{B,0.0492202:A,0.05548486:C,0.00859794}):{C,0.030798}):{C,0.022745},(t10:{C,0.036822},t11:{C,0.036822}):{C,0.244228}):{C,0.301951},((t12:{B,0.13643},(t13:{B,0.066417},t14:{B,0.066417}):{B,0.070013}):{B,0.313489},(((t15:{A,0.01105711:B,0.08633207:C,0.08392283},(t16:{C,0.040903},t17:{C,0.040903}):{C,0.140409}):{C,0.047569},t18:{C,0.228881}):{C,0.024976},t19:{C,0.253857}):{C,0.15615481:B,0.03990719}):{B,0.06598734:C,0.06709466}):{C,0.142307},(((((t20:{C,0.061614},((t21:{C,0.012292},t22:{C,0.012292}):{C,0.004394},t23:{C,0.016686}):{C,0.044928}):{C,0.18261292:B,0.15110308},(t24:{C,0.14691033:B,0.13659467},((t25:{C,0.047565},t26:{C,0.047565}):{C,0.03623597:B,0.12757203},(t27:{B,0.054112},t28:{B,0.054112}):{B,0.09377572:C,0.04499636:B,0.01848892}):{B,0.072132}):{B,0.111825}):{B,0.003716},((((t29:{B,0.018011},t30:{B,0.018011}):{B,0.161776},t31:{B,0.179787}):{B,0.031075},t32:{C,0.09395762:B,0.11690438}):{B,0.008284},((t33:{A,0.04511},(t34:{A,0.025024},t35:{A,0.025024}):{A,0.020086}):{A,0.022372},t36:{A,0.067482}):{A,0.0220329:B,0.1296311}):{B,0.1799}):{B,0.066447},(((t37:{A,0.01881787:B,0.11542713},(t38:{C,0.058193},t39:{C,0.058193}):{C,0.0452913:B,0.0307607}):{B,0.11923},((t40:{B,0.029424},t41:{B,0.029424}):{B,0.064453},(t42:{B,0.088798},(t43:{B,0.006835},t44:{B,0.006835}):{B,0.081963}):{B,0.005079}):{B,0.159598}):{B,0.201775},(t45:{A,0.27169364:B,0.01218236},(t46:{B,0.054755},t47:{B,0.054755}):{B,0.01137782:C,0.00527465:B,0.21246852}):{B,0.171374}):{B,0.010243}):{B,0.082374},((t48:{A,0.014428},t49:{A,0.014428}):{A,0.06602405:B,0.00717595},(t50:{B,0.079839},(t51:{B,0.063663},t52:{B,0.063663}):{B,0.016176}):{B,0.007789}):{B,0.460239}):{B,0.13318855:C,0.04425245}):{C,0.093657},(t53:{B,0.190817},t54:{B,0.190817}):{B,0.12304625:A,0.15363605:B,0.30567886:C,0.04578684}):{C,0.181035},(((t55:{A,0.25796546:C,0.13217354},t56:{C,0.390139}):{C,0.141789},((t57:{B,0.287908},t58:{B,0.287908}):{B,0.13331858:C,0.05888842},((t59:{A,0.001281},t60:{A,0.001281}):{A,0.1619706:B,0.00648471:A,0.00343325:B,0.1447788:C,0.08548364},(((t61:{A,0.170246},((t62:{A,0.048774},t63:{A,0.048774}):{A,0.063415},t64:{C,0.08672847:A,0.02546053}):{A,0.058057}):{A,0.0120117:C,0.0881293},t65:{C,0.270387}):{C,0.03781934:B,0.06603066},t66:{A,0.36636122:B,0.00787578}):{B,0.0168208:C,0.0123742}):{C,0.076683}):{C,0.051813}):{C,0.302289},(((((t67:{C,0.198747},t68:{C,0.198747}):{C,0.084922},((t69:{C,0.050633},t70:{C,0.050633}):{C,0.204302},(t71:{C,0.234648},(t72:{B,0.03930386:A,0.13807972:C,0.01655742},(t73:{B,0.01176706:C,0.04577094},(t74:{C,0.039504},t75:{C,0.039504}):{C,0.018034}):{C,0.136403}):{C,0.040707}):{C,0.020287}):{C,0.028734}):{C,0.096779},((((((t76:{C,0.067239},t77:{C,0.067239}):{C,0.040487},(t78:{C,0.080094},t79:{C,0.080094}):{C,0.027632}):{C,0.050747},(t80:{B,0.022775},t81:{B,0.022775}):{B,0.08754141:C,0.04815659}):{C,0.103693},(t82:{A,0.07025162:C,0.11523038},t83:{B,0.10137623:C,0.08410577}):{C,0.076684}):{C,0.04376},(t84:{C,0.022435},t85:{C,0.022435}):{C,0.283491}):{C,0.018402},t86:{C,0.324328}):{C,0.05612}):{C,0.066885},(t87:{C,0.09773},t88:{A,0.07109598:B,0.01773404:C,0.00889998}):{C,0.349603}):{C,0.374617},((((((((t89:{B,0.011431},t90:{B,0.011431}):{B,0.03600887:C,0.10503713},t91:{C,0.152477}):{C,0.034806},t92:{C,0.187283}):{C,0.105663},(t93:{B,0.002354},t94:{B,0.002354}):{B,0.05707722:C,0.23351478}):{C,0.111032},((((t95:{B,0.163727},(t96:{B,0.011349},t97:{B,0.011349}):{B,0.152378}):{B,0.18617},t98:{C,0.0114988:B,0.02800592:A,0.06698869:B,0.24340359}):{B,0.014207},(t99:{A,0.042468},t100:{A,0.042468}):{A,0.0244259:B,0.2972101}):{B,0.01022913:C,0.00892587},(t101:{C,0.105583},(t102:{C,0.064286},t103:{C,0.064286}):{C,0.041297}):{C,0.277676}):{C,0.020719}):{C,0.162541},t104:{B,0.06026646:C,0.24619528:A,0.16340273:C,0.09665452}):{C,0.069693},((t105:{A,0.004823},t106:{A,0.004823}):{A,0.28578406:B,0.26749494},(((t107:{A,0.04327952:C,0.01568848},t108:{C,0.058968}):{C,0.012749},t109:{C,0.071717}):{C,0.12764508:B,0.00874292},t110:{B,0.16660987:A,0.01515606:B,0.02633907}):{B,0.349997}):{B,0.07722562:C,0.00088438}):{C,0.032159},(((t111:{B,0.08549674:A,0.06290487:C,0.16231239},t112:{C,0.310714}):{C,0.2969},(((t113:{B,0.14396234:C,0.17738666},t114:{B,0.22907891:C,0.09227009}):{C,0.12059},((t115:{B,0.122887},t116:{B,0.122887}):{B,0.18466489:C,0.00874111},(t117:{C,0.10759561:B,0.14599639},(t118:{A,0.1120285:B,0.0591345},t119:{B,0.171163}):{B,0.082429}):{B,0.00630766:C,0.05639334}):{C,0.125646}):{C,0.066612},(t120:{A,0.056834},t121:{A,0.056834}):{A,0.22197082:C,0.22974618}):{C,0.099063}):{C,0.027002},((((t122:{C,0.00932159:A,0.01515441},t123:{A,0.024476}):{A,0.16301338:B,0.00777762},t124:{C,0.05317007:B,0.14209693}):{B,0.014201},t125:{B,0.209468}):{B,0.09836283:C,0.12362417},((t126:{C,0.042169},t127:{C,0.042169}):{C,0.050774},(t128:{C,0.03018},t129:{C,0.03018}):{C,0.062763}):{C,0.338512}):{C,0.203161}):{C,0.033755}):{C,0.153579}):{C,0.012267}):{C,0.165783}); RNeXML/inst/examples/primates.xml0000644000176200001440000057727612641021656016513 0ustar liggesusers RNeXML/inst/CITATION0000644000176200001440000000110512734262162013433 0ustar liggesusersbibentry(bibtype = "Article", header = "To cite RNeXML in publications, please use:", title = "{RNeXML}: {A} Package for Reading and Writing Richly Annotated Phylogenetic, Character, and Trait Data in {R}", journal = "Methods in Ecology and Evolution", author = c( person("Carl", "Boettiger"), person("Scott", "Chamberlain"), person("Rutger", "Vos"), person("Hilmar", "Lapp")), year = 2016, volume = 7, pages = "352--357", doi = "10.1111/2041-210X.12469") RNeXML/inst/simmap.md0000644000176200001440000000333412641021656014112 0ustar liggesusers## simmap NeXML definitions - Author: Carl Boettiger - Initial version: 2014-03-21 Definitions of the `simmap` namespace, as defined for the use in `RNeXML`. The prefix `nex:` refers to the [NeXML schema](http://www.nexml.org/2009). term | definition ------------------- | ------------- `simmap:reconstructions` | A container of one or more stochastic character map reconstructions, as a `meta` child of a the `nex:edge` element to which the contained stochastic character map reconstructions are being assigned. `simmap:reconstruction` | A single stochastic character map reconstruction for a given `nex:edge`. Normally nested within a `simmap:reconstructions` element. `simmap:char` | The id of a character trait, as defined by the `nex:char` element with this value as its `id`. This is a property of a `simmap:reconstruction`. `simmap:stateChange` | A character state assignment to the given `nex:edge` during a specified interval, as a property of a `simmap:reconstruction`. Must have children `simmap:order`, `simmap:length`, and `simmap:state`. `simmap:order` | The chronological order (from the root) in which the state is assigned to the edge. An edge that does not change states still has `simmap:order` 1. This is a property of a `simmap:stateChange`. `simmap:length` | The duration for which the edge occupies the assigned state, in the same units as the `nex:length` attribute defined on the `nex:edge` being annotated. This is a property of a `simmap:stateChange`. `simmap:state` | The id of a `nex:state` of the `nex:char` identified by the `simmap:char` property of the `simmap:reconstruction`. This is a property of a `simmap:stateChange`. RNeXML/inst/doc/0000755000176200001440000000000012641021656013044 5ustar liggesusersRNeXML/inst/doc/metadata.Rmd0000644000176200001440000001420212641021656015267 0ustar liggesusers--- title: "Handling Metadata in RNeXML" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) ``` ## Writing NeXML metadata The `add_basic_meta()` function takes as input an existing `nexml` object (like the other `add_` functions, if none is provided it will create one), and at the time of this writing any of the following parameters: `title`, `description`, `creator`, `pubdate`, `rights`, `publisher`, `citation`. Other metadata elements and corresponding parameters may be added in the future. Load the packages and data: ```{r} library('RNeXML') data(bird.orders) ``` Create an `nexml` object for the phylogeny `bird.orders` and add appropriate metadata: ```{r} birds <- add_trees(bird.orders) birds <- add_basic_meta( title = "Phylogeny of the Orders of Birds From Sibley and Ahlquist", description = "This data set describes the phylogenetic relationships of the orders of birds as reported by Sibley and Ahlquist (1990). Sibley and Ahlquist inferred this phylogeny from an extensive number of DNA/DNA hybridization experiments. The ``tapestry'' reported by these two authors (more than 1000 species out of the ca. 9000 extant bird species) generated a lot of debates. The present tree is based on the relationships among orders. The branch lengths were calculated from the values of Delta T50H as found in Sibley and Ahlquist (1990, fig. 353).", citation = "Sibley, C. G. and Ahlquist, J. E. (1990) Phylogeny and classification of birds: a study in molecular evolution. New Haven: Yale University Press.", creator = "Sibley, C. G. and Ahlquist, J. E.", nexml=birds) ``` Instead of a literal string, citations can also be provided in R's `bibentry` type, which is the one in which R package citations are obtained: ```{r} birds <- add_basic_meta(citation = citation("ape"), nexml = birds) ``` ## Taxonomic identifiers The `taxize_nexml()` function uses the R package `taxize` [@Chamberlain_2013] to check each taxon label against the NCBI database. If a unique match is found, a metadata annotation is added to the taxon providing the NCBI identification number to the taxonomic unit. ```{r message=FALSE, results='hide'} birds <- taxize_nexml(birds, "NCBI") ``` If no match is found, the user is warned to check for possible typographic errors in the taxonomic labels provided. If multiple matches are found, the user will be prompted to choose between them. ## Custom metadata extensions We can get a list of namespaces along with their prefixes from the `nexml` object: ```{r} prefixes <- get_namespaces(birds) prefixes["dc"] ``` We create a `meta` element containing this annotation using the `meta` function: ```{r} modified <- meta(property = "prism:modificationDate", content = "2013-10-04") ``` We can add this annotation to our existing `birds` NeXML file using the `add_meta()` function. Because we do not specify a level, it is added to the root node, referring to the NeXML file as a whole. ```{r} birds <- add_meta(modified, birds) ``` The built-in vocabularies are just the tip of the iceberg of established vocabularies. Here we add an annotation from the `skos` namespace which describes the history of where the data comes from: ```{r} history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML") ``` Because `skos` is not in the current namespace list, we add it with a url when adding this meta element. We also specify that this annotation be placed at the level of the `trees` sub-node in the NeXML file. ```{r} birds <- add_meta(history, birds, level = "trees", namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) ``` For finer control of the level at which a `meta` element is added, we will manipulate the `nexml` R object directly using S4 sub-setting, as shown in the supplement. Much richer metadata annotation is possible. Later we illustrate how metadata annotation can be used to extend the base NeXML format to represent new forms of data while maintaining compatibility with any NeXML parser. The `RNeXML` package can be easily extended to support helper functions such as `taxize_nexml` to add additional metadata without imposing a large burden on the user. ## Reading NeXML metadata A call to the `nexml` object prints some metadata summarizing the data structure: ```{r } birds ``` We can extract all metadata pertaining to the NeXML document as a whole (annotations of the XML root node, ``) with the command ```{r} meta <- get_metadata(birds) ``` This returns a named list of available metadata. We can see the kinds of metadata recorded from the names (showing the first 4): ```{r} names(meta)[1:4] ``` and can ask for a particular element using the standard list sub-setting mechanism (i.e. either the name of an element or its numeric position), ```{r} meta[["dc:title"]] ``` All metadata terms must belong to an explicit *namespace* or vocabulary that allows a computer to interpret the term precisely. The prefix (before the `:`) indicates to which vocabulary the term belongs, e.g. `dc` in this case. The `get_namespaces` function tells us the definition of the vocabulary using a link: ```{r} prefixes <- get_namespaces(birds) prefixes["dc"] ``` Common metadata can be accessed with a few dedicated functions: ```{r get_citation} get_citation(birds) ``` ```{r get_taxa} get_taxa(birds) ``` Which returns text from the otu element labels, typically used to define taxonomic names, rather than text from explicit meta elements. We can also access metadata at a specific level (or use `level=all` to extract all meta elements in a list). Here we show only the first few results: ```{r} otu_meta <- get_metadata(birds, level="otu") otu_meta[1:4] ``` RNeXML/inst/doc/simmap.R0000644000176200001440000000333712641021656014463 0ustar liggesusers## ----compile-settings, include=FALSE------------------------------------- library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) library("RNeXML") ## ------------------------------------------------------------------------ m <- meta("simmap:reconstructions", children = c( meta("simmap:reconstruction", children = c( meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 1), meta("simmap:length", "0.2030"), meta("simmap:state", "s2"))), meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 2), meta("simmap:length", "0.0022"), meta("simmap:state", "s1"))) )))) ## ------------------------------------------------------------------------ nex <- add_namespaces(c(simmap = "https://github.com/ropensci/RNeXML/tree/master/inst/simmap.md")) ## ------------------------------------------------------------------------ data(simmap_ex) ## ------------------------------------------------------------------------ phy <- nexml_to_simmap(simmap_ex) ## ----Figure1, fig.cap="Stochastic character mapping on a phylogeny, as generated by the phytools package after parsing the simmap-extended NeXML."---- library("phytools") plotSimmap(phy) ## ------------------------------------------------------------------------ nex <- simmap_to_nexml(phy) nexml_write(nex, "simmap.xml") ## ----cleanup, include=FALSE---------------------------------------------- unlink("simmap.xml") RNeXML/inst/doc/simmap.Rmd0000644000176200001440000001545512641021656015010 0ustar liggesusers--- title: "Extending NeXML: an example based on simmap" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette bibliography: references.bib --- ```{r compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) library("RNeXML") ``` ## Extending the NeXML standard through metadata annotation. Here we illustrate this process using the example of stochastic character mapping [@Huelsenbeck_2003]. A stochastic character map is simply an annotation of the branches on a phylogeny, assigning each section of each branch to a particular "state" (typically of a morphological characteristic). @Bollback_2006 provides a widely used stand-alone software implementation of this method in the software `simmap`, which modified the standard Newick tree format to express this additional information. This can break compatibility with other software, and creates a format that cannot be interpreted without additional information describing this convention. By contrast, the NeXML extension is not only backwards compatible but contains a precise and machine-readable description of what it is encoding. In this example, we illustrate how the additional information required to define a stochastic character mapping (a `simmap` mapping) in NeXML. @Revell_2012 describes the `phytools` package for R, which includes utilities for reading, manipulating, and writing `simmap` files in R. In this example, we also show how to define `RNeXML` functions that map the R representation used by Revell (an extension of the `ape` class) into the NeXML extension we have defined by using `RNeXML` functions. Since a stochastic character map simply assigns different states to parts of a branch (or edge) on the phylogenetic tree, we can create a NeXML representation by annotating the `edge` elements with appropriate `meta` elements. These elements need to describe the character state being assigned and the duration (in terms of branch-length) that the edge spends in that state (Stochastic character maps are specific to time-calibrated or ultrametric trees). NeXML already defines the `characters` element to handle discrete character traits (`nex:char`) and the states they can assume (`nex:state`). We will thus reuse the `characters` element for this purpose, referring to both the character trait and the states by the ids assigned to them in that element. (NeXML's convention of referring to everything by id permits a single canonical definition of each term, making it clear where additional annotation belongs). For each edge, we need to indicate: - That our annotation contains a stochastic character mapping reconstruction - Since many reconstructions are possible for a single edge, we give each reconstruction an id - We indicate for which character trait we are defining the reconstruction - We then indicate which states the character assumes on that edge. For each state realized on the edge, that involves stating: + the state assignment + the duration (length of time) for which the edge spends in the given state + the order in which the state changes happen (Though we could just assume state transitions are listed chronologically, NeXML suggests making all data explicit, rather than relying on the structure of the data file to convey information). Thus the annotation for an edge that switches from state `s2` to state `s1` of character `cr1` would be constructed like this: ```{r} m <- meta("simmap:reconstructions", children = c( meta("simmap:reconstruction", children = c( meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 1), meta("simmap:length", "0.2030"), meta("simmap:state", "s2"))), meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 2), meta("simmap:length", "0.0022"), meta("simmap:state", "s1"))) )))) ``` Of course writing out such a definition manually becomes tedious quickly. Because these are just R commands, we can easily define a function that can loop over an assignment like this for each edge, extracting the appropriate order, length and state from an existing R object such as that provided in the `phytools` package. Likewise, it is straightforward to define a function that reads this data using the `RNeXML` utilities and converts it back to the `phytools` package. The full implementation of this mapping can be seen in the `simmap_to_nexml()` and the `nexml_to_simmap()` functions provided in the `RNeXML` package. As the code indicates, the key step is simply to define the data in meta elements. In so doing, we have defined a custom namespace, `simmap`, to hold our variables. This allows us to provide a URL with more detailed descriptions of what each of these elements mean: ```{r} nex <- add_namespaces(c(simmap = "https://github.com/ropensci/RNeXML/tree/master/inst/simmap.md")) ``` At that URL we have posted a simple description of each term. Using this convention we can generate NeXML files containing `simmap` data, read those files into R, and convert them back into the `phytools` package format. These simple functions serve as further illustration of how `RNeXML` can be used to extend the NeXML standard. We illustrate their use briefly here, starting with loading a `nexml` object containing a `simmap` reconstruction into R: ```{r} data(simmap_ex) ``` The `get_trees()` function can be used to return an `ape::phylo` tree as usual. `RNeXML` automatically detects the `simmap` reconstruction data and returns includes this in a `maps` element of the `ape::phylo` object, for use with other `phytools` functions. ```{r} phy <- nexml_to_simmap(simmap_ex) ``` We can then use various functions from `phytools` designed for `simmap` objects [@Revell_2012], such as the plotting function: ```{r Figure1, fig.cap="Stochastic character mapping on a phylogeny, as generated by the phytools package after parsing the simmap-extended NeXML."} library("phytools") plotSimmap(phy) ``` Likewise, we can convert the object back in the NeXML format and write it out to file to be read by other users. ```{r} nex <- simmap_to_nexml(phy) nexml_write(nex, "simmap.xml") ``` Though other NeXML parsers (for instance, for Perl or Python) have not been written explicitly to express `simmap` data, those parsers will nonetheless be able to successfully parse this file and expose the `simmap` data to the user. ```{r cleanup, include=FALSE} unlink("simmap.xml") ``` RNeXML/inst/doc/sparql.R0000644000176200001440000000653512641021656014502 0ustar liggesusers## ----supplement-compile-settings, include=FALSE-------------------------- library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE, eval=require("rrdf")) basename <- 'sparql' ## ----include=FALSE------------------------------------------------------- # library("RNeXML") ## ------------------------------------------------------------------------ # library("rrdf") # library("XML") # library("phytools") # library("RNeXML") ## ------------------------------------------------------------------------ # nexml <- nexml_read(system.file("examples/primates.xml", package="RNeXML")) ## ------------------------------------------------------------------------ # rdf <- get_rdf(system.file("examples/primates.xml", package="RNeXML")) # tmp <- tempfile() # so we must write the XML out first # saveXML(rdf, tmp) # graph <- load.rdf(tmp) ## ------------------------------------------------------------------------ # root <- sparql.rdf(graph, # "SELECT ?uri WHERE { # ?id . # ?id ?uri # }") ## ------------------------------------------------------------------------ # get_name <- function(id) { # max <- length(nexml@otus[[1]]@otu) # for(i in 1:max) { # if ( nexml@otus[[1]]@otu[[i]]@id == id ) { # label <- nexml@otus[[1]]@otu[[i]]@label # label <- gsub(" ","_",label) # return(label) # } # } # } ## ------------------------------------------------------------------------ # recurse <- function(node){ # # # fetch the taxonomic rank and id string # rank_query <- paste0( # "SELECT ?rank ?id WHERE { # ?id <",node,"> . # ?id ?rank # }") # result <- sparql.rdf(graph, rank_query) # # # get the local ID, strip URI part # id <- result[2] # id <- gsub("^.+#", "", id, perl = TRUE) # # # if rank is terminal, return the name # if (result[1] == "http://rs.tdwg.org/ontology/voc/TaxonRank#Species") { # return(get_name(id)) # } # # # recurse deeper # else { # child_query <- paste0( # "SELECT ?uri WHERE { # ?id <",node,"> . # ?id ?uri # }") # children <- sparql.rdf(graph, child_query) # # return(paste("(", # paste(sapply(children, recurse), # sep = ",", collapse = "," ), # ")", # get_name(id), # label interior nodes # sep = "", collapse = "")) # } # } # ## ------------------------------------------------------------------------ # newick <- paste(recurse(root), ";", sep = "", collapse = "") # tree <- read.newick(text = newick) # collapsed <- collapse.singles(tree) # plot(collapsed, # type='cladogram', # show.tip.label=FALSE, # show.node.label=TRUE, # cex=0.75, # edge.color='grey60', # label.offset=-9) RNeXML/inst/doc/sparql.html0000644000176200001440000004610312641021656015240 0ustar liggesusers SPARQL with RNeXML

SPARQL Queries

Rich, semantically meaningful metadata lies at the heart of the NeXML standard. R provides a rich environment to unlock this information. While our previous examples have relied on the user knowing exactly what metadata they intend to extract (title, publication date, citation information, and so forth), semantic metadata has meaning that a computer can make use of, allowing us to make much more conceptually rich queries than those simple examples. The SPARQL query language is a powerful way to make use of such semantic information in making complex queries.

While users should consult a formal introduction to SPARQL for further background, here we illustrate how SPARQL can be used in combination with R functions in ways that would be much more tedious to assemble with only traditional/non-semantic queries. The SPARQL query language is provided for the R environment through the rrdf package [@Willighagen_2014], so we start by loading that package. We will also make use of functions from phytools and RNeXML.

library("rrdf")
library("XML")
library("phytools")
library("RNeXML")

We read in an example file that contains semantic metadata annotations describing the taxonomic units (OTUs) used in the tree.

nexml <- nexml_read(system.file("examples/primates.xml", package="RNeXML"))

In particular, this example declares the taxon rank, NCBI identifier and parent taxon for each OTU, such as:

<otu about="#ou541" id="ou541" label="Alouatta guariba">
      <meta href="http://ncbi.nlm.nih.gov/taxonomy/182256" 
            id="ma20" 
            rel="concept:toTaxon" 
            xsi:type="nex:ResourceMeta"/>
      <meta href="http://rs.tdwg.org/ontology/voc/TaxonRank#Species" 
            id="ma21" 
            rel="concept:rank" 
            xsi:type="nex:ResourceMeta"/>
      <meta href="http://ncbi.nlm.nih.gov/taxonomy/9499" 
            id="ma22" 
            rel="rdfs:subClassOf" 
            xsi:type="nex:ResourceMeta"/>
    </otu>

In this example, we will construct a cladogram by using this information to identify the taxonomic rank of each OTU, and its shared parent taxonomic rank. (If this example looks complex, try writing down the steps to do this without the aid of the SPARQL queries). These examples show the manipulation of semantic triples, Unique Resource Identifiers (URIs) and use of the SPARQL “Join” operator.

Note that this example can be run using demo("sparql", "RNeXML") to see the code displayed in the R terminal and to avoid character errors that can occur in having to copy and paste from PDF files.

We begin by extracting the RDF graph from the NeXML,

rdf <- get_rdf(system.file("examples/primates.xml", package="RNeXML"))
tmp <- tempfile()  # so we must write the XML out first
saveXML(rdf, tmp) 
graph <- load.rdf(tmp)

We then fetch the NCBI URI for the taxon that has rank ‘Order’, i.e. the root of the primates phylogeny. The dot operator . between clauses implies a join, in this case

root <- sparql.rdf(graph, 
"SELECT ?uri WHERE { 
    ?id <http://rs.tdwg.org/ontology/voc/TaxonConcept#rank> <http://rs.tdwg.org/ontology/voc/TaxonRank#Order> . 
    ?id <http://rs.tdwg.org/ontology/voc/TaxonConcept#toTaxon> ?uri    
}")

This makes use of the SPARQL query language provided by the rrdf package. We will also define some helper functions that use SPARQL queries. Here we define a function to get the name

get_name <- function(id) {
  max <- length(nexml@otus[[1]]@otu)
  for(i in 1:max) {
    if ( nexml@otus[[1]]@otu[[i]]@id == id ) {
      label <- nexml@otus[[1]]@otu[[i]]@label
      label <- gsub(" ","_",label)
      return(label)
    }
  }
}

Next, we define a recursive function to build a newick tree from the taxonomic rank information.

recurse <- function(node){
  
    # fetch the taxonomic rank and id string
    rank_query <- paste0(
        "SELECT ?rank ?id WHERE {
            ?id <http://rs.tdwg.org/ontology/voc/TaxonConcept#toTaxon> <",node,"> .
            ?id <http://rs.tdwg.org/ontology/voc/TaxonConcept#rank> ?rank
          }")
    result <- sparql.rdf(graph, rank_query)
    
    # get the local ID, strip URI part
    id <- result[2]
    id <- gsub("^.+#", "", id, perl = TRUE)
    
    # if rank is terminal, return the name
    if (result[1] == "http://rs.tdwg.org/ontology/voc/TaxonRank#Species") {
        return(get_name(id))
    }
    
    # recurse deeper
    else {
        child_query <- paste0(
            "SELECT ?uri WHERE {
                ?id <http://www.w3.org/2000/01/rdf-schema#subClassOf> <",node,"> .
                ?id <http://rs.tdwg.org/ontology/voc/TaxonConcept#toTaxon> ?uri
            }")
        children <- sparql.rdf(graph, child_query)
        
        return(paste("(", 
                     paste(sapply(children, recurse), 
                           sep = ",", collapse = "," ), 
                     ")",  
                     get_name(id), # label interior nodes
                     sep = "", collapse = ""))
    }
}

With these functions in place, it is straight forward to build the tree from the semantic RDFa data and then visualize it

newick <- paste(recurse(root), ";", sep = "", collapse = "")
tree <- read.newick(text = newick)
collapsed <- collapse.singles(tree)
plot(collapsed, 
     type='cladogram', 
     show.tip.label=FALSE, 
     show.node.label=TRUE, 
     cex=0.75, 
     edge.color='grey60', 
     label.offset=-9)
RNeXML/inst/doc/S4.Rmd0000644000176200001440000001006212641021656013775 0ustar liggesusers--- title: "The nexml S4 Object" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r supplement-compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- 'S4' ``` ```{r include=FALSE} library("RNeXML") ``` ## Understanding the `nexml` S4 object The `RNeXML` package provides many convenient functions to add and extract information from `nexml` objects in the R environment without requiring the reader to understand the details of the NeXML data structure and making it less likely that a user will generate invalid NeXML syntax that could not be read by other parsers. The `nexml` object we have been using in all of the examples is built on R's S4 mechanism. Advanced users may sometimes prefer to interact with the data structure more directly using R's S4 class mechanism and subsetting methods. Many R users are more familiar with the S3 class mechanism (such as in the `ape` package phylo objects) rather than the S4 class mechanism used in phylogenetics packages such as `ouch` and `phylobase`. The `phylobase` vignette provides an excellent introduction to these data structures. Users already familiar with subsetting lists and other S3 objects in R are likely familar with the use of the `$` operator, such as `phy$edge`. S4 objects simply use an `@` operator instead (but cannot be subset using numeric arguments such as `phy[[1]]` or named arguments such as phy[["edge"]]). The `nexml` object is an S4 object, as are all of its components (slots). Its hierarchical structure corresponds exactly with the XML tree of a NeXML file, with the single exception that both XML attributes and children are represented as slots. S4 objects have constructor functions to initialize them. We create a new `nexml` object with the command: ```{r} nex <- new("nexml") ``` We can see a list of slots contained in this object with ```{r} slotNames(nex) ``` Some of these slots have already been populated for us, for instance, the schema version and default namespaces: ```{r} nex@version nex@namespaces ``` Recognize that `nex@namespaces` serves the same role as `get_namespaces` function, but provides direct access to the slot data. For instance, with this syntax we could also overwrite the existing namespaces with `nex@namespaces <- NULL`. Changing the namespace in this way is not advised. Some slots can contain multiple elements of the same type, such as `trees`, `characters`, and `otus`. For instance, we see that ```{r} class(nex@characters) ``` is an object of class `ListOfcharacters`, and is currently empty, ```{r} length(nex@characters) ``` In order to assign an object to a slot, it must match the class definition of the slot. We can create a new element of any given class with the `new` function, ```{r} nex@characters <- new("ListOfcharacters", list(new("characters"))) ``` and now we have a length-1 list of character matrices, ```{r} length(nex@characters) ``` and we access the first character matrix using the list notation, `[[1]]`. Here we check the class is a `characters` object. ```{r} class(nex@characters[[1]]) ``` Direct subsetting has two primary use cases: (a) useful in looking up (and possibly editing) a specific value of an element, or (b) when adding metadata annotations to specific elements. Consider the example file ```{r} f <- system.file("examples", "trees.xml", package="RNeXML") nex <- nexml_read(f) ``` We can look up the species label of the first `otu` in the first `otus` block: ```{r} nex@otus[[1]]@otu[[1]]@label ``` We can add metadata to this particular OTU using this subsetting format ```{r} nex@otus[[1]]@otu[[1]]@meta <- c(meta("skos:note", "This species was incorrectly identified"), nex@otus[[1]]@otu[[1]]@meta) ``` Here we use the `c` operator to append this element to any existing meta annotations to this otu. RNeXML/inst/doc/sparql.Rmd0000644000176200001440000001365012641021656015017 0ustar liggesusers--- title: "SPARQL with RNeXML" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r supplement-compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE, eval=require("rrdf")) basename <- 'sparql' ``` ```{r include=FALSE} library("RNeXML") ``` ## SPARQL Queries Rich, semantically meaningful metadata lies at the heart of the NeXML standard. R provides a rich environment to unlock this information. While our previous examples have relied on the user knowing exactly what metadata they intend to extract (title, publication date, citation information, and so forth), _semantic_ metadata has meaning that a computer can make use of, allowing us to make much more conceptually rich queries than those simple examples. The SPARQL query language is a powerful way to make use of such semantic information in making complex queries. While users should consult a formal introduction to SPARQL for further background, here we illustrate how SPARQL can be used in combination with R functions in ways that would be much more tedious to assemble with only traditional/non-semantic queries. The SPARQL query language is provided for the R environment through the `rrdf` package [@Willighagen_2014], so we start by loading that package. We will also make use of functions from `phytools` and `RNeXML`. ```{r} library("rrdf") library("XML") library("phytools") library("RNeXML") ``` We read in an example file that contains semantic metadata annotations describing the taxonomic units (OTUs) used in the tree. ```{r} nexml <- nexml_read(system.file("examples/primates.xml", package="RNeXML")) ``` In particular, this example declares the taxon rank, NCBI identifier and parent taxon for each OTU, such as: ```xml ``` In this example, we will construct a cladogram by using this information to identify the taxonomic rank of each OTU, and its shared parent taxonomic rank. (If this example looks complex, try writing down the steps to do this without the aid of the SPARQL queries). These examples show the manipulation of semantic triples, Unique Resource Identifiers (URIs) and use of the SPARQL "Join" operator. Note that this example can be run using `demo("sparql", "RNeXML")` to see the code displayed in the R terminal and to avoid character errors that can occur in having to copy and paste from PDF files. We begin by extracting the RDF graph from the NeXML, ```{r} rdf <- get_rdf(system.file("examples/primates.xml", package="RNeXML")) tmp <- tempfile() # so we must write the XML out first saveXML(rdf, tmp) graph <- load.rdf(tmp) ``` We then fetch the NCBI URI for the taxon that has rank 'Order', i.e. the root of the primates phylogeny. The dot operator `.` between clauses implies a join, in this case ```{r} root <- sparql.rdf(graph, "SELECT ?uri WHERE { ?id . ?id ?uri }") ``` This makes use of the SPARQL query language provided by the `rrdf` package. We will also define some helper functions that use SPARQL queries. Here we define a function to get the name ```{r} get_name <- function(id) { max <- length(nexml@otus[[1]]@otu) for(i in 1:max) { if ( nexml@otus[[1]]@otu[[i]]@id == id ) { label <- nexml@otus[[1]]@otu[[i]]@label label <- gsub(" ","_",label) return(label) } } } ``` Next, we define a recursive function to build a newick tree from the taxonomic rank information. ```{r} recurse <- function(node){ # fetch the taxonomic rank and id string rank_query <- paste0( "SELECT ?rank ?id WHERE { ?id <",node,"> . ?id ?rank }") result <- sparql.rdf(graph, rank_query) # get the local ID, strip URI part id <- result[2] id <- gsub("^.+#", "", id, perl = TRUE) # if rank is terminal, return the name if (result[1] == "http://rs.tdwg.org/ontology/voc/TaxonRank#Species") { return(get_name(id)) } # recurse deeper else { child_query <- paste0( "SELECT ?uri WHERE { ?id <",node,"> . ?id ?uri }") children <- sparql.rdf(graph, child_query) return(paste("(", paste(sapply(children, recurse), sep = ",", collapse = "," ), ")", get_name(id), # label interior nodes sep = "", collapse = "")) } } ``` With these functions in place, it is straight forward to build the tree from the semantic RDFa data and then visualize it ```{r} newick <- paste(recurse(root), ";", sep = "", collapse = "") tree <- read.newick(text = newick) collapsed <- collapse.singles(tree) plot(collapsed, type='cladogram', show.tip.label=FALSE, show.node.label=TRUE, cex=0.75, edge.color='grey60', label.offset=-9) ``` RNeXML/inst/doc/S4.html0000644000176200001440000004040312641021656014221 0ustar liggesusers The nexml S4 Object

Understanding the nexml S4 object

The RNeXML package provides many convenient functions to add and extract information from nexml objects in the R environment without requiring the reader to understand the details of the NeXML data structure and making it less likely that a user will generate invalid NeXML syntax that could not be read by other parsers. The nexml object we have been using in all of the examples is built on R’s S4 mechanism. Advanced users may sometimes prefer to interact with the data structure more directly using R’s S4 class mechanism and subsetting methods. Many R users are more familiar with the S3 class mechanism (such as in the ape package phylo objects) rather than the S4 class mechanism used in phylogenetics packages such as ouch and phylobase. The phylobase vignette provides an excellent introduction to these data structures. Users already familiar with subsetting lists and other S3 objects in R are likely familar with the use of the $ operator, such as phy$edge. S4 objects simply use an @ operator instead (but cannot be subset using numeric arguments such as phy[[1]] or named arguments such as phy[[“edge”]]).

The nexml object is an S4 object, as are all of its components (slots). Its hierarchical structure corresponds exactly with the XML tree of a NeXML file, with the single exception that both XML attributes and children are represented as slots.
S4 objects have constructor functions to initialize them. We create a new nexml object with the command:

nex <- new("nexml")

We can see a list of slots contained in this object with

slotNames(nex)
 [1] "version"            "generator"          "xsi:schemaLocation"
 [4] "namespaces"         "otus"               "trees"             
 [7] "characters"         "meta"               "about"             
[10] "xsi:type"          

Some of these slots have already been populated for us, for instance, the schema version and default namespaces:

nex@version
[1] "0.9"
nex@namespaces
                                             nex 
                     "http://www.nexml.org/2009" 
                                             xsi 
     "http://www.w3.org/2001/XMLSchema-instance" 
                                             xml 
          "http://www.w3.org/XML/1998/namespace" 
                                            cdao 
       "http://purl.obolibrary.org/obo/cdao.owl" 
                                             xsd 
             "http://www.w3.org/2001/XMLSchema#" 
                                              dc 
              "http://purl.org/dc/elements/1.1/" 
                                         dcterms 
                     "http://purl.org/dc/terms/" 
                                           prism 
"http://prismstandard.org/namespaces/1.2/basic/" 
                                              cc 
                "http://creativecommons.org/ns#" 
                                            ncbi 
         "http://www.ncbi.nlm.nih.gov/taxonomy#" 
                                              tc 
 "http://rs.tdwg.org/ontology/voc/TaxonConcept#" 
                                                 
                     "http://www.nexml.org/2009" 

Recognize that nex@namespaces serves the same role as get_namespaces function, but provides direct access to the slot data. For instance, with this syntax we could also overwrite the existing namespaces with nex@namespaces <- NULL. Changing the namespace in this way is not advised.

Some slots can contain multiple elements of the same type, such as trees, characters, and otus. For instance, we see that

class(nex@characters)
[1] "ListOfcharacters"
attr(,"package")
[1] "RNeXML"

is an object of class ListOfcharacters, and is currently empty,

length(nex@characters)
[1] 0

In order to assign an object to a slot, it must match the class definition of the slot. We can create a new element of any given class with the new function,

nex@characters <- new("ListOfcharacters", list(new("characters")))

and now we have a length-1 list of character matrices,

length(nex@characters)
[1] 1

and we access the first character matrix using the list notation, [[1]]. Here we check the class is a characters object.

class(nex@characters[[1]])
[1] "characters"
attr(,"package")
[1] "RNeXML"

Direct subsetting has two primary use cases: (a) useful in looking up (and possibly editing) a specific value of an element, or (b) when adding metadata annotations to specific elements. Consider the example file

f <- system.file("examples", "trees.xml", package="RNeXML")
nex <- nexml_read(f)

We can look up the species label of the first otu in the first otus block:

nex@otus[[1]]@otu[[1]]@label
      label 
"species 1" 

We can add metadata to this particular OTU using this subsetting format

nex@otus[[1]]@otu[[1]]@meta <- 
  c(meta("skos:note", 
          "This species was incorrectly identified"),
         nex@otus[[1]]@otu[[1]]@meta)

Here we use the c operator to append this element to any existing meta annotations to this otu.

RNeXML/inst/doc/S4.R0000644000176200001440000000321712641021656013460 0ustar liggesusers## ----supplement-compile-settings, include=FALSE-------------------------- library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- 'S4' ## ----include=FALSE------------------------------------------------------- library("RNeXML") ## ------------------------------------------------------------------------ nex <- new("nexml") ## ------------------------------------------------------------------------ slotNames(nex) ## ------------------------------------------------------------------------ nex@version nex@namespaces ## ------------------------------------------------------------------------ class(nex@characters) ## ------------------------------------------------------------------------ length(nex@characters) ## ------------------------------------------------------------------------ nex@characters <- new("ListOfcharacters", list(new("characters"))) ## ------------------------------------------------------------------------ length(nex@characters) ## ------------------------------------------------------------------------ class(nex@characters[[1]]) ## ------------------------------------------------------------------------ f <- system.file("examples", "trees.xml", package="RNeXML") nex <- nexml_read(f) ## ------------------------------------------------------------------------ nex@otus[[1]]@otu[[1]]@label ## ------------------------------------------------------------------------ nex@otus[[1]]@otu[[1]]@meta <- c(meta("skos:note", "This species was incorrectly identified"), nex@otus[[1]]@otu[[1]]@meta) RNeXML/inst/doc/metadata.R0000644000176200001440000000650012641021656014750 0ustar liggesusers## ----compile-settings, include=FALSE------------------------------------- library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) ## ------------------------------------------------------------------------ library('RNeXML') data(bird.orders) ## ------------------------------------------------------------------------ birds <- add_trees(bird.orders) birds <- add_basic_meta( title = "Phylogeny of the Orders of Birds From Sibley and Ahlquist", description = "This data set describes the phylogenetic relationships of the orders of birds as reported by Sibley and Ahlquist (1990). Sibley and Ahlquist inferred this phylogeny from an extensive number of DNA/DNA hybridization experiments. The ``tapestry'' reported by these two authors (more than 1000 species out of the ca. 9000 extant bird species) generated a lot of debates. The present tree is based on the relationships among orders. The branch lengths were calculated from the values of Delta T50H as found in Sibley and Ahlquist (1990, fig. 353).", citation = "Sibley, C. G. and Ahlquist, J. E. (1990) Phylogeny and classification of birds: a study in molecular evolution. New Haven: Yale University Press.", creator = "Sibley, C. G. and Ahlquist, J. E.", nexml=birds) ## ------------------------------------------------------------------------ birds <- add_basic_meta(citation = citation("ape"), nexml = birds) ## ----message=FALSE, results='hide'--------------------------------------- birds <- taxize_nexml(birds, "NCBI") ## ------------------------------------------------------------------------ prefixes <- get_namespaces(birds) prefixes["dc"] ## ------------------------------------------------------------------------ modified <- meta(property = "prism:modificationDate", content = "2013-10-04") ## ------------------------------------------------------------------------ birds <- add_meta(modified, birds) ## ------------------------------------------------------------------------ history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML") ## ------------------------------------------------------------------------ birds <- add_meta(history, birds, level = "trees", namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) ## ------------------------------------------------------------------------ birds ## ------------------------------------------------------------------------ meta <- get_metadata(birds) ## ------------------------------------------------------------------------ names(meta)[1:4] ## ------------------------------------------------------------------------ meta[["dc:title"]] ## ------------------------------------------------------------------------ prefixes <- get_namespaces(birds) prefixes["dc"] ## ----get_citation-------------------------------------------------------- get_citation(birds) ## ----get_taxa------------------------------------------------------------ get_taxa(birds) ## ------------------------------------------------------------------------ otu_meta <- get_metadata(birds, level="otu") otu_meta[1:4] RNeXML/inst/doc/metadata.html0000644000176200001440000005023212641021656015514 0ustar liggesusers Handling Metadata in RNeXML

Writing NeXML metadata

The add_basic_meta() function takes as input an existing nexml object (like the other add_ functions, if none is provided it will create one), and at the time of this writing any of the following parameters: title, description, creator, pubdate, rights, publisher, citation. Other metadata elements and corresponding parameters may be added in the future.

Load the packages and data:

library('RNeXML')
data(bird.orders)

Create an nexml object for the phylogeny bird.orders and add appropriate metadata:

birds <- add_trees(bird.orders)
birds <- add_basic_meta(
  title = "Phylogeny of the Orders of Birds From Sibley and Ahlquist",

  description = "This data set describes the phylogenetic relationships of the
     orders of birds as reported by Sibley and Ahlquist (1990). Sibley
     and Ahlquist inferred this phylogeny from an extensive number of
     DNA/DNA hybridization experiments. The ``tapestry'' reported by
     these two authors (more than 1000 species out of the ca. 9000
     extant bird species) generated a lot of debates.

     The present tree is based on the relationships among orders. The
     branch lengths were calculated from the values of Delta T50H as
     found in Sibley and Ahlquist (1990, fig. 353).",

  citation = "Sibley, C. G. and Ahlquist, J. E. (1990) Phylogeny and
     classification of birds: a study in molecular evolution. New
     Haven: Yale University Press.",

  creator = "Sibley, C. G. and Ahlquist, J. E.",
    nexml=birds)

Instead of a literal string, citations can also be provided in R’s bibentry type, which is the one in which R package citations are obtained:

birds <- add_basic_meta(citation = citation("ape"), nexml = birds)

Taxonomic identifiers

The taxize_nexml() function uses the R package taxize [@Chamberlain_2013] to check each taxon label against the NCBI database. If a unique match is found, a metadata annotation is added to the taxon providing the NCBI identification number to the taxonomic unit.

birds <- taxize_nexml(birds, "NCBI")

If no match is found, the user is warned to check for possible typographic errors in the taxonomic labels provided. If multiple matches are found, the user will be prompted to choose between them.

Custom metadata extensions

We can get a list of namespaces along with their prefixes from the nexml object:

prefixes <- get_namespaces(birds)
prefixes["dc"]
                                dc 
"http://purl.org/dc/elements/1.1/" 

We create a meta element containing this annotation using the meta function:

modified <- meta(property = "prism:modificationDate", content = "2013-10-04")

We can add this annotation to our existing birds NeXML file using the add_meta() function. Because we do not specify a level, it is added to the root node, referring to the NeXML file as a whole.

birds <- add_meta(modified, birds) 

The built-in vocabularies are just the tip of the iceberg of established vocabularies. Here we add an annotation from the skos namespace which describes the history of where the data comes from:

history <- meta(property = "skos:historyNote",
  content = "Mapped from the bird.orders data in the ape package using RNeXML")

Because skos is not in the current namespace list, we add it with a url when adding this meta element. We also specify that this annotation be placed at the level of the trees sub-node in the NeXML file.

birds <- add_meta(history, 
                birds, 
                level = "trees",
                namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#"))

For finer control of the level at which a meta element is added, we will manipulate the nexml R object directly using S4 sub-setting, as shown in the supplement.

Much richer metadata annotation is possible. Later we illustrate how metadata annotation can be used to extend the base NeXML format to represent new forms of data while maintaining compatibility with any NeXML parser. The RNeXML package can be easily extended to support helper functions such as taxize_nexml to add additional metadata without imposing a large burden on the user.

Reading NeXML metadata

A call to the nexml object prints some metadata summarizing the data structure:

birds
A nexml object representing:
     1 phylogenetic tree blocks, where: 
     block 1 contains 1 phylogenetic trees 
     46 meta elements 
     0 character matrices 
     23 taxonomic units 
 Taxa:   Struthioniformes, Tinamiformes, Craciformes, Galliformes, Anseriformes, Turniciformes ... 

 NeXML generated by RNeXML using schema version: 0.9 
 size: 372.6 Kb 

We can extract all metadata pertaining to the NeXML document as a whole (annotations of the XML root node, <nexml>) with the command

meta <- get_metadata(birds) 

This returns a named list of available metadata. We can see the kinds of metadata recorded from the names (showing the first 4):

names(meta)[1:4]
[1] "dc:title"                      "dc:creator"                   
[3] "dc:description"                "dcterms:bibliographicCitation"

and can ask for a particular element using the standard list sub-setting mechanism (i.e. either the name of an element or its numeric position),

meta[["dc:title"]]
[1] "Phylogeny of the Orders of Birds From Sibley and Ahlquist"

All metadata terms must belong to an explicit namespace or vocabulary that allows a computer to interpret the term precisely. The prefix (before the :) indicates to which vocabulary the term belongs, e.g. dc in this case. The get_namespaces function tells us the definition of the vocabulary using a link:

prefixes <- get_namespaces(birds)
prefixes["dc"]
                                dc 
"http://purl.org/dc/elements/1.1/" 

Common metadata can be accessed with a few dedicated functions:

get_citation(birds)
Sibley, C. G. and Ahlquist, J. E. (1990) Phylogeny and
     classification of birds: a study in molecular evolution. New
     Haven: Yale University Press. Paradis E, Claude J and Strimmer K (2004). "APE: analyses of
phylogenetics and evolution in R language." _Bioinformatics_,
*20*, pp. 289-290.
get_taxa(birds)
 [1] "Struthioniformes" "Tinamiformes"     "Craciformes"     
 [4] "Galliformes"      "Anseriformes"     "Turniciformes"   
 [7] "Piciformes"       "Galbuliformes"    "Bucerotiformes"  
[10] "Upupiformes"      "Trogoniformes"    "Coraciiformes"   
[13] "Coliiformes"      "Cuculiformes"     "Psittaciformes"  
[16] "Apodiformes"      "Trochiliformes"   "Musophagiformes" 
[19] "Strigiformes"     "Columbiformes"    "Gruiformes"      
[22] "Ciconiiformes"    "Passeriformes"   

Which returns text from the otu element labels, typically used to define taxonomic names, rather than text from explicit meta elements.

We can also access metadata at a specific level (or use level=all to extract all meta elements in a list). Here we show only the first few results:

otu_meta <- get_metadata(birds, level="otu")
otu_meta[1:4]
$`tc:toTaxon`
[1] "http://ncbi.nlm.nih.gov/taxonomy/8798"

$`tc:toTaxon`
[1] "http://ncbi.nlm.nih.gov/taxonomy/8802"

$`tc:toTaxon`
[1] "http://ncbi.nlm.nih.gov/taxonomy/8976"

$`tc:toTaxon`
[1] "http://ncbi.nlm.nih.gov/taxonomy/8976"
RNeXML/inst/doc/simmap.html0000644000176200001440000007255012641021656015231 0ustar liggesusers Extending NeXML: an example based on simmap

Extending the NeXML standard through metadata annotation.

Here we illustrate this process using the example of stochastic character mapping (Huelsenbeck, Nielsen, and Bollback 2003). A stochastic character map is simply an annotation of the branches on a phylogeny, assigning each section of each branch to a particular “state” (typically of a morphological characteristic).

J. Bollback (2006) provides a widely used stand-alone software implementation of this method in the software simmap, which modified the standard Newick tree format to express this additional information. This can break compatibility with other software, and creates a format that cannot be interpreted without additional information describing this convention. By contrast, the NeXML extension is not only backwards compatible but contains a precise and machine-readable description of what it is encoding.

In this example, we illustrate how the additional information required to define a stochastic character mapping (a simmap mapping) in NeXML.

Revell (2012) describes the phytools package for R, which includes utilities for reading, manipulating, and writing simmap files in R. In this example, we also show how to define RNeXML functions that map the R representation used by Revell (an extension of the ape class) into the NeXML extension we have defined by using RNeXML functions.

Since a stochastic character map simply assigns different states to parts of a branch (or edge) on the phylogenetic tree, we can create a NeXML representation by annotating the edge elements with appropriate meta elements. These elements need to describe the character state being assigned and the duration (in terms of branch-length) that the edge spends in that state (Stochastic character maps are specific to time-calibrated or ultrametric trees).

NeXML already defines the characters element to handle discrete character traits (nex:char) and the states they can assume (nex:state). We will thus reuse the characters element for this purpose, referring to both the character trait and the states by the ids assigned to them in that element. (NeXML’s convention of referring to everything by id permits a single canonical definition of each term, making it clear where additional annotation belongs). For each edge, we need to indicate:

  • That our annotation contains a stochastic character mapping reconstruction
  • Since many reconstructions are possible for a single edge, we give each reconstruction an id
  • We indicate for which character trait we are defining the reconstruction
  • We then indicate which states the character assumes on that edge. For each state realized on the edge, that involves stating:
    • the state assignment
    • the duration (length of time) for which the edge spends in the given state
    • the order in which the state changes happen (Though we could just assume state transitions are listed chronologically, NeXML suggests making all data explicit, rather than relying on the structure of the data file to convey information).

Thus the annotation for an edge that switches from state s2 to state s1 of character cr1 would be constructed like this:

 m <- meta("simmap:reconstructions", children = c(
        meta("simmap:reconstruction", children = c(

          meta("simmap:char", "cr1"),
          meta("simmap:stateChange", children = c(
            meta("simmap:order", 1),
            meta("simmap:length", "0.2030"),
            meta("simmap:state", "s2"))),
          
          meta("simmap:char", "cr1"),
          meta("simmap:stateChange", children = c(
            meta("simmap:order", 2),
            meta("simmap:length", "0.0022"),
            meta("simmap:state", "s1")))
          ))))

Of course writing out such a definition manually becomes tedious quickly. Because these are just R commands, we can easily define a function that can loop over an assignment like this for each edge, extracting the appropriate order, length and state from an existing R object such as that provided in the phytools package.
Likewise, it is straightforward to define a function that reads this data using the RNeXML utilities and converts it back to the phytools package. The full implementation of this mapping can be seen in the simmap_to_nexml() and the nexml_to_simmap() functions provided in the RNeXML package.

As the code indicates, the key step is simply to define the data in meta elements. In so doing, we have defined a custom namespace, simmap, to hold our variables. This allows us to provide a URL with more detailed descriptions of what each of these elements mean:

nex <- add_namespaces(c(simmap = "https://github.com/ropensci/RNeXML/tree/master/inst/simmap.md"))

At that URL we have posted a simple description of each term.

Using this convention we can generate NeXML files containing simmap data, read those files into R, and convert them back into the phytools package format. These simple functions serve as further illustration of how RNeXML can be used to extend the NeXML standard. We illustrate their use briefly here, starting with loading a nexml object containing a simmap reconstruction into R:

data(simmap_ex)

The get_trees() function can be used to return an ape::phylo tree as usual. RNeXML automatically detects the simmap reconstruction data and returns includes this in a maps element of the ape::phylo object, for use with other phytools functions.

phy <- nexml_to_simmap(simmap_ex)

We can then use various functions from phytools designed for simmap objects (Revell 2012), such as the plotting function:

library("phytools")
plotSimmap(phy)
no colors provided. using the following legend:
       A        B        C 
 "black"    "red" "green3" 

Stochastic character mapping on a phylogeny, as generated by the phytools package after parsing the simmap-extended NeXML.

Likewise, we can convert the object back in the NeXML format and write it out to file to be read by other users.

nex <- simmap_to_nexml(phy) 
nexml_write(nex, "simmap.xml")
[1] "simmap.xml"

Though other NeXML parsers (for instance, for Perl or Python) have not been written explicitly to express simmap data, those parsers will nonetheless be able to successfully parse this file and expose the simmap data to the user.

Bollback, JonathanP. 2006. BMC Bioinformatics 7 (1). Springer Science + Business Media: 88. doi:10.1186/1471-2105-7-88.

Huelsenbeck, John P., Rasmus Nielsen, and Jonathan P. Bollback. 2003. “Stochastic Mapping of Morphological Characters.” Systematic Biology 52 (2). Oxford University Press (OUP): 131–58. doi:10.1080/10635150390192780.

Revell, Liam J. 2012. “Phytools: An R Package for Phylogenetic Comparative Biology (and Other Things).” Methods in Ecology and Evolution 3: 217–23.

RNeXML/tests/0000755000176200001440000000000012641021656012464 5ustar liggesusersRNeXML/tests/testthat/0000755000176200001440000000000012734357450014333 5ustar liggesusersRNeXML/tests/testthat/test_parsing.R0000644000176200001440000000133512731606043017152 0ustar liggesuserscontext("parsing") # More lower-level parsing tests in inheritance test_that("We can parse a NeXML file to an S4 RNeXML::tree object", { f <- system.file("examples", "trees.xml", package="RNeXML") doc <- xmlParse(f) root <- xmlRoot(doc) nexml <- as(root, "nexml") ## parse the XML into S4 expect_is(nexml,"nexml") }) test_that("We preserve existing namespace", { f <- system.file("examples/biophylo.xml", package="RNeXML") nex <- nexml_read(f) g <- tempfile() nexml_write(nex, g) expect_true_or_null(nexml_validate(g)) nex2 <- nexml_read(g) ## check the namespaces are added expect_gt(length(get_namespaces(nex2)), length(get_metadata(nex))) ## Check that the new abbreviations are added }) RNeXML/tests/testthat/test_taxonomy.R0000644000176200001440000000372312731606043017370 0ustar liggesuserscontext("taxonomy") data(bird.orders) birdorders_small <- drop.tip(bird.orders, tip = 1:10) birds <- add_trees(birdorders_small) birds <- taxize_nexml(birds, "NCBI") data(chiroptera) chiroptera_small <- drop.tip(chiroptera, tip = 1:906) chir <- add_trees(chiroptera_small) chir <- taxize_nexml(chir, "NCBI") chiroptera_super_small <- drop.tip(chiroptera, tip = 1:911) chir_super_small <- add_trees(chiroptera_super_small) chir_super_small <- taxize_nexml(chir_super_small, "NCBI") test_that("taxize_nexml correctly collects ncbi identifiers", { expect_is(birds@otus, "ListOfotus") expect_is(birds@otus@.Data[[1]]@otu, "ListOfotu") expect_is(birds@otus@.Data[[1]]@otu[[1]], "otu") expect_is(birds@otus@.Data[[1]]@otu[[1]]@meta, "ListOfmeta") expect_is(birds@otus@.Data[[1]]@otu[[1]]@meta[[1]], "meta") expect_equal(slot(birds@otus@.Data[[1]]@otu[[1]]@meta[[1]], "href"), "http://ncbi.nlm.nih.gov/taxonomy/56308") expect_equal(slot(birds@otus@.Data[[1]]@otu[[1]]@meta[[1]], "rel"), "tc:toTaxon") expect_is(chir@otus, "ListOfotus") expect_is(chir@otus@.Data[[1]]@otu, "ListOfotu") expect_is(chir@otus@.Data[[1]]@otu[[1]], "otu") }) test_that("we can extract taxonomy data from the object", { expect_is(get_metadata(birds, "otus/otu"), "data.frame") expect_is(get_metadata(chir_super_small, 'otus/otu'), "data.frame") }) ### TODO: how to deal with missing meta slot elements??? test_that("taxize_nexml throws appropriate warnings", { chir1 <- drop.tip(chiroptera, tip = 1:910) chir1 <- add_trees(chir1) expect_warning(taxize_nexml(chir1, "NCBI")) chiroptera_super_small <- drop.tip(chiroptera, tip = 1:912) chir_super_small <- add_trees(chiroptera_super_small) expect_is(taxize_nexml(chir_super_small, "NCBI"), "nexml") # note from Scott: above test used to test that a warning # was not thrown, but not() is no longer exported from testthat # so instead testing that the object returned is of # class nexml }) RNeXML/tests/testthat/treebase_test.R0000644000176200001440000000250012641021656017275 0ustar liggesusers# filename does not begin with `test` so not run by `testthat::test_dir()` # This test assumes the working directory contains all the XML files provided # here: https://github.com/rvosa/supertreebase/tree/master/data/treebase files <- system("ls *.xml", intern=TRUE) print("testing parsing only") parses <- sapply(files, function(x){ out <- try(xmlParse(x)) if(is(out, "try-error")) out <- x else { free(out) out = "success" } out }) fails <- parses[parses!="success"] works <- files[parses == "success"] writeLines(fails, "unparseable.txt") print("testing parsing only") treebase <- sapply(works, function(x){ print(x) tree <- try(nexml_read(x, "nexml")) if(is(tree, "try-error")) out = "read failed:" else { tree <- try(as(tree, "phylo")) if(is(tree, "try-error")) out = "conversion failed:" else out = "success" } rm(tree) out }) save(list=ls(), file = "RNeXML_test_results.rda") table(treebase) RNeXML/tests/testthat/test_serializing.R0000644000176200001440000000170512641021656020031 0ustar liggesuserscontext("serializing") ## More tests at lower-level serializing from S4 to XML in inheritance.R test_that("We can serialize ape to S4 RNeXML into valid NeXML",{ data(bird.orders) nexml <- as(bird.orders, "nexml") as(nexml, "XMLInternalNode") ### Higher level API tests nexml_write(bird.orders, file="test.xml") expect_true_or_null(nexml_validate("test.xml")) ## Clean up unlink("test.xml") }) test_that("We can serialize parsed NeXML to S4 RNeXML into valid NeXML",{ root <- xmlRoot(xmlParse(system.file("examples", "trees.xml", package="RNeXML"))) tree <- as(root, "nexml") nexml_write(tree, file="test.xml") ## validate expect_true_or_null(nexml_validate("test.xml")) ## Clean up unlink("test.xml") }) #root <- xmlRoot(xmlParse(system.file("examples", "trees.xml", package="RNeXML"))) #tree <- as(root, "nexml") #tree@trees[[1]]@tree[[1]]@node[[4]]@meta #as(root[["trees"]][["tree"]][[4]][["meta"]], "meta") RNeXML/tests/testthat/test_inheritance.R0000644000176200001440000000627112641021656020005 0ustar liggesuserscontext("inheritance") ## FIXME ## Should include expect_that tests, rather than just running without errors. ## ADD test to show that toggling xml->s4->xml returns IDENTICAL objects, ## Add tests to check values on some nodes/attributes... test_that("we can perform simple conversions between NeXML XML and S4", { # basic example node <- newXMLNode("meta", attrs = c('xsi:type'="nex:LiteralMeta", id="dict1", property="cdao:has_tag", datatype="xsd:boolean", content="true"), suppressNamespaceWarning=TRUE) n2 <- newXMLNode("node", attrs = c(about="#n4", label="n4", id = "n4"), .children = node) # check conversions to/from NeXML s4 <- as(n2, "node") xmlfroms4 <- as(s4, "XMLInternalNode") ## expect_identical(n2, xmlfroms4) #cannot compare two external pointers expect_identical(saveXML(n2), saveXML(xmlfroms4)) }) # test_that("We can parse a complete NeXML file and toggle back and forth between XML and S4", { test_that("Parse a complete NeXML file to a single otu", { doc <- xmlParse(system.file("examples", "trees.xml", package="RNeXML")) root <- xmlRoot(doc) otu <- as(root[["otus"]][[1]], "otu") expect_that(otu, is_a("otu")) as(otu, "XMLInternalNode") }) doc <- xmlParse(system.file("examples", "trees.xml", package="RNeXML")) root <- xmlRoot(doc) test_that("Parse a complete NeXML file to trees", { trees <- as(root[["trees"]], "trees") expect_that(trees, is_a("trees")) as(trees, "XMLInternalNode") }) test_that("Parse a complete NeXML file to many otus", { otus <- as(root[["otus"]], "otus") expect_that(otus, is_a("otus")) tt <- as(otus, "XMLInternalNode") expect_that(tt, is_a("XMLInternalNode")) }) test_that("Parse a complete NeXML file to xmlinternalnode", { parsed <- as(root, "nexml") expect_that(parsed, is_a("nexml")) serialized <- as(parsed, "XMLInternalNode") expect_that(serialized, is_a("XMLInternalNode")) }) test_that("Check that values are correct in the otu class element", { otu <- as(root[["otus"]][[1]], "otu") expect_that(otu@id[[1]], equals("t1")) expect_that(otu@label[[1]], equals("species 1")) expect_that(otu@meta, is_a("list")) expect_that(otu@about, is_identical_to(character(0))) }) test_that("Check that values are correct in the trees class element", { trees <- as(root[["trees"]], "trees") expect_that(trees@tree, is_a("ListOftree")) expect_that(trees@otus[[1]], equals("tax1")) expect_that(trees@id[[1]], equals("Trees")) expect_that(trees@label[[1]], equals("TreesBlockFromXML")) expect_that(trees@meta, is_a("list")) expect_that(trees@about, is_identical_to(character(0))) }) test_that("Check that values are correct in the otus class element", { otus <- as(root[["otus"]], "otus") expect_that(otus@otu, is_a("ListOfotu")) expect_that(otus@id[[1]], equals("tax1")) expect_that(otus@label[[1]], equals("RootTaxaBlock")) expect_that(otus@meta, is_a(class=c("list","ListOfmeta"))) expect_that(otus@about, is_identical_to(character(0))) }) RNeXML/tests/testthat/test_rdf.R0000644000176200001440000000135512641021656016265 0ustar liggesuserscontext("rdf") test_that("we can extract rdf-xml", { if(require("Sxslt")){ f <- system.file("examples", "meta_example.xml", package="RNeXML") rdf <- get_rdf(f) expect_is(rdf, "XMLInternalXSLTDocument") } }) test_that("we can perform sparql queries with rrdf", { skip_on_travis() if(require("Sxslt")){ f <- system.file("examples", "meta_example.xml", package="RNeXML") rdf <- get_rdf(f) ## Write to a file and read in with rrdf saveXML(rdf, "rdf_meta.xml") success <- require(rrdf) if(success){ lib <- load.rdf("rdf_meta.xml") ## Perform a SPARQL query: out <- sparql.rdf(lib, "SELECT ?title WHERE { ?x ?title}") } unlink("rdf_meta.xml") } }) RNeXML/tests/testthat/test_ape.R0000644000176200001440000000527712641021656016266 0ustar liggesuserscontext("ape") test_that("From ape::phylo to RNeXML::nexml object", { data(bird.orders) expect_is(as(bird.orders, "nexml"), class="nexml") }) test_that("We can go from various orderings of ape::phylo to RNeXML::nexml", { data(bird.orders) nexml <- as(bird.orders, "nexml") phy <- as(nexml, "phylo") ## Demonstrate that we now have a phylo object p <- plot(phy) expect_that(plot(phy), is_a("list")) expect_that(phy, is_a("phylo")) }) test_that("From nexml to multiPhylo", { # part of base testing, could be replaced with higher level, but why f <- system.file("examples", "trees.xml", package="RNeXML") doc <- xmlParse(f) root <- xmlRoot(doc) nexml <- as(root, "nexml") ## parse the XML into S4 ## APE TEST: Coerce the S4 into phylo S3 object expect_warning(phy <- as(nexml, "phylo"), "Multiple trees found, Returning multiPhylo object") expect_is(phy, "multiPhylo") }) ## This unit test is really not testing ape functions but just the higher-level nexml_write function... test_that("We can serialize the various versions of the ape format", { data(bird.orders) nexml <- as(bird.orders, "nexml") nexml_write(nexml, file = "test.xml") unlink("test.xml") }) test_that("We can read and write NeXML to phylo and back without edge.lengths", { s <- "owls(((Strix_aluco,Asio_otus),Athene_noctua),Tyto_alba);" cat(s, file = "ex.tre", sep = "\n") owls <- read.tree("ex.tre") nexml_write(owls, file = "ex.xml") owls2 <- as(nexml_read("ex.xml"), "phylo") expect_equal(owls, owls2) ## FIXME what? unlink("ex.tre") unlink("ex.xml") }) test_that("Rooted trees remain rooted on conversions", { expect_true(is.rooted(bird.orders)) expect_true(is.rooted(as(as(bird.orders, "nexml"), "phylo"))) write.nexml(bird.orders, file = "tmp.xml") expect_true(is.rooted(as(read.nexml("tmp.xml"), "phylo"))) unlink("tmp.xml") }) phy <- unroot(bird.orders) test_that("Unrooted trees remain unrooted on conversions", { expect_false(is.rooted(phy)) expect_false(is.rooted(as(as(phy, "nexml"), "phylo"))) write.nexml(phy, file = "tmp.xml") expect_false(is.rooted(as(read.nexml("tmp.xml"), "phylo"))) unlink("tmp.xml") }) test_that("We can convert trees with only some edge lengths into ape::phylo", { f <- system.file("examples", "some_missing_branchlengths.xml", package="RNeXML") expect_warning(a <- as(read.nexml(f), "phylo"), "Multiple trees found, Returning multiPhylo object") # We can parse it, goodness knows what anyone will do with it. Better to hack off the branch lengths or convert to 0, but that's for the user. }) RNeXML/tests/testthat/test_get_characters.R0000644000176200001440000000064712731606043020472 0ustar liggesuserscontext("get_characters") test_that("Getting characters", { f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- read.nexml(f) out <- get_characters(nex) expect_is(out, "data.frame") }) test_that("get_characters throws appropriate warnings", { f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- read.nexml(f) expect_is(get_characters(nex), "data.frame") }) RNeXML/tests/testthat/test_toplevel_api.R0000644000176200001440000000410312641021656020167 0ustar liggesuserscontext("top level API") test_that("read.nexml works", { ## The short version using an RNeXML API f <- system.file("examples", "trees.xml", package="RNeXML") nex <- read.nexml(f) # check alias expect_is(nex, "nexml") }) test_that("write.nexml works (from ape::phylo)", { ## The short version using an RNeXML API data(bird.orders) nexml_write(bird.orders, file="example.xml") write.nexml(bird.orders, file="example.xml") # check alias too ## Check that that example is valid NeXML expect_true_or_null(nexml_validate("example.xml")) expect_is(nexml_read("example.xml", "nexml"), "nexml") unlink("example.xml") # cleanup }) test_that("write.nexml can write multiple trees at once ", { f <- system.file("examples", "trees.xml", package="RNeXML") nex <- nexml_read(f) trees <- get_trees(nex) ## We can write a listOfmultiPhylo if the argument is named nexml_write(trees = trees, file="example.xml") expect_true_or_null(nexml_validate("example.xml")) # we can write a multiPhylo (or phylo) by attempting coercion on the first argument instead: nexml_write(trees[[1]], file="example.xml") expect_true_or_null(nexml_validate("example.xml")) unlink("example.xml") # cleanup }) test_that("We can get the right level of lists of trees ", { f <- system.file("examples", "trees.xml", package="RNeXML") nex <- nexml_read(f) ## identical methods, Collapses length-1 lists phy <- as(nex, "phylo") ## phy2 <- get_trees(nex) phy3 <- nexml_get(nex, "trees") expect_identical(phy, phy2) expect_identical(phy3, phy2) ## Doesn't collapse the length-1 lists, returns list of multiPhylo always: phy <- as(nex, "multiPhyloList") ## phy2 <- get_trees_list(nex) phy3 <- nexml_get(nex, "trees_list") expect_identical(phy, phy2) expect_identical(phy3, phy2) ## Collapse to multiPhylo phy <- as(nex, "multiPhylo") ## phy2 <- get_trees(nex) # same because there are two trees in the same `trees` node. expect_identical(phy, phy2) phy3 <- nexml_get(nex, "flat_trees") ## FIXME SOMETHING WRONG! expect_identical(phy3, phy2) }) RNeXML/tests/testthat/test_get_level.R0000644000176200001440000000022012731606037017450 0ustar liggesuserstestthat::context("get_level") f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- read.nexml(f) get_level(nex, "meta") RNeXML/tests/testthat/test_comp_analysis.R0000644000176200001440000000153412641021656020352 0ustar liggesuserscontext("Comparative analysis") library(geiger) test_that("We can extract tree and trait data to run fitContinuous and fitDiscrete", { nexml <- read.nexml(system.file("examples", "comp_analysis.xml", package="RNeXML")) traits <- get_characters(nexml) tree <- get_trees(nexml) expect_is(tree, "phylo") cts <- fitContinuous(tree, traits[1], ncores=1) ## Incredibly, fitDiscrete cannot take discrete characters # dte <- fitDiscrete(tree, traits[2], ncores=1) traits[[2]] <- as.numeric(traits[[2]]) dte <- fitDiscrete(tree, traits[2], ncores=1) }) test_that("We can serialize tree and trait data for a comparative analysis", { data(geospiza) add_trees(geospiza$phy) nexml <- add_characters(geospiza$dat) write.nexml(nexml, file = "geospiza.xml") expect_true_or_null(nexml_validate("geospiza.xml")) unlink("geospiza.xml") }) RNeXML/tests/testthat/test_global_ids.R0000644000176200001440000000060512641021656017606 0ustar liggesuserscontext("Set global (uuid) identifiers") test_that("We can generate valid EML with uuid ids on all elements", { if(require("uuid")){ options(uuid = TRUE) data(geospiza) add_trees(geospiza$phy) nexml <- add_characters(geospiza$dat) write.nexml(nexml, file = "geospiza.xml") expect_true_or_null(nexml_validate("geospiza.xml")) unlink("geospiza.xml") } }) RNeXML/tests/testthat/test_nexml_read.R0000644000176200001440000000240512641021656017625 0ustar liggesuserscontext("read nexml") f <- system.file("examples", "trees.xml", package = "RNeXML") url <- "https://raw.githubusercontent.com/ropensci/RNeXML/master/inst/examples/trees.xml" test_that("we can read nexml from a file path", { nex <- nexml_read(f) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) test_that("we can read nexml from a url", { nex <- read.nexml(url) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) test_that("we can read nexml from a character string of xml", { str <- paste0(readLines(f), collapse = "") nex <- nexml_read(str) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) test_that("we can read nexml from a XMLInternalDocument object", { library("httr") library("XML") x <- xmlParse(content(GET(url))) nex <- nexml_read(x) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) test_that("we can read nexml from a XMLInternalNode object", { library("httr") library("XML") x <- xmlParse(content(GET(url))) nex <- nexml_read(xmlRoot(x)) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) test_that("alias for nexml_read works", { nex <- read.nexml(f) expect_is(nex, "nexml") expect_equal(nex@trees@names, "trees") }) RNeXML/tests/testthat/test_meta_extract.R0000644000176200001440000000117712641021656020174 0ustar liggesuserscontext("extract_metadata") nex <- add_basic_meta( title = "My test title", description = "A description of my test", creator = "Carl Boettiger ", publisher = "unpublished data", pubdate = "2012-04-01", citation = citation("ape")) test_that("we can extract metadata using the dedicated functions", { get_citation(nex) get_license(nex) get_metadata(nex) summary(nex) unlink("example.xml") }) test_that("we can extract all available metadata at a specified level of the DOM", { get_metadata(nex) get_metadata(nex, "trees") }) RNeXML/tests/testthat/geiger_test.R0000644000176200001440000000334612641021656016756 0ustar liggesuserscontext("Geiger tests (may take 15+ minutes)") library(geiger) test_that("We can write caudata data to nexml", { data(caudata) nexml_write(trees = caudata$phy, characters = caudata$dat, file="tmp.xml") expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) test_that("We can write geospiza data to nexml", { data(geospiza) nexml_write(trees = geospiza$phy, characters = geospiza$dat, file="tmp.xml") expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) test_that("We can write chelonia data to nexml", { data(chelonia) nexml_write(trees = chelonia$phy, characters = chelonia$dat, file="tmp.xml") expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) test_that("We can write primates data to nexml", { data(primates) nexml_write(trees = primates$phy, characters = primates$dat, file="tmp.xml") expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) test_that("We can write whales data to nexml", { data(whales) # taxa need to be rownames not separate column whales$dat <- whales$richness[[2]] names(whales$dat) <- whales$richness[[1]] nexml_write(trees = whales$phy, characters = whales$dat, file="tmp.xml") expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) test_that("We can write amphibia multiphylo to nexml. Two of these phylogenies each have nearly 3K taxa, so this may take around 12 minutes", { # multiphylo, where two phylogenies have each nearly 3K taxa data(amphibia) class(amphibia) <- "multiPhylo" runtime <- system.time(nexml_write(amphibia, file="tmp.xml")) # Slow! about 12 minutes expect_true_or_null(nexml_validate("tmp.xml")) unlink("tmp.xml") # cleanup }) RNeXML/tests/testthat/test_publish.R0000644000176200001440000000421312641021656017154 0ustar liggesuserscontext("publish") if(0){ # skip publishing tests. These were all still passing at last check, but rfigshare configuration for testing is not ideal. # This loads the rOpenSci figshare sandbox credentials, so that the example # can run automatically during check and install. Unlike normal figshare accounts, # data loaded to this testing sandbox is periodically purged. library(rfigshare) status <- try(fs_auth(token = "xdBjcKOiunwjiovwkfTF2QjGhROeLMw0y0nSCSgvg3YQxdBjcKOiunwjiovwkfTF2Q", token_secret = "4mdM3pfekNGO16X4hsvZdg")) if(is(status, "try-error") || (is(status, "response") && status$status_code != 200)){ warning("Could not authenticate figshare, skipping figshare tests") } else { ## Create example file library(geiger) data(geospiza) geiger_nex <- add_trees(geospiza$phy) geiger_nex <- add_characters(geospiza$dat, geiger_nex) geiger_nex <- add_basic_meta( title = "Geospiza phylogeny with character data rendered as NeXML", creator = "Carl Boettiger", description = "This example NeXML file was created using the data originally provided in the geiger package for R to illustrate how this data can be stored, shared and distributed as NeXML.", citation = citation("geiger"), nexml = geiger_nex) test_that("We can publish to figshare", { ## Publish id <- nexml_publish(geiger_nex, visibility="public", repo="figshare") ## Download and parse publication ## Note that at present, only public files can be automatically downloaded from figshare library(rfigshare) test_nex <- nexml_read(fs_download(id)) ## Extract and compare metadata from upload and download m <- get_metadata(geiger_nex) test_m <- get_metadata(test_nex) expect_equal(m["dc:title"], test_m["dc:title"]) expect_equal(m["dc:description"], test_m["dc:description"]) ## Check that DOI resolves -- doesn't for the test account #library(httr) #page <- GET(test_m[["dc:identifier"]]) #expect_equal(page$status_code, 200) # Check that we avoid repeated metadata entries expect_equal(sum(match(names(test_m), "dc:pubdate"), na.rm=TRUE), 1) expect_equal(sum(match(names(test_m), "cc:license"), na.rm=TRUE), 1) }) } } RNeXML/tests/testthat/test_validate.R0000644000176200001440000000125212641021656017277 0ustar liggesuserscontext("Online validator tool") test_that("example file validates", { f <- system.file("examples", "trees.xml", package="RNeXML") expect_true_or_null(nexml_validate(f)) # null if we cannot perform validation, don't fail }) test_that("RNeXML-generated file validates", { data(bird.orders) f <- nexml_write(bird.orders, file="test.xml") o <- nexml_validate(f) if(!is.null(o)){ expect_true(o) } else { expect_null(o) } unlink("test.xml") }) test_that("Validation can fail gracefully", { f <- system.file("examples/sparql.newick", package="RNeXML") o <- nexml_validate(f) if(!is.null(o)) { expect_false(o) } else { expect_null(o) } }) RNeXML/tests/testthat/conversions.R0000644000176200001440000001375512641021656017032 0ustar liggesusers## Note: these tests do not run in the typical test suite process as the file name ## doesn't start with "test-" # nms <- c("S100","S1000","S10000","S10001","S10005","S10006","S10007","S10009","S10014","S10018","S10301","S1044","S10500","S10636","S1064","S1073","S10774","S1207","S13135","S1452","S938","S9981","S999") # nexml_files <- lapply(nms, function(x) content(GET(sprintf("https://raw.github.com/rvosa/supertreebase/master/data/treebase/%s.xml",x)), as="text")) # save(nexml_files, file="~/github/ropensci/RNeXML_testfiles/nexml_files.rda") load("~/github/ropensci/RNeXML_testfiles/nexml_files.rda") context("nexml files parse correctly") test_that("nexml files parse correctly", { expect_is(nexml_read(nexml_files[[1]]), "multiPhylo") expect_is(nexml_read(nexml_files[[2]]), "phylo") expect_is(nexml_read(nexml_files[[3]]), "list") expect_is(nexml_read(nexml_files[[4]]), "phylo") expect_is(nexml_read(nexml_files[[5]]), "list") expect_is(nexml_read(nexml_files[[6]]), "list") expect_is(nexml_read(nexml_files[[7]]), "phylo") expect_is(nexml_read(nexml_files[[8]]), "list") expect_is(nexml_read(nexml_files[[9]]), "list") expect_is(nexml_read(nexml_files[[10]]), "phylo") expect_is(nexml_read(nexml_files[[11]]), "list") expect_is(nexml_read(nexml_files[[12]]), "list") expect_is(nexml_read(nexml_files[[13]]), "phylo") expect_is(nexml_read(nexml_files[[14]]), "phylo") expect_is(nexml_read(nexml_files[[15]]), "phylo") expect_is(nexml_read(nexml_files[[16]]), "list") expect_is(nexml_read(nexml_files[[17]]), "list") expect_is(nexml_read(nexml_files[[18]]), "phylo") expect_is(nexml_read(nexml_files[[19]]), "phylo") expect_is(nexml_read(nexml_files[[20]]), "list") expect_is(nexml_read(nexml_files[[21]]), "phylo") expect_is(nexml_read(nexml_files[[22]]), "list") expect_is(nexml_read(nexml_files[[23]]), "multiPhylo") }) # puma <- search_treebase('"Puma"', by="taxon") # save(puma, file="~/github/ropensci/RNeXML_testfiles/puma.rda") # ursus <- search_treebase('"Ursus"', by="taxon") # save(ursus, file="~/github/ropensci/RNeXML_testfiles/ursus.rda") # quercus <- search_treebase('"Quercus"', by="taxon") # save(quercus, file="~/github/ropensci/RNeXML_testfiles/quercus.rda") load(file="~/github/ropensci/RNeXML_testfiles/puma.rda") load(file="~/github/ropensci/RNeXML_testfiles/ursus.rda") load(file="~/github/ropensci/RNeXML_testfiles/quercus.rda") context("ape files convert to class nexml correctly") test_that("ape files convert to class nexml correctly", { expect_that(as(puma[[1]], "nexml"), is_a("nexml")) expect_that(as(puma[[2]], "nexml"), is_a("nexml")) expect_that(as(puma[[3]], "nexml"), is_a("nexml")) expect_that(as(puma[[5]], "nexml"), is_a("nexml")) expect_that(as(puma[[7]], "nexml"), is_a("nexml")) expect_that(as(puma[[9]], "nexml"), is_a("nexml")) expect_that(as(puma[[11]], "nexml"), is_a("nexml")) expect_that(as(ursus[[1]], "nexml"), is_a("nexml")) expect_that(as(ursus[[2]], "nexml"), is_a("nexml")) expect_that(as(ursus[[3]], "nexml"), is_a("nexml")) expect_that(as(ursus[[5]], "nexml"), is_a("nexml")) expect_that(as(ursus[[7]], "nexml"), is_a("nexml")) expect_that(as(ursus[[9]], "nexml"), is_a("nexml")) expect_that(as(ursus[[11]], "nexml"), is_a("nexml")) expect_that(as(ursus[[13]], "nexml"), is_a("nexml")) expect_that(as(ursus[[15]], "nexml"), is_a("nexml")) expect_that(as(ursus[[18]], "nexml"), is_a("nexml")) expect_that(as(ursus[[22]], "nexml"), is_a("nexml")) expect_that(as(ursus[[27]], "nexml"), is_a("nexml")) expect_that(as(ursus[[30]], "nexml"), is_a("nexml")) expect_that(as(ursus[[33]], "nexml"), is_a("nexml")) expect_that(as(quercus[[1]], "nexml"), is_a("nexml")) expect_that(as(quercus[[2]], "nexml"), is_a("nexml")) expect_that(as(quercus[[3]], "nexml"), is_a("nexml")) expect_that(as(quercus[[5]], "nexml"), is_a("nexml")) expect_that(as(quercus[[7]], "nexml"), is_a("nexml")) expect_that(as(quercus[[9]], "nexml"), is_a("nexml")) expect_that(as(quercus[[11]], "nexml"), is_a("nexml")) expect_that(as(quercus[[20]], "nexml"), is_a("nexml")) expect_that(as(quercus[[30]], "nexml"), is_a("nexml")) expect_that(as(quercus[[40]], "nexml"), is_a("nexml")) expect_that(as(quercus[[50]], "nexml"), is_a("nexml")) expect_that(as(quercus[[70]], "nexml"), is_a("nexml")) expect_that(as(quercus[[90]], "nexml"), is_a("nexml")) expect_that(as(quercus[[92]], "nexml"), is_a("nexml")) }) context("ape files write to nexml files correctly") test_that("ape files write to nexml files correctly, set 1", { nexml_write(puma[[1]], "one.xml") nexml_write(puma[[5]], "two.xml") nexml_write(puma[[9]], "three.xml") expect_is(nexml_read("~/one.xml"), "phylo") expect_is(nexml_read("~/two.xml"), "phylo") expect_is(nexml_read("~/three.xml"), "phylo") }) test_that("ape files write to nexml files correctly, set 2", { nexml_write(ursus[[1]], "one_u.xml") nexml_write(ursus[[5]], "two_u.xml") nexml_write(ursus[[9]], "three_u.xml") nexml_write(ursus[[1]], "four_u.xml") nexml_write(ursus[[5]], "five_u.xml") nexml_write(ursus[[9]], "six_u.xml") expect_is(nexml_read("~/one_u.xml"), "phylo") expect_is(nexml_read("~/two_u.xml"), "phylo") expect_is(nexml_read("~/three_u.xml"), "phylo") expect_is(nexml_read("~/four_u.xml"), "phylo") expect_is(nexml_read("~/five_u.xml"), "phylo") expect_is(nexml_read("~/six_u.xml"), "phylo") }) test_that("ape files write to nexml files correctly, set 3", { nexml_write(quercus[[1]], "one_q.xml") nexml_write(quercus[[5]], "two_q.xml") nexml_write(quercus[[9]], "three_q.xml") nexml_write(quercus[[1]], "four_q.xml") nexml_write(quercus[[5]], "five_q.xml") nexml_write(quercus[[9]], "six_q.xml") expect_is(as(nexml_read("~/one_q.xml"), "phylo"), "phylo") expect_is(as(nexml_read("~/two_q.xml"), "phylo"), "phylo") expect_is(as(nexml_read("~/three_q.xml"), "phylo"), "phylo") expect_is(as(nexml_read("~/four_q.xml"), "phylo"), "phylo") expect_is(as(nexml_read("~/five_q.xml"), "phylo"), "phylo") expect_is(as(nexml_read("~/six_q.xml"), "phylo"), "phylo") }) RNeXML/tests/testthat/test_simmap.R0000644000176200001440000000202412641021656016772 0ustar liggesuserscontext("simmap") ## Make a simmap tree test_that("we can coerce an ape::phylo tree with a phytools:simmap extension into nexml", { skip_if_not_installed("phytools") library("phytools") set.seed(10) tree <- rbdtree(b = log(50), d = 0, Tmax = .5) Q <- matrix(c(-2, 1, 1, 1, -2 ,1 ,1, 1, -2), 3, 3) rownames(Q) <- colnames(Q) <- c("A", "B", "C") ## Note that state symbols must be integers! factors will be converted mtree <- sim.history(tree, Q) cols <- c("red", "blue", "green") names(cols) <- rownames(Q) nex <- simmap_to_nexml(mtree) expect_is(nex, "nexml") phy <- nexml_to_simmap(nex) orig <- plotSimmap(mtree,cols,ftype="off") roundtrip <- plotSimmap(phy,cols,ftype="off") # checks that the edge mappings are correct expect_equal(mtree$maps, phy$maps) orig <- as.integer(as.factor(mtree$states[sort(names(mtree$states))])) converted <- as.integer(phy$states[sort(names(phy$states))]) # checks that we got the states slot correct expect_equal(converted, orig) }) RNeXML/tests/testthat/test_meta.R0000644000176200001440000001032712641021656016437 0ustar liggesuserscontext("meta") data(bird.orders) test_that("We can add additional metadata", { ## The short version using an RNeXML API nex <- add_basic_meta( title = "My test title", description = "A description of my test", creator = "Carl Boettiger ", publisher = "unpublished data", pubdate = "2012-04-01") write.nexml(nex, file = "meta_example.xml") expect_true_or_null(nexml_validate("meta_example.xml")) expect_is(nexml_read("meta_example.xml"), "nexml") unlink("meta_example.xml") # cleanup }) test_that("We can add R bibentry type metadata", { ## The short version using an RNeXML API nex <- add_trees(bird.orders) nex <- add_basic_meta(nexml=nex, citation=citation("ape")) write.nexml(nex, file = "meta_example.xml") expect_true_or_null(nexml_validate("meta_example.xml")) expect_is(nexml_read("meta_example.xml"), "nexml") unlink("meta_example.xml") # cleanup }) test_that("We can add additional metadata", { ## The short version using an RNeXML API nex <- add_trees(bird.orders) nex <- add_basic_meta(nexml = nex, citation=citation("ape")) history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML", id = "meta5144") modified <- meta(property = "prism:modificationDate", content = "2013-10-04") website <- meta(href = "http://carlboettiger.info", rel = "foaf:homepage") nex <- add_meta(list(history, modified, website), nex, namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#", prism = "http://prismstandard.org/namespaces/1.2/basic/", # check and remove duplicates foaf = "http://xmlns.com/foaf/0.1/")) nexml_write(nex, file = "meta_example.xml") expect_true_or_null(nexml_validate("meta_example.xml")) expect_is(nexml_read("meta_example.xml"), "nexml") unlink("meta_example.xml") # cleanup }) test_that("We can directly add additional metadata at arbitrary level", { nex <- add_trees(bird.orders) modified <- meta(property = "prism:modificationDate", content = "2013-10-04") nex@trees[[1]]@meta <- new("ListOfmeta", list(modified)) get_metadata(nex, "trees") %>% dplyr::filter(property == "prism:modificationDate") %>% dplyr::select(content) -> tmp expect_identical(tmp[[1]], modified@content) }) test_that("We can directly add additional metadata using concatenation notation", { nex <- add_trees(bird.orders) modified <- meta(property = "prism:modificationDate", content = "2013-10-04") website <- meta(href = "http://carlboettiger.info", rel = "foaf:homepage") nex@trees[[1]]@meta <- c(modified) # we can add just one element nex@trees[[1]]@meta <- c(modified,website) # or more than one element get_metadata(nex, "trees") %>% dplyr::filter(property == "prism:modificationDate") %>% dplyr::select(content) -> tmp expect_identical(tmp[[1]], modified@content) }) test_that("We can add arbitrary metadata", { rdfa <- ' twitter github ' parsed <- xmlRoot(xmlParse(rdfa)) arbitrary_rdfa <- meta(property="eml:additionalMetadata", content="additional metadata", children = parsed) nex <- add_meta(arbitrary_rdfa, namespaces = c(foaf = "http://xmlns.com/foaf/0.1/", eml = "eml://ecoinformatics.org/eml-2.1.1", xhtml = "http://www.w3.org/1999/xhtml")) nexml_write(nex, file = "example.xml") expect_is(nexml_read("example.xml", "nexml"), "nexml") unlink("example.xml") # cleanup }) test_that("we can write numeric types of meta elements and get correct datatype", { m <- meta(property="numericTest", content = 3.141) expect_is(m@content, "character") expect_match(m@datatype, ".*:decimal") }) RNeXML/tests/testthat/helper-RNeXML.R0000644000176200001440000000017612641021656016775 0ustar liggesusersexpect_true_or_null <- function(o){ if(!is.null(o)){ expect_true(o) } else { expect_null(o) } } library("XML")RNeXML/tests/testthat/test_characters.R0000644000176200001440000000634712641021656017637 0ustar liggesuserscontext("character matrices") ## All tests will use this data file f <- system.file("examples", "comp_analysis.xml", package="RNeXML") test_that("we can parse XML to S4 and serialize S4 to XML for the basic character classes", { doc <- xmlParse(f) root <- xmlRoot(doc) char <- as(root[["characters"]][["format"]][["char"]], "char") out <- as(char, "XMLInternalElementNode") expect_is(char, "char") # not as dumb as it looks, at least we're checking our own method here expect_is(out, "XMLInternalElementNode") # dumb check, but provides a dot to show the code above executed successfully format <- as(root[["characters"]][["format"]], "format") out <- as(format, "XMLInternalElementNode") expect_is(format, "format") expect_is(out, "XMLInternalElementNode") matrix <- as(root[["characters"]][["matrix"]], "obsmatrix") out <- as(matrix, "XMLInternalElementNode") expect_is(matrix, "obsmatrix") expect_is(out, "XMLInternalElementNode") characters <- as(root[["characters"]], "characters") out <- as(characters, "XMLInternalElementNode") expect_is(characters, "characters") expect_is(out, "XMLInternalElementNode") }) test_that("we can actually parse NeXML files containing character data", { nex <- read.nexml(f) expect_is(nex, "nexml") }) ## Now that we tested this, store the result so we can use it in later tests nex <- read.nexml(f) test_that("we can extract character matrix with get_characters", { x <- get_characters(nex) expect_is(x, "data.frame") ## FIXME add additional and more precise expect_ checks }) test_that("we can extract a list of character matrices with get_characters_list", { x <- get_characters_list(nex) expect_is(x, "list") expect_is(x[[1]], "data.frame") ## FIXME add additional and more precise expect_ checks }) ## test_that("add_otu can append only unmatched taxa to an existing otus block", { orig <- get_taxa(nex) x <- get_characters_list(nex) nex@otus[[1]]@otu <- new("ListOfotu", nex@otus[[1]]@otu[1:5]) # chop off some of the otu values new_taxa <- rownames(x[[1]]) nex2 <- RNeXML:::add_otu(nex, new_taxa, append=TRUE) # add them back ## should have same contents as orig... get_taxa(nex2) expect_identical(sort(orig$label), sort(get_taxa(nex2)$label)) ## Note that otu ids are not unique when we chop them off ... }) ## FIXME add_characters needs a method to add character names of states ## and then we need a test for that method test_that("we can add characters to a nexml file using a data.frame", { f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- read.nexml(f) x <- get_characters(nex) nexml <- add_characters(x) ## Can we write it out and read it back? nexml_write(nexml, file = "chartest.xml") tmp <- nexml_read("chartest.xml") tmp_x <- get_characters(tmp) expect_is(tmp_x, "data.frame") expect_is(tmp, "nexml") unlink("chartest.xml") }) ## based on bug on 2014-03-12 65ae459523c529452adb699c3d5d118c0a207402 test_that("we can add multiple character matrices to a nexml file", { library("geiger") data(geospiza) data(primates) nex <- add_characters(geospiza$dat) nex <- add_characters(primates$dat, nex) expect_is(nex, "nexml") }) RNeXML/tests/testthat/test_concatenate.R0000644000176200001440000000544212731606043017776 0ustar liggesuserscontext("concatenate method") test_that("we can concatenate two files with unique ids", { f1 <- system.file("examples", "trees.xml", package="RNeXML") f2 <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex1 <- read.nexml(f1) nex2 <- read.nexml(f2) expect_is(c(nex1, nex2), "nexml") }) test_that("we get an error if the files to be concatenated have non-unique ids", { f1 <- system.file("examples", "trees.xml", package="RNeXML") nex1 <- read.nexml(f1) nex2 <- read.nexml(f1) expect_error(c(nex1, nex2),"ids are not unique") }) test_that("we can concatenate meta elements", { out <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) expect_is(out, "ListOfmeta") sapply(out, expect_is, "meta") }) test_that("we can conatenate meta elements with empty ListOfmeta elements", { ## Doesn't trigger our method if x is not class `meta` out <- c(new("ListOfmeta"), meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) expect_is(out, "ListOfmeta") sapply(out, expect_is, "meta") ## in any order out <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator"), new("ListOfmeta")) expect_is(out, "ListOfmeta") sapply(out, expect_is, "meta") }) test_that("we can conatenate meta elements with ListOfmeta elements", { out <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) out <- c(out, meta("skos:note", "an editorial note")) expect_is(out, "ListOfmeta") sapply(out, expect_is, "meta") out <- c(meta("skos:note", "another editorial note"), out) expect_is(out, "ListOfmeta") sapply(out, expect_is, "meta") }) test_that("we can concatenate two ListOfmeta elements", { metalist <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) out <- c(metalist, metalist) expect_is(out, "ListOfmeta") expect_is(out[[1]], "meta") expect_equal(length(out), 4) }) test_that("we can concatenate a ListOfmeta and a meta", { metalist <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) out <- c(metalist, meta(content="a", property="b")) expect_is(out, "ListOfmeta") expect_is(out[[1]], "meta") expect_equal(length(out), 3) }) test_that("we can read in a file with existing meta and append without overwriting", { f <- system.file("examples/biophylo.xml", package="RNeXML") nex <- nexml_read(f) g <- tempfile() nexml_write(nex, g) nex2 <- nexml_read(g) expect_gt(length(get_metadata(nex2)), length(get_metadata(nex))) }) RNeXML/tests/test-all.R0000644000176200001440000000005312641021656014332 0ustar liggesuserslibrary("testthat") test_check("RNeXML") RNeXML/NAMESPACE0000644000176200001440000000257412734324011012543 0ustar liggesusers# Generated by roxygen2: do not edit by hand S3method(nexml_read,XMLInternalDocument) S3method(nexml_read,XMLInternalNode) S3method(nexml_read,character) export(add_basic_meta) export(add_characters) export(add_meta) export(add_namespaces) export(add_trees) export(flatten_multiphylo) export(get_characters) export(get_characters_list) export(get_citation) export(get_flat_trees) export(get_level) export(get_license) export(get_metadata) export(get_namespaces) export(get_otu) export(get_otus_list) export(get_rdf) export(get_taxa) export(get_taxa_list) export(get_trees) export(get_trees_list) export(meta) export(nexml_add) export(nexml_figshare) export(nexml_get) export(nexml_publish) export(nexml_read) export(nexml_to_simmap) export(nexml_validate) export(nexml_write) export(read.nexml) export(reset_id_counter) export(simmap_to_nexml) export(taxize_nexml) export(write.nexml) import(XML) import(ape) import(httr) import(methods) import(plyr) import(reshape2) import(taxize) import(uuid) importFrom(dplyr,"%>%") importFrom(dplyr,bind_rows) importFrom(dplyr,left_join) importFrom(dplyr,matches) importFrom(dplyr,mutate_) importFrom(dplyr,select) importFrom(dplyr,select_) importFrom(lazyeval,interp) importFrom(stats,na.omit) importFrom(stats,setNames) importFrom(stringr,str_replace) importFrom(tidyr,spread) importFrom(utils,capture.output) importFrom(utils,head) importFrom(utils,object.size) RNeXML/demo/0000755000176200001440000000000012641021656012246 5ustar liggesusersRNeXML/demo/sparql.R0000644000176200001440000000511212641021656013672 0ustar liggesuserslibrary(rrdf) library(phytools) library(RNeXML) nexml <- nexml_read(system.file("examples/primates.xml", package="RNeXML")) # Extract the RDF graph from the nexml rdf <- get_rdf(system.file("examples/primates.xml", package="RNeXML")) saveXML(rdf, "rdf_meta.xml") # rrdf requires a file name, so we must write the XML out first graph <- load.rdf("rdf_meta.xml") # fetch the NCBI URI for the taxon that has rank 'Order', i.e. the root of the primates. The dot operator # '.' between clauses implies a join, in this case root <- sparql.rdf(graph, "SELECT ?uri WHERE { ?id . ?id ?uri }" ) # Define a function to get the name get_name <- function(id) { max <- length(nexml@otus[[1]]@otu) for(i in 1:max) { if ( nexml@otus[[1]]@otu[[i]]@id == id ) { label <- nexml@otus[[1]]@otu[[i]]@label label <- gsub(" ","_",label) return(label) } } } # define a recursive function to build newick recurse <- function(node){ # fetch the taxonomic rank and id string rank_query <- paste0( "SELECT ?rank ?id WHERE { ?id <",node,"> . ?id ?rank }") result <- sparql.rdf(graph, rank_query) # get the local ID, strip URI part id <- result[2] id <- gsub("^.+#", "", id, perl = TRUE) # if rank is terminal, return the name if (result[1] == "http://rs.tdwg.org/ontology/voc/TaxonRank#Species") { return(get_name(id)) } # recurse deeper else { child_query <- paste0( "SELECT ?uri WHERE { ?id <",node,"> . ?id ?uri }") children <- sparql.rdf(graph, child_query) # the newick can be made to contain interior node labels by inserting get_name(id), before the sep="" argument return(paste("(", paste(sapply(children, recurse), sep = ",", collapse = "," ), ")", sep = "", collapse = "")) } } # build the tree and visualize it newick <- paste(recurse(root), ";", sep = "", collapse = "") tree <- read.newick(text = newick) collapsed <- collapse.singles(tree) plot(collapsed, type = "cladogram") RNeXML/demo/00Index0000644000176200001440000000011012641021656013370 0ustar liggesuserssparql Example of using sparql queries to explore the metadata RNeXML/NEWS0000644000176200001440000000671312734257730012037 0ustar liggesusersNEWS ==== For more fine-grained list of changes or to report a bug, consult * [The issues log](https://github.com/ropensci/RNeXML/issues) * [The commit log](https://github.com/ropensci/RNeXML/commits/master) Versioning ---------- Releases will be numbered with the following semantic versioning format: .. And constructed with the following guidelines: * Breaking backward compatibility bumps the major (and resets the minor and patch) * New additions without breaking backward compatibility bumps the minor (and resets the patch) * Bug fixes and misc changes bumps the patch * Following the RStudio convention, a .99 is appended after the patch number to indicate the development version on Github. Any version Coming from Github will now use the .99 extension, which will never appear in a version number for the package on CRAN. For more information on SemVer, please visit http://semver.org/. v2.0.7 ------ - Bugfixes following release of new dplyr and new tidyr dependencies v2.0.6 ------ - Migrate Additional_repositories to new address for OmegaHat project. v2.0.5 ------- - `get_metadata()`, `get_taxa()` now return much richer `data.frames` instead of named vectors. This is potentially a non-backwards compatible change if scripts use the output of these functions as lists (#129). See updated metadata vignette. This introduces new dependencies `dplyr` and `lazyeval`. - more robust `nexml_read()` method for URLs, (#123) - Avoid assuming the namespace prefix `nex` for nexml elements (#51, #124, #126). Includes a fix server-side on the NeXML validator as well. - `nexml_validate()` points to the new validator. (#126) v2.0.4 ------- - Fix compatibilty issue with recent phytools release. v2.0.3 ------ - Upgrade tests to be compatible with newest testthat (0.10.0), bumps testthat dependency version up (#119) thanks @hadley v2.0.2 ------ - Add four new vignettes describing the use of various advanced features in the package: the use of SPARQL queries, advanced use of metadata features, an example of how to extend NeXML with simmap data as the use case, and documentation on the central S4 data structure used in the package. - Implements the use of Title Case in the package title, as requested (on several occassions) by the CRAN maintainers. v2.0.1 ------- - Update DESCRIPTION to provide a standard `install.packages()` compatible repository for `rrdf`, as per request from the CRAN team. v2.0.0 --------- * add URL and BugReports to Description. [#103](https://github.com/ropensci/RNeXML/issues/103) * for consistency with other `add_` methods, the `nexml` object is now the _last_, not the _first_, argument to `add_basic_meta`. As this changes the function API, it could break code that does not explicitly name the arguments, so we release this as 2.0.0 v1.1.3 ------ Minor bugfix * Fixes typo that caused validator to fail when nexml.org couldn't be reached v1.1.2 ------- Less aggressive unit-tests * nexml_validate now returns NULL if the validation cannot be performed. Unit tests now consider either TRUE or NULL as acceptable. * Just skips the uuid unit test if uuid package is not available * Documented versioning practice in NEWS v1.1.1 ------ Documentation and less agressive unit tests * Unit tests relying on the Figshare API are not run (without failing) if authentication to figshare server fails * Documentation updated to include examples for all functions v1.1-0 ------ Initial Release RNeXML/data/0000755000176200001440000000000012641021656012233 5ustar liggesusersRNeXML/data/simmap_ex.rda0000644000176200001440000002705012641021656014711 0ustar liggesusers7zXZi"6!XM-])TW"nRʟ0]d;Q\~$u_TYr*g*豫Z׹w'_0y97EaѧA٬\:4z`#l'r WI,X V4w$;0Z#Ž{er}\˥:HwWvT健d()\ONOZGtm}?Nw2O&iDX ٪I6E9.N9=ضdxX ꐤJma!o*Apo(Qy75͘V'"p/]~u7t]KKM R]lBjՙ7odäoػ=. B:vԤckVY˛+_C/ZN{}nm2J[CB |}m*aڙv=|v7z?zҪyѤED_?/<mhY=7oA-FBc<:v9YwW5z`o^Wy"4_>et`b #u8KQE^1/Y ',RWQ ph z(| (ZN diӇI^9}Vy@KsSM;M aN,ŎIN2;#(YmgC0kRrf 5lo⋽̛ЀF&)h81_c$H^ &$"dE,ZX&RAX,@%% BɾZ_8Sm~BlLd~DM3v,'egoQCUʼniĘm˖iO_KbĜ-Qn {(H>2^eo ~ My6S$<W@E\<c>ZdroŠi$e=.=ND'DE ْzyPJi2Tw 9 >SruܿQe:; `vy^o~}f:w?5R9frwG-Ab̗73!G5ӹ\Ay|jasL8$Qrj48rO=e)Ei4 h)jJpo2]ͥx|`TS&HKd+`2ʱ9Үf9pH ``8"mǜ#Ά< ds! LmwZWTS-OƝm_W]jk39ggb.dLb#dc?IE]&F6+-ܻwԿLC+=Ipyl?c{7.=u+*Vu`6`ÿ'3WL t˸|#ST/Ӳĸnz75:SBU5d&Uw: VqA#a1 zV75626KNr\| x0Ƥ˵;ZeЪsZ`nHhB0/]ǵg>P~W F,5(O v[3p铏TЇX !n,wy~]&z{*-t6k SǬ~Ir8^ Pپ7,hzu}BuP#0+p7i?i@oBXehoZ1\U3LJtA -9ruTWK`8l)b=mbn$lQd|*\aC:61(+-8Wߋ 񍿧<2՚/EI M{Co?5DK{gJMDG:'5K=2md!oJ$61:XL[Z8^<ƋnÎwxK:lX.5dtM ȼj |"?BhE+Q/U )66MApNҳ(W3|xx6 M#k'4WO(0>f"105)SpO=R؜Cjg |\ 8b1" ?rѮ[@fxctR9-\ a?dM'}pȡPɌOwcv^.VZD?ᷢ?ў! ~6ƈD5J?IوFpzXwiù6~_t f'\qw5 I ^-7Cg+NKA9?Wm<1.qN#t/ʽ]~,d~tɻiV4=,:{R_Z,F͖l=>&rV~jSbSRyTi)/2oF deF=:8*+0Ѕ4V+N^4IvVO܀Y.Vk/SL}qP&[y%%/\ͦ{L^?P|t$׈}斥9 jD~Nz|JV&7U ;w3p-! x --c'LcO7g380WfCv ]UuvO47j;DI$85t^X`jAD()-!Ϙ1+!pmJny@ƧI!O8J'6 ڬ^y2ʞbpndwǙ32gDs?kBxs7,3@SM;͗ͮrKE= {7"ݘnWCC Z:H36m}1}՗C;ַr{V* 7DOW ͆:Ӛ4 &=Թp*uZ7CW9Ԧ2frv2 aM8w?gX7m N;Q2+x'Lwo7{H*N +fCž3?5}u6Q<$±Lה5B-'\o6f{_}M`?g{??:)vo#4A4g"V4틁C#>[43~6$aU?FgllbɾIީ0k2tUV0S,ܤQ';"Ųz^8P8 Fw`nIb}Lg 1}7MDhC+MO²5}ȃYHo#&CQ26(!^@g$\7ͅ"/ m*@Up;zOrQb f%=D>\T<̘n :j#c`}97C=ʩ'ڰ(}tuZyM \Sb+ATԝegVc &:zq5ڲF̞7 {S $]B.] ,熷n2!]67S<ȪXʹm | K%uBG %mgGHt4)\y@;ȲobpCbJ!I!ZvԪp@'41jq>P-%|/&fssv*yM[=5UO .5PξUjCXR]l'1XVA1-3[|˻2l=r#1{xdradƑa%B&hhNq/5&A z%xNWB$vK¢'$yOVekcT_&>ۨ(- #'DIgy-@ :ݩ(Wn a[6(ܹtxatW SItDt&zv*84@*~n͍[[koLS7m/([t((k 鞪Afֆn3u\<^# !%a)fYʆ)6tsCb`^[%-!0w5nd`쨳uY~wEs")3X=n B7| JB*$.2bg7TXiMl0Wյ)7A;',9L U` "o^6+&N%fV\r28Zf:L/o#E뭒1\Qjqz({`⏋qiJuꢞzd [Aɔ~ȥ/?NG $? o< iA)yj AvD/&UI㉡vD'[XBPrϦYuWXZQ*O*i|lŸU ,"ߴHI:B (ζ⎬?mT=D7Lm'u4h)3*T":s!- ?D9]95j[ujV+vD߮( 8ϳ/md sQGeÿ#$7[NH}_>s#! @Fý,9*3!}E˙-~*aPcK=%˯ٖΠZK骸V`EvaʞW{KBz e_S9x*t8t3m=MQ `5̆$~x"e} !.]J#dn wKFΝ5XVAɝ&9&ZWBS2vR7Lig<e@E?GPB,eP`U5H}3 Z]km>n=1kFI-L\V!ra9r$&M7Q!@2JMz^\[ڤ;dMX'L:n !bILlp!c>B%rJWσ"d2w\;n0Ӛ a}tE*nk@Aq>dI0"9p,^ B >M1!D;N/@J]U L7%݃[RP -Fzx8S|p%\Gm(羿Z?Qa+PtTZ2 N\2"SVFs0Q5>QqZjcJŌnPDg/f4Qݚ -2>3 F"-GtP]{nd\P2m#)2F, E@&еzVڣD1ȑ"{6-)Gš=AR0 zV_C0wU%j@\r K@_?CiK^AtU`k#JnP*1g~ϩ˄ax%썕$q ԭNd l-o^9NΣ[wiDOP^e *S&5Hd,?dwJ\ sޖR :%dƎtsBf\𨂳hT @ MDMfo7k>4s=lhDUvZT|}ጋ5,?yqŶbԤzﵴkx)\rD=)RO9Ԩ6N'FlYMlrֻr"n[;P*x`ח[ac)ũw#mL]q F=Z߲~U]󙜮E5Б2"H@p[IoXzYX$qX\U.}D-Վm 8E]תY4QMכqҔ.3 ml]Ye WBz[eAۡ+9FJ&O~XUϥ4 DU^P>7,SD%fe x=é7h~4/3/~Hoe̍ 9k <.0$~HxO(c9 !ܦD|S VՕ{&%D`*k֋D2@R"(>J;. n{6FG:'9T`fAѲV ͺ]NVq!sͧ%<.Yo RPnO9c-2uш <@G I'~PqzQ v ióp@Fn,Y4u+)]TQ AR.؏J1,jv$2xS˞w8+#&p_'O wjr1;sa+8Vn4hk-|oDM4+K*/Uߋ43j"z '*4&HFL('N,puޤg/̙bBnk} Y%"mPҋ,$xoΈ[bf:Y^^6K^] im޵Z.aS-EUMחJp`nɡ3SLG]?uZ&r!ײmo_1zmIZCfoq͂"TwZv(_jb%JrJ.%##6x ka_V])Z'19B*\^y-ksg(\+1ku.7hjM<`;(4~V!ja"rc0+.qrIγmyQ1me*$$)B I%L^G֊3e]ywg9݂ߎ8D[84x 5K>n6W;cK0o,j`֖Zٳ O!X?^;Adv!Gjci!gnoOZgdjV漁|e9-q5?^ǺXe)B{t:ɢ8m(>.@Ke^]hwү҄cTZEt3ߧ(aOt&CzhcwJ=g C霁˟S~޲^΅wߨb9P]6;+%2۫974[YsX錴/xo80M0#u1 ~chsѩIR"bw5.^+ Yj|xP:H}_Su}?3Aq▍t - Y,:6ܳe\h&m@KF2-KcӶ4ߖ6@noZbKL1 UKCl!cz@su5uQa *jpRBMn s67)$y:v3zU3Ch@9#KW:+o2Vk{xyxF(ώ8tǂ^[ X$ EyPT.n)@&S66^T&#Av֊s<-86?ɉɎ+ Z7Np3}Z0ȈԇW4~/Fm;5,@rX;_22xHNX>Aפ=Ʉ9a\AfS^"t1ɩ"1,q} uBH\@7 פf^g nytRHEv4Nל 1w@ OzeWeg`b]zŬc@hE|[k6U|FbdC` mR4t|H˵t=<7ۉH\|'U(T|d'NR|~`=mʐxwo`Ox]t˨ܐQ͞|ަg"$^B$w3T& 9ٌdqa= fkb殉1\_]&aỳ3%lb?yunUJ@W5~0&Y[&$d2ܺkSԎjllMTPC,UJ?3})$n'52X7>%<\"rR%F`dbe&h Ojnن!&u؀?R8 ƞK&JIݴRxeow!_pS%-,.pV5䇔%AI93E~joxιW{Hȳvf= q|gs%js7 A ;~Ћ% SRVw)v=JI'5c֔øn"+IVn^ѺKkPdG}ָ;8z`3הg[0V,~y7l7+j8f׵O+@Xϵ6ߴCbiN.(\=}hnw6gU,Z$Rony#wBؽ.o;Pq {wuOcǞnM ~ڕXHuW YREOڏڝۙgb[4#1/S]fo+pOKpӴfej{7\L^TlkL A~^giWk;Ӥ{ xϏ43.zPu,ީݔg Wk<" $a .OdT%j}F>0{FBGmX&1SpHV<>}J}Ҳː&y$jD⒱(}tN|T|B+dIds5WL ()2I"JCzdxZ 7A}J97a^R:9&ofȞJ I_8E˸( gIq^ˮIH ;Ϥqy5(u1kBK,z'g< QsUUtX y2 >9n> ^oͨuev(EemQ[ O*+03>, Iк[=IUT9Yg汞SM$@M K~B^(X1p4%|WIO0N#,/yS`j{I Sk@6(4uD ?~6BmO$iSj<G]_({. )ٰL!%,rt La Vn3<h^E[v!T 'JHVA'ib'I@@8A?g ġ,:ĭ'0Wr.*+4OZ}PGĐ_`$Q^'ES>c_.UA̪-͢cY#K *mMY ֟ 4308%}"2r];n>/11\ԛ b>0 YZRNeXML/R/0000755000176200001440000000000012734324011011515 5ustar liggesusersRNeXML/R/get_level.R0000644000176200001440000001017012641021656013613 0ustar liggesusers ## Should be all element names (unless they are also attribute names!), since we only want attribute names ## (alternately we should define to only grab possible attribute names..) SKIP = c("meta", "children", "member", "row", "cell", "seq", "matrix", "format", "names") #' get_level #' #' get a data.frame of attribute values of a given node #' #' @param nex a nexml object #' @param level a character vector indicating the class of node, see details #' #' @return Returns the attributes of specified class of nodes as a data.frame #' #' @details level should be a character vector giving the path to the specified node #' group. For instance, `otus`, `characters`, and `trees` are top-level blocks (e.g. #' child nodes of the root nexml block), and can be specified directly. To get metadata #' for all "char" elements from all characters blocks, you must specify that `char` nodes #' are child nodes to `character` nodes: e.g. `get_level(nex, "characters/char")`, #' or similarly for states: `get_level(nex, characters/states)`. #' #' The return object is a data frame whose columns are the attribute names of the elements #' specified. The column names match the attribute names except for "id" attribute, for which the column #' is renamed using the node itself. (Thus would be rendered in a data.frame with column #' called "otus" instead of "id"). Additional columns are #' added for each parent element in the path; e.g. get_level(nex, "otus/otu") would include a column #' named "otus" with the id of each otus block. Even though the method always returns the data frame #' for all matching nodes in all blocks, these ids let you see which otu values came from which #' otus block. This is identical to the function call `get_taxa()`. #' Similarly, `get_level(nex, "otus/otu/meta")` would return additional columns 'otus' and #' also a column, 'otu', with the otu parent ids of each metadata block. (This is identical to a #' function call to `get_metadata`). This makes it easier to join data.frames as well, see examples #' #' @export #' @importFrom dplyr select get_level <- function(nex, level){ lvl <- strsplit(level, "/")[[1]] out <- recursion(1, lvl)(nex) %>% dplyr::select_(quote(-nexml)) ## drop columns that are all-na? # all_na <- sapply(out, function(x) all(is.na(x))) # out <- out[!all_na] out } ## Trick to apply nodelist_to_df iteratively closure <- function(level, fun) function(node) nodelist_to_df(node, level, fun) recursion <- function(i, level){ if(i < length(level)) closure(level[i], recursion(i+1, level)) else closure(level[i], attributes_to_row) } ## Assumes slot(node, element) is a list #' @importFrom lazyeval interp #' @importFrom dplyr bind_rows mutate_ %>% nodelist_to_df <- function(node, element, fn){ dots <- setNames(list(lazyeval::interp(~x, x = node_id(node))), class(node)) nodelist <- slot(node, element) if(is.list(nodelist)){ ## node has a list of elements nodelist %>% lapply(fn) %>% dplyr::bind_rows() %>% dplyr::mutate_(.dots = dots) -> out } else { ## handle case when node has only one element fn(nodelist) %>% dplyr::mutate_(.dots = dots) } } node_id <- function(node){ if("id" %in% slotNames(node)) slot(node, "id") else "root" } attributes_to_row <- function(node){ who <- slotNames(node) ## Avoid things that are not attributes: types <- sapply(who, function(x) class(slot(node,x))) who <- who[ types %in% c("character", "integer", "numeric", "logical") ] if("names" %in% who) who <- who[!(who %in% "names")] ## Extract attributes, use NAs for numeric(0) / character(0) values tmp <- sapply(who, function(x) slot(node, x)) tmp[sapply(tmp,length) < 1] <- NA ## Coerce into a row of a data.frame & rename id column to match class out <- data.frame(as.list(tmp), stringsAsFactors=FALSE) if("id" %in% who) out <- dplyr::rename_(out, .dots = setNames("id", class(node))) out } ## Depricated method, still in use in some other functions setxpath <- function(object){ tmp <- tempfile() suppressWarnings(saveXML(object, tmp)) doc <- xmlParse(tmp) unlink(tmp) doc } RNeXML/R/classes.R0000644000176200001440000005636212641021656013317 0ustar liggesusers setGeneric("fromNeXML", function(obj, from) standardGeneric("fromNeXML")) setGeneric("toNeXML", valueClass="XMLInternalElementNode", function(object, parent) standardGeneric("toNeXML")) # Rather verbose methods definitions manually reading and writing each class... # Rather than just slot matching names, we explicitly map each attribute... ############################## setClass("Base", slots = c('xsi:type' = "character")) setMethod("toNeXML", signature("Base", "XMLInternalElementNode"), function(object, parent){ type <- slot(object, "xsi:type") if(length(type) > 0){ #if(is.na(pmatch("nex:", type))) # nex or relevant namespace should come from default anyway # type <- paste0("nex:", type) addAttributes(parent, "xsi:type" = type, suppressNamespaceWarning=TRUE) # We always define xsi namespace in the header... } parent }) setMethod("fromNeXML", signature("Base", "XMLInternalElementNode"), function(obj, from){ if(!is.null(xmlAttrs(from))){ if(!is.na(xmlAttrs(from)["type"])) ## FIXME use [["type"]] or ["type"] slot(obj, "xsi:type") <- as.character(xmlAttrs(from)["type"]) if(!is.na(xmlAttrs(from)["xsi:type"])) ## Shouldn't be necessary but seems to be for first test in test_inheritance.R... slot(obj, "xsi:type") <- as.character(xmlAttrs(from)["xsi:type"]) } obj } ) ######################### setClass("Meta", slots = c(children = "list"), contains = "Base") setMethod("fromNeXML", signature("Meta", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() } ) setMethod("toNeXML", signature("Meta", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() }) ######################### setClass("LiteralMeta", slots = c(id = "character", property = "character", datatype = "character", content = "character"), contains="Meta") setMethod("fromNeXML", signature("LiteralMeta", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() attrs <- xmlAttrs(from) obj@property <- attrs[["property"]] if(!is.na(attrs["datatype"])) obj@datatype <- attrs[["datatype"]] if(!is.na(attrs["content"])) obj@content <- attrs[["content"]] if(!is.na(attrs["id"])) obj@id <- attrs[["id"]] obj } ) setMethod("toNeXML", signature("LiteralMeta", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() attrs <- c(id = unname(object@id), property = unname(object@property), # required datatype = unname(object@datatype), # optional content = unname(object@content)) # required attrs <- plyr::compact(attrs) addAttributes(parent, .attrs = attrs) }) setAs("XMLInternalElementNode", "LiteralMeta", function(from) fromNeXML(new("LiteralMeta"), from)) setAs("LiteralMeta", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("meta"))) setAs("LiteralMeta", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("meta"))) ############################################## setClass("ResourceMeta", slots = c(id = "character", rel = "character", href = "character"), contains="Meta") setMethod("fromNeXML", signature("ResourceMeta", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() attrs <- xmlAttrs(from) obj@href <- attrs[["href"]] if(!is.na(attrs["id"])) obj@id <- attrs[["id"]] if(!is.na(attrs[["rel"]])) obj@rel <- attrs[["rel"]] obj } ) setMethod("toNeXML", signature("ResourceMeta", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() attrs <- c(id = unname(object@id), href = unname(object@href), rel = unname(object@rel)) attrs <- plyr::compact(attrs) addAttributes(parent, .attrs = attrs) }) setAs("XMLInternalElementNode", "ResourceMeta", function(from) fromNeXML(new("ResourceMeta"), from)) setAs("ResourceMeta", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("meta"))) setAs("ResourceMeta", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("meta"))) ############################################## setClass("meta", contains=c("LiteralMeta", "ResourceMeta")) setAs("XMLInternalElementNode", "meta", function(from){ type <- xmlAttrs(from)["type"] if(is.na(type)) ## FIXME This is CRUDE type <- xmlAttrs(from)["xsi:type"] if(is.na(type)) # if still not defined... fromNeXML(new("meta", from)) else { type <- gsub(".*:", "", type) ## FIXME This is CRUDE fromNeXML(new(type[1]), from) } }) setAs("meta", "XMLInternalElementNode", function(from){ if(length( slot(from, "xsi:type") ) > 0 ){ if(grepl("LiteralMeta|ResourceMeta", slot(from, "xsi:type"))) m <- as(from, slot(from, "xsi:type")) } else m <- from toNeXML(m, newXMLNode("meta", .children = from@children)) }) setAs("meta", "XMLInternalNode", function(from) as(from, "XMLInternalElementNode")) # Methods inherited automatically? ############################################### setClass("ListOfmeta", slots = c(names="character"), contains = "list") ############################################### setClass("Annotated", slots = c(meta = "ListOfmeta", about = "character"), contains = "Base") setMethod("fromNeXML", signature("Annotated", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0) obj@meta <- new("ListOfmeta", lapply(kids[names(kids) == "meta"], as, "meta")) if(!is.null(xmlAttrs(from))) if(!is.na(xmlAttrs(from)["about"])) obj@about <- xmlAttrs(from)["about"] obj }) setMethod("toNeXML", signature("Annotated", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@meta) if(length(object@about) > 0) addAttributes(parent, "about" = object@about) parent }) ###################################################### setClass("Labelled", slots = c(label = "character"), contains = "Annotated") setMethod("fromNeXML", signature("Labelled", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["label"])) obj@label <- xmlAttrs(from)["label"] obj } ) setMethod("toNeXML", signature("Labelled", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@label) > 0) addAttributes(parent, "label" = object@label) parent }) ############################## setClass("IDTagged", slots = c(id = "character"), contains = "Labelled") setMethod("fromNeXML", signature("IDTagged", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["id"])) obj@id <- as.character(xmlAttrs(from)["id"]) obj } ) setMethod("toNeXML", signature("IDTagged", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@id) > 0) addAttributes(parent, "id" = object@id) parent }) ############################## setClass("OptionalTaxonLinked", slots = c(otu = "character"), contains = "IDTagged") setMethod("fromNeXML", signature("OptionalTaxonLinked", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["otu"])) obj@otu <- as.character(xmlAttrs(from)["otu"]) obj } ) setMethod("toNeXML", signature("OptionalTaxonLinked", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@otu) > 0) addAttributes(parent, "otu" = object@otu) parent }) ############################## setClass("TaxaLinked", slots = c(otus = "character"), contains = "IDTagged") setMethod("fromNeXML", signature("TaxaLinked", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["otus"])) obj@otus <- as.character(xmlAttrs(from)["otus"]) obj } ) setMethod("toNeXML", signature("TaxaLinked", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@otus) > 0) addAttributes(parent, "otus" = object@otus) parent }) ############################## Really AbstractNode setClass("node", slots = c(root = "logical"), contains = "OptionalTaxonLinked") setMethod("fromNeXML", signature("node", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["root"])) obj@root <- as.logical(xmlAttrs(from)["root"]) obj } ) setMethod("toNeXML", signature("node", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@root) > 0) addAttributes(parent, "root" = tolower(object@root)) parent }) setAs("node", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("node"))) setAs("node", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("node"))) setAs("XMLInternalElementNode", "node", function(from) fromNeXML(new("node"), from)) ################################ Really AbstractEdge setClass("edge", slots = c(source = "character", target = "character", length = "numeric"), contains="IDTagged") setMethod("fromNeXML", signature("edge", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() attrs <- xmlAttrs(from) obj@source <- attrs["source"] obj@target <- attrs["target"] if(!is.na(attrs["length"])) obj@length <- as.numeric(attrs["length"]) obj } ) setMethod("toNeXML", signature("edge", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "source" = object@source) addAttributes(parent, "target" = object@target) if(length(object@length) > 0) addAttributes(parent, "length" = object@length) parent }) setAs("edge", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("edge"))) setAs("edge", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("edge"))) setAs("XMLInternalElementNode", "edge", function(from) fromNeXML(new("edge"), from)) ################################################## setClass("rootEdge", slots = c(source = "character", target = "character", length = "numeric"), contains="IDTagged") setMethod("fromNeXML", signature("rootEdge", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() attrs <- xmlAttrs(from) obj@target <- attrs["target"] if(!is.na(attrs["length"])) obj@length <- as.numeric(attrs["length"]) obj } ) setMethod("toNeXML", signature("rootEdge", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "target" = object@target) if(length(object@length) > 0) addAttributes(parent, "length" = object@length) parent }) setAs("rootEdge", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("rootedge"))) setAs("rootEdge", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("rootedge"))) setAs("XMLInternalElementNode", "rootEdge", function(from) fromNeXML(new("rootEdge"), from)) ################################ alternatively called "Taxon" by the schema setClass("otu", contains = "IDTagged") setMethod("fromNeXML", signature("otu", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj }) setMethod("toNeXML", signature("otu", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() parent }) setAs("otu", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("otu"))) setAs("otu", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("otu"))) setAs("XMLInternalElementNode", "otu", function(from) fromNeXML(new("otu"), from)) ################################ alternatively called Taxa by the schema setClass("ListOfotu", slots = c(names="character"), contains = "list", validity = function(object) if(!all(sapply(object, is, "otu"))) "not all elements are otu objects" else TRUE) ############################### setClass("otus", slots = c(otu = "ListOfotu", names="character"), contains = "IDTagged") setMethod("fromNeXML", signature("otus", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0) obj@otu <- new("ListOfotu", lapply(kids[names(kids) == "otu"], as, "otu")) obj }) setMethod("toNeXML", signature("otus", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@otu) parent }) setAs("otus", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("otus"))) setAs("otus", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("otus"))) setAs("XMLInternalElementNode", "otus", function(from) fromNeXML(new("otus"), from)) ################################ setClass("ListOfedge", slots = c(names="character"), contains = "list", validity = function(object) if(!all(sapply(object, is, "edge"))) "not all elements are meta objects" else TRUE) setClass("ListOfnode", slots = c(names="character"), contains = "list", validity = function(object) if(!all(sapply(object, is, "node"))) "not all elements are meta objects" else TRUE) ################################## actually AbstractTree setClass("tree", slots = c(node = "ListOfnode", edge = "ListOfedge", rootedge = "rootEdge"), # Actually AbstractRootEdge contains = "IDTagged") setMethod("fromNeXML", signature("tree", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) obj@node <- new("ListOfnode", lapply(kids[names(kids) == "node"], as, "node")) obj@edge <- new("ListOfedge", lapply(kids[names(kids) == "edge"], as, "edge")) obj }) setMethod("toNeXML", signature("tree", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@node) addChildren(parent, kids = object@edge) parent }) setAs("tree", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("tree"))) setAs("tree", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("tree"))) setAs("XMLInternalElementNode", "tree", function(from) fromNeXML(new("tree"), from)) ################################################ setClass("ListOftree", slots = c(names="character"), contains = "list") # validity can contain tree or network nodes? setClass("trees", slots = c(tree = "ListOftree"), # Can contain networks... contains = "TaxaLinked") setMethod("fromNeXML", signature("trees", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) obj@tree <- new("ListOftree", lapply(kids[names(kids) == "tree"], as, "tree")) obj }) setMethod("toNeXML", signature("trees", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@tree) # addChildren(parent, kids = object@network) parent }) setAs("trees", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("trees"))) setAs("trees", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("trees"))) setAs("XMLInternalElementNode", "trees", function(from) fromNeXML(new("trees"), from)) #################################################### setClass("ListOfotus", slots = c(names="character"), contains = "list") setClass("ListOftrees", slots = c(names="character"), contains = "list") setClass("ListOfcharacters", slots = c(names="character"), contains = "list") #################################################### nexml_namespaces <- c("nex" = "http://www.nexml.org/2009", "xsi" = "http://www.w3.org/2001/XMLSchema-instance", "xml" = "http://www.w3.org/XML/1998/namespace", "cdao" = "http://purl.obolibrary.org/obo/cdao.owl", "xsd" = "http://www.w3.org/2001/XMLSchema#", "dc" = "http://purl.org/dc/elements/1.1/", "dcterms" = "http://purl.org/dc/terms/", "ter" = "http://purl.org/dc/terms/", "prism" = "http://prismstandard.org/namespaces/1.2/basic/", "cc" = "http://creativecommons.org/ns#", "ncbi" = "http://www.ncbi.nlm.nih.gov/taxonomy#", "tc" = "http://rs.tdwg.org/ontology/voc/TaxonConcept#") setClass("nexml", slots = c(version = "character", generator = "character", "xsi:schemaLocation" = "character", # part of base? namespaces = "character", # part of base? otus = "ListOfotus", trees = "ListOftrees", characters="ListOfcharacters"), prototype = prototype(version = "0.9", generator = "RNeXML", "xsi:schemaLocation" = "http://www.nexml.org/2009/nexml.xsd", namespaces = c(nexml_namespaces, "http://www.nexml.org/2009")), contains = "Annotated") setMethod("fromNeXML", signature("nexml", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() # handle attributes attrs <- xmlAttrs(from) obj@version <- attrs["version"] # required attribute if(!is.na(attrs["generator"])) # optional attribute obj@generator <- attrs["generator"] if(!is.na(attrs["xsi:schemaLocation"])) slot(obj, "xsi:schemaLocation") <- attrs["xsi:schemaLocation"] if(!is.na(attrs["schemaLocation"])) slot(obj, "xsi:schemaLocation") <- attrs["schemaLocation"] if(!is.na(attrs["xsi:type"])) slot(obj, "xsi:type") <- attrs["xsi:type"] if(!is.na(attrs["type"])) slot(obj, "xsi:type") <- attrs["type"] if(!is.na(attrs["about"])) obj@about <- attrs["about"] ns_defs <- xmlNamespaceDefinitions(from) ns <- sapply(ns_defs, `[[`, "uri") obj <- add_namespaces(ns, obj) # Handle children kids <- xmlChildren(from) # at least 1 OTU block is required obj@otus <- new("ListOfotus", lapply(kids[names(kids) == "otus"], as, "otus")) if("characters" %in% names(kids)) obj@characters <- new("ListOfcharacters", lapply(kids[names(kids) == "characters"], as, "characters")) if("trees" %in% names(kids)) obj@trees <- new("ListOftrees", lapply(kids[names(kids) == "trees"], as, "trees")) obj }) setMethod("toNeXML", signature("nexml", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "version" = object@version) if(length(object@generator)>0) addAttributes(parent, "generator" = object@generator) # Coercion of object to XML happens automatically addChildren(parent, kids = object@otus) # a list of "otus" objects addChildren(parent, kids = object@trees) # a list of "trees" objects addChildren(parent, kids = object@characters) # a list of "characters" objects parent }) ## NOTE: The root nexml element must have it's namespace setAs("nexml", "XMLInternalNode", function(from) suppressWarnings(toNeXML(from, newXMLNode("nex:nexml", namespaceDefinitions = from@namespaces)))) setAs("nexml", "XMLInternalElementNode", function(from) suppressWarnings(toNeXML(from, newXMLNode("nex:nexml", namespaceDefinitions = from@namespaces)))) setAs("XMLInternalElementNode", "nexml", function(from) fromNeXML(new("nexml"), from)) ####################################################### RNeXML/R/add_characters.R0000644000176200001440000002170312641021656014600 0ustar liggesusers#################### Write character matices into S4 ##################### #' Add character data to a nexml object #' #' @param x character data, in which character traits labels are column names #' and taxon labels are row names. x can be in matrix or data.frame #' format. #' @param nexml a nexml object, if appending character table to an existing #' nexml object. If ommitted will initiate a new nexml object. #' @param append_to_existing_otus logical. If TRUE, will add any new taxa #' (taxa not matching any existing otus block) to the existing (first) #' otus block. Otherwise (default), a new otus block is created, even #' though it may contain duplicate taxa to those already present. While #' FALSE is the safe option, TRUE may be appropriate when building nexml #' files from scratch with both characters and trees. #' @include classes.R #' @examples #' library("geiger") #' data(geospiza) #' geiger_nex <- add_characters(geospiza$dat) #' @export add_characters <- function(x, nexml = new("nexml"), append_to_existing_otus=FALSE){ # FIXME does it make sense to take a phylo object here as an option? # If so, perhaps don't call the argument 'nexml'. # If not, then we don't really need this conversion. (maybe a type-check instead). nexml <- as(nexml, "nexml") ## Check types & row names ## x <- format_characters(x) j <- length(nexml@characters) ## add after any existing character matrices nexml <- add_character_nodes(nexml, x) for(i in 1:length(x)){ new_taxa <- rownames(x[[i]]) nexml <- add_otu(nexml, new_taxa, append=append_to_existing_otus) ## Add the otus id to the characters node otus_id <- nexml@otus[[length(nexml@otus)]]@id nexml@characters[[i+j]]@otus <- get_by_id(nexml@otus, otus_id)@id nexml <- add_char(nexml, x, i, j) nexml <- add_states(nexml, x, i, j) } for(i in 1:length(x)){ nexml <- add_rows(nexml, x, i, j) } nexml } add_character_nodes <- function(nexml, x){ n <- length(x) cs_list <- lapply(1:n, function(i){ uid <- nexml_id("cs") characters <- new("characters", id = uid, about = paste0("#", uid)) if(class(x[[i]][[1]]) == "numeric") ## Should be numeric but not integer! type <- "ContinuousCells" else ## Should be integer! type <- "StandardCells" slot(characters, "xsi:type") <- type characters }) cs_list <- c(nexml@characters, cs_list) nexml@characters <- new("ListOfcharacters", cs_list) nexml } otu_list <- function(to_add, prefix="ou"){ lapply(to_add, function(label){ uid <- nexml_id(prefix) new("otu", label=label, id =uid, about = paste0("#", uid))}) } add_otu <- function(nexml, new_taxa, append=FALSE){ current_taxa <- get_taxa_list(nexml) if(length(current_taxa) == 0) { # No otus exist, create a new node otus <- new_otus_block(nexml, new_taxa) nexml@otus <- new("ListOfotus", c(nexml@otus, otus)) } else { otu_pos <- lapply(current_taxa, function(current) match(new_taxa, current)) if(any(is.na(unlist(otu_pos)))){ # We have missing taxa if(append){ ## append to otus block `otus_id` ## otus_id <- 1 # position that matches the id string to_add <- new_taxa[sapply(otu_pos, is.na)] nexml@otus[[otus_id]]@otu <- new("ListOfotu", c(nexml@otus[[otus_id]]@otu, otu_list(to_add, "ou_char"))) ## FIXME hack to make sure new ids are 'unique', } else { ## Alternatively, do not append ## otus <- new_otus_block(nexml, new_taxa) nexml@otus <- new("ListOfotus", c(nexml@otus, otus)) } } # else # all taxa matched, so we're all set } nexml } new_otus_block <- function(nexml, to_add){ id <- nexml_id("os") new("ListOfotus", list(new("otus", id = id, about = paste0("#", id), otu = new("ListOfotu", otu_list(to_add)) ))) } # Turns char names into char nodes add_char <- function(nexml, x, i = 1, j = 0){ char_labels <- colnames(x[[i]]) char_list <- lapply(char_labels, function(lab){ id <- nexml_id("cr") char <- new("char", id = id, about = paste0("#", id), label = lab) }) nexml@characters[[i+j]]@format@char <- new("ListOfchar", char_list) nexml } add_states <- function(nexml, x, i = 1, J = 0){ # don't ctreate a states node if data is numeric if(all(sapply(x[[i]], is.numeric))) nexml else { nchars <- length(x[[i]]) char <- nexml@characters[[i+J]]@format@char states_list <- lapply(1:nchars, function(j){ lab <- char[[j]]@label lvls <- levels(x[[i]][[lab]]) id <- nexml_id("ss") states <- new("states", id = id, about = paste0("#", id), state = new("ListOfstate", lapply(lvls, function(lvl){ new("state", id=nexml_id("s"), symbol = as.integer(as.factor(lvl))) })) ) }) nexml@characters[[i+J]]@format@states <- new("ListOfstates", states_list) # Add the states's id to char for(j in 1:nchars) nexml@characters[[i+J]]@format@char[[j]]@states <- states_list[[j]]@id nexml } nexml } ## Assumes that otu ids have already been added to the nexml add_rows <- function(nexml, x, i = 1, j = 0){ X <- x[[i]] taxa <- rownames(X) char_labels <- colnames(X) ## get the relevant characters block and otus block cs <- nexml@characters[[i+j]]@id os <- nexml@characters[[i+j]]@otus otu_map <- get_otu_maps(nexml)[[os]] char_map <- get_char_maps(nexml)[[cs]] state_map <- get_state_maps(nexml)[[cs]] reverse_otu_map <- reverse_map(otu_map) reverse_char_map <- reverse_map(char_map) reverse_state_map <- reverse_map(state_map) mat <- new("obsmatrix", row = new("ListOfrow", lapply(taxa, function(taxon){ id = nexml_id("rw") new("row", id = id, about = paste0("#", id), label = taxon, otu = reverse_otu_map[taxon], cell = new("ListOfcell", lapply(char_labels, function(char){ state <- X[taxon,char] # unmapped char_id <- reverse_char_map[[char]] if(!is.null(state_map)) state <- reverse_state_map[[char_id]][state] new("cell", char = char_id, state = as.character(state)) })) ) })) ) nexml@characters[[i+j]]@matrix <- mat nexml } ## divide matrix into discrete and continuous trait matrices, if necessary ## then write each as separate nodes: ## x should now be a list of data.frames of common type format_characters <- function(x){ if(is(x, "numeric")) x <- as.data.frame(x) ## Actually useful conversions ## ## Matrices are either all-numeric or all-character class, so no risk of mixed discrete and continous states. if(is(x, "matrix")){ x <- list(as.data.frame(x)) ## Data.frames can mix discrete and continous states, so we need to seperate them } else if(is(x, "data.frame") && dim(x)[2] > 1) { x <- split_by_class(x) } else if(is(x, "data.frame") && dim(x)[2] == 1) { x <- list(x) #### Ugh, this next bit isn't pretty. Maybe we should just hope lists are formatted correctly, e.g. come from get_character_list. ## If we're getting a list with matrices, coerce them into data.frames and hope for the best. ## If the list has data.frames, check that each one has consistent class type. ## Otherwise, panic. } else if(is(x, "list")) { for(i in 1:length(x)){ if(is(x[[i]], "matrix")) x[[i]] <- as.data.frame(x[[i]]) ## A list of matrices we can make into a list of data.frames... } ## Someone didn't even try to read the documentation... } else { stop("x must be a named numeric, matrix, data.frame, or list thereof") } ## Let's just hope folks read the documentation and have ## row names as taxa and column names as be character traits. ## Kinda hard to check that for sure? ## return the updated object: a list of data.frames x } ## Helper function for the above, contains the primary functionality # divide a data.frame into a list of data.frames, in which each has only a unique column class split_by_class <- function(x){ col.classes <- sapply(x, class) if(all(sapply(col.classes, identical, col.classes[1]))) x <- list(x) else { ## split into numerics and non-numerics cts <- unname(which(col.classes=="numeric")) discrete <- unname(which(col.classes!="numeric")) x <- list(x[cts], x[discrete]) } x } RNeXML/R/simmap.R0000644000176200001440000001560712641021656013145 0ustar liggesusers# simmap.R # if(!is.null(phy$maps)) ## FIXME write the characters/states block (but not matrix block) as well. ## FIXME support writing multiphylos, list of multiphylos to nexml #' simmap_to_nexml #' #' simmap_to_nexml #' @param phy a phy object containing simmap phy$maps element, #' from the phytools pacakge #' @param state_ids a named character vector giving the state #' names corresponding to the ids used to refer to each state #' in nexml. If null ids will be generated and states taken from #' the phy$states names. #' @return a nexml representation of the simmap #' @export #' @import XML #' @examples #' data(simmap_ex) #' phy <- nexml_to_simmap(simmap_ex) #' nex <- simmap_to_nexml(phy) simmap_to_nexml <- function(phy, state_ids = NULL){ ## Hack to deal with S3 class issues when coercing to S4 if(class(phy) == c("simmap", "phylo")) class(phy) <- "phylo" ## Create the NeXML object nexml <- as(phy, "nexml") if(!is.null(phy$states)){ nexml <- add_characters(data.frame(states = as.integer(as.factor(phy$states))), nexml) ## FIXME doesn't have states chars_ids <- get_state_maps(nexml)[[1]] char_id <- names(chars_ids) state_ids <- reverse_map(chars_ids[[1]]) # can assume no other states added yet since works on a phy, not nexml } if(!is.null(phy$maps)) nexml <- simmap_edge_annotations(maps = phy$maps, nexml = nexml, state_ids = state_ids, char_id = char_id) nexml } simmap_edge_annotations <- function(maps, nexml, state_ids = NULL, char_id = "simmapped_trait"){ ## if state ids are not given if(is.null(state_ids)){ state_ids <- levels(as.factor(names(unlist(maps)))) names(state_ids) <- state_ids } # Loop over all edges, adding the simmap annotation to each: for(i in 1:length(nexml@trees[[1]]@tree[[1]]@edge)){ ## FIXME check to assure this is always the correct order?? ## Read the mapping of the current edge edge_map <- maps[[i]] ## Generate the list of XML "stateChange" nodes mapping <- lapply(1:length(edge_map), function(j){ ## A node has an id, a length and a state meta(property = "simmap:stateChange", children = list(meta(property = "simmap:order", content = j), meta(property = "simmap:length", content = edge_map[[j]]), meta(property = "simmap:state", content = state_ids[[names(edge_map[j])]] ) ) ) }) reconstruction <-meta(property = "simmap:reconstruction", children =c(list(meta(property="simmap:char", content = char_id)), mapping)) ## Insert the reconstructions into a element in each nexml edge nexml@trees[[1]]@tree[[1]]@edge[[i]]@meta <- c(meta(type = "LiteralMeta", property = "simmap:reconstructions", children = list(reconstruction))) } ## Return the entire nexml object nexml <- add_namespaces(c(simmap = "https://github.com/ropensci/RNeXML/tree/master/inst/simmap.md"), nexml) nexml } ## Returns list of multiPhylo ... #' nexml_to_simmap #' #' nexml_to_simmap #' @param nexml a nexml object #' @return a simmap object (phylo object with a $maps element #' for use in phytools functions). #' @export #' @examples #' data(simmap_ex) #' phy <- nexml_to_simmap(simmap_ex) #' nex <- simmap_to_nexml(phy) nexml_to_simmap <- function(nexml){ ## Get the statemap, if available characters <- get_characters(nexml) ## loop over trees blocks out <- lapply(nexml@trees, function(trees){ phys <- lapply(trees@tree, tree_to_simmap, get_otu_maps(nexml)[[trees@otus]], get_state_maps(nexml)[[1]][[1]] ) phys <- lapply(phys, characters_to_simmap, characters) names(phys) <- NULL class(phys) <- "multiPhylo" phys }) if(length(out) == 1){ if(length(out[[1]]) > 1){ flatten_multiphylo(out) } else { out[[1]][[1]] } } } characters_to_simmap <- function(phy, characters){ out <- as.character(characters[[1]]) # coerce factor to string names(out) <- rownames(characters) phy$states <- out phy } # given the nexml tree: tree_to_simmap <- function(tree, otus, state_maps = NULL){ maps <- lapply(tree@edge, function(edge){ reconstructions <- sapply(edge@meta, function(x) x@property == "simmap:reconstructions") if(any(reconstructions)) reconstruction <- edge@meta[[which(reconstructions)]]@children else { # handle exceptions warning("no simmap data found") return(toPhylo(tree, otus)) ## no simmap found } # lapply(reconstruction, function(reconstruction){ # for each reconstruction stateChange <- sapply(reconstruction[[1]]@children, function(x) x@property == "simmap:stateChange") values <- sapply(reconstruction[[1]]@children[which(stateChange)], function(stateChange){ # phytools only supports one reconstruction of one character per phy object property <- sapply(stateChange@children, function(x) x@property) names(stateChange@children) <- property # clean labels sapply(stateChange@children, function(x) x@content) }) out <- as.numeric(values["simmap:length", ]) ordering <- as.numeric(values["simmap:order",]) if(!is.null(state_maps)) states <- state_maps[values["simmap:state", ]] else states <- values["simmap:state", ] names(out) <- states out <- out[ordering] # sort according to explicit order out # }) }) names(maps) <- NULL phy <- toPhylo(tree, otus) phy$maps <- maps ## create the rest of the maps elements ## Return phylo object phy } #' @name simmap_ex #' @title A nexml class R object that includes simmap annotations #' @description A nexml object with simmap stochastic character mapping #' annotations added to the edges, for use with the RNeXML package #' parsing and serializing NeXML into formats that work with the ape and #' phytools packages. #' @docType data #' @usage simmap_ex #' @format a \code{nexml} instance #' @source Simulated tree and stochastic character mapping based on #' Revell 2011 (doi:10.1111/j.2041-210X.2011.00169.x) #' @author Carl Boettiger NULL ## Extend directly with XML representation instead of S4 #setClass("simmap:reconstructions", # slots = c(reconstruction = "ListOfreconstruction")) #setClass("ListOfreconstruction", contains = "list") #setClass("simmap:stateChange", contains = "IDTagged") # RNeXML/R/get_characters.R0000644000176200001440000001167012734324011014623 0ustar liggesusers#' Get character data.frame from nexml #' #' @param nex a nexml object #' @param rownames_as_col option to return character matrix rownames (with taxon ids) as it's own column in the #' data.frame. Default is FALSE for compatibility with geiger and similar packages. #' @param otu_id logical, default FALSE. return a column with the #' otu id (for joining with otu metadata, etc) #' @param otus_id logical, default FALSE. return a column with the #' otus block id (for joining with otu metadata, etc) #' @return the character matrix as a data.frame #' @details RNeXML will attempt to return the matrix using the NeXML taxon (otu) labels to name the rows #' and the NeXML char labels to name the traits (columns). If these are unavailable or not unique, the NeXML #' id values for the otus or traits will be used instead. #' @importFrom tidyr spread #' @importFrom dplyr left_join select_ matches #' @importFrom stringr str_replace #' @importFrom stats setNames #' @export #' @examples #' \dontrun{ #' # A simple example with a discrete and a continous trait #' f <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- read.nexml(f) #' get_characters(nex) #' #' # A more complex example -- currently ignores sequence-type characters #' f <- system.file("examples", "characters.xml", package="RNeXML") #' nex <- read.nexml(f) #' get_characters(nex) #' } get_characters <- function(nex, rownames_as_col=FALSE, otu_id = FALSE, otus_id = FALSE){ drop = lazyeval::interp(~-dplyr::matches(x), x = "about|xsi.type|format") otus <- get_level(nex, "otus/otu") %>% dplyr::select_(drop) %>% optional_labels(id_col = "otu") char <- get_level(nex, "characters/format/char") %>% dplyr::select_(drop) %>% optional_labels(id_col = "char") ## Rows have otu information rows <- get_level(nex, "characters/matrix/row") %>% dplyr::select_(.dots = c("otu", "row")) cells <- get_level(nex, "characters/matrix/row/cell") %>% dplyr::select_(.dots = c("char", "state", "row")) %>% dplyr::left_join(rows, by = "row") characters <- get_level(nex, "characters") ## States, including polymorphic states (or uncertain states) states <- get_level(nex, "characters/format/states/state") ## Include polymorphic and uncertain states polymorph <- get_level(nex, "characters/format/states/polymorphic_state_set") uncertain <- get_level(nex, "characters/format/states/uncertain_state_set") if(dim(polymorph)[1] > 0) states <- dplyr::bind_rows(states, polymorph) if(dim(uncertain)[1] > 0) states <- dplyr::bind_rows(states, uncertain) states <- dplyr::select_(states, drop) if(dim(states)[1] > 0) cells <- cells %>% dplyr::left_join(states, by = c("state")) %>% dplyr::select_(.dots = c("char", "symbol", "otu", "state")) ## Join the matrices. Note that we select unique column names after each join to avoid collisions cells %>% dplyr::left_join(char, by = c("char")) %>% dplyr::rename_(.dots = c("trait" = "label")) %>% dplyr::left_join(otus, by = c("otu")) %>% dplyr::rename_(.dots = c("taxa" = "label")) %>% na_symbol_to_state() %>% dplyr::select_(.dots = c("taxa", "symbol", "trait", "otu", "otus")) %>% tidyr::spread("trait", "symbol") -> out ## Identify the class of each column and reset it appropriately cellclass <- function(x){ x %>% stringr::str_replace(".*ContinuousCells", "numeric") %>% stringr::str_replace(".*StandardCells", "integer") } type <- get_level(nex, "characters/matrix/row/cell") %>% dplyr::select_(drop) %>% dplyr::left_join(characters, by = "characters") %>% dplyr::select_(.dots = c( "xsi.type", "char", "characters")) %>% dplyr::left_join(char, by = c("char")) %>% dplyr::select_(.dots = c("label", "xsi.type")) %>% dplyr::distinct() %>% dplyr::mutate_(.dots = setNames(list(~cellclass(xsi.type)), "class")) for(i in dim(type)[1]) class(out[[type$label[i]]]) <- type$class[i] ## drop unwanted columns if requested (default) if(!otu_id){ out <- dplyr::select_(out, quote(-otu)) } if(!otus_id){ out <- dplyr::select_(out, quote(-otus)) } if(!rownames_as_col){ taxa <- out$taxa out <- dplyr::select_(out, quote(-taxa)) out <- as.data.frame(out) rownames(out) <- taxa out } out } ## If 'label' column is missing, create it from 'id' column ## if label exists but has missing or non-unique values, also use ids instead optional_labels <- function(df, id_col = "id"){ who <- names(df) if(! "label" %in% who) df$label <- df[[id_col]] if(length(unique(df$label)) < length(df$label)) df$label <- df[[id_col]] df } ## Continuous traits have the values in "state" column, whereas ## for discrete states we return the value of the "symbol" column na_symbol_to_state <- function(df){ if(is.null(df$symbol)) df$symbol <- NA df$symbol[is.na(df$symbol)] <- suppressWarnings(as.numeric(df$state[is.na(df$symbol)])) df } RNeXML/R/nexml_methods.R0000644000176200001440000000402412641021656014514 0ustar liggesusers #setMethod("head", signature("nexml"), function(x, n=6L, ...){ # write.nexml(x, "tmp123.xml") # txt <- readLines("tmp123.xml", n=n) # unlink("tmp123.xml") # cat(txt, "\n") #}) #setMethod("tail", signature("nexml"), function(x, n=6L, ...){ # write.nexml(x, "tmp123.xml") # txt <- readLines("tmp123.xml", n=-n) # unlink("tmp123.xml") # cat(txt, "\n") #}) setMethod("show", signature("nexml"), function(object){ summary(object) }) # FIXME: consider showing author/title/citation information if available? setMethod("summary", signature("nexml"), function(object){ doc <- xmlParse(write.nexml(object)) nmeta <- length(getNodeSet(doc, "//x:meta", namespaces="x")) ntree_blocks <- length(object@trees) n_per_block <- sapply(unname(object@trees), function(x) length(x@tree)) ntrees <- length(getNodeSet(doc, "//x:tree", namespaces="x")) ncharacters <- length(getNodeSet(doc, "//x:characters", namespaces="x")) notu <- length(getNodeSet(doc, "//x:otu", namespaces="x")) block_counts <- paste(sapply(1:length(n_per_block), function(i) paste("\t block", i, "contains", n_per_block[i], "phylogenetic trees")), sep= "", collapse = "\n") cat(paste("A nexml object representing:\n", "\t", ntree_blocks, "phylogenetic tree blocks, where:", "\n", block_counts, "\n", "\t", nmeta, "meta elements", "\n", "\t", ncharacters, "character matrices", "\n", "\t", notu, "taxonomic units", "\n", "Taxa: \t", paste(head(get_taxa(object)$label), collapse = ", "), "...", "\n\n", "NeXML generated by", object@generator, "using", "schema version:", object@version, "\n", "size:", capture.output(print(object.size(object), units="auto")), "\n")) }) RNeXML/R/nexml_publish.R0000644000176200001440000001040512641021656014517 0ustar liggesusers #' publish nexml files to the web and receive a DOI #' #' publish nexml files to the web and receive a DOI #' @param nexml a nexml object (or file path) #' @param ... additional arguments, depending on repository. See examples. #' @param repository desitination respository #' @return a digital object identifier to the published data #' @export #' @examples \dontrun{ #' data(bird.orders) #' birds <- add_trees(bird.orders) #' doi <- nexml_publish(birds, visibility = "public", repository="figshare") #' } nexml_publish <- function(nexml, ..., repository="figshare"){ repository = match.arg(repository) switch(repository, figshare = nexml_figshare(nexml, ...)) } #' publish nexml to figshare #' #' publish nexml to figshare #' @param nexml a nexml object (or file path to a nexml file) #' @param file The filename desired for the object, if nexml is not already a file. #' if the first argument is already a path, this value is ignored. #' @param categories The figshare categories, must match available set. see \code{fs_add_categories} #' @param tags Any keyword tags you want to add to the data. #' @param visibility whether the results should be published (public), or kept private, #' or kept as a draft for further editing before publication. (New versions can be updated, #' but any former versions that was once made public will always be archived and cannot be removed). #' @param id an existing figshare id (e.g. from fs_create), to which this file can be appended. #' @param ... additional arguments #' @return the figshare id of the object #' @export #' @examples \dontrun{ #' data(bird.orders) #' birds <- add_trees(bird.orders) #' doi <- nexml_figshare(birds, visibility = "public", repository="figshare") #' } nexml_figshare <- function(nexml, file = "nexml.xml", categories = "Evolutionary Biology", tags = list("phylogeny", "NeXML"), visibility = c("public", "private", "draft"), id = NULL, ...){ visibility = match.arg(visibility) # success <- require(rfigshare) # if(!success){ # message("rfigshare package not found. Attempting to install") # install.packages("rfigshare") # success <- require(rfigshare) # if(!success) # stop("The rfigshare package must be installed to publish data to figshare") # } # handle nexml as a file path or as an object if(!is(nexml, "nexml")){ if(file.exists(nexml)){ file <- nexml nexml <- nexml_read(nexml) } # else warning? } m <- get_metadata(nexml) if(is.null(id)){ id <- rfigshare::fs_create(title = m[["dc:title"]], description = m[["dc:description"]], type = "dataset") } doi <- paste("http://doi.org/10.6084/m9.figshare", id, sep=".") rfigshare::fs_add_authors(id, authors = m[["dc:creator"]]) rfigshare::fs_add_categories(id, categories) rfigshare::fs_add_tags(id, tags) # Use object DOI instead of figshare id when available? # Construct DOI from figshare id? nexml <- add_meta(meta("dc:identifier", doi), nexml) nexml_write(nexml, file) rfigshare::fs_upload(id, file) if (visibility == "private"){ rfigshare::fs_make_private(id) message(paste("Your data has been uploaded to figshare privately. You may make further edits and publish the data from the online control panel at figshare.com or by using the rfigshare package and the article_id:", id, ". Your doi has been reserved but will not resolve until the article is made public.")) } else if (visibility == "public"){ rfigshare::fs_make_public(id) message(paste("Your data is published and now accessible at", doi)) } else { message(paste("Your data has been uploaded to figshare as a draft. You may make further edits and publish the data from the online control panel at figshare.com or by using the rfigshare package and the article_id:", id, " Your doi has been reserved but will not resolve until the article is made public.")) } # FIXME consider not returning the DOI as a link. # Consider returning just the Figshare id # id } RNeXML/R/get_trees.R0000644000176200001440000001677712641021656013651 0ustar liggesusers#' extract all phylogenetic trees in ape format #' #' extract all phylogenetic trees in ape format #' @param nexml a representation of the nexml object from which the data is to be retrieved #' @return returns a list of lists of multiphylo trees, even if all trees are in the same `trees` node (and hence the outer list will be of length 1) or if there is only a single tree (and hence the inner list will also be of length 1. This guarentees a consistent return type regardless of the number of trees present in the nexml file, and also preserves any heirarchy/grouping of trees. #' @export #' @import plyr #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_trees_list(nex) #' @seealso \code{\link{get_trees}} \code{\link{get_flat_trees}} \code{\link{get_item}} get_trees_list <- function(nexml) as(nexml, "multiPhyloList") #' extract a phylogenetic tree from the nexml #' #' extract a phylogenetic tree from the nexml #' @param nexml a representation of the nexml object from which the data is to be retrieved #' @return an ape::phylo tree, if only one tree is represented. Otherwise returns a list of lists of multiphylo trees. To consistently recieve the list of lists format (preserving the heriarchical nature of the nexml), use \code{\link{get_trees_list}} instead. #' @export #' @seealso \code{\link{get_trees}} \code{\link{get_flat_trees}} \code{\link{get_item}} #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_trees(nex) get_trees <- function(nexml) as(nexml, "phylo") #' get_flat_trees #' #' extract a single multiPhylo object containing all trees in the nexml #' @details Note that this method collapses any heirachical structure that may have been present as multiple `trees` nodes in the original nexml (though such a feature is rarely used). To preserve that structure, use \code{\link{get_trees}} instead. #' @return a multiPhylo object (list of ape::phylo objects). See details. #' @param nexml a representation of the nexml object from which the data is to be retrieved #' @export #' @seealso \code{\link{get_trees}} \code{\link{get_trees}} \code{\link{get_item}} #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_flat_trees(nex) get_flat_trees <- function(nexml) flatten_multiphylo(get_trees_list(nexml)) ####### Coercion methods ######### setAs("nexml", "multiPhyloList", function(from){ map <- get_otu_maps(from) unname(lapply(from@trees, function(X){ out <- unname(lapply(X@tree, toPhylo, map[[X@otus]])) class(out) <- "multiPhylo" out })) }) # Always collapses all trees nodes into a multiphylo setAs("nexml", "multiPhylo", function(from){ map <- get_otu_maps(from) out <- unname(lapply(from@trees, function(X){ out <- unname(lapply(X@tree, toPhylo, map[[X@otus]])) class(out) <- "multiPhylo" out })) flatten_multiphylo(out) }) #' Flatten a multiphylo object #' #' @details NeXML has the concept of multiple nodes, each with multiple child nodes. #' This maps naturally to a list of multiphylo objects. Sometimes #' this heirarchy conveys important structural information, so it is not discarded by default. #' Occassionally it is useful to flatten the structure though, hence this function. Note that this #' discards the original structure, and the nexml file must be parsed again to recover it. #' @param object a list of multiphylo objects #' @export flatten_multiphylo <- function(object){ out <- unlist(object, FALSE, FALSE) class(out) <- "multiPhylo" out } setAs("nexml", "phylo", function(from){ if(length(from@trees[[1]]@tree) == 1){ maps <- get_otu_maps(from) otus_id <- from@trees[[1]]@otus out <- toPhylo(from@trees[[1]]@tree[[1]], maps[[otus_id]]) } else { warning("Multiple trees found, Returning multiPhylo object") out <- as(from, "multiPhylo") } if(length(out) == 1) out <- flatten_multiphylo(out) out }) ########### Main internal function for converting nexml to phylo ######## #' nexml to phylo #' #' nexml to phylo coercion #' @param tree an nexml tree element #' @param otus a character string of taxonomic labels, named by the otu ids. #' e.g. (from get_otu_maps for the otus set matching the relevant trees node. #' @return phylo object. If a "reconstructions" annotation is found on the #' edges, return simmap maps slot as well. toPhylo <- function(tree, otus){ otu <- NULL # Avoid CRAN NOTE as per http://stackoverflow.com/questions/8096313/no-visible-binding-for-global-variable-note-in-r-cmd-check ## Extract the nodes list nodes <- sapply(unname(tree@node), function(x) c(node = unname(x@id), otu = missing_as_na(x@otu))) # If any edges have lengths, use this routine if(any(sapply(tree@edge, function(x) length(x@length) > 0))) edges <- sapply(unname(tree@edge), function(x) c(source = unname(x@source), target = unname(x@target), length = if(identical(x@length, numeric(0))) NA else unname(x@length), id = unname(x@id))) else # no edge lengths, use this routine edges <- sapply(unname(tree@edge), function(x) c(source = unname(x@source), target = unname(x@target), id = unname(x@id))) nodes <- data.frame(t(nodes), stringsAsFactors=FALSE) names(nodes) <- c("node", "otu") ## Identifies tip.label based on being named with OTUs while others are NULL ## FIXME Should instead decide that these are tips based on the edge labels? nodes <- cbind(plyr::arrange(nodes, otu), id = 1:dim(nodes)[1]) # Also warns because arrange isn't quoting the column name. ## NB: these ids are the ape:id numbers by which nodes are identified in ape::phylo ## Arbitrary ids are not supported - ape expecs the numbers 1:n, starting with tips. ## nodes$node lists tip taxa first (see arrange fn above), since ## APE expects nodes numbered 1:n_tips to be to correspond to tips. source_nodes <- match(edges["source",], nodes$node) target_nodes <- match(edges["target",], nodes$node) ## Define elements of a phylo class object ## #--------------------------------------------# ## define edge matrix edge <- unname(cbind(source_nodes, target_nodes)) if("length" %in% rownames(edges)) edge.length <- as.numeric(edges["length",]) else edge.length <- NULL ## define tip labels tip_otus <- as.character(na.omit(nodes$otu)) tip.label <- otus[tip_otus] # Count internal nodes (assumes bifurcating tree. Does ape always assume this?) # FIXME use a method that does not assume bifurcating tree... Nnode <- length(tip.label) - 1 # assemble the phylo object, assign class and return. phy = list(edge=edge, tip.label = unname(tip.label), Nnode = Nnode) if(!is.null(edge.length)) phy$edge.length = edge.length # optional fields class(phy) = "phylo" ## Check for simmap phy } ## Helper function missing_as_na <- function(x){ if(length(x) == 0) NA else unname(x) } RNeXML/R/add_meta.R0000644000176200001440000000536712641021656013417 0ustar liggesusers#' Add metadata to a nexml file #' #' @param meta a meta S4 object, e.g. ouput of the function \code{\link{meta}}, or a list of these meta objects #' @param nexml (S4) object #' @param level the level at which the metadata annotation should be added. #' @param namespaces named character string for any additional namespaces that should be defined. #' @param i for otus, trees, characters: if there are multiple such blocks, which one should be annotated? Default is first/only block. #' @param at_id the id of the element to be annotated. Optional, advanced use only. #' @seealso \code{\link{meta}} \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_basic_meta}} #' @return the updated nexml object #' @examples #' ## Create a new nexml object with a single metadata element: #' modified <- meta(property = "prism:modificationDate", content = "2013-10-04") #' nex <- add_meta(modified) # Note: 'prism' is defined in nexml_namespaces by default. #' #' ## Write multiple metadata elements, including a new namespace: #' website <- meta(href = "http://carlboettiger.info", #' rel = "foaf:homepage") # meta can be link-style metadata #' nex <- add_meta(list(modified, website), #' namespaces = c(foaf = "http://xmlns.com/foaf/0.1/")) #' #' ## Append more metadata, and specify a level: #' history <- meta(property = "skos:historyNote", #' content = "Mapped from the bird.orders data in the ape package using RNeXML") #' nex <- add_meta(history, #' nexml = nex, #' level = "trees", #' namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) #' #' @export add_meta #' @include classes.R #' add_meta <- function(meta, nexml=new("nexml"), level=c("nexml", "otus", "trees", "characters"), namespaces = NULL, i = 1, at_id = NULL){ level <- match.arg(level) if(is(meta, "meta")) meta <- list(meta) if(!all(sapply(meta, is, "meta"))) stop("All elements in list must be of class 'meta'") if(!is.null(at_id)){ stop("function does not yet handle at_id assignments") # case not written yet } else if(level =="nexml"){ nexml@meta <- new("ListOfmeta", c(unlist(nexml@meta), unlist(meta))) } else if(level =="otus"){ nexml@otus[[i]]@meta <- new("ListOfmeta", c(nexml@otus[[i]]@meta, meta)) } else if(level =="nexml"){ nexml@trees[[i]]@meta <- new("ListOfmeta", c(nexml@trees[[i]]@meta, meta)) } else if(level =="nexml"){ nexml@characters[[i]]@meta <- new("ListOfmeta", c(nexml@characters[[i]]@meta, meta)) } ## append additional namespaces nexml <- add_namespaces(namespaces, nexml) nexml } RNeXML/R/get_rdf.R0000644000176200001440000000206312641021656013261 0ustar liggesusers#' Extract rdf-xml from a NeXML file #' #' Extract rdf-xml from a NeXML file #' @param file the name of a nexml file, or otherwise a nexml object. #' @return an RDF-XML object (XMLInternalDocument). This can be manipulated with #' tools from the XML R package, or converted into a triplestore for use with #' SPARQL queries from the rrdf R package. #' @export #' @import httr XML # @import Sxslt # not yet #' @examples \dontrun{ #' f <- system.file("examples", "meta_example.xml", package="RNeXML") #' rdf <- get_rdf(f) #' #' ## Write to a file and read in with rrdf #' tmp <- tempfile() #' saveXML(rdf, tmp) #' library(rrdf) #' lib <- load.rdf(tmp) #' #' ## Perform a SPARQL query: #' sparql.rdf(lib, "SELECT ?title WHERE { ?x ?title}") #' } get_rdf <- function(file){ if(is(file, "nexml")){ who <- tempfile() nexml_write(x=file, file=who) file <- who } to_rdf <- system.file("examples", "RDFa2RDFXML.xsl", package="RNeXML") rdf <- Sxslt::xsltApplyStyleSheet(file, to_rdf) rdf } RNeXML/R/add_basic_meta.R0000644000176200001440000001050112731606043014541 0ustar liggesusers#' Add basic metadata #' #' adds Dublin Core metadata elements to (top-level) nexml #' @param title A title for the dataset #' @param description a description of the dataset #' @param creator name of the data creator. Can be a string or R person object #' @param pubdate publication date. Default is current date. #' @param rights the intellectual property rights associated with the data. #' The default is Creative Commons Zero (CC0) public domain declaration, #' compatiable with all other licenses and appropriate for deposition #' into the Dryad or figshare repositories. CC0 is also recommended by the Panton Principles. #' Alternatively, any other plain text string can be added and will be provided as the content #' attribute to the dc:rights property. #' @param publisher the publisher of the dataset. Usually where a user may go to find the canonical #' copy of the dataset: could be a repository, journal, or academic institution. #' @param citation a citation associated with the data. Usually an acompanying academic journal #' article that indicates how the data should be cited in an academic context. Multiple citations #' can be included here. #' citation can be a plain text object, but is preferably an R `citation` or `bibentry` object (which #' can include multiple citations. See examples #' @param nexml a nexml object to which metadata should be added. A new #' nexml object will be created if none exists. #' @return an updated nexml object #' @details \code{add_basic_meta()} is just a wrapper for \code{\link{add_meta}} to make it easy to #' provide generic metadata without explicitly providing the namespace. For instance, #' \code{add_basic_meta(title="My title", description="a description")} is identical to: #' \code{add_meta(list(meta("dc:title", "My title"), meta("dc:description", "a description")))} #' Most function arguments are mapped directly to the Dublin Core terms #' of the same name, with the exception of `rights`, which by default maps #' to the Creative Commons namespace when using CC0 license. #' #' @seealso \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} #' @export #' @importFrom stats na.omit #' @importFrom utils capture.output head object.size #' @examples #' nex <- add_basic_meta(title = "My test title", #' description = "A description of my test", #' creator = "Carl Boettiger ", #' publisher = "unpublished data", #' pubdate = "2012-04-01") #' #' ## Adding citation to an R package: #' nexml <- add_basic_meta(citation=citation("ape")) #' \dontrun{ #' ## Use knitcitations package to add a citation by DOI: #' library(knitcitations) #' nexml <- add_basic_meta(citation = bib_metadata("10.2307/2408428")) #' } #' @include classes.R add_basic_meta <- function(title = NULL, description = NULL, creator = Sys.getenv("USER"), pubdate = Sys.Date(), rights = "CC0", publisher = NULL, citation = NULL, nexml = new("nexml") ){ mymeta <- get_metadata(nexml) m <- mymeta$content names(m) <- mymeta$property if(!is.null(title)) nexml <- add_meta(meta("dc:title", title), nexml) if(!is.null(creator) || creator == "") nexml <- add_meta(meta("dc:creator", format(creator)), nexml) if(!is.null(pubdate)) if(!is.null(m)) if(is.null(m["dc:pubdate"]) | is.na(m["dc:pubdate"])) nexml <- add_meta(meta("dc:pubdate", format(pubdate)), nexml) if(!is.null(description)) nexml <- add_meta(meta("dc:description", description), nexml) if(!is.null(rights)){ if(rights == "CC0") if(is.null(get_license(nexml))) nexml <- add_meta(meta(rel="cc:license", href="http://creativecommons.org/publicdomain/zero/1.0/"), nexml) else nexml <- add_meta(meta("dc:rights", rights), nexml) } if(!is.null(citation)) if(is(citation, "BibEntry")) class(citation) = "bibentry" if(is(citation, "bibentry")) nexml <- add_meta(nexml_citation(citation), nexml) else nexml <- add_meta(meta("dcterms:bibliographicCitation", citation), nexml) nexml } RNeXML/R/meta.R0000644000176200001440000001457112641021656012604 0ustar liggesusers## Utilities for adding additional metadata #' Constructor function for metadata nodes #' #' @param property specify the ontological definition together with it's namespace, e.g. dc:title #' @param content content of the metadata field #' @param rel Ontological definition of the reference provided in href #' @param href A link to some reference #' @param datatype optional RDFa field #' @param id optional id element (otherwise id will be automatically generated). #' @param type optional xsi:type. If not given, will use either "LiteralMeta" or "ResourceMeta" as #' determined by the presence of either a property or a href value. #' @param children Optional element containing any valid XML block (XMLInternalElementNode class, see the XML package for details). #' @details User must either provide property+content or rel+href. Mixing these will result in potential garbage. #' The datatype attribute will be detected automatically from the class of the content argument. Maps from R class #' to schema datatypes are as follows: #' character - xs:string, #' Date - xs:date, #' integer - xs:integer, #' numeric - xs:decimal, #' logical - xs:boolean #' #' @examples #' meta(content="example", property="dc:title") #' @export #' @seealso \code{\link{nexml_write}} #' @include classes.R meta <- function(property = character(0), content = character(0), rel = character(0), href = character(0), datatype = character(0), id = character(0), type = character(0), children = list()){ if(is.logical(content)) datatype <- "xsd:boolean" else if(is(content, "Date")) datatype <- "xsd:date" else if(is.numeric(content)) datatype <- "xsd:decimal" else if(is.character(content)) datatype <- "xsd:string" else if(is.integer(content)) datatype <- "xsd:integer" else datatype <- "xsd:string" # Having assigned the datatype, # the content text must be written as a string content <- as.character(content) if(length(id) == 0) id <- nexml_id("m") if(is(children, "XMLAbstractNode") || is(children, "XMLInternalNode")) children <- list(children) if(length(property) > 0){ ## avoid if(is.null(content) && length(children) == 0) ## Avoid writing when content is missing, e.g. prism:endingpage is blank NULL else new("meta", content = content, datatype = datatype, property = property, id = id, 'xsi:type' = "LiteralMeta", children = children) } else if(length(rel) > 0){ if(is.null(href)) NULL else new("meta", rel = rel, href = href, id = id, 'xsi:type' = "ResourceMeta", children = children) } else { new("meta", content = content, datatype = datatype, rel = rel, href = href, id = id, 'xsi:type' = type, children = children) } } ## Common helper functions nexml_citation <- function(obj){ if(is(obj, "BibEntry")) class(obj) <- "bibentry" if(is(obj, "bibentry")){ out <- lapply(obj, function(obj){ if(length(grep("--", obj$pages)) > 0){ pgs <- strsplit(obj$pages, "--")[[1]] start_page <- pgs[[1]] end_page <- if(length(pgs)>1) pgs[[2]] else " " } else if(length(grep("-", obj$pages)) > 0){ pgs <- strsplit(obj$pages, "-")[[1]] start_page <- pgs[[1]] end_page <- if(length(pgs)>1) pgs[[2]] else " " } else { start_page <- NULL end_page <- NULL } list_of_metadata_nodes <- plyr::compact(c(list( meta(content=obj$volume, property="prism:volume"), meta(content=obj$journal, property="dc:publisher"), meta(content=obj$journal, property="prism:publicationName"), meta(content = end_page, property="prism:endingPage"), meta(content=start_page, property="prism:startingPage"), meta(content=obj$year, property="prism:publicationDate"), meta(content=obj$title, property="dc:title")), lapply(obj$author, function(x){ meta(content = format(x, c("given", "family")), property="dc:contributor") }))) citation_elements = new("ListOfmeta", list_of_metadata_nodes) meta(content=format(obj, "text"), property="dcterms:bibliographicCitation", children = lapply(citation_elements, as, "XMLInternalElementNode")) }) out } } #' Concatenate meta elements into a ListOfmeta #' #' Concatenate meta elements into a ListOfmeta #' @param x,... meta elements to be concatenated, e.g. see \code{\link{meta}} #' @param recursive logical, if 'recursive=TRUE', the function #' descends through lists and combines their elements into a vector. #' @return a listOfmeta object containing multiple meta elements. #' @examples #' c(meta(content="example", property="dc:title"), #' meta(content="Carl", property="dc:creator")) #' setMethod("c", signature("meta"), function(x, ..., recursive = FALSE){ elements <- list(x, ...) # if(recursive) elements <- meta_recursion(elements) new("ListOfmeta", elements) }) #' Concatenate ListOfmeta elements into a ListOfmeta #' #' Concatenate ListOfmeta elements into a ListOfmeta #' @param x,... meta or ListOfmeta elements to be concatenated, e.g. see \code{\link{meta}} #' @param recursive logical, if 'recursive=TRUE', the function #' descends through lists and combines their elements into a vector. #' @return a listOfmeta object containing multiple meta elements. #' @include classes.R #' @examples #' metalist <- c(meta(content="example", property="dc:title"), #' meta(content="Carl", property="dc:creator")) #' out <- c(metalist, metalist) #' out <- c(metalist, meta(content="a", property="b")) setMethod("c", signature("ListOfmeta"), function(x, ..., recursive = FALSE){ elements <- list(x, unlist(...)) elements <- meta_recursion(elements) new("ListOfmeta", elements) }) meta_recursion <- function(elements){ i <- 1 out <- vector("list") for(e in elements){ if(length(e) > 0){ if(is(e, "meta")){ out[[i]] <- e i <- i + 1 } else if(is.list(e)){ out <- c(out, meta_recursion(e)) i <- length(out) + 1 } } } out } RNeXML/R/get_metadata.R0000644000176200001440000000250012641022470014255 0ustar liggesusers ## FIXME might want to define this for sub-nodes. e.g. so we can get all metadata on "nodes" in tree2... #' get_metadata #' #' get_metadata #' @param nexml a nexml object #' @param level the name of the level of element desired, see details #' @return the requested metadata as a data.frame. Additional columns #' indicate tha parent element of the return value. #' @details 'level' should be either the name of a child element of a NeXML document #' (e.g. "otu", "characters"), or a path to the desired element, e.g. 'trees/tree' #' will return the metadata for all phylogenies in all trees blocks. #' @import XML #' @examples \dontrun{ #' comp_analysis <- system.file("examples", "primates.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_metadata(nex) #' get_metadata(nex, "otus/otu") #' } #' @export get_metadata <- function(nexml, level = "nexml"){ # level = c("nexml", "otus", "trees", "characters", # "otus/otu", "trees/tree", "characters/format", "characters/matrix", # "characters/format/states") # level <- match.arg(level) ## Handle deprecated formats if(level =="otu") level <- "otus/otu" if(level =="tree") level <- "trees/tree" if(level == "nexml") level <- "meta" else level <- paste(level, "meta", sep="/") get_level(nexml, level) } RNeXML/R/nexml_write.R0000644000176200001440000000706312665743025014220 0ustar liggesusers#' Write nexml files #' #' @param x a nexml object, or any phylogeny object (e.g. phylo, phylo4) #' that can be coerced into one. Can also be omitted, in which case a new #' nexml object will be constructed with the additional parameters specified. #' @param file the name of the file to write out #' @param trees phylogenetic trees to add to the nexml file (if not already given in x) #' see \code{\link{add_trees}} for details. #' @param characters additional characters #' @param meta A meta element or list of meta elements, see \code{\link{add_meta}} #' @param ... additional arguments to add__basic_meta, such as the title. See \code{\link{add_basic_meta}}. #' @return Writes out a nexml file #' @import ape #' @import XML #' @import methods #' @aliases nexml_write write.nexml #' @export nexml_write write.nexml #' @seealso \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} \code{\link{nexml_read}} #' @examples #' ## Write an ape tree to nexml, analgous to write.nexus: #' library(ape); data(bird.orders) #' write.nexml(bird.orders, file="example.xml") #' #' \dontrun{ # takes > 5s #' ## Assemble a nexml section by section and then write to file: #' library(geiger) #' data(geospiza) #' nexml <- add_trees(geospiza$phy) # creates new nexml #' nexml <- add_characters(geospiza$dat, nexml = nexml) # pass the nexml obj to append character data #' nexml <- add_basic_meta(title="my title", creator = "Carl Boettiger", nexml = nexml) #' nexml <- add_meta(meta("prism:modificationDate", format(Sys.Date())), nexml = nexml) #' #' write.nexml(nexml, file="example.xml") #' #' ## As above, but in one call (except for add_meta() call). #' write.nexml(trees = geospiza$phy, #' characters = geospiza$dat, #' title = "My title", #' creator = "Carl Boettiger", #' file = "example.xml") #' #' ## Mix and match: identical to the section by section: #' nexml <- add_meta(meta("prism:modificationDate", format(Sys.Date()))) #' write.nexml(x = nexml, #' trees = geospiza$phy, #' characters = geospiza$dat, #' title = "My title", #' creator = "Carl Boettiger", #' file = "example.xml") #' #' } nexml_write <- function(x = new("nexml"), file = NULL, trees = NULL, characters = NULL, meta = NULL, ...){ nexml <- as(x, "nexml") if(!is.null(trees)) nexml <- add_trees(trees, nexml = nexml) if(!is.null(characters)) nexml <- add_characters(characters, nexml = nexml) if(!is.null(meta)) nexml <- add_meta(meta, nexml = nexml) nexml <- do.call(add_basic_meta, c(list(...), list(nexml=nexml))) out <- as(nexml, "XMLInternalNode") saveXML(out, file = file) } write.nexml <- nexml_write ############## Promotion methods ######## ## FIXME -- Coercion is not the way to go about any of this ## want generator methods that can handle id creation better # consider: # setMethod("promote", # signature("tree", "character"), # function(object, target_type) setAs("tree", "nexml", function(from){ trees = as(from, "trees") otus = as(from, "otus") otus@id = "tax1" #UUIDgenerate() trees@id = "Trees" #UUIDgenerate() trees@otus = otus@id new("nexml", trees = new("ListOftrees", list(trees)), otus = otus) }) setAs("ListOfnode", "otus", function(from) new("otus", otu = from)) setAs("tree", "trees", function(from) new("trees", tree = new("ListOftree", list(from)))) RNeXML/R/deprecated.R0000644000176200001440000001155512641021656013755 0ustar liggesusers## Some of these methods are still in use by simmap and various add_ functions. ## Ideally all would be replaced by the equivalent get_level() version ## Conversions between matrix and characters node #' Extract the character matrix #' #' @param nexml nexml object (e.g. from read.nexml) #' @param rownames_as_col option to return character matrix rownames #' (with taxon ids) as it's own column in the data.frame. Default is FALSE #' for compatibility with geiger and similar packages. #' @return the list of taxa #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_characters_list(nex) #' @export get_characters_list <- function(nexml, rownames_as_col=FALSE){ # extract mapping between otus and taxon ids maps <- get_otu_maps(nexml) # loop over all character matrices out <- lapply(nexml@characters, function(characters){ # Make numeric data class numeric, discrete data class discrete type <- slot(characters, 'xsi:type') # check without namespace? if (grepl("ContinuousCells", type)) { dat <- extract_character_matrix(characters@matrix) dat <- otu_to_label(dat, maps[[characters@otus]]) dat <- character_to_label(dat, characters@format) for(i in length(dat)) ## FIXME something more elegant, no? dat[[i]] <- as.numeric(dat[[i]]) } else if (grepl("StandardCells", type)) { dat <- extract_character_matrix(characters@matrix) dat <- state_to_symbol(dat, characters@format) dat <- otu_to_label(dat, maps[[characters@otus]]) dat <- character_to_label(dat, characters@format) for(i in length(dat)) dat[[i]] <- factor(dat[[i]]) } else { dat <- NULL } if(rownames_as_col){ dat <- cbind(taxa = rownames(dat), dat) rownames(dat) <- NULL } dat }) # name the character matrices by their labels, # if available, otherwise, by id. names(out) <- name_by_id_or_label(nexml@characters) out } # for lists only # identical_rownames <- function(x) all(sapply(lapply(x, rownames), identical, rownames(x[[1]]))) #### Subroutines (not exported) ########### ### The subroutines of "get_characters_list# function ## Fixme these could be adapted to use the get_*_maps functions otu_to_label <- function(dat, otu_map){ rownames(dat) <- otu_map[rownames(dat)] dat } character_to_label <- function(dat, format){ ## Compute the mapping map <- map_chars_to_label(format) ## replace colnames with matching labels colnames(dat) <- map[colnames(dat)] dat } state_to_symbol <- function(dat, format){ if(!isEmpty(format@states)){ map_by_char <- map_state_to_symbol(format) for(n in names(dat)){ symbol <- map_by_char[[n]] dat[[n]] <- symbol[dat[[n]]] } dat } else { dat ## Nothing to do if we don't have a states list } } ### Subroutine for characters_to_label, ### Also subroutine for get_char_map . map_chars_to_label <- function(format){ map <- sapply(format@char, function(char){ if(length(char@label) > 0) label <- char@label else label <- char@id c(char@id, label) }) out <- map[2,] names(out) <- map[1,] out } # Subroutine of the get_state_maps and state_to_symbol functions # # For each character, find the matching `states` set. # For that set, map each state id to the state symbol map_state_to_symbol <- function(format){ # loop over characters map <- lapply(format@char, function(char){ # name the list with elements as `states` sets by their ids states <- format@states ids <- sapply(states, function(states) states@id) names(states) <- ids # Get the relevant states set matching the current character map_states_to_symbols( states[[char@states]] ) }) names(map) <- name_by_id(format@char) map } ## Subroutine of the map_state_to_symbol function above map_states_to_symbols <- function(states){ map <- sapply(states@state, function(state){ if(length(state@symbol) > 0) symbol <- state@symbol else symbol <- state@id c(state@id, symbol) }) out <- map[2, ] names(out) <- map[1, ] out } #' @import reshape2 extract_character_matrix <- function(matrix){ otu <- sapply(matrix@row, function(row) row@otu) # charnames <- unname(sapply(matrix@row[[1]]@cell, function(cell) cell@char)) names(matrix@row) <- otu mat <- lapply(matrix@row, function(row){ names(row@cell) <- unname(sapply(row@cell, function(b) b@char)) # names(row@cell) <- charnames lapply(row@cell, function(cell) cell@state) }) mat <- melt(mat) colnames(mat) <- c("state", "character", "otu") mat <- dcast(mat, otu ~ character, value.var = "state") # Move otus into rownames and drop the column rownames(mat) <- mat[["otu"]] # mat <- mat[-1] mat } RNeXML/R/add_namespaces.R0000644000176200001440000000355112641021656014601 0ustar liggesusers #' add namespaces #' #' add namespaces, avoiding duplication if prefix is already defined #' @param namespaces a named character vector of namespaces #' @param nexml a nexml object. will create a new one if none is given. #' @return a nexml object with updated namespaces #' @examples #' ## Create a new nexml object with a single metadata element: #' modified <- meta(property = "prism:modificationDate", content = "2013-10-04") #' nex <- add_meta(modified) # Note: 'prism' is defined in nexml_namespaces by default. #' #' ## Write multiple metadata elements, including a new namespace: #' website <- meta(href = "http://carlboettiger.info", #' rel = "foaf:homepage") # meta can be link-style metadata #' nex <- add_meta(list(modified, website), #' namespaces = c(foaf = "http://xmlns.com/foaf/0.1/")) #' #' ## Append more metadata, and specify a level: #' history <- meta(property = "skos:historyNote", #' content = "Mapped from the bird.orders data in the ape package using RNeXML") #' nex <- add_meta(history, #' nexml = nex, #' level = "trees", #' namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) #' #' @seealso \code{\link{meta}} \code{\link{add_meta}} #' @export add_namespaces <- function(namespaces, nexml = new("nexml")){ if(!is.null(namespaces)){ ## check for duplicated abbreviation, not for duplicated URI. OKAY to have multiple abbrs for same URI... ## FIXME Make sure that cases where abbreviation match actually match the URI as well notdups <- match(names(namespaces), names(nexml@namespaces)) notdups <- sapply(notdups, is.na) if(all(notdups)) # all are unique nexml@namespaces <- c(nexml@namespaces, namespaces) else { nexml@namespaces <- c(nexml@namespaces, namespaces[notdups]) } } nexml } RNeXML/R/internal_nexml_id.R0000644000176200001440000000155012641021656015342 0ustar liggesusers nexml_env = new.env(hash=TRUE) # If no prefix is given, will use a UUID # Generates an id number by appending a counter to the prefix # Will keep track of the counter for each prefix for that session. #' @import uuid nexml_id <- function(prefix = "", use_uuid = getOption("uuid", FALSE)){ if(use_uuid){ uid <- paste0("uuid-", UUIDgenerate()) } else { if((prefix %in% ls(envir=nexml_env))) id_counter <- get(prefix, envir=nexml_env) else { assign(prefix, 1, envir=nexml_env) id_counter <- 1 } uid <- paste0(prefix, id_counter) id_counter <- id_counter + 1 assign(prefix, id_counter, envir=nexml_env) } uid } #' reset id counter #' #' reset the id counter #' @export reset_id_counter <- function(){ rm(list=ls(envir=nexml_env), envir=nexml_env) } # use an environment to store counter RNeXML/R/internal_get_node_maps.R0000644000176200001440000000270412641021656016351 0ustar liggesusers# get otus map # # @param nexml nexml object # @return a list showing the mapping between (internal) otu identifiers and labels (taxonomic names). List is named by the id of the otus block. # @details largely for internal use get_otu_maps <- function(nexml){ otus <- as.list(nexml@otus) names(otus) <- name_by_id(otus) otu_maps <- lapply(otus, function(otus){ # loop over all otus nodes taxon <- sapply(otus@otu, function(otu){ # loop over each otu in the otus set if(length(otu@label) > 0) label <- otu@label else label <- otu@id c(otu@id, label) }) out <- taxon[2, ] #label names(out) <- taxon[1, ] #id out }) otu_maps } get_char_maps <- function(nexml){ map <- lapply(nexml@characters, function(characters) map_chars_to_label(characters@format)) names(map) <- name_by_id(nexml@characters) map } get_state_maps <- function(nexml){ map <- lapply(nexml@characters, function(characters){ if(!isEmpty(characters@format@states)) map_state_to_symbol(characters@format) else NULL }) names(map) <- name_by_id(nexml@characters) map } reverse_map <- function(map){ out <- NULL if(is.list(map)){ out <- lapply(map, function(x){ out <- names(x) names(out) <- x out}) } else if(is.character(map)) { out <- names(map) names(out) <- map out } out } RNeXML/R/nexmlTree.R0000644000176200001440000000442712641021656013620 0ustar liggesuserssetClass("phyloS4", slots = c(edge = "matrix", Nnode = "integer", tip.label = "character", edge.length = "numeric")) setOldClass("phylo", S4Class="phyloS4") # FIXME repeat selectMethod for all ape, geiger, etc methods(??) selectMethod("show", "phylo") removeClass("phyloS4") setClass("nexmlTree", slots = c(nexml = "nexml"), contains="phylo") setMethod("show", "nexmlTree", function(object) print.phylo(object)) # callNextMethod(object) ## callNextMethod might have been an option, but it looks for 'show' method, not print method?? ## constructor function nexmlTree <- function(object){ if(is(object, "nexml")){ phylo <- as(object, "phylo") } new("nexmlTree", nexml = object, phylo) } ## Coercions between classes setAs("XMLInternalElementNode", "nexmlTree", function(from) nexmlTree(as(from, "nexml"))) setAs("nexmlTree", "XMLInternalElementNode", function(from) as(from@nexml, "XMLInternalElementNode")) setAs("XMLInternalNode", "nexmlTree", function(from) nexmlTree(as(from, "nexml"))) setAs("nexmlTree", "XMLInternalNode", function(from) as(from@nexml, "XMLInternalNode")) setAs("phylo", "nexmlTree", function(from) nexmlTree(as(from, "nexml"))) setAs("nexmlTree", "phylo", function(from) as(from@nexml, "phylo")) setAs("nexml", "nexmlTree", function(from) nexmlTree(from)) setAs("nexmlTree", "nexml", function(from) from@nexml) ### Testing # a <- new("phylo", bird.orders) # expect_is(a, "phylo") # a # plot(a) # # b <- new("nexmlTree", bird.orders, nexml = as(bird.orders, "nexml")) # expect_is(b, "phylo") # b # plot(b) # # Some ape functions don't check class properly. i.e. class(b) == "phylo" is FALSE, but is(b, "phylo") is TRUE. # Don't really need these, but here they are mapping between S3 and S4 setAs("phyloS4", "phylo", function(from){ out <- list(edge = from@edge, Nnode = from@Nnode, tip.label = from@tip.label, edge.length = from@edge.length) class(out) <- "phylo" out }) setAs("phylo", "phyloS4", function(from) new("phyloS4", edge = from$edge, Nnode = from$Nnode, tip.label = from$tip.label, edge.length = from$edge.length)) RNeXML/R/internal_name_by_id.R0000644000176200001440000000050412641021656015627 0ustar liggesusers ## Helper functions name_by_id <- function(x) unname(sapply(x, function(i) if(length(i@id)>0) i@id else NULL)) name_by_id_or_label <- function(x) unname(sapply(x, function(i) if(length(i@label)>0) i@label else i@id)) get_by_id <- function(x, id){ ids <- sapply(x, function(i) i@id) m <- match(id, ids) x[[m]] } RNeXML/R/internal_isEmpty.R0000644000176200001440000000117412641021656015177 0ustar liggesusers isEmpty <- function (obj) { if (!isS4(obj)) { if (length(obj) > 0) FALSE else TRUE } else { if (identical(obj, new(class(obj)[1]))) out <- TRUE else { empty <- sapply(slotNames(obj), function(s) { if (isS4(slot(obj, s))) isEmpty(slot(obj, s)) else { if (length(slot(obj, s)) == 0) TRUE else if (length(slot(obj, s)) > 0) FALSE } }) out <- !any(!empty) } out } } RNeXML/R/nexml_get.R0000644000176200001440000000622412641021656013634 0ustar liggesusers#' Get the desired element from the nexml object #' #' Get the desired element from the nexml object #' @aliases nexml_get get_item #' @param nexml a nexml object (from read_nexml) #' @param element the kind of object desired, see details. #' @param ... additional arguments, if applicable to certain elements #' @details #' #' \itemize{ #' \item{"tree"}{ an ape::phylo tree, if only one tree is represented. Otherwise returns a list of lists of multiphylo trees. To consistently recieve the list of lists format (preserving the heriarchical nature of the nexml), use \code{trees} instead.} #' \item{"trees"}{ returns a list of lists of multiphylo trees, even if all trees are in the same `trees` node (and hence the outer list will be of length 1) or if there is only a single tree (and hence the inner list will also be of length 1. This guarentees a consistent return type regardless of the number of trees present in the nexml file, and also preserves any heirarchy/grouping of trees. } #' \item{"flat_trees"}{ a multiPhylo object (list of ape::phylo objects) Note that this method collapses any heirachical structure that may have been present as multiple `trees` nodes in the original nexml (though such a feature is rarely used). To preserve that structure, use `trees` instead.} #' \item{"metadata"}{Get metadata from the specified level (default is top/nexml level) } #' \item{"otu"}{ returns a named character vector containing all available metadata. names indicate \code{property} (or \code{rel} in the case of links/resourceMeta), while values indicate the \code{content} (or \code{href} for links). } #' \item{"taxa"}{ alias for otu } #' } #' For a slightly cleaner interface, each of these elements is also defined as an S4 method #' for a nexml object. So in place of `get_item(nexml, "tree")`, one could use `get_tree(nexml)`, #' and so forth for each element type. #' @return return type depends on the element requested. See details. #' @export #' @seealso \code{\link{get_trees}} #' @include classes.R #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' nexml_get(nex, "trees") #' nexml_get(nex, "characters_list") nexml_get <- function(nexml, element = c("trees", "trees_list", "flat_trees", "metadata", "otu", "taxa", "characters", "characters_list", "namespaces"), ...){ element <- match.arg(element) switch(element, trees = get_trees(nexml), # will warn if more than one tree is available trees_list = get_trees_list(nexml), flat_trees = get_flat_trees(nexml), metadata = get_metadata(nexml, ...), otu = get_taxa(nexml), taxa = get_taxa(nexml), characters = get_characters(nexml), characters_list = get_characters_list(nexml), namespaces = get_namespaces(nexml)) } get_item <- nexml_get RNeXML/R/get_namespaces.R0000644000176200001440000000073712641021656014633 0ustar liggesusers #' get namespaces #' #' get namespaces #' @param nexml a nexml object #' @return a named character vector providing the URLs defining each #' of the namespaces used in the nexml file. Names correspond to #' the prefix abbreviations of the namespaces. #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_namespaces(nex) #' @export get_namespaces <- function(nexml){ nexml@namespaces } RNeXML/R/get_taxa.R0000644000176200001440000000216612641021656013447 0ustar liggesusers #' get_taxa #' #' Retrieve names of all species/otus otus (operational taxonomic units) included in the nexml #' @aliases get_taxa get_otu #' @param nexml a nexml object #' @return the list of taxa #' @export get_taxa get_otu #' @examples #' comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex <- nexml_read(comp_analysis) #' get_taxa(nex) #' @seealso \code{\link{get_item}} get_taxa <- function(nexml){ get_level(nexml, "otus/otu") } get_otu <- get_taxa #' get_taxa_list #' #' Retrieve names of all species/otus otus (operational taxonomic units) included in the nexml #' @aliases get_taxa_list get_otus_list #' @param nexml a nexml object #' @return the list of taxa #' @export get_taxa_list get_otus_list #' @seealso \code{\link{get_item}} get_taxa_list <- function(nexml){ out <- lapply(nexml@otus, function(otus){ out <- sapply(otus@otu, function(otu) otu@label) names(out) <- name_by_id(otus@otu) out }) names(out) <- name_by_id(nexml@otus) out } get_otus_list <- get_taxa_list RNeXML/R/add_trees.R0000644000176200001440000001003112641021656013573 0ustar liggesusers#' add_trees #' #' add_trees #' @param phy a phylo object, multiPhylo object, or list of #' mulitPhylo to be added to the nexml #' @param nexml a nexml object to which we should append this phylo. #' By default, a new nexml object will be created. #' @param append_to_existing_otus logical, indicating if we should #' make a new OTU block (default) or append to the existing one. #' @return a nexml object containing the phy in nexml format. #' @export #' @examples #' library("geiger") #' data(geospiza) #' geiger_nex <- add_trees(geospiza$phy) add_trees <- function(phy, nexml=new("nexml"), append_to_existing_otus=FALSE){ nexml <- as(nexml, "nexml") phy <- standardize_phylo_list(phy) ## handle multiPhlyo cases new_taxa <- unlist(sapply(phy, function(y) sapply(y, function(z) z$tip.label))) nexml <- add_otu(nexml, new_taxa, append=append_to_existing_otus) otus_id <- nexml@otus[[length(nexml@otus)]]@id nexml <- add_trees_block(nexml, phy, otus_id) nexml } ##################### phylo -> nexml ############### setAs("phylo", "nexml", function(from){ add_trees(from) }) setAs("multiPhylo", "nexml", function(from){ add_trees(from) }) standardize_phylo_list <- function(phy){ if(is(phy, "list") && (is(phy[[1]], "list") || is(phy[[1]], "multiPhylo")) && is(phy[[1]][[1]], "phylo")){ phy } else if(is(phy, "multiPhylo") || (is(phy, "list") && is(phy[[1]], "phylo"))) { list(phy) } else if(is(phy, "phylo")) { phy <- list(phy) class(phy) <- "multiPhylo" list(phy) } else { # desperate phy <- list(as(phy, "phylo")) class(phy) <- "multiPhylo" list(phy) } } add_trees_block <- function(nexml, phy, otus_id){ phy <- standardize_phylo_list(phy) ## all trees will use the same otu_map <- reverse_map(get_otu_maps(nexml))[[otus_id]] trees <- lapply(phy, function(trs){ tree_id <- nexml_id("ts") new("trees", id = tree_id, about = paste0("#", tree_id), otus = otus_id, tree = new("ListOftree", lapply(trs, function(tr) fromPhylo(tr, otu_map))) ) }) ## Append to any existing trees nodes nexml@trees <- new("ListOftrees", c(nexml@trees, trees)) nexml } # Main routine to generate NeXML from ape:phylo fromPhylo <- function(phy, otu_map){ node_ids <- sapply(unique(as.numeric(phy$edge)), function(i) nexml_id("n")) names(node_ids) <- as.character(unique(as.numeric(phy$edge))) ## Generate the "ListOfedge" made of "edge" objects edges <- lapply(1:dim(phy$edge)[1], function(i){ edge_id <- nexml_id("e") source <- node_ids[as.character(phy$edge[i,1])] target <- node_ids[as.character(phy$edge[i,2])] e <- new("edge", source = source, target = target, id = edge_id, about = paste0("#", edge_id)) if(!is.null(phy$edge.length)) e@length <- as.numeric(phy$edge.length[i]) e } ) edges <- new("ListOfedge", edges) ## Generate the ListOfnode made of "node" objects ## In doing so, generate otu_id numbers for tip nodes nodes <- lapply(unique(as.numeric(phy$edge)), function(i){ node_id <- node_ids[as.character(i)] if(is.na(phy$tip.label[i])) new("node", id = node_id, about = paste0("#", node_id)) else if(is.character(phy$tip.label[i])){ otu_id <- otu_map[phy$tip.label[i]] new("node", id = node_id, about = paste0("#", node_id), otu = otu_id) } }) ## FIXME how about naming non-tip labels? nodes <- new("ListOfnode", nodes) ## Create the "tree" S4 object tree_id <- nexml_id("tree") tree <- new("tree", node = nodes, edge = edges, 'xsi:type' = 'FloatTree', id = tree_id, about = paste0("#", tree_id)) } RNeXML/R/character_classes.R0000644000176200001440000003447212641021656015331 0ustar liggesusers#' @include classes.R #################################################### setClass("char", slots = c(states = "character"), contains = "IDTagged") setMethod("fromNeXML", signature("char", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() if(!is.na(xmlAttrs(from)["states"])) obj@states <- xmlAttrs(from)["states"] obj }) setMethod("toNeXML", signature("char", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(length(object@states) > 0) addAttributes(parent, "states" = object@states) parent }) setAs("char", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("char"))) setAs("char", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("char"))) setAs("XMLInternalElementNode", "char", function(from) fromNeXML(new("char"), from)) ############################################### setClass("ListOfrow", slots = c(names="character"), contains="list") setClass("obsmatrix", slots = c(row="ListOfrow"), contains = "Annotated") setMethod("fromNeXML", signature("obsmatrix", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0) obj@row <- new("ListOfrow", lapply(kids[names(kids) == "row"], as, "row")) obj }) setMethod("toNeXML", signature("obsmatrix", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@row) parent }) setAs("obsmatrix", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("matrix"))) setAs("obsmatrix", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("matrix"))) setAs("XMLInternalElementNode", "obsmatrix", function(from) fromNeXML(new("obsmatrix"), from)) ###################################################### setClass("ListOfcell", slots = c(names="character"), contains="list") setClass("ListOfseq", slots = c(names="character"), contains="list") setClass("row", slots = c(cell = "ListOfcell", seq = "ListOfseq"), contains = "OptionalTaxonLinked") setMethod("fromNeXML", signature("row", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0){ if("cell" %in% names(kids)) obj@cell <- new("ListOfcell", lapply(kids[names(kids) == "cell"], as, "cell")) if("seq" %in% names(kids)) obj@seq <- new("ListOfseq", lapply(kids[names(kids) == "seq"], as, "seq")) } obj }) setMethod("toNeXML", signature("row", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@cell) addChildren(parent, kids = object@seq) parent }) setAs("row", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("row"))) setAs("row", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("row"))) setAs("XMLInternalElementNode", "row", function(from) fromNeXML(new("row"), from)) ####################################################### setClass("ListOfstate", slots = c(names="character"), contains="list") setClass("ListOfpolymorphic_state_set", slots = c(names="character"), contains="list") setClass("ListOfuncertain_state_set", slots = c(names="character"), contains="list") setClass("states", slots = c(state="ListOfstate", polymorphic_state_set="ListOfpolymorphic_state_set", uncertain_state_set="ListOfuncertain_state_set"), contains = "IDTagged") setMethod("fromNeXML", signature("states", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0){ obj@state <- new("ListOfstate", lapply(kids[names(kids) == "state"], as, "state")) obj@polymorphic_state_set <- new("ListOfpolymorphic_state_set", lapply(kids[names(kids) == "polymorphic_state_set"], as, "polymorphic_state_set")) obj@uncertain_state_set <- new("ListOfuncertain_state_set", lapply(kids[names(kids) == "uncertain_state_set"], as, "uncertain_state_set")) } obj }) setMethod("toNeXML", signature("states", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@state) parent }) setAs("states", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("states"))) setAs("states", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("states"))) setAs("XMLInternalElementNode", "states", function(from) fromNeXML(new("states"), from)) ####################################################### ## technically symbol is positive integer http://nexml.org/doc/schema-1/characters/standard/#StandardToken setClass("state", slots = c(symbol = "integer"), contains = "IDTagged") setMethod("fromNeXML", signature("state", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj@symbol <- as.integer(xmlAttrs(from)["symbol"]) obj }) setMethod("toNeXML", signature("state", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "symbol" = object@symbol) parent }) setAs("state", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("state"))) setAs("state", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("state"))) setAs("XMLInternalElementNode", "state", function(from) suppressWarnings(fromNeXML(new("state"), from))) ################################################ setClass("ListOfmember", slots = c(names="character"), contains="list") setClass("uncertain_state_set", slots = c(member = "ListOfmember"), contains="state") setMethod("fromNeXML", signature("uncertain_state_set", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0) obj@member <- new("ListOfmember", lapply(kids[names(kids) == "member"], as, "member")) obj }) setMethod("toNeXML", signature("uncertain_state_set", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@member) parent }) setAs("uncertain_state_set", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("uncertain_state_set"))) setAs("uncertain_state_set", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("uncertain_state_set"))) setAs("XMLInternalElementNode", "uncertain_state_set", function(from) fromNeXML(new("uncertain_state_set"), from)) ################################################ setClass("polymorphic_state_set", contains="uncertain_state_set") setMethod("fromNeXML", signature("polymorphic_state_set", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0) obj@member <- new("ListOfmember", lapply(kids[names(kids) == "member"], as, "member")) obj }) setMethod("toNeXML", signature("polymorphic_state_set", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, kids = object@member) parent }) setAs("polymorphic_state_set", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("polymorphic_state_set"))) setAs("polymorphic_state_set", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("polymorphic_state_set"))) setAs("XMLInternalElementNode", "polymorphic_state_set", function(from) fromNeXML(new("polymorphic_state_set"), from)) ##################### setClass("cell", slots = c(char="character", state= "character"), contains="Base") setMethod("fromNeXML", signature("cell", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj@char <- xmlAttrs(from)["char"] obj@state <- xmlAttrs(from)["state"] obj }) setMethod("toNeXML", signature("cell", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "char" = object@char) addAttributes(parent, "state" = object@state) parent }) setAs("cell", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("cell"))) setAs("cell", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("cell"))) setAs("XMLInternalElementNode", "cell", function(from) fromNeXML(new("cell"), from)) ######################### setClass("member", slots = c(state="character"), contains="Base") setMethod("fromNeXML", signature("member", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj@state <- xmlAttrs(from)["state"] obj }) setMethod("toNeXML", signature("member", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addAttributes(parent, "state" = object@state) parent }) setAs("member", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("member"))) setAs("member", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("member"))) setAs("XMLInternalElementNode", "member", function(from) fromNeXML(new("member"), from)) ######################## setClass("seq", slots = c(seq = "character"), contains="Base") setMethod("fromNeXML", signature("seq", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj@seq <- xmlValue(from) obj } ) setMethod("toNeXML", signature("seq", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() addChildren(parent, object@seq) parent }) setAs("seq", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("seq"))) setAs("seq", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("seq"))) setAs("XMLInternalElementNode", "seq", function(from) fromNeXML(new("seq"), from)) ######################################### setClass("ListOfchar", slots = c(names="character"), contains="list") setClass("ListOfstates", slots = c(names="character"), contains="list") setClass("format", slots = c(states = "ListOfstates", ## FIXME Should be ListOfstates char = "ListOfchar"), contains = "Annotated") setMethod("fromNeXML", signature("format", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() kids <- xmlChildren(from) if(length(kids) > 0){ if("char" %in% names(kids)) obj@char <- new("ListOfchar", lapply(kids[names(kids) == "char"], as, "char")) if("states" %in% names(kids)) obj@states <- new("ListOfstates", lapply(kids[names(kids) == "states"], as, "states")) } obj }) setMethod("toNeXML", signature("format", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() if(!isEmpty(object@char)) addChildren(parent, kids = object@char) if(length(object@states) > 0) addChildren(parent, kids = object@states) parent }) setAs("format", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("format"))) setAs("format", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("format"))) setAs("XMLInternalElementNode", "format", function(from) fromNeXML(new("format"), from)) #################################################### setClass("characters", slots = c(format = "format", matrix = "obsmatrix"), contains = "TaxaLinked") setMethod("fromNeXML", signature("characters", "XMLInternalElementNode"), function(obj, from){ obj <- callNextMethod() obj@format <- as(from[["format"]], "format") obj@matrix <- as(from[["matrix"]], "obsmatrix") obj }) setMethod("toNeXML", signature("characters", "XMLInternalElementNode"), function(object, parent){ parent <- callNextMethod() parent <- addChildren(parent, format = object@format) parent <- addChildren(parent, matrix = object@matrix) parent }) setAs("characters", "XMLInternalNode", function(from) toNeXML(from, newXMLNode("characters"))) setAs("characters", "XMLInternalElementNode", function(from) toNeXML(from, newXMLNode("characters"))) setAs("XMLInternalElementNode", "characters", function(from) fromNeXML(new("characters"), from)) RNeXML/R/tbl_df.R0000644000176200001440000000005412731606043013076 0ustar liggesusers`$.tbl_df` <- function(x, i) .subset2(x, i) RNeXML/R/nexml_read.R0000644000176200001440000000347312731606043013772 0ustar liggesusers#' Read NeXML files into various R formats #' #' @param x Path to the file to be read in. Or an \code{\link[XML]{XMLInternalDocument-class}} #' or \code{\link[XML]{XMLInternalNode-class}} #' @param ... Further arguments passed on to \code{\link[XML]{xmlParse}} #' @import XML #' @import httr #' @aliases nexml_read read.nexml #' @export nexml_read read.nexml #' @examples #' # file #' f <- system.file("examples", "trees.xml", package="RNeXML") #' nexml_read(f) #' \dontrun{ # may take > 5 s #' # url #' url <- "https://raw.githubusercontent.com/ropensci/RNeXML/master/inst/examples/trees.xml" #' nexml_read(url) #' # character string of XML #' str <- paste0(readLines(f), collapse = "") #' nexml_read(str) #' # XMLInternalDocument #' library("httr") #' library("XML") #' x <- xmlParse(content(GET(url))) #' nexml_read(x) #' # XMLInternalNode #' nexml_read(xmlRoot(x)) #' } nexml_read <- function(x, ...) { UseMethod("nexml_read") } #' @export #' @rdname nexml_read nexml_read.character <- function(x, ...) { if (!any(grepl("^https?://", x), XML::isXMLString(x), file.exists(x))) { stop("character input must be a URL, xml string or file path", call. = FALSE) } # handle remote paths using httr::GET if (grepl("^https?://", x)) { tmp <- GET(x) stop_for_status(tmp) x <- content(tmp, as = "text") } doc <- xmlParse(x, ...) output <- as(xmlRoot(doc), "nexml") free(doc) # explicitly free the pointers after conversion into S4 return(output) } #' @export #' @rdname nexml_read nexml_read.XMLInternalDocument <- function(x, ...) { as(xmlRoot(x), "nexml") } #' @export #' @rdname nexml_read nexml_read.XMLInternalNode <- function(x, ...) { as(x, "nexml") } setAs("XMLInternalNode", "phylo", function(from) as(as(from, "nexml"), "phylo") ) read.nexml <- nexml_read RNeXML/R/concatenate_nexml.R0000644000176200001440000000352412641021656015341 0ustar liggesusers #' Concatenate nexml files #' #' Concatenate nexml files #' @param x,... nexml objects to be concatenated, e.g. from #' \code{\link{write.nexml}} or \code{\link{read.nexml}}. #' Must have unique ids on all elements #' @param recursive logical. If 'recursive = TRUE', the function recursively #' descends through lists (and pairlists) combining all their #' elements into a vector. (Not implemented). #' @return a concatenated nexml file #' @examples #' \dontrun{ #' f1 <- system.file("examples", "trees.xml", package="RNeXML") #' f2 <- system.file("examples", "comp_analysis.xml", package="RNeXML") #' nex1 <- read.nexml(f1) #' nex2 <- read.nexml(f2) #' nex <- c(nex1, nex2) #' } setMethod("c", signature("nexml"), function(x, ..., recursive = FALSE){ elements = list(x, ...) nexml <- new("nexml") ## Check that ids are unique if(!do.call(unique_ids,elements)) stop("ids are not unique across nexml files. Consider regenerating ids") else { nexml@otus <- new("ListOfotus", unlist(lapply(elements, function(n) n@otus), recursive=FALSE)) nexml@characters <- new("ListOfcharacters", unlist(lapply(elements, function(n) n@characters), recursive=FALSE)) nexml@trees <- new("ListOftrees", unlist(lapply(elements, function(n) n@trees), recursive=FALSE)) } nexml }) get_ids <- function(nexml){ doc <- xmlDoc(as(nexml, "XMLInternalNode")) out <- unname(xpathSApply(doc, "//@id")) free(doc) out } unique_ids <- function(...){ set <- list(...) counts <- table(unlist(lapply(set, get_ids))) !any(counts > 1) } RNeXML/R/nexml_add.R0000644000176200001440000000227012641021656013602 0ustar liggesusers #' add elements to a new or existing nexml object #' #' add elements to a new or existing nexml object #' @param x the object to be added #' @param nexml an existing nexml object onto which the object should be appended #' @param type the type of object being provided. #' @param ... additional optional arguments to the add functions #' #' @return a nexml object with the additional data #' @seealso \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} \code{\link{add_namespaces}} #' @export #' @examples #' library("geiger") #' data(geospiza) #' geiger_nex <- nexml_add(geospiza$phy, type="trees") #' geiger_nex <- nexml_add(geospiza$dat, nexml = geiger_nex, type="characters") nexml_add <- function(x, nexml = new("nexml"), type = c("trees", "characters", "meta", "namespaces"), ...){ switch(type, "trees" = add_trees(x, nexml, ...), "characters" = add_characters(x, nexml, ...), "metadata" = add_meta(x, nexml, ...), "namespaces" = add_namespaces(x, nexml, ...)) } RNeXML/R/get_basic_metadata.R0000644000176200001440000000237712641021656015437 0ustar liggesusers ## Goodness, but XPATH is so much more expressive for this purpose... ## get all top-level metadata More extensible than hardwired functions ## The following methods are somewhat too rigid. Might make more sense to do get_metadata(nexml, "nexml")["dc:creator"], etc. ## Note that we define our namespace prefixes explicitly, so that should the NeXML use a different abberivation, this should still work. #' get_citation #' #' get_citation #' @param nexml a nexml object #' @return the list of taxa #' @export get_citation <- function(nexml){ b <- setxpath(as(nexml, "XMLInternalElementNode")) ## FIXME should return a citation class nexml! cat(unname(xpathSApply(b, "/nexml/meta[@property='dcterms:bibliographicCitation']/@content", namespaces = nexml_namespaces))) } #' get_license #' #' get_license #' @param nexml a nexml object #' @return the list of taxa #' @export get_license <- function(nexml){ b <- setxpath(as(nexml, "XMLInternalElementNode")) dc_rights <- unname(xpathSApply(b, "/nexml/meta[@property='dc:rights']/@content", namespaces = nexml_namespaces)) cc_license <- unname(xpathSApply(b, "/nexml/meta[@rel='cc:license']/@href", namespaces = nexml_namespaces)) if(length(dc_rights) > 0) dc_rights else cc_license } RNeXML/R/taxize_nexml.R0000644000176200001440000000233312641021656014356 0ustar liggesusers #' taxize nexml #' #' Check taxanomic names against the specified service and #' add appropriate semantic metadata to the nexml OTU unit #' containing the corresponding identifier. #' @param nexml a nexml object #' @param type the name of the identifier to use #' @param ... additional arguments (not implemented yet) #' @import taxize #' @export #' @examples \dontrun{ #' data(bird.orders) #' birds <- add_trees(bird.orders) #' birds <- taxize_nexml(birds, "NCBI") #' } taxize_nexml <- function(nexml, type = c("NCBI"), ...){ type <- match.arg(type) if(type == "NCBI"){ for(j in 1:length(nexml@otus)){ for(i in 1:length(nexml@otus[[j]]@otu)){ id <- get_uid(nexml@otus[[j]]@otu[[i]]@label) if(is.na(id)) warning(paste("ID for otu", nexml@otus[[j]]@otu[[i]]@label, "not found. Consider checking the spelling or alternate classification")) else nexml@otus[[j]]@otu[[i]]@meta <- new("ListOfmeta", list( meta(href = paste0("http://ncbi.nlm.nih.gov/taxonomy/", id), rel = "tc:toTaxon"))) } } } nexml } RNeXML/R/get_taxa_meta.R0000644000176200001440000000363412641021656014456 0ustar liggesusers### TODO: how to deal with missing meta slot elements??? ### Right now the functions break when an otu doesn't have meta slot #' get_taxa_meta #' #' Retrieve metadata of all species/otus otus (operational taxonomic units) included in the nexml #' @param nexml a nexml object #' @param what One of href, rel, id, or xsi:type #' @return the list of metadata for each taxon #' @seealso \code{\link{get_item}} #' @keywords internal #' @examples #' \dontrun{ #' data(bird.orders) #' birds <- add_trees(bird.orders) #' birds <- taxize_nexml(birds, "NCBI") #' RNeXML:::get_taxa_meta(birds) #' RNeXML:::get_taxa_meta(birds, 'rel') #' RNeXML:::get_taxa_meta(birds, 'id') #' RNeXML:::get_taxa_meta(birds, 'xsi:type') #' } get_taxa_meta <- function(nexml, what='href'){ out <- lapply(nexml@otus, function(otus) # sapply(otus@otu, function(otu) slot(otu@meta, what))) sapply(otus@otu, function(otu) slot(otu@meta[[1]], what))) unname(unlist(out, recursive = FALSE)) } get_otu_meta <- get_taxa_meta #' get_taxa_meta_list #' #' Retrieve metadata of all species/otus otus (operational taxonomic units) included in the nexml #' @param nexml a nexml object #' @param what One of href, rel, id, or xsi:type #' @return the list of metadata for each taxon #' @seealso \code{\link{get_item}} #' @keywords internal #' @examples \dontrun{ #' data(bird.orders) #' birds <- add_trees(bird.orders) #' birds <- taxize_nexml(birds, "NCBI") #' RNeXML:::get_taxa_meta_list(birds) #' RNeXML:::get_taxa_meta_list(birds, 'rel') #' RNeXML:::get_taxa_meta_list(birds, 'id') #' RNeXML:::get_taxa_meta_list(birds, 'xsi:type') #' } get_taxa_meta_list <- function(nexml, what='href'){ out <- lapply(nexml@otus, function(otus){ out <- sapply(otus@otu, function(otu) slot(otu@meta[[1]], what)) names(out) <- name_by_id(otus@otu) out }) names(out) <- name_by_id(nexml@otus) out } get_otu_meta_list <- get_taxa_meta_list RNeXML/R/nexml_validate.R0000644000176200001440000000430212731606043014640 0ustar liggesusersONLINE_VALIDATOR <- "http://162.13.187.155/nexml/phylows/validator" CANONICAL_SCHEMA <- "http://162.13.187.155/nexml/xsd/nexml.xsd" #ONLINE_VALIDATOR <- "http://www.nexml.org/nexml/phylows/validator" #CANONICAL_SCHEMA <- "http://www.nexml.org/2009/nexml.xsd" #' validate nexml using the online validator tool #' @param file path to the nexml file to validate #' @param schema URL of schema (for fallback method only, set by default). #' @details Requires an internet connection. see http://www.nexml.org/nexml/phylows/validator for more information in debugging invalid files #' @return TRUE if the file is valid, FALSE or error message otherwise #' @export #' @import httr XML #' @examples \dontrun{ #' data(bird.orders) #' birds <- nexml_write(bird.orders, "birds_orders.xml") #' nexml_validate("birds_orders.xml") #' unlink("birds_orders.xml") # delete file to clean up #' } nexml_validate <- function(file, schema=CANONICAL_SCHEMA){ a = POST(ONLINE_VALIDATOR, body=list(file = upload_file(file))) if(a$status_code %in% c(200,201)){ TRUE } else if(a$status_code == 504){ warning("Online validator timed out, trying schema-only validation.") nexml_schema_validate(file, schema=schema) } else if(a$status_code == 400){ warning(paste("Validation failed, error messages:", xpathSApply(htmlParse(content(a, "text")), "//li[contains(@class, 'error') or contains(@class, 'fatal')]", xmlValue) )) FALSE } else { warning(paste("Unable to reach validator. status code:", a$status_code, ". Message:\n\n", content(a, "text"))) NULL } } nexml_schema_validate <- function(file, schema=CANONICAL_SCHEMA){ a = GET(schema) if(a$status_code == 200){ if(is.null(xmlSchemaParse(schema))){ warning(paste("Schema not accessible at", schema)) NULL } else { result <- xmlSchemaValidate(schema, file) if(length(result$errors) == 0){ TRUE } else { warning(paste(result$errors)) FALSE } } } else { warning("Unable to obtain schema, couldn't validate") NULL } } #xmlSchemaValidate(xmlSchemaParse(content(a, "text"), asText=TRUE), file) # fails to get other remote resources RNeXML/vignettes/0000755000176200001440000000000012734542424013336 5ustar liggesusersRNeXML/vignettes/metadata.Rmd0000644000176200001440000001675012641021656015567 0ustar liggesusers--- title: "Handling Metadata in RNeXML" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) ``` ## Writing NeXML metadata The `add_basic_meta()` function takes as input an existing `nexml` object (like the other `add_` functions, if none is provided it will create one), and at the time of this writing any of the following parameters: `title`, `description`, `creator`, `pubdate`, `rights`, `publisher`, `citation`. Other metadata elements and corresponding parameters may be added in the future. Load the packages and data: ```{r} library('RNeXML') data(bird.orders) ``` Create an `nexml` object for the phylogeny `bird.orders` and add appropriate metadata: ```{r} birds <- add_trees(bird.orders) birds <- add_basic_meta( title = "Phylogeny of the Orders of Birds From Sibley and Ahlquist", description = "This data set describes the phylogenetic relationships of the orders of birds as reported by Sibley and Ahlquist (1990). Sibley and Ahlquist inferred this phylogeny from an extensive number of DNA/DNA hybridization experiments. The ``tapestry'' reported by these two authors (more than 1000 species out of the ca. 9000 extant bird species) generated a lot of debates. The present tree is based on the relationships among orders. The branch lengths were calculated from the values of Delta T50H as found in Sibley and Ahlquist (1990, fig. 353).", citation = "Sibley, C. G. and Ahlquist, J. E. (1990) Phylogeny and classification of birds: a study in molecular evolution. New Haven: Yale University Press.", creator = "Sibley, C. G. and Ahlquist, J. E.", nexml=birds) ``` Instead of a literal string, citations can also be provided in R's `bibentry` type, which is the one in which R package citations are obtained: ```{r} birds <- add_basic_meta(citation = citation("ape"), nexml = birds) ``` ## Taxonomic identifiers The `taxize_nexml()` function uses the R package `taxize` [@Chamberlain_2013] to check each taxon label against the NCBI database. If a unique match is found, a metadata annotation is added to the taxon providing the NCBI identification number to the taxonomic unit. ```{r message=FALSE, results='hide'} birds <- taxize_nexml(birds, "NCBI") ``` If no match is found, the user is warned to check for possible typographic errors in the taxonomic labels provided. If multiple matches are found, the user will be prompted to choose between them. ## Custom metadata extensions We can get a list of namespaces along with their prefixes from the `nexml` object: ```{r} prefixes <- get_namespaces(birds) prefixes["dc"] ``` We create a `meta` element containing this annotation using the `meta` function: ```{r} modified <- meta(property = "prism:modificationDate", content = "2013-10-04") ``` We can add this annotation to our existing `birds` NeXML file using the `add_meta()` function. Because we do not specify a level, it is added to the root node, referring to the NeXML file as a whole. ```{r} birds <- add_meta(modified, birds) ``` The built-in vocabularies are just the tip of the iceberg of established vocabularies. Here we add an annotation from the `skos` namespace which describes the history of where the data comes from: ```{r} history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML") ``` Because `skos` is not in the current namespace list, we add it with a url when adding this meta element. We also specify that this annotation be placed at the level of the `trees` sub-node in the NeXML file. ```{r} birds <- add_meta(history, birds, level = "trees", namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) ``` For finer control of the level at which a `meta` element is added, we will manipulate the `nexml` R object directly using S4 sub-setting, as shown in the supplement. Much richer metadata annotation is possible. Later we illustrate how metadata annotation can be used to extend the base NeXML format to represent new forms of data while maintaining compatibility with any NeXML parser. The `RNeXML` package can be easily extended to support helper functions such as `taxize_nexml` to add additional metadata without imposing a large burden on the user. ## Reading NeXML metadata A call to the `nexml` object prints some metadata summarizing the data structure: ```{r } birds ``` We can extract all metadata pertaining to the NeXML document as a whole (annotations of the XML root node, ``) with the command ```{r} meta <- get_metadata(birds) ``` This returns a data.frame of available metadata. We can see the kinds of metadata recorded from the names: ```{r} meta ``` We can also access a table of taxonomic metadata: ```{r get_taxa} get_taxa(birds) ``` Which returns text from the otu element labels, typically used to define taxonomic names, rather than text from explicit meta elements. We can also access metadata at a specific level (or use `level=all` to extract all meta elements in a list). Here we show only the first few results: ```{r} otu_meta <- get_metadata(birds, level="otus/otu") otu_meta ``` ## Merging metadata tables We often want to combine metadata from multiple tables. For instance, in this exercise we want to include the taxonomic identifier and id value for each species returned in the character table. This helps us more precisely identify the species whose traits are described by the table. ```{r} library("RNeXML") library("dplyr") library("geiger") knitr::opts_chunk$set(message = FALSE, warning=FALSE, comment = NA) ``` To begin, let's generate a `NeXML` file using the tree and trait data from the `geiger` package's "primates" data: ```{r} data("primates") add_trees(primates$phy) %>% add_characters(primates$dat, ., append=TRUE) %>% taxize_nexml() -> nex ``` (Note that we've used `dplyr`'s cute pipe syntax, but unfortunately our `add_` methods take the `nexml` object as the _second_ argument instead of the first, so this isn't as elegant since we need the stupid `.` to show where the piped output should go...) We now read in the three tables of interest. Note that we tell `get_characters` to give us species labels as there own column, rather than as rownames. The latter is the default only because this plays more nicely with the default format for character matrices that is expected by `geiger` and other phylogenetics packages, but is in general a silly choice for data manipulation. ```{r} otu_meta <- get_metadata(nex, "otus/otu") taxa <- get_taxa(nex) char <- get_characters(nex, rownames_as_col = TRUE) ``` We can take a peek at what the tables look like, just to orient ourselves: ```{r} otu_meta taxa head(char) ``` Now that we have nice `data.frame` objects for all our data, it's easy to join them into the desired table with a few obvious `dplyr` commands: ```{r} taxa %>% left_join(char, by = c("label" = "taxa")) %>% left_join(otu_meta, by = "otu") %>% select(otu, label, x, href) ``` Because these are all from the same otus block anyway, we haven't selected that column, but were it of interest it is also available in the taxa table. RNeXML/vignettes/simmap.Rmd0000644000176200001440000001545512641021656015276 0ustar liggesusers--- title: "Extending NeXML: an example based on simmap" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette bibliography: references.bib --- ```{r compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) library("RNeXML") ``` ## Extending the NeXML standard through metadata annotation. Here we illustrate this process using the example of stochastic character mapping [@Huelsenbeck_2003]. A stochastic character map is simply an annotation of the branches on a phylogeny, assigning each section of each branch to a particular "state" (typically of a morphological characteristic). @Bollback_2006 provides a widely used stand-alone software implementation of this method in the software `simmap`, which modified the standard Newick tree format to express this additional information. This can break compatibility with other software, and creates a format that cannot be interpreted without additional information describing this convention. By contrast, the NeXML extension is not only backwards compatible but contains a precise and machine-readable description of what it is encoding. In this example, we illustrate how the additional information required to define a stochastic character mapping (a `simmap` mapping) in NeXML. @Revell_2012 describes the `phytools` package for R, which includes utilities for reading, manipulating, and writing `simmap` files in R. In this example, we also show how to define `RNeXML` functions that map the R representation used by Revell (an extension of the `ape` class) into the NeXML extension we have defined by using `RNeXML` functions. Since a stochastic character map simply assigns different states to parts of a branch (or edge) on the phylogenetic tree, we can create a NeXML representation by annotating the `edge` elements with appropriate `meta` elements. These elements need to describe the character state being assigned and the duration (in terms of branch-length) that the edge spends in that state (Stochastic character maps are specific to time-calibrated or ultrametric trees). NeXML already defines the `characters` element to handle discrete character traits (`nex:char`) and the states they can assume (`nex:state`). We will thus reuse the `characters` element for this purpose, referring to both the character trait and the states by the ids assigned to them in that element. (NeXML's convention of referring to everything by id permits a single canonical definition of each term, making it clear where additional annotation belongs). For each edge, we need to indicate: - That our annotation contains a stochastic character mapping reconstruction - Since many reconstructions are possible for a single edge, we give each reconstruction an id - We indicate for which character trait we are defining the reconstruction - We then indicate which states the character assumes on that edge. For each state realized on the edge, that involves stating: + the state assignment + the duration (length of time) for which the edge spends in the given state + the order in which the state changes happen (Though we could just assume state transitions are listed chronologically, NeXML suggests making all data explicit, rather than relying on the structure of the data file to convey information). Thus the annotation for an edge that switches from state `s2` to state `s1` of character `cr1` would be constructed like this: ```{r} m <- meta("simmap:reconstructions", children = c( meta("simmap:reconstruction", children = c( meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 1), meta("simmap:length", "0.2030"), meta("simmap:state", "s2"))), meta("simmap:char", "cr1"), meta("simmap:stateChange", children = c( meta("simmap:order", 2), meta("simmap:length", "0.0022"), meta("simmap:state", "s1"))) )))) ``` Of course writing out such a definition manually becomes tedious quickly. Because these are just R commands, we can easily define a function that can loop over an assignment like this for each edge, extracting the appropriate order, length and state from an existing R object such as that provided in the `phytools` package. Likewise, it is straightforward to define a function that reads this data using the `RNeXML` utilities and converts it back to the `phytools` package. The full implementation of this mapping can be seen in the `simmap_to_nexml()` and the `nexml_to_simmap()` functions provided in the `RNeXML` package. As the code indicates, the key step is simply to define the data in meta elements. In so doing, we have defined a custom namespace, `simmap`, to hold our variables. This allows us to provide a URL with more detailed descriptions of what each of these elements mean: ```{r} nex <- add_namespaces(c(simmap = "https://github.com/ropensci/RNeXML/tree/master/inst/simmap.md")) ``` At that URL we have posted a simple description of each term. Using this convention we can generate NeXML files containing `simmap` data, read those files into R, and convert them back into the `phytools` package format. These simple functions serve as further illustration of how `RNeXML` can be used to extend the NeXML standard. We illustrate their use briefly here, starting with loading a `nexml` object containing a `simmap` reconstruction into R: ```{r} data(simmap_ex) ``` The `get_trees()` function can be used to return an `ape::phylo` tree as usual. `RNeXML` automatically detects the `simmap` reconstruction data and returns includes this in a `maps` element of the `ape::phylo` object, for use with other `phytools` functions. ```{r} phy <- nexml_to_simmap(simmap_ex) ``` We can then use various functions from `phytools` designed for `simmap` objects [@Revell_2012], such as the plotting function: ```{r Figure1, fig.cap="Stochastic character mapping on a phylogeny, as generated by the phytools package after parsing the simmap-extended NeXML."} library("phytools") plotSimmap(phy) ``` Likewise, we can convert the object back in the NeXML format and write it out to file to be read by other users. ```{r} nex <- simmap_to_nexml(phy) nexml_write(nex, "simmap.xml") ``` Though other NeXML parsers (for instance, for Perl or Python) have not been written explicitly to express `simmap` data, those parsers will nonetheless be able to successfully parse this file and expose the `simmap` data to the user. ```{r cleanup, include=FALSE} unlink("simmap.xml") ``` RNeXML/vignettes/S4.Rmd0000644000176200001440000001006212641021656014263 0ustar liggesusers--- title: "The nexml S4 Object" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r supplement-compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE) basename <- 'S4' ``` ```{r include=FALSE} library("RNeXML") ``` ## Understanding the `nexml` S4 object The `RNeXML` package provides many convenient functions to add and extract information from `nexml` objects in the R environment without requiring the reader to understand the details of the NeXML data structure and making it less likely that a user will generate invalid NeXML syntax that could not be read by other parsers. The `nexml` object we have been using in all of the examples is built on R's S4 mechanism. Advanced users may sometimes prefer to interact with the data structure more directly using R's S4 class mechanism and subsetting methods. Many R users are more familiar with the S3 class mechanism (such as in the `ape` package phylo objects) rather than the S4 class mechanism used in phylogenetics packages such as `ouch` and `phylobase`. The `phylobase` vignette provides an excellent introduction to these data structures. Users already familiar with subsetting lists and other S3 objects in R are likely familar with the use of the `$` operator, such as `phy$edge`. S4 objects simply use an `@` operator instead (but cannot be subset using numeric arguments such as `phy[[1]]` or named arguments such as phy[["edge"]]). The `nexml` object is an S4 object, as are all of its components (slots). Its hierarchical structure corresponds exactly with the XML tree of a NeXML file, with the single exception that both XML attributes and children are represented as slots. S4 objects have constructor functions to initialize them. We create a new `nexml` object with the command: ```{r} nex <- new("nexml") ``` We can see a list of slots contained in this object with ```{r} slotNames(nex) ``` Some of these slots have already been populated for us, for instance, the schema version and default namespaces: ```{r} nex@version nex@namespaces ``` Recognize that `nex@namespaces` serves the same role as `get_namespaces` function, but provides direct access to the slot data. For instance, with this syntax we could also overwrite the existing namespaces with `nex@namespaces <- NULL`. Changing the namespace in this way is not advised. Some slots can contain multiple elements of the same type, such as `trees`, `characters`, and `otus`. For instance, we see that ```{r} class(nex@characters) ``` is an object of class `ListOfcharacters`, and is currently empty, ```{r} length(nex@characters) ``` In order to assign an object to a slot, it must match the class definition of the slot. We can create a new element of any given class with the `new` function, ```{r} nex@characters <- new("ListOfcharacters", list(new("characters"))) ``` and now we have a length-1 list of character matrices, ```{r} length(nex@characters) ``` and we access the first character matrix using the list notation, `[[1]]`. Here we check the class is a `characters` object. ```{r} class(nex@characters[[1]]) ``` Direct subsetting has two primary use cases: (a) useful in looking up (and possibly editing) a specific value of an element, or (b) when adding metadata annotations to specific elements. Consider the example file ```{r} f <- system.file("examples", "trees.xml", package="RNeXML") nex <- nexml_read(f) ``` We can look up the species label of the first `otu` in the first `otus` block: ```{r} nex@otus[[1]]@otu[[1]]@label ``` We can add metadata to this particular OTU using this subsetting format ```{r} nex@otus[[1]]@otu[[1]]@meta <- c(meta("skos:note", "This species was incorrectly identified"), nex@otus[[1]]@otu[[1]]@meta) ``` Here we use the `c` operator to append this element to any existing meta annotations to this otu. RNeXML/vignettes/sparql.Rmd0000644000176200001440000001372412641021656015307 0ustar liggesusers--- title: "SPARQL with RNeXML" author: - Carl Boettiger - Scott Chamberlain - Rutger Vos - Hilmar Lapp output: html_vignette --- ```{r supplement-compile-settings, include=FALSE} library("methods") library("knitr") opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE, cache = FALSE, comment = NA, verbose = TRUE, eval = all(c(require("Sxslt"), require("rrdf")))) basename <- 'sparql' ``` ```{r include=FALSE} library("RNeXML") ``` ## SPARQL Queries Rich, semantically meaningful metadata lies at the heart of the NeXML standard. R provides a rich environment to unlock this information. While our previous examples have relied on the user knowing exactly what metadata they intend to extract (title, publication date, citation information, and so forth), _semantic_ metadata has meaning that a computer can make use of, allowing us to make much more conceptually rich queries than those simple examples. The SPARQL query language is a powerful way to make use of such semantic information in making complex queries. While users should consult a formal introduction to SPARQL for further background, here we illustrate how SPARQL can be used in combination with R functions in ways that would be much more tedious to assemble with only traditional/non-semantic queries. The SPARQL query language is provided for the R environment through the `rrdf` package [@Willighagen_2014], so we start by loading that package. We will also make use of functions from `phytools` and `RNeXML`. ```{r} library("rrdf") library("XML") library("phytools") library("RNeXML") ``` We read in an example file that contains semantic metadata annotations describing the taxonomic units (OTUs) used in the tree. ```{r} nexml <- nexml_read(system.file("examples/primates.xml", package="RNeXML")) ``` In particular, this example declares the taxon rank, NCBI identifier and parent taxon for each OTU, such as: ```xml ``` In this example, we will construct a cladogram by using this information to identify the taxonomic rank of each OTU, and its shared parent taxonomic rank. (If this example looks complex, try writing down the steps to do this without the aid of the SPARQL queries). These examples show the manipulation of semantic triples, Unique Resource Identifiers (URIs) and use of the SPARQL "Join" operator. Note that this example can be run using `demo("sparql", "RNeXML")` to see the code displayed in the R terminal and to avoid character errors that can occur in having to copy and paste from PDF files. We begin by extracting the RDF graph from the NeXML, ```{r} rdf <- get_rdf(system.file("examples/primates.xml", package="RNeXML")) tmp <- tempfile() # so we must write the XML out first saveXML(rdf, tmp) graph <- load.rdf(tmp) ``` We then fetch the NCBI URI for the taxon that has rank 'Order', i.e. the root of the primates phylogeny. The dot operator `.` between clauses implies a join, in this case ```{r} root <- sparql.rdf(graph, "SELECT ?uri WHERE { ?id . ?id ?uri }") ``` This makes use of the SPARQL query language provided by the `rrdf` package. We will also define some helper functions that use SPARQL queries. Here we define a function to get the name ```{r} get_name <- function(id) { max <- length(nexml@otus[[1]]@otu) for(i in 1:max) { if ( nexml@otus[[1]]@otu[[i]]@id == id ) { label <- nexml@otus[[1]]@otu[[i]]@label label <- gsub(" ","_",label) return(label) } } } ``` Next, we define a recursive function to build a newick tree from the taxonomic rank information. ```{r} recurse <- function(node){ # fetch the taxonomic rank and id string rank_query <- paste0( "SELECT ?rank ?id WHERE { ?id <",node,"> . ?id ?rank }") result <- sparql.rdf(graph, rank_query) # get the local ID, strip URI part id <- result[2] id <- gsub("^.+#", "", id, perl = TRUE) # if rank is terminal, return the name if (result[1] == "http://rs.tdwg.org/ontology/voc/TaxonRank#Species") { return(get_name(id)) } # recurse deeper else { child_query <- paste0( "SELECT ?uri WHERE { ?id <",node,"> . ?id ?uri }") children <- sparql.rdf(graph, child_query) return(paste("(", paste(sapply(children, recurse), sep = ",", collapse = "," ), ")", get_name(id), # label interior nodes sep = "", collapse = "")) } } ``` With these functions in place, it is straight forward to build the tree from the semantic RDFa data and then visualize it ```{r} newick <- paste(recurse(root), ";", sep = "", collapse = "") tree <- read.newick(text = newick) collapsed <- collapse.singles(tree) plot(collapsed, type='cladogram', show.tip.label=FALSE, show.node.label=TRUE, cex=0.75, edge.color='grey60', label.offset=-9) ``` RNeXML/vignettes/references.bib0000644000176200001440000002755312641021656016145 0ustar liggesusers@Misc{W3C_2014, url = {http://www.w3.org/TR/rdf-sparql-query/}, title = {SPARQL Query Language for RDF}, journal = {W3C}, howpublished = {\url{http://www.w3.org/TR/rdf-sparql-query/}}, author = {Eric Prud'hommeaux}, archived = {http://greycite.knowledgeblog.org/?uri=http%3A%2F%2Fwww.w3.org%2FTR%2Frdf-sparql-query%2F}, year = {2014}, } @article{Parr2011, abstract = {The accelerating growth of data and knowledge in evolutionary biology is indisputable. Despite this rapid progress, information remains scattered, poorly documented and in formats that impede discovery and integration. A grand challenge is the creation of a linked system of all evolutionary data, information and knowledge organized around Darwin's ever-growing Tree of Life. Such a system, accommodating topological disagreement where necessary, would consolidate taxon names, phenotypic and geographical distributional data across clades, and serve as an integrated community resource. The field of evolutionary informatics, reviewed here for the first time, has matured into a robust discipline that is developing the conceptual, infrastructure and community frameworks for meeting this grand challenge.}, author = {Parr, Cynthia S and Guralnick, Robert and Cellinese, Nico and Page, Roderic D M}, doi = {10.1016/j.tree.2011.11.001}, file = {:home/cboettig/Documents/Mendeley/Trends in ecology \& evolution/2011/Parr et al. - 2011 - Trends in ecology \& evolution.pdf:pdf}, issn = {0169-5347}, journal = {Trends in ecology \& evolution}, mendeley-groups = {Evolutionary Theory/Phylogenetic Methods}, month = dec, number = {2}, pages = {94--103}, pmid = {22154516}, publisher = {Elsevier Ltd}, title = {{Evolutionary informatics: unifying knowledge about the diversity of life.}}, url = {http://www.ncbi.nlm.nih.gov/pubmed/22154516}, volume = {27}, year = {2011} } @Article{Revell_2012, title = {phytools: An R package for phylogenetic comparative biology (and other things).}, author = {Liam J. Revell}, journal = {Methods in Ecology and Evolution}, year = {2012}, volume = {3}, pages = {217-223}, } @Book{Xie_2013, title = {Dynamic Documents with {R} and knitr}, author = {Yihui Xie}, publisher = {Chapman and Hall/CRC}, address = {Boca Raton, Florida}, year = {2013}, note = {ISBN 978-1482203530}, url = {http://yihui.name/knitr/}, } @Article{Vos_2012, doi = {10.1093/sysbio/sys025}, url = {http://dx.doi.org/10.1093/sysbio/sys025}, year = {2012}, month = {Feb}, publisher = {Oxford University Press (OUP)}, volume = {61}, number = {4}, pages = {675-689}, author = {R. A. Vos and J. P. Balhoff and J. A. Caravas and M. T. Holder and H. Lapp and W. P. Maddison and P. E. Midford and A. Priyam and J. Sukumaran and X. Xia and A. Stoltzfus}, title = {NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata}, journal = {Systematic Biology}, } @Article{Harmon_2008, title = {GEIGER: investigating evolutionary radiations}, author = {LJ Harmon and JT Weir and CD Brock and RE Glor and W Challenger}, journal = {Bioinformatics}, year = {2008}, volume = {24}, pages = {129-131}, } @Article{Rausher_2010, doi = {10.1111/j.1558-5646.2009.00940.x}, url = {http://dx.doi.org/10.1111/j.1558-5646.2009.00940.x}, year = {2010}, month = {Mar}, publisher = {Wiley-Blackwell}, volume = {64}, number = {3}, pages = {603-604}, author = {Mark D. Rausher and Mark A. McPeek and Allen J. Moore and Loren Rieseberg and Michael C. Whitlock}, title = {Data Archiving}, journal = {Evolution}, } @Article{Boettiger_2012, doi = {10.1111/j.2041-210x.2012.00247.x}, url = {http://dx.doi.org/10.1111/j.2041-210x.2012.00247.x}, year = {2012}, month = {Oct}, publisher = {Wiley-Blackwell}, volume = {3}, number = {6}, pages = {1060-1066}, author = {Carl Boettiger and Duncan {Temple Lang}}, editor = {Luke Harmon}, title = {Treebase: an R package for discovery, access and manipulation of online phylogenies}, journal = {Methods Ecol Evol}, } @Manual{R, title = {R: A Language and Environment for Statistical Computing}, author = {{R Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2014}, url = {http://www.R-project.org/}, } @Manual{phylobase, title = {phylobase: Base package for phylogenetic structures and comparative data}, author = {{NESCENT R Hackathon Team}}, year = {2014}, note = {R package version 0.6.8}, url = {http://CRAN.R-project.org/package=phylobase}, } @Article{Tenopir_2011, doi = {10.1371/journal.pone.0021101}, url = {http://dx.doi.org/10.1371/journal.pone.0021101}, year = {2011}, month = {Jun}, publisher = {Public Library of Science (PLoS)}, volume = {6}, number = {6}, pages = {e21101}, author = {Carol Tenopir and Suzie Allard and Kimberly Douglass and Arsev Umur Aydinoglu and Lei Wu and Eleanor Read and Maribeth Manoff and Mike Frame}, editor = {Cameron Neylon}, title = {Data Sharing by Scientists: Practices and Perceptions}, journal = {PLoS ONE}, } @Manual{Boettiger_2014, title = {knitcitations: Citations for knitr markdown files}, author = {Carl Boettiger}, note = {R package version 1.0-1}, url = {https://github.com/cboettig/knitcitations}, year = {2014}, } @InCollection{Hartig_2012, doi = {10.1007/978-3-642-31753-8_56}, url = {http://dx.doi.org/10.1007/978-3-642-31753-8_56}, year = {2012}, publisher = {Springer Science + Business Media}, pages = {506-507}, author = {Olaf Hartig}, title = {An Introduction to SPARQL and Queries over Linked Data}, booktitle = {Web Engineering}, } @Article{Chamberlain_2013, doi = {10.12688/f1000research.2-191.v2}, url = {http://dx.doi.org/10.12688/f1000research.2-191.v2}, year = {2013}, month = {Oct}, publisher = {F1000 Research, Ltd.}, author = {Scott A. Chamberlain and Eduard Sz{\"o}cs}, title = {taxize: taxonomic search and retrieval in R}, journal = {F1000Research}, } @Article{Drew_2013, doi = {10.1371/journal.pbio.1001636}, url = {http://dx.doi.org/10.1371/journal.pbio.1001636}, year = {2013}, month = {Sep}, publisher = {Public Library of Science (PLoS)}, volume = {11}, number = {9}, pages = {e1001636}, author = {Bryan T. Drew and Romina Gazis and Patricia Cabezas and Kristen S. Swithers and Jiabin Deng and Roseana Rodriguez and Laura A. Katz and Keith A. Crandall and David S. Hibbett and Douglas E. Soltis}, title = {Lost Branches on the Tree of Life}, journal = {PLoS Biol}, } @Article{Bollback_2006, doi = {10.1186/1471-2105-7-88}, url = {http://dx.doi.org/10.1186/1471-2105-7-88}, year = {2006}, publisher = {Springer Science + Business Media}, volume = {7}, number = {1}, pages = {88}, author = {JonathanP Bollback}, journal = {BMC Bioinformatics}, } @Article{Cranston_2014, doi = {10.1371/currents.tol.bf01eff4a6b60ca4825c69293dc59645}, url = {http://dx.doi.org/10.1371/currents.tol.bf01eff4a6b60ca4825c69293dc59645}, year = {2014}, publisher = {Public Library of Science (PLoS)}, author = {Karen Cranston and Luke J. Harmon and Maureen A. O'Leary and Curtis Lisle}, title = {Best Practices for Data Sharing in Phylogenetic Research}, journal = {PLoS Curr}, } @Article{Huelsenbeck_2003, doi = {10.1080/10635150390192780}, url = {http://dx.doi.org/10.1080/10635150390192780}, year = {2003}, month = {Apr}, publisher = {Oxford University Press (OUP)}, volume = {52}, number = {2}, pages = {131-158}, author = {John P. Huelsenbeck and Rasmus Nielsen and Jonathan P. Bollback}, title = {Stochastic Mapping of Morphological Characters}, journal = {Systematic Biology}, } @Article{Stoltzfus_2012, doi = {10.1186/1756-0500-5-574}, url = {http://dx.doi.org/10.1186/1756-0500-5-574}, year = {2012}, publisher = {Springer Science + Business Media}, volume = {5}, number = {1}, pages = {574}, author = {Arlin Stoltzfus and Brian O'Meara and Jamie Whitacre and Ross Mounce and Emily L Gillespie and Sudhir Kumar and Dan F Rosauer and Rutger A Vos}, title = {Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis}, journal = {BMC Research Notes}, } @Article{Stodden_2014, doi = {10.2139/ssrn.1550193}, url = {http://dx.doi.org/10.2139/ssrn.1550193}, publisher = {Social Science Electronic Publishing}, author = {Victoria Stodden}, title = {The Scientific Method in Practice: Reproducibility in the Computational Sciences}, journal = {SSRN Journal}, year = {2014}, } @Article{Paradis_2004, title = {{APE}: analyses of phylogenetics and evolution in {R} language}, author = {E. Paradis and J. Claude and K. Strimmer}, journal = {Bioinformatics}, year = {2004}, volume = {20}, pages = {289-290}, } @Misc{taskview, url = {http://cran.r-project.org/web/views/Phylogenetics.html}, author = {Brian O'Meara}, title = {CRAN Task View: Phylogenetics, Especially Comparative Methods}, howpublished = {\url{http://cran.r-project.org/web/views/Phylogenetics.html}}, archived = {http://greycite.knowledgeblog.org/?uri=http%3A%2F%2Fcran.r-project.org%2Fweb%2Fviews%2FPhylogenetics.html}, year = {2014}, } @ARTICLE{Maddison_1997, title = "{NEXUS}: an extensible file format for systematic information", author = "Maddison, D and Swofford, D and Maddison, W", journal = "Syst. Biol.", volume = 46, number = 4, pages = "590--621", month = dec, year = 1997, url = "http://www.ncbi.nlm.nih.gov/pubmed/11975335", keywords = "data standard", issn = "1063-5157", pmid = "11975335" } @INCOLLECTION{Piel_2002, title = "{TreeBASE}: a database of phylogenetic information", booktitle = "The Interoperable ``Catalog of Life''", author = "Piel, W H and Donoghue, M J and Sanderson, Michael J", editor = "Shimura, J and Wilson, K L and Gordon, D", abstract = "Phylogenetic systematics brings added value to the species lists and taxonomic inventories that form the groundwork of our understanding of biodiversity. But this added value is lost without a central database to store what we know about historical patterns and evolutionary relationships. TreeBASE was developed to harness this information and to provide a tool to study the evolution of biodiversity. Access to phylogenetic trees, and to the data underlying them, is needed for a wide variety of purposes, including comparative studies of morphological and molecular evolution, biogeography, coevolution, and studies of congruence of results based on different sources of evidence. Such data are also needed to monitor progress in phylogenetic research, to test new methods of analysis, and to address immediate practical problems in conservation of biodiversity. TreeBASE stores published phylogenetic trees, character and molecular data matrices, bibliographic information, and some details on taxa, characters, algorithms used, and analyses performed. The database is designed to be explored interactively and to allow retrieval and recombination of trees and data from different studies. TreeBASE therefore provides a means of assessing and synthesizing knowledge of phylogenetic information and biodiversity. The URL is: http://phylogeny.harvard.edu/treebase.", publisher = "National Institute for Environmental Studies", number = 171, pages = "41--47", series = "Research Report", year = 2002, url = "http://donoghuelab.yale.edu/sites/default/files/124_piel_shimura02.pdf", address = "Tsukuba, Japan" } @Manual{XML, title = {XML: Tools for parsing and generating XML within R and S-Plus.}, author = {Duncan Temple Lang}, year = {2013}, note = {R package version 3.98-1.1}, url = {http://CRAN.R-project.org/package=XML}, } RNeXML/README.md0000644000176200001440000002751112734530403012605 0ustar liggesusers [![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.13131.svg)](http://dx.doi.org/10.5281/zenodo.13131) [![Build Status](https://api.travis-ci.org/ropensci/RNeXML.png)](https://travis-ci.org/ropensci/RNeXML) RNeXML: The next-generation phylogenetics format comes to R =========================================================== - Maintainer: Carl Boettiger - Authors: Carl Boettiger, Scott Chamberlain, Hilmar Lapp, Kseniia Shumelchyk, Rutger Vos - License: BSD-3 - [Issues](https://github.com/ropensci/RNeXML/issues): Bug reports, feature requests, and development discussion. An extensive and rapidly growing collection of richly annotated phylogenetics data is now available in the NeXML format. NeXML relies on state-of-the-art data exchange technology to provide a format that can be both validated and extended, providing a data quality assurance and adaptability to the future that is lacking in other formats. See [Vos et al 2012](http://doi.org/10.1093/sysbio/sys025 "NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata.") for further details on the NeXML format. How to cite ----------- RNeXML has been published in the following article: > Boettiger C, Chamberlain S, Vos R and Lapp H (2016). “RNeXML: A Package for Reading and Writing Richly Annotated Phylogenetic, Character, and Trait Data in R.” *Methods in Ecology and Evolution*, **7**, pp. 352-357. [doi:10.1111/2041-210X.12469](http://doi.org/10.1111/2041-210X.12469) Although the published version of the article is paywalled, the source of the manuscript, and a much better rendered PDF, are included in this package (in the `manuscripts` folder). You can also find it [freely available on arXiv](http://arxiv.org/abs/1506.02722). Getting Started --------------- The latest stable release of RNeXML is on CRAN, and can be installed with the usual `install.packages("RNeXML")` command. Some of the more specialized functionality described in the Vignettes (such as RDF manipulation) requires additional packages which can be installed using: ``` r install.packages("RNeXML", deps=TRUE, repos=c("https://cran.rstudio.com", "http://packages.ropensci.org")) ``` which will also install the development version of the RNeXML package. For most common tasks such as shown here, those additional packages are not required. The development version of RNeXML is also [available on Github](https://github.com/ropensci/RNeXML). With the `devtools` package installed on your system, RNeXML can be installed using: ``` r library(devtools) install_github("ropensci/RNeXML") library(RNeXML) ``` Read in a `nexml` file into the `ape::phylo` format: ``` r f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nexml <- nexml_read(f) tr <- get_trees(nexml) # or: as(nexml, "phylo") plot(tr) ``` ![](README-unnamed-chunk-5-1.png) Write an `ape::phylo` tree into the `nexml` format: ``` r data(bird.orders) nexml_write(bird.orders, "test.xml") #> [1] "test.xml" ``` A key feature of NeXML is the ability to formally validate the construction of the data file against the standard (the lack of such a feature in nexus files had lead to inconsistencies across different software platforms, and some files that cannot be read at all). While it is difficult to make an invalid NeXML file from `RNeXML`, it never hurts to validate just to be sure: ``` r nexml_validate("test.xml") #> [1] TRUE ``` Extract metadata from the NeXML file: ``` r birds <- nexml_read("test.xml") get_taxa(birds) #> otu label about xsi.type otus #> 1 ou70 Struthioniformes #ou70 NA os4 #> 2 ou71 Tinamiformes #ou71 NA os4 #> 3 ou72 Craciformes #ou72 NA os4 #> 4 ou73 Galliformes #ou73 NA os4 #> 5 ou74 Anseriformes #ou74 NA os4 #> 6 ou75 Turniciformes #ou75 NA os4 #> 7 ou76 Piciformes #ou76 NA os4 #> 8 ou77 Galbuliformes #ou77 NA os4 #> 9 ou78 Bucerotiformes #ou78 NA os4 #> 10 ou79 Upupiformes #ou79 NA os4 #> 11 ou80 Trogoniformes #ou80 NA os4 #> 12 ou81 Coraciiformes #ou81 NA os4 #> 13 ou82 Coliiformes #ou82 NA os4 #> 14 ou83 Cuculiformes #ou83 NA os4 #> 15 ou84 Psittaciformes #ou84 NA os4 #> 16 ou85 Apodiformes #ou85 NA os4 #> 17 ou86 Trochiliformes #ou86 NA os4 #> 18 ou87 Musophagiformes #ou87 NA os4 #> 19 ou88 Strigiformes #ou88 NA os4 #> 20 ou89 Columbiformes #ou89 NA os4 #> 21 ou90 Gruiformes #ou90 NA os4 #> 22 ou91 Ciconiiformes #ou91 NA os4 #> 23 ou92 Passeriformes #ou92 NA os4 get_metadata(birds) #> LiteralMeta property datatype content #> 1 m15 dc:creator xsd:string #> 2 #> 3 m17 dcterms:bibliographicCitation xsd:string #> xsi.type ResourceMeta rel #> 1 LiteralMeta #> 2 ResourceMeta m16 cc:license #> 3 LiteralMeta #> href #> 1 #> 2 http://creativecommons.org/publicdomain/zero/1.0/ #> 3 ``` ------------------------------------------------------------------------ Add basic additional metadata: ``` r nexml_write(bird.orders, file="meta_example.xml", title = "My test title", description = "A description of my test", creator = "Carl Boettiger ", publisher = "unpublished data", pubdate = "2012-04-01") #> [1] "meta_example.xml" ``` By default, `RNeXML` adds certain metadata, including the NCBI taxon id numbers for all named taxa. This acts a check on the spelling and definitions of the taxa as well as providing a link to additional metadata about each taxonomic unit described in the dataset. ### Advanced annotation We can also add arbitrary metadata to a NeXML tree by define `meta` objects: ``` r modified <- meta(property = "prism:modificationDate", content = "2013-10-04") ``` Advanced use requires specifying the namespace used. Metadata follows the RDFa conventions. Here we indicate the modification date using the prism vocabulary. This namespace is included by default, as it is used for some of the basic metadata shown in the previous example. We can see from this list: ``` r RNeXML:::nexml_namespaces #> nex #> "http://www.nexml.org/2009" #> xsi #> "http://www.w3.org/2001/XMLSchema-instance" #> xml #> "http://www.w3.org/XML/1998/namespace" #> cdao #> "http://purl.obolibrary.org/obo/cdao.owl" #> xsd #> "http://www.w3.org/2001/XMLSchema#" #> dc #> "http://purl.org/dc/elements/1.1/" #> dcterms #> "http://purl.org/dc/terms/" #> ter #> "http://purl.org/dc/terms/" #> prism #> "http://prismstandard.org/namespaces/1.2/basic/" #> cc #> "http://creativecommons.org/ns#" #> ncbi #> "http://www.ncbi.nlm.nih.gov/taxonomy#" #> tc #> "http://rs.tdwg.org/ontology/voc/TaxonConcept#" ``` This next block defines a resource (link), described by the `rel` attribute as a homepage, a term in the `foaf` vocabulalry. Becuase `foaf` is not a default namespace, we will have to provide its URL in the full definition below. ``` r website <- meta(href = "http://carlboettiger.info", rel = "foaf:homepage") ``` Here we create a history node using the `skos` namespace. We can also add id values to any metadata element to make the element easier to reference externally: ``` r history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML", id = "meta123") ``` For this kind of richer annotation, it is best to build up our NeXML object sequentially. Frist we will add `bird.orders` phylogeny to a new phylogenetic object, and then we will add the metadata elements created above to this object. Finally, we will write the object out as an XML file: ``` r birds <- add_trees(bird.orders) birds <- add_meta(meta = list(history, modified, website), namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#", foaf = "http://xmlns.com/foaf/0.1/"), nexml=birds) nexml_write(birds, file = "example.xml") #> [1] "example.xml" ``` ### Taxonomic identifiers Add taxonomic identifier metadata to the OTU elements: ``` r nex <- add_trees(bird.orders) nex <- taxize_nexml(nex) ``` Working with character data --------------------------- NeXML also provides a standard exchange format for handling character data. The R platform is particularly popular in the context of phylogenetic comparative methods, which consider both a given phylogeny and a set of traits. NeXML provides an ideal tool for handling this metadata. ### Extracting character data We can load the library, parse the NeXML file and extract both the characters and the phylogeny. ``` r library(RNeXML) nexml <- read.nexml(system.file("examples", "comp_analysis.xml", package="RNeXML")) traits <- get_characters(nexml) tree <- get_trees(nexml) ``` (Note that `get_characters` would return both discrete and continuous characters together in the same data.frame, but we use `get_characters_list` to get separate data.frames for the continuous `characters` block and the discrete `characters` block). We can then fire up `geiger` and fit, say, a Brownian motion model the continuous data and a Markov transition matrix to the discrete states: ``` r library(geiger) fitContinuous(tree, traits[1], ncores=1) #> GEIGER-fitted comparative model of continuous data #> fitted 'BM' model parameters: #> sigsq = 1.166011 #> z0 = 0.255591 #> #> model summary: #> log-likelihood = -20.501183 #> AIC = 45.002367 #> AICc = 46.716652 #> free parameters = 2 #> #> Convergence diagnostics: #> optimization iterations = 100 #> failed iterations = 0 #> frequency of best fit = 1.00 #> #> object summary: #> 'lik' -- likelihood function #> 'bnd' -- bounds for likelihood search #> 'res' -- optimization iteration summary #> 'opt' -- maximum likelihood parameter estimates fitDiscrete(tree, traits[2], ncores=1) #> GEIGER-fitted comparative model of discrete data #> fitted Q matrix: #> 0 1 #> 0 -0.07308302 0.07308302 #> 1 0.07308302 -0.07308302 #> #> model summary: #> log-likelihood = -4.574133 #> AIC = 11.148266 #> AICc = 11.648266 #> free parameters = 1 #> #> Convergence diagnostics: #> optimization iterations = 100 #> failed iterations = 0 #> frequency of best fit = 1.00 #> #> object summary: #> 'lik' -- likelihood function #> 'bnd' -- bounds for likelihood search #> 'res' -- optimization iteration summary #> 'opt' -- maximum likelihood parameter estimates ``` ------------------------------------------------------------------------ [![ropensci footer](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org) RNeXML/MD50000644000176200001440000002052512734566742011653 0ustar liggesusersebe99292b89cc34140a39ad712681bf7 *DESCRIPTION 15d87f8d4ecfd954c8dd3fc2b69a0cbb *LICENSE 1d93149a8c506b01083905ff4c1ee3ea *NAMESPACE 04866c936719c90d023ccaa71d9246fe *NEWS 3f9769a15da159d125da7a8c314401b3 *R/add_basic_meta.R 40bece1f389a5a124ef19d214762e316 *R/add_characters.R 8a3742fe5e603f306cef2a96ae1e9b8e *R/add_meta.R 5cd943276adbf9f17f5230085652e9e9 *R/add_namespaces.R 12af339498a96925e3dcd0e74e80a046 *R/add_trees.R 4b8556f774eb9ee1d38df6cd6c742532 *R/character_classes.R 4888005a99dbdfe559fe99568a938c1f *R/classes.R eeb4a688d49a71b887d4720a1730174d *R/concatenate_nexml.R 7bd3c9c98ffe79c342f9c367137050f5 *R/deprecated.R bc53e2e4026dbbdeba595dcba9feaefd *R/get_basic_metadata.R 28300d92ffe9dad2934a153dbb3fcbd7 *R/get_characters.R e51cbb9c28bd6bf98ecb76dd8cf2a5b8 *R/get_level.R ae3df68ccfb01f0ceaae7c92e76a8052 *R/get_metadata.R fe892d1d2be5fff56f37b24e89515f87 *R/get_namespaces.R f561b1ded0a06a5d225d77a0aa39241b *R/get_rdf.R c4aa799bc0ba7b7a23bd7a0a32409209 *R/get_taxa.R 3a890d1b39f18a08b0f4c7d94fe0d57b *R/get_taxa_meta.R 323a810bfdadcb830fd013011d4c40d4 *R/get_trees.R f8c0a9c2cec4bcbde80170ee767eba13 *R/internal_get_node_maps.R 2c2f1c91251d019f03d99e61c62c42cd *R/internal_isEmpty.R 3372a2273979f5336464963d70fb710c *R/internal_name_by_id.R 88264b69fef8bccbdcb1619f7d8c80cf *R/internal_nexml_id.R 71a4b1b132719c58e5c5d7b4d8bc919e *R/meta.R a0eb07febda79bb9019c22cc1de272bf *R/nexmlTree.R 8e1b70f3620578164ee69c83abaae05d *R/nexml_add.R eaf388a4f6baa30b25470711aae80073 *R/nexml_get.R 543c0593bb8c0bf854f7637b0a578111 *R/nexml_methods.R a3ed5cf4bcd243452e87c95e78a84284 *R/nexml_publish.R e55ca62202b4a3ea46b584fc3e8f2a6b *R/nexml_read.R de91697697c1d05dcac0b3df54344ad8 *R/nexml_validate.R f7f02df61a764d23efcabc61fbc2044c *R/nexml_write.R f09917e2c28d50018d6e9b94c920fed7 *R/simmap.R 7b7d400236ab6cfa6fe179224f9fa20c *R/taxize_nexml.R 78ad6d9378c8d09f85d70b4f62b176b6 *R/tbl_df.R 10f560b71f8e0371f94c5fe01be697e4 *README.md 2b0b8bd66c24f83cfee581819fbc1c0f *build/vignette.rds b6698b64d799f3e60f545c4998233741 *data/simmap_ex.rda b2e0de0418b8c0fa9efed1fa5475a534 *demo/00Index 5bb8b588882878ecd86680b26e79a5e9 *demo/sparql.R 84b45265f4efcfb698335957ec44d7ba *inst/CITATION a7ba86ba5a8e6ed654b2faf068d665dc *inst/doc/S4.R cf3f47f24e945e8258e5df3d447f509c *inst/doc/S4.Rmd 65d1e01a870615fe2aaee99f462a9a65 *inst/doc/S4.html 12a0995b3f1129d5ba6685beed5d164c *inst/doc/metadata.R 93e452d7bc5e213191d7e22781edda33 *inst/doc/metadata.Rmd 9b47e98360d4b50a6a3ade83a4a3506a *inst/doc/metadata.html 90daf8f0bf53cfe7cbfd4f2351319e2c *inst/doc/simmap.R 87f05f59fb504698ea3fd69c09cd2b1d *inst/doc/simmap.Rmd fda10aa33741510b2d70ed2004f4b57d *inst/doc/simmap.html a08f6e54ee3f739aef13b300fc33bf37 *inst/doc/sparql.R 64bbe5a0174518618dc6dd1eef9f44f9 *inst/doc/sparql.Rmd 645b3f8577e78a2d5265d37d8d7bcd7a *inst/doc/sparql.html 2de44d2898c22479b8a449767f1ab627 *inst/examples/RDFa2RDFXML.xsl 64a60bc7b825e88ba931e1da5fd08e28 *inst/examples/biophylo.xml 41cf6116441773694f8f32b88900d346 *inst/examples/characters.xml 7cfba1e69724dd93fe4a07525e419ed4 *inst/examples/comp_analysis.xml 835c4ec3a29a2a08be861f44d445fab5 *inst/examples/gardiner_1984.xml d3a3d2519dcc6fe7b5399061db69f1d9 *inst/examples/geospiza.xml 84cf7d2d949b6a36df18bec31d4c8917 *inst/examples/mbank_X962_11-22-2013_1534.nex dcd3fe12a4ec3d330d89f34ed10dbaf5 *inst/examples/merge_data.Rmd 3b7704edc024518cb054fd73f1f2f9b6 *inst/examples/merge_data.md 727f60b2ff7bdae76ffc09452937107f *inst/examples/meta_example.xml 12d5ea02edb324a2efc824b4cdec8ed8 *inst/examples/meta_taxa.xml 34b804a9a1187689fe223be6498dcd26 *inst/examples/missing_some_branchlengths.xml 2fc3d9aca072a9964a95de8a8d7e6d53 *inst/examples/multitrees.xml e34f32b1c48ae0acd8bfab2f09971945 *inst/examples/ncbii.xml fb1edb923697675d5282d9d181fcfcd3 *inst/examples/ontotrace-result.xml 200927eaf3dc24fd97bd607277693992 *inst/examples/phenoscape.xml cb142aadc75b48b696ccf6d42637ff72 *inst/examples/primates.xml 1f52dfb1306c9ed696d14f805e95a498 *inst/examples/primates_from_R.xml 065eae0c3189850bcff9d838b5041f90 *inst/examples/primates_meta.xml a64c177eae6be033dd8712e6d693b2ea *inst/examples/primates_meta_xslt.xml 8a8e3e961583a94cec32852e9b4d36e8 *inst/examples/simmap.nex 327200ae849ba639f267e42d04932aa0 *inst/examples/simmap.xml d1740a7345534477c40477f0fc776ad7 *inst/examples/some_missing_branchlengths.xml e49283485e7dec7c98f77c54cf6406bd *inst/examples/sparql.newick f4744af232f0e5087b78f0002cc38e1d *inst/examples/taxa.xml 9e0f346fbf1ef7dcd87a3406b98e7aeb *inst/examples/treebase-record.xml 74361320b5d4bc935ea9d09a35463708 *inst/examples/trees.xml 797825bdfed94f7c25aff9b96cbe7c09 *inst/simmap.md 6616a1400f25bd536f5958135f5331a7 *man/add_basic_meta.Rd 9213fa78891cc24777b0d6d49fed0f87 *man/add_characters.Rd a9185633f5d18322b55326130885bb8f *man/add_meta.Rd f371974ff7018c18e597538799f1944b *man/add_namespaces.Rd 0d2a95d73d96b36d6050ad4d14ab2239 *man/add_trees.Rd 61286f843046e80e9e70c24584bb4ca3 *man/c-ListOfmeta-method.Rd b751f355bc634e3311dc4338a8580038 *man/c-meta-method.Rd db4566c1aba031ac575679ceebf492c8 *man/c-nexml-method.Rd e47a01a239ab2714836d887ee74bc72f *man/flatten_multiphylo.Rd 6fa1f00c6f4aabb836afd36b5d8a7c36 *man/get_characters.Rd fbb70fa0eaf8597465c8dcfc4720d970 *man/get_characters_list.Rd 659b91259dd9a40316097e6817938495 *man/get_citation.Rd 69bb6034d4b6df8665ec3a30e78a9212 *man/get_flat_trees.Rd bb807456a44e2513696ae558276ad447 *man/get_level.Rd 6c2eee2b7165012bc454c227c0fa4d79 *man/get_license.Rd 203abcfb15fe3a50e885c20dde4d4839 *man/get_metadata.Rd c46a4a683d6456763b66f598483fe4f9 *man/get_namespaces.Rd 888b720a376d8574a609b976f05cb96f *man/get_rdf.Rd c4551423633e161dd7eb52febba2e9bb *man/get_taxa.Rd 017da4e56e201af8cd0ed7f4bddad95b *man/get_taxa_list.Rd 85eb6f6931f265dbaa3d70acf30669b3 *man/get_taxa_meta.Rd 49d76200b87f52a8702f07d591b4fc46 *man/get_taxa_meta_list.Rd 63db8648ca7565cf3a95f4ca03bd6c69 *man/get_trees.Rd f8631c38b0a17cefd68d6dc837333d5d *man/get_trees_list.Rd 14bade08261118997384f8b19ef9cfe1 *man/meta.Rd 16213b88dcaeaead93b4ca14e0f1155d *man/nexml_add.Rd 86e03649975e3e304a4549d1b52ab0da *man/nexml_figshare.Rd effecba3da7f455df1e03e91650ab30c *man/nexml_get.Rd 959776702ff1e5b6e68332679e146f2e *man/nexml_publish.Rd 621317a0b5a9f361cd3456ae4531b1e8 *man/nexml_read.Rd 9dcccbaae20ba7d6bae820f92a55bdd6 *man/nexml_to_simmap.Rd e3f70ff0fcf24c0fa9e039d013f14f3c *man/nexml_validate.Rd b572d80b8214325580b882615f2c69f5 *man/nexml_write.Rd 9013924bda3ef2fa50a2ab2eaefcee5c *man/reset_id_counter.Rd 3e0069cf908c01f5bf54456727bc9872 *man/simmap_ex.Rd f3d53e87bdf4d6492be6ce6a817acd34 *man/simmap_to_nexml.Rd ec58c682829b41bd6f719806aaf5fe2b *man/taxize_nexml.Rd 5b8bcfa789a527efd486e95d195af6e4 *man/toPhylo.Rd f9774e3679709c9036691527556fe97e *tests/test-all.R c64d6a3a0af9b631471c129a0957e626 *tests/testthat/conversions.R 2a83a0474b87536e4e1f9f4441c8fa60 *tests/testthat/geiger_test.R 570b83e755bcffaa47c2b716b7356a12 *tests/testthat/helper-RNeXML.R 4a961531db2de3c792e8c0a2999e8d22 *tests/testthat/test_ape.R 2e4db69073aed7c11e38e62668adc1bc *tests/testthat/test_characters.R 187111ca4a1a30f4ce667b29af1f545c *tests/testthat/test_comp_analysis.R 18f77acc3657c0c15a30b56439b10506 *tests/testthat/test_concatenate.R 82c84691dccf5ec3ee12825e167ee53f *tests/testthat/test_get_characters.R 74cd1774ba8b63c324af7e1d2f1fb7cb *tests/testthat/test_get_level.R 84b7067fbe52a42ea40cfd500990d22e *tests/testthat/test_global_ids.R 15bdc14e960842894d04719caab12223 *tests/testthat/test_inheritance.R b3bcc9642dd9449837ec050d64ecf2dc *tests/testthat/test_meta.R 7536014afe9c3d74b23b7e77c49ecc81 *tests/testthat/test_meta_extract.R 85ffdcb97727d6337b897b1f7459f9b2 *tests/testthat/test_nexml_read.R 7aec4558c1d4b92f86e8bfda16d23a2a *tests/testthat/test_parsing.R 16a017f64abb2539015e3dd6d37a57b6 *tests/testthat/test_publish.R e8830f5726f1450ff8c43f435e57857e *tests/testthat/test_rdf.R cebc49fccf74e658b171651b18f06f9a *tests/testthat/test_serializing.R c9ed51458274d31a7e799c73e3eabde7 *tests/testthat/test_simmap.R e8700d675fcb9599051369e1c74d824c *tests/testthat/test_taxonomy.R f04b0dc0ea1ddf2637106f69183e3db9 *tests/testthat/test_toplevel_api.R 4c6ba000905bdd7fca8a212b67159817 *tests/testthat/test_validate.R 24a4d4a46c84bf205bdd7407791f15f9 *tests/testthat/treebase_test.R cf3f47f24e945e8258e5df3d447f509c *vignettes/S4.Rmd 6e669e1070f9fed34aa9645a2768a2a4 *vignettes/metadata.Rmd 70e3f1824013e209654f96e819ceba0a *vignettes/references.bib 87f05f59fb504698ea3fd69c09cd2b1d *vignettes/simmap.Rmd 5f8e0dfbf93b58d0f35d1005abc4518c *vignettes/sparql.Rmd RNeXML/build/0000755000176200001440000000000012734542424012425 5ustar liggesusersRNeXML/build/vignette.rds0000644000176200001440000000046712734542424014773 0ustar liggesusersRN0t4P /?T*$Z(I H7:cvm@q9&`p.>\i6 e EA)dxO[v}T JWؖ E]-1eI8U|/Si|k⦼:pL7ӒVB'M`ܯh"8}3\ȂZȃQfӶK׍q3w88D1K|b r|c<KCTY=hTd 7?'~3ڥ4 ½PhRNeXML/DESCRIPTION0000644000176200001440000000437312734566742013054 0ustar liggesusersPackage: RNeXML Type: Package Title: Semantically Rich I/O for the 'NeXML' Format Version: 2.0.7 Authors@R: c(person("Carl", "Boettiger", role = c("cre", "aut"), email="cboettig@gmail.com"), person("Scott", "Chamberlain", role = "aut"), person("Hilmar", "Lapp", role = "aut"), person("Kseniia", "Shumelchyk", role = "aut"), person("Rutger", "Vos", role = "aut")) Description: Provides access to phyloinformatic data in 'NeXML' format. The package should add new functionality to R such as the possibility to manipulate 'NeXML' objects in more various and refined way and compatibility with 'ape' objects. URL: https://github.com/ropensci/RNeXML BugReports: https://github.com/ropensci/RNeXML/issues License: BSD_3_clause + file LICENSE Additional_repositories: http://packages.ropensci.org VignetteBuilder: knitr Suggests: rrdf (>= 2.0.2), geiger (>= 2.0), phytools (>= 0.3.93), knitr (>= 1.5), rfigshare (>= 0.3.0), knitcitations (>= 1.0.1), testthat (>= 0.10.0), phylobase (>= 0.6.8), rmarkdown (>= 0.3.3), Sxslt (>= 0.91) Depends: R (>= 3.0.0), ape (>= 3.1), methods (>= 3.0.0) Imports: XML (>= 3.95), plyr (>= 1.8), taxize (>= 0.2.2), reshape2 (>= 1.2.2), httr (>= 0.3), uuid (>= 0.1-1), dplyr (>= 0.4.0), lazyeval (>= 0.1.0), tidyr (>= 0.3.1), stringr (>= 1.0) Collate: 'classes.R' 'add_basic_meta.R' 'add_characters.R' 'add_meta.R' 'add_namespaces.R' 'add_trees.R' 'character_classes.R' 'concatenate_nexml.R' 'deprecated.R' 'get_basic_metadata.R' 'get_characters.R' 'get_level.R' 'get_metadata.R' 'get_namespaces.R' 'get_rdf.R' 'get_taxa.R' 'get_taxa_meta.R' 'get_trees.R' 'internal_get_node_maps.R' 'internal_isEmpty.R' 'internal_name_by_id.R' 'internal_nexml_id.R' 'meta.R' 'nexmlTree.R' 'nexml_add.R' 'nexml_get.R' 'nexml_methods.R' 'nexml_publish.R' 'nexml_read.R' 'nexml_validate.R' 'nexml_write.R' 'simmap.R' 'taxize_nexml.R' 'tbl_df.R' RoxygenNote: 5.0.1 NeedsCompilation: no Packaged: 2016-06-28 18:42:28 UTC; root Author: Carl Boettiger [cre, aut], Scott Chamberlain [aut], Hilmar Lapp [aut], Kseniia Shumelchyk [aut], Rutger Vos [aut] Maintainer: Carl Boettiger Repository: CRAN Date/Publication: 2016-06-28 23:36:34 RNeXML/man/0000755000176200001440000000000012731606043012074 5ustar liggesusersRNeXML/man/flatten_multiphylo.Rd0000644000176200001440000000137212641021656016312 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_trees.R \name{flatten_multiphylo} \alias{flatten_multiphylo} \title{Flatten a multiphylo object} \usage{ flatten_multiphylo(object) } \arguments{ \item{object}{a list of multiphylo objects} } \description{ Flatten a multiphylo object } \details{ NeXML has the concept of multiple nodes, each with multiple child nodes. This maps naturally to a list of multiphylo objects. Sometimes this heirarchy conveys important structural information, so it is not discarded by default. Occassionally it is useful to flatten the structure though, hence this function. Note that this discards the original structure, and the nexml file must be parsed again to recover it. } RNeXML/man/get_taxa_meta_list.Rd0000644000176200001440000000144512641021656016225 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_taxa_meta.R \name{get_taxa_meta_list} \alias{get_taxa_meta_list} \title{get_taxa_meta_list} \usage{ get_taxa_meta_list(nexml, what = "href") } \arguments{ \item{nexml}{a nexml object} \item{what}{One of href, rel, id, or xsi:type} } \value{ the list of metadata for each taxon } \description{ Retrieve metadata of all species/otus otus (operational taxonomic units) included in the nexml } \examples{ \dontrun{ data(bird.orders) birds <- add_trees(bird.orders) birds <- taxize_nexml(birds, "NCBI") RNeXML:::get_taxa_meta_list(birds) RNeXML:::get_taxa_meta_list(birds, 'rel') RNeXML:::get_taxa_meta_list(birds, 'id') RNeXML:::get_taxa_meta_list(birds, 'xsi:type') } } \seealso{ \code{\link{get_item}} } \keyword{internal} RNeXML/man/meta.Rd0000644000176200001440000000302312641021656013310 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/meta.R \name{meta} \alias{meta} \title{Constructor function for metadata nodes} \usage{ meta(property = character(0), content = character(0), rel = character(0), href = character(0), datatype = character(0), id = character(0), type = character(0), children = list()) } \arguments{ \item{property}{specify the ontological definition together with it's namespace, e.g. dc:title} \item{content}{content of the metadata field} \item{rel}{Ontological definition of the reference provided in href} \item{href}{A link to some reference} \item{datatype}{optional RDFa field} \item{id}{optional id element (otherwise id will be automatically generated).} \item{type}{optional xsi:type. If not given, will use either "LiteralMeta" or "ResourceMeta" as determined by the presence of either a property or a href value.} \item{children}{Optional element containing any valid XML block (XMLInternalElementNode class, see the XML package for details).} } \description{ Constructor function for metadata nodes } \details{ User must either provide property+content or rel+href. Mixing these will result in potential garbage. The datatype attribute will be detected automatically from the class of the content argument. Maps from R class to schema datatypes are as follows: character - xs:string, Date - xs:date, integer - xs:integer, numeric - xs:decimal, logical - xs:boolean } \examples{ meta(content="example", property="dc:title") } \seealso{ \code{\link{nexml_write}} } RNeXML/man/get_taxa_meta.Rd0000644000176200001440000000137612641021656015175 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_taxa_meta.R \name{get_taxa_meta} \alias{get_taxa_meta} \title{get_taxa_meta} \usage{ get_taxa_meta(nexml, what = "href") } \arguments{ \item{nexml}{a nexml object} \item{what}{One of href, rel, id, or xsi:type} } \value{ the list of metadata for each taxon } \description{ Retrieve metadata of all species/otus otus (operational taxonomic units) included in the nexml } \examples{ \dontrun{ data(bird.orders) birds <- add_trees(bird.orders) birds <- taxize_nexml(birds, "NCBI") RNeXML:::get_taxa_meta(birds) RNeXML:::get_taxa_meta(birds, 'rel') RNeXML:::get_taxa_meta(birds, 'id') RNeXML:::get_taxa_meta(birds, 'xsi:type') } } \seealso{ \code{\link{get_item}} } \keyword{internal} RNeXML/man/add_characters.Rd0000644000176200001440000000212712641021656015315 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/add_characters.R \name{add_characters} \alias{add_characters} \title{Add character data to a nexml object} \usage{ add_characters(x, nexml = new("nexml"), append_to_existing_otus = FALSE) } \arguments{ \item{x}{character data, in which character traits labels are column names and taxon labels are row names. x can be in matrix or data.frame format.} \item{nexml}{a nexml object, if appending character table to an existing nexml object. If ommitted will initiate a new nexml object.} \item{append_to_existing_otus}{logical. If TRUE, will add any new taxa (taxa not matching any existing otus block) to the existing (first) otus block. Otherwise (default), a new otus block is created, even though it may contain duplicate taxa to those already present. While FALSE is the safe option, TRUE may be appropriate when building nexml files from scratch with both characters and trees.} } \description{ Add character data to a nexml object } \examples{ library("geiger") data(geospiza) geiger_nex <- add_characters(geospiza$dat) } RNeXML/man/c-nexml-method.Rd0000644000176200001440000000157712641021656015217 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/concatenate_nexml.R \docType{methods} \name{c,nexml-method} \alias{c,nexml-method} \title{Concatenate nexml files} \usage{ \S4method{c}{nexml}(x, ..., recursive = FALSE) } \arguments{ \item{x, ...}{nexml objects to be concatenated, e.g. from \code{\link{write.nexml}} or \code{\link{read.nexml}}. Must have unique ids on all elements} \item{recursive}{logical. If 'recursive = TRUE', the function recursively descends through lists (and pairlists) combining all their elements into a vector. (Not implemented).} } \value{ a concatenated nexml file } \description{ Concatenate nexml files } \examples{ \dontrun{ f1 <- system.file("examples", "trees.xml", package="RNeXML") f2 <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex1 <- read.nexml(f1) nex2 <- read.nexml(f2) nex <- c(nex1, nex2) } } RNeXML/man/reset_id_counter.Rd0000644000176200001440000000035712641021656015726 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/internal_nexml_id.R \name{reset_id_counter} \alias{reset_id_counter} \title{reset id counter} \usage{ reset_id_counter() } \description{ reset the id counter } RNeXML/man/simmap_ex.Rd0000644000176200001440000000116512641021656014351 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/simmap.R \docType{data} \name{simmap_ex} \alias{simmap_ex} \title{A nexml class R object that includes simmap annotations} \format{a \code{nexml} instance} \source{ Simulated tree and stochastic character mapping based on Revell 2011 (doi:10.1111/j.2041-210X.2011.00169.x) } \usage{ simmap_ex } \description{ A nexml object with simmap stochastic character mapping annotations added to the edges, for use with the RNeXML package parsing and serializing NeXML into formats that work with the ape and phytools packages. } \author{ Carl Boettiger } RNeXML/man/nexml_read.Rd0000644000176200001440000000226512665743025014516 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_read.R \name{nexml_read} \alias{nexml_read} \alias{nexml_read.XMLInternalDocument} \alias{nexml_read.XMLInternalNode} \alias{nexml_read.character} \alias{read.nexml} \title{Read NeXML files into various R formats} \usage{ nexml_read(x, ...) \method{nexml_read}{character}(x, ...) \method{nexml_read}{XMLInternalDocument}(x, ...) \method{nexml_read}{XMLInternalNode}(x, ...) } \arguments{ \item{x}{Path to the file to be read in. Or an \code{\link[XML]{XMLInternalDocument-class}} or \code{\link[XML]{XMLInternalNode-class}}} \item{...}{Further arguments passed on to \code{\link[XML]{xmlParse}}} } \description{ Read NeXML files into various R formats } \examples{ # file f <- system.file("examples", "trees.xml", package="RNeXML") nexml_read(f) \dontrun{ # may take > 5 s # url url <- "https://raw.githubusercontent.com/ropensci/RNeXML/master/inst/examples/trees.xml" nexml_read(url) # character string of XML str <- paste0(readLines(f), collapse = "") nexml_read(str) # XMLInternalDocument library("httr") library("XML") x <- xmlParse(content(GET(url))) nexml_read(x) # XMLInternalNode nexml_read(xmlRoot(x)) } } RNeXML/man/get_characters.Rd0000644000176200001440000000270212641021656015343 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_characters.R \name{get_characters} \alias{get_characters} \title{Get character data.frame from nexml} \usage{ get_characters(nex, rownames_as_col = FALSE, otu_id = FALSE, otus_id = FALSE) } \arguments{ \item{nex}{a nexml object} \item{rownames_as_col}{option to return character matrix rownames (with taxon ids) as it's own column in the data.frame. Default is FALSE for compatibility with geiger and similar packages.} \item{otu_id}{logical, default FALSE. return a column with the otu id (for joining with otu metadata, etc)} \item{otus_id}{logical, default FALSE. return a column with the otus block id (for joining with otu metadata, etc)} } \value{ the character matrix as a data.frame } \description{ Get character data.frame from nexml } \details{ RNeXML will attempt to return the matrix using the NeXML taxon (otu) labels to name the rows and the NeXML char labels to name the traits (columns). If these are unavailable or not unique, the NeXML id values for the otus or traits will be used instead. } \examples{ \dontrun{ # A simple example with a discrete and a continous trait f <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- read.nexml(f) get_characters(nex) # A more complex example -- currently ignores sequence-type characters f <- system.file("examples", "characters.xml", package="RNeXML") nex <- read.nexml(f) get_characters(nex) } } RNeXML/man/nexml_validate.Rd0000644000176200001440000000152312731606043015360 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_validate.R \name{nexml_validate} \alias{nexml_validate} \title{validate nexml using the online validator tool} \usage{ nexml_validate(file, schema = CANONICAL_SCHEMA) } \arguments{ \item{file}{path to the nexml file to validate} \item{schema}{URL of schema (for fallback method only, set by default).} } \value{ TRUE if the file is valid, FALSE or error message otherwise } \description{ validate nexml using the online validator tool } \details{ Requires an internet connection. see http://www.nexml.org/nexml/phylows/validator for more information in debugging invalid files } \examples{ \dontrun{ data(bird.orders) birds <- nexml_write(bird.orders, "birds_orders.xml") nexml_validate("birds_orders.xml") unlink("birds_orders.xml") # delete file to clean up } } RNeXML/man/taxize_nexml.Rd0000644000176200001440000000117712641021656015101 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/taxize_nexml.R \name{taxize_nexml} \alias{taxize_nexml} \title{taxize nexml} \usage{ taxize_nexml(nexml, type = c("NCBI"), ...) } \arguments{ \item{nexml}{a nexml object} \item{type}{the name of the identifier to use} \item{...}{additional arguments (not implemented yet)} } \description{ Check taxanomic names against the specified service and add appropriate semantic metadata to the nexml OTU unit containing the corresponding identifier. } \examples{ \dontrun{ data(bird.orders) birds <- add_trees(bird.orders) birds <- taxize_nexml(birds, "NCBI") } } RNeXML/man/add_meta.Rd0000644000176200001440000000364012641021656014125 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/add_meta.R \name{add_meta} \alias{add_meta} \title{Add metadata to a nexml file} \usage{ add_meta(meta, nexml = new("nexml"), level = c("nexml", "otus", "trees", "characters"), namespaces = NULL, i = 1, at_id = NULL) } \arguments{ \item{meta}{a meta S4 object, e.g. ouput of the function \code{\link{meta}}, or a list of these meta objects} \item{nexml}{(S4) object} \item{level}{the level at which the metadata annotation should be added.} \item{namespaces}{named character string for any additional namespaces that should be defined.} \item{i}{for otus, trees, characters: if there are multiple such blocks, which one should be annotated? Default is first/only block.} \item{at_id}{the id of the element to be annotated. Optional, advanced use only.} } \value{ the updated nexml object } \description{ Add metadata to a nexml file } \examples{ ## Create a new nexml object with a single metadata element: modified <- meta(property = "prism:modificationDate", content = "2013-10-04") nex <- add_meta(modified) # Note: 'prism' is defined in nexml_namespaces by default. ## Write multiple metadata elements, including a new namespace: website <- meta(href = "http://carlboettiger.info", rel = "foaf:homepage") # meta can be link-style metadata nex <- add_meta(list(modified, website), namespaces = c(foaf = "http://xmlns.com/foaf/0.1/")) ## Append more metadata, and specify a level: history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML") nex <- add_meta(history, nexml = nex, level = "trees", namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) } \seealso{ \code{\link{meta}} \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_basic_meta}} } RNeXML/man/get_taxa.Rd0000644000176200001440000000102712641021656014160 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_taxa.R \name{get_taxa} \alias{get_otu} \alias{get_taxa} \title{get_taxa} \usage{ get_taxa(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ the list of taxa } \description{ Retrieve names of all species/otus otus (operational taxonomic units) included in the nexml } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_taxa(nex) } \seealso{ \code{\link{get_item}} } RNeXML/man/get_citation.Rd0000644000176200001440000000044312641021656015036 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_basic_metadata.R \name{get_citation} \alias{get_citation} \title{get_citation} \usage{ get_citation(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ the list of taxa } \description{ get_citation } RNeXML/man/get_trees_list.Rd0000644000176200001440000000176312641021656015407 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_trees.R \name{get_trees_list} \alias{get_trees_list} \title{extract all phylogenetic trees in ape format} \usage{ get_trees_list(nexml) } \arguments{ \item{nexml}{a representation of the nexml object from which the data is to be retrieved} } \value{ returns a list of lists of multiphylo trees, even if all trees are in the same `trees` node (and hence the outer list will be of length 1) or if there is only a single tree (and hence the inner list will also be of length 1. This guarentees a consistent return type regardless of the number of trees present in the nexml file, and also preserves any heirarchy/grouping of trees. } \description{ extract all phylogenetic trees in ape format } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_trees_list(nex) } \seealso{ \code{\link{get_trees}} \code{\link{get_flat_trees}} \code{\link{get_item}} } RNeXML/man/c-meta-method.Rd0000644000176200001440000000130512641021656015007 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/meta.R \docType{methods} \name{c,meta-method} \alias{c,meta-method} \title{Concatenate meta elements into a ListOfmeta} \usage{ \S4method{c}{meta}(x, ..., recursive = FALSE) } \arguments{ \item{x, ...}{meta elements to be concatenated, e.g. see \code{\link{meta}}} \item{recursive}{logical, if 'recursive=TRUE', the function descends through lists and combines their elements into a vector.} } \value{ a listOfmeta object containing multiple meta elements. } \description{ Concatenate meta elements into a ListOfmeta } \examples{ c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) } RNeXML/man/get_level.Rd0000644000176200001440000000357312641021656014342 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_level.R \name{get_level} \alias{get_level} \title{get_level} \usage{ get_level(nex, level) } \arguments{ \item{nex}{a nexml object} \item{level}{a character vector indicating the class of node, see details} } \value{ Returns the attributes of specified class of nodes as a data.frame } \description{ get a data.frame of attribute values of a given node } \details{ level should be a character vector giving the path to the specified node group. For instance, `otus`, `characters`, and `trees` are top-level blocks (e.g. child nodes of the root nexml block), and can be specified directly. To get metadata for all "char" elements from all characters blocks, you must specify that `char` nodes are child nodes to `character` nodes: e.g. `get_level(nex, "characters/char")`, or similarly for states: `get_level(nex, characters/states)`. The return object is a data frame whose columns are the attribute names of the elements specified. The column names match the attribute names except for "id" attribute, for which the column is renamed using the node itself. (Thus would be rendered in a data.frame with column called "otus" instead of "id"). Additional columns are added for each parent element in the path; e.g. get_level(nex, "otus/otu") would include a column named "otus" with the id of each otus block. Even though the method always returns the data frame for all matching nodes in all blocks, these ids let you see which otu values came from which otus block. This is identical to the function call `get_taxa()`. Similarly, `get_level(nex, "otus/otu/meta")` would return additional columns 'otus' and also a column, 'otu', with the otu parent ids of each metadata block. (This is identical to a function call to `get_metadata`). This makes it easier to join data.frames as well, see examples } RNeXML/man/nexml_write.Rd0000644000176200001440000000444112665743025014733 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_write.R \name{nexml_write} \alias{nexml_write} \alias{write.nexml} \title{Write nexml files} \usage{ nexml_write(x = new("nexml"), file = NULL, trees = NULL, characters = NULL, meta = NULL, ...) } \arguments{ \item{x}{a nexml object, or any phylogeny object (e.g. phylo, phylo4) that can be coerced into one. Can also be omitted, in which case a new nexml object will be constructed with the additional parameters specified.} \item{file}{the name of the file to write out} \item{trees}{phylogenetic trees to add to the nexml file (if not already given in x) see \code{\link{add_trees}} for details.} \item{characters}{additional characters} \item{meta}{A meta element or list of meta elements, see \code{\link{add_meta}}} \item{...}{additional arguments to add__basic_meta, such as the title. See \code{\link{add_basic_meta}}.} } \value{ Writes out a nexml file } \description{ Write nexml files } \examples{ ## Write an ape tree to nexml, analgous to write.nexus: library(ape); data(bird.orders) write.nexml(bird.orders, file="example.xml") \dontrun{ # takes > 5s ## Assemble a nexml section by section and then write to file: library(geiger) data(geospiza) nexml <- add_trees(geospiza$phy) # creates new nexml nexml <- add_characters(geospiza$dat, nexml = nexml) # pass the nexml obj to append character data nexml <- add_basic_meta(title="my title", creator = "Carl Boettiger", nexml = nexml) nexml <- add_meta(meta("prism:modificationDate", format(Sys.Date())), nexml = nexml) write.nexml(nexml, file="example.xml") ## As above, but in one call (except for add_meta() call). write.nexml(trees = geospiza$phy, characters = geospiza$dat, title = "My title", creator = "Carl Boettiger", file = "example.xml") ## Mix and match: identical to the section by section: nexml <- add_meta(meta("prism:modificationDate", format(Sys.Date()))) write.nexml(x = nexml, trees = geospiza$phy, characters = geospiza$dat, title = "My title", creator = "Carl Boettiger", file = "example.xml") } } \seealso{ \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} \code{\link{nexml_read}} } RNeXML/man/nexml_figshare.Rd0000644000176200001440000000254512641021656015365 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_publish.R \name{nexml_figshare} \alias{nexml_figshare} \title{publish nexml to figshare} \usage{ nexml_figshare(nexml, file = "nexml.xml", categories = "Evolutionary Biology", tags = list("phylogeny", "NeXML"), visibility = c("public", "private", "draft"), id = NULL, ...) } \arguments{ \item{nexml}{a nexml object (or file path to a nexml file)} \item{file}{The filename desired for the object, if nexml is not already a file. if the first argument is already a path, this value is ignored.} \item{categories}{The figshare categories, must match available set. see \code{fs_add_categories}} \item{tags}{Any keyword tags you want to add to the data.} \item{visibility}{whether the results should be published (public), or kept private, or kept as a draft for further editing before publication. (New versions can be updated, but any former versions that was once made public will always be archived and cannot be removed).} \item{id}{an existing figshare id (e.g. from fs_create), to which this file can be appended.} \item{...}{additional arguments} } \value{ the figshare id of the object } \description{ publish nexml to figshare } \examples{ \dontrun{ data(bird.orders) birds <- add_trees(bird.orders) doi <- nexml_figshare(birds, visibility = "public", repository="figshare") } } RNeXML/man/add_trees.Rd0000644000176200001440000000135512641021656014322 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/add_trees.R \name{add_trees} \alias{add_trees} \title{add_trees} \usage{ add_trees(phy, nexml = new("nexml"), append_to_existing_otus = FALSE) } \arguments{ \item{phy}{a phylo object, multiPhylo object, or list of mulitPhylo to be added to the nexml} \item{nexml}{a nexml object to which we should append this phylo. By default, a new nexml object will be created.} \item{append_to_existing_otus}{logical, indicating if we should make a new OTU block (default) or append to the existing one.} } \value{ a nexml object containing the phy in nexml format. } \description{ add_trees } \examples{ library("geiger") data(geospiza) geiger_nex <- add_trees(geospiza$phy) } RNeXML/man/c-ListOfmeta-method.Rd0000644000176200001440000000153312641021656016133 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/meta.R \docType{methods} \name{c,ListOfmeta-method} \alias{c,ListOfmeta-method} \title{Concatenate ListOfmeta elements into a ListOfmeta} \usage{ \S4method{c}{ListOfmeta}(x, ..., recursive = FALSE) } \arguments{ \item{x, ...}{meta or ListOfmeta elements to be concatenated, e.g. see \code{\link{meta}}} \item{recursive}{logical, if 'recursive=TRUE', the function descends through lists and combines their elements into a vector.} } \value{ a listOfmeta object containing multiple meta elements. } \description{ Concatenate ListOfmeta elements into a ListOfmeta } \examples{ metalist <- c(meta(content="example", property="dc:title"), meta(content="Carl", property="dc:creator")) out <- c(metalist, metalist) out <- c(metalist, meta(content="a", property="b")) } RNeXML/man/get_characters_list.Rd0000644000176200001440000000132512641021656016376 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/deprecated.R \name{get_characters_list} \alias{get_characters_list} \title{Extract the character matrix} \usage{ get_characters_list(nexml, rownames_as_col = FALSE) } \arguments{ \item{nexml}{nexml object (e.g. from read.nexml)} \item{rownames_as_col}{option to return character matrix rownames (with taxon ids) as it's own column in the data.frame. Default is FALSE for compatibility with geiger and similar packages.} } \value{ the list of taxa } \description{ Extract the character matrix } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_characters_list(nex) } RNeXML/man/add_namespaces.Rd0000644000176200001440000000267312641021656015323 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/add_namespaces.R \name{add_namespaces} \alias{add_namespaces} \title{add namespaces} \usage{ add_namespaces(namespaces, nexml = new("nexml")) } \arguments{ \item{namespaces}{a named character vector of namespaces} \item{nexml}{a nexml object. will create a new one if none is given.} } \value{ a nexml object with updated namespaces } \description{ add namespaces, avoiding duplication if prefix is already defined } \examples{ ## Create a new nexml object with a single metadata element: modified <- meta(property = "prism:modificationDate", content = "2013-10-04") nex <- add_meta(modified) # Note: 'prism' is defined in nexml_namespaces by default. ## Write multiple metadata elements, including a new namespace: website <- meta(href = "http://carlboettiger.info", rel = "foaf:homepage") # meta can be link-style metadata nex <- add_meta(list(modified, website), namespaces = c(foaf = "http://xmlns.com/foaf/0.1/")) ## Append more metadata, and specify a level: history <- meta(property = "skos:historyNote", content = "Mapped from the bird.orders data in the ape package using RNeXML") nex <- add_meta(history, nexml = nex, level = "trees", namespaces = c(skos = "http://www.w3.org/2004/02/skos/core#")) } \seealso{ \code{\link{meta}} \code{\link{add_meta}} } RNeXML/man/get_trees.Rd0000644000176200001440000000153312641021656014347 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_trees.R \name{get_trees} \alias{get_trees} \title{extract a phylogenetic tree from the nexml} \usage{ get_trees(nexml) } \arguments{ \item{nexml}{a representation of the nexml object from which the data is to be retrieved} } \value{ an ape::phylo tree, if only one tree is represented. Otherwise returns a list of lists of multiphylo trees. To consistently recieve the list of lists format (preserving the heriarchical nature of the nexml), use \code{\link{get_trees_list}} instead. } \description{ extract a phylogenetic tree from the nexml } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_trees(nex) } \seealso{ \code{\link{get_trees}} \code{\link{get_flat_trees}} \code{\link{get_item}} } RNeXML/man/get_flat_trees.Rd0000644000176200001440000000164712641021656015363 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_trees.R \name{get_flat_trees} \alias{get_flat_trees} \title{get_flat_trees} \usage{ get_flat_trees(nexml) } \arguments{ \item{nexml}{a representation of the nexml object from which the data is to be retrieved} } \value{ a multiPhylo object (list of ape::phylo objects). See details. } \description{ extract a single multiPhylo object containing all trees in the nexml } \details{ Note that this method collapses any heirachical structure that may have been present as multiple `trees` nodes in the original nexml (though such a feature is rarely used). To preserve that structure, use \code{\link{get_trees}} instead. } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_flat_trees(nex) } \seealso{ \code{\link{get_trees}} \code{\link{get_trees}} \code{\link{get_item}} } RNeXML/man/nexml_publish.Rd0000644000176200001440000000130312641021656015232 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_publish.R \name{nexml_publish} \alias{nexml_publish} \title{publish nexml files to the web and receive a DOI} \usage{ nexml_publish(nexml, ..., repository = "figshare") } \arguments{ \item{nexml}{a nexml object (or file path)} \item{...}{additional arguments, depending on repository. See examples.} \item{repository}{desitination respository} } \value{ a digital object identifier to the published data } \description{ publish nexml files to the web and receive a DOI } \examples{ \dontrun{ data(bird.orders) birds <- add_trees(bird.orders) doi <- nexml_publish(birds, visibility = "public", repository="figshare") } } RNeXML/man/get_metadata.Rd0000644000176200001440000000155312641025014014777 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_metadata.R \name{get_metadata} \alias{get_metadata} \title{get_metadata} \usage{ get_metadata(nexml, level = "nexml") } \arguments{ \item{nexml}{a nexml object} \item{level}{the name of the level of element desired, see details} } \value{ the requested metadata as a data.frame. Additional columns indicate tha parent element of the return value. } \description{ get_metadata } \details{ 'level' should be either the name of a child element of a NeXML document (e.g. "otu", "characters"), or a path to the desired element, e.g. 'trees/tree' will return the metadata for all phylogenies in all trees blocks. } \examples{ \dontrun{ comp_analysis <- system.file("examples", "primates.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_metadata(nex) get_metadata(nex, "otus/otu") } } RNeXML/man/get_namespaces.Rd0000644000176200001440000000112012641021656015334 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_namespaces.R \name{get_namespaces} \alias{get_namespaces} \title{get namespaces} \usage{ get_namespaces(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ a named character vector providing the URLs defining each of the namespaces used in the nexml file. Names correspond to the prefix abbreviations of the namespaces. } \description{ get namespaces } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) get_namespaces(nex) } RNeXML/man/get_rdf.Rd0000644000176200001440000000153612641021656014003 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_rdf.R \name{get_rdf} \alias{get_rdf} \title{Extract rdf-xml from a NeXML file} \usage{ get_rdf(file) } \arguments{ \item{file}{the name of a nexml file, or otherwise a nexml object.} } \value{ an RDF-XML object (XMLInternalDocument). This can be manipulated with tools from the XML R package, or converted into a triplestore for use with SPARQL queries from the rrdf R package. } \description{ Extract rdf-xml from a NeXML file } \examples{ \dontrun{ f <- system.file("examples", "meta_example.xml", package="RNeXML") rdf <- get_rdf(f) ## Write to a file and read in with rrdf tmp <- tempfile() saveXML(rdf, tmp) library(rrdf) lib <- load.rdf(tmp) ## Perform a SPARQL query: sparql.rdf(lib, "SELECT ?title WHERE { ?x ?title}") } } RNeXML/man/nexml_get.Rd0000644000176200001440000000461512641021656014354 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_get.R \name{nexml_get} \alias{get_item} \alias{nexml_get} \title{Get the desired element from the nexml object} \usage{ nexml_get(nexml, element = c("trees", "trees_list", "flat_trees", "metadata", "otu", "taxa", "characters", "characters_list", "namespaces"), ...) } \arguments{ \item{nexml}{a nexml object (from read_nexml)} \item{element}{the kind of object desired, see details.} \item{...}{additional arguments, if applicable to certain elements} } \value{ return type depends on the element requested. See details. } \description{ Get the desired element from the nexml object } \details{ \itemize{ \item{"tree"}{ an ape::phylo tree, if only one tree is represented. Otherwise returns a list of lists of multiphylo trees. To consistently recieve the list of lists format (preserving the heriarchical nature of the nexml), use \code{trees} instead.} \item{"trees"}{ returns a list of lists of multiphylo trees, even if all trees are in the same `trees` node (and hence the outer list will be of length 1) or if there is only a single tree (and hence the inner list will also be of length 1. This guarentees a consistent return type regardless of the number of trees present in the nexml file, and also preserves any heirarchy/grouping of trees. } \item{"flat_trees"}{ a multiPhylo object (list of ape::phylo objects) Note that this method collapses any heirachical structure that may have been present as multiple `trees` nodes in the original nexml (though such a feature is rarely used). To preserve that structure, use `trees` instead.} \item{"metadata"}{Get metadata from the specified level (default is top/nexml level) } \item{"otu"}{ returns a named character vector containing all available metadata. names indicate \code{property} (or \code{rel} in the case of links/resourceMeta), while values indicate the \code{content} (or \code{href} for links). } \item{"taxa"}{ alias for otu } } For a slightly cleaner interface, each of these elements is also defined as an S4 method for a nexml object. So in place of `get_item(nexml, "tree")`, one could use `get_tree(nexml)`, and so forth for each element type. } \examples{ comp_analysis <- system.file("examples", "comp_analysis.xml", package="RNeXML") nex <- nexml_read(comp_analysis) nexml_get(nex, "trees") nexml_get(nex, "characters_list") } \seealso{ \code{\link{get_trees}} } RNeXML/man/nexml_add.Rd0000644000176200001440000000164612641021656014326 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nexml_add.R \name{nexml_add} \alias{nexml_add} \title{add elements to a new or existing nexml object} \usage{ nexml_add(x, nexml = new("nexml"), type = c("trees", "characters", "meta", "namespaces"), ...) } \arguments{ \item{x}{the object to be added} \item{nexml}{an existing nexml object onto which the object should be appended} \item{type}{the type of object being provided.} \item{...}{additional optional arguments to the add functions} } \value{ a nexml object with the additional data } \description{ add elements to a new or existing nexml object } \examples{ library("geiger") data(geospiza) geiger_nex <- nexml_add(geospiza$phy, type="trees") geiger_nex <- nexml_add(geospiza$dat, nexml = geiger_nex, type="characters") } \seealso{ \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} \code{\link{add_namespaces}} } RNeXML/man/simmap_to_nexml.Rd0000644000176200001440000000125712641021656015564 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/simmap.R \name{simmap_to_nexml} \alias{simmap_to_nexml} \title{simmap_to_nexml} \usage{ simmap_to_nexml(phy, state_ids = NULL) } \arguments{ \item{phy}{a phy object containing simmap phy$maps element, from the phytools pacakge} \item{state_ids}{a named character vector giving the state names corresponding to the ids used to refer to each state in nexml. If null ids will be generated and states taken from the phy$states names.} } \value{ a nexml representation of the simmap } \description{ simmap_to_nexml } \examples{ data(simmap_ex) phy <- nexml_to_simmap(simmap_ex) nex <- simmap_to_nexml(phy) } RNeXML/man/get_taxa_list.Rd0000644000176200001440000000064512641021656015220 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_taxa.R \name{get_taxa_list} \alias{get_otus_list} \alias{get_taxa_list} \title{get_taxa_list} \usage{ get_taxa_list(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ the list of taxa } \description{ Retrieve names of all species/otus otus (operational taxonomic units) included in the nexml } \seealso{ \code{\link{get_item}} } RNeXML/man/toPhylo.Rd0000644000176200001440000000102312641021656014016 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_trees.R \name{toPhylo} \alias{toPhylo} \title{nexml to phylo} \usage{ toPhylo(tree, otus) } \arguments{ \item{tree}{an nexml tree element} \item{otus}{a character string of taxonomic labels, named by the otu ids. e.g. (from get_otu_maps for the otus set matching the relevant trees node.} } \value{ phylo object. If a "reconstructions" annotation is found on the edges, return simmap maps slot as well. } \description{ nexml to phylo coercion } RNeXML/man/get_license.Rd0000644000176200001440000000043612641021656014650 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/get_basic_metadata.R \name{get_license} \alias{get_license} \title{get_license} \usage{ get_license(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ the list of taxa } \description{ get_license } RNeXML/man/add_basic_meta.Rd0000644000176200001440000000564012731606043015267 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/add_basic_meta.R \name{add_basic_meta} \alias{add_basic_meta} \title{Add basic metadata} \usage{ add_basic_meta(title = NULL, description = NULL, creator = Sys.getenv("USER"), pubdate = Sys.Date(), rights = "CC0", publisher = NULL, citation = NULL, nexml = new("nexml")) } \arguments{ \item{title}{A title for the dataset} \item{description}{a description of the dataset} \item{creator}{name of the data creator. Can be a string or R person object} \item{pubdate}{publication date. Default is current date.} \item{rights}{the intellectual property rights associated with the data. The default is Creative Commons Zero (CC0) public domain declaration, compatiable with all other licenses and appropriate for deposition into the Dryad or figshare repositories. CC0 is also recommended by the Panton Principles. Alternatively, any other plain text string can be added and will be provided as the content attribute to the dc:rights property.} \item{publisher}{the publisher of the dataset. Usually where a user may go to find the canonical copy of the dataset: could be a repository, journal, or academic institution.} \item{citation}{a citation associated with the data. Usually an acompanying academic journal article that indicates how the data should be cited in an academic context. Multiple citations can be included here. citation can be a plain text object, but is preferably an R `citation` or `bibentry` object (which can include multiple citations. See examples} \item{nexml}{a nexml object to which metadata should be added. A new nexml object will be created if none exists.} } \value{ an updated nexml object } \description{ adds Dublin Core metadata elements to (top-level) nexml } \details{ \code{add_basic_meta()} is just a wrapper for \code{\link{add_meta}} to make it easy to provide generic metadata without explicitly providing the namespace. For instance, \code{add_basic_meta(title="My title", description="a description")} is identical to: \code{add_meta(list(meta("dc:title", "My title"), meta("dc:description", "a description")))} Most function arguments are mapped directly to the Dublin Core terms of the same name, with the exception of `rights`, which by default maps to the Creative Commons namespace when using CC0 license. } \examples{ nex <- add_basic_meta(title = "My test title", description = "A description of my test", creator = "Carl Boettiger ", publisher = "unpublished data", pubdate = "2012-04-01") ## Adding citation to an R package: nexml <- add_basic_meta(citation=citation("ape")) \dontrun{ ## Use knitcitations package to add a citation by DOI: library(knitcitations) nexml <- add_basic_meta(citation = bib_metadata("10.2307/2408428")) } } \seealso{ \code{\link{add_trees}} \code{\link{add_characters}} \code{\link{add_meta}} } RNeXML/man/nexml_to_simmap.Rd0000644000176200001440000000070612641021656015562 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/simmap.R \name{nexml_to_simmap} \alias{nexml_to_simmap} \title{nexml_to_simmap} \usage{ nexml_to_simmap(nexml) } \arguments{ \item{nexml}{a nexml object} } \value{ a simmap object (phylo object with a $maps element for use in phytools functions). } \description{ nexml_to_simmap } \examples{ data(simmap_ex) phy <- nexml_to_simmap(simmap_ex) nex <- simmap_to_nexml(phy) } RNeXML/LICENSE0000644000176200001440000000012012641021656012320 0ustar liggesusersYEAR: 2013 - 2014 COPYRIGHT HOLDER: Carl Boettiger ORGANIZATION: rOpenSci