IRanges/DESCRIPTION0000644000175400017540000000427013175724757014663 0ustar00biocbuildbiocbuildPackage: IRanges Title: Infrastructure for manipulating intervals on sequences Description: Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames. Version: 2.12.0 Encoding: UTF-8 Author: H. Pagès, P. Aboyoun and M. Lawrence Maintainer: Bioconductor Package Maintainer biocViews: Infrastructure, DataRepresentation Depends: R (>= 3.1.0), methods, utils, stats, BiocGenerics (>= 0.23.3), S4Vectors (>= 0.15.5) Imports: stats4 LinkingTo: S4Vectors Suggests: XVector, GenomicRanges, Rsamtools, GenomicAlignments, GenomicFeatures, BSgenome.Celegans.UCSC.ce2, pasillaBamSubset, RUnit License: Artistic-2.0 Collate: range-squeezers.R Vector-class-leftovers.R List-class-leftovers.R AtomicList-class.R Ranges-class.R subsetting-utils.R Ranges-comparison.R IRanges-class.R IRanges-constructor.R IRanges-utils.R IPos-class.R Grouping-class.R CompressedList-class.R CompressedList-comparison.R Views-class.R Rle-class-leftovers.R RleViews-class.R RleViews-utils.R extractList.R seqapply.R multisplit.R AtomicList-impl.R AtomicList-utils.R ListGrouping-class.R Hits-class-leftovers.R DataFrame-utils.R DataFrameList-class.R DataFrameList-utils.R RangesList-class.R ViewsList-class.R RleViewsList-class.R RleViewsList-utils.R MaskCollection-class.R RangedData-class.R RangedData-utils.R CompressedHitsList-class.R NCList-class.R RangedSelection-class.R read.Mask.R findOverlaps-methods.R intra-range-methods.R inter-range-methods.R reverse-methods.R coverage-methods.R slice-methods.R setops-methods.R nearest-methods.R cbind-Rle-methods.R tile-methods.R zzz.R NeedsCompilation: yes Packaged: 2017-10-30 22:38:39 UTC; biocbuild IRanges/NAMESPACE0000644000175400017540000001723213175713360014362 0ustar00biocbuildbiocbuilduseDynLib(IRanges) import(methods) importFrom(utils, stack, read.table) importFrom(stats, cov, cor, median, quantile, smoothEnds, runmed, "window<-", aggregate, setNames) import(BiocGenerics) import(S4Vectors) importFrom(stats4, summary, update) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 classes ### exportClasses( Ranges, RangesORmissing, IRanges, NormalIRanges, IPos, NCList, NCLists, Grouping, ManyToOneGrouping, ManyToManyGrouping, H2LGrouping, Dups, GroupingRanges, GroupingIRanges, Partitioning, PartitioningByEnd, PartitioningByWidth, PartitioningMap, CompressedList, Views, RleViews, MaskCollection, AtomicList, CompressedAtomicList, SimpleAtomicList, LogicalList, CompressedLogicalList, SimpleLogicalList, IntegerList, CompressedIntegerList, SimpleIntegerList, NumericList, CompressedNumericList, SimpleNumericList, ComplexList, CompressedComplexList, SimpleComplexList, CharacterList, CompressedCharacterList, SimpleCharacterList, RawList, CompressedRawList, SimpleRawList, RleList, CompressedRleList, SimpleRleList, FactorList, CompressedFactorList, SimpleFactorList, RangesList, SimpleRangesList, IRangesList, CompressedIRangesList, SimpleIRangesList, NormalIRangesList, CompressedNormalIRangesList, SimpleNormalIRangesList, ViewsList, SimpleViewsList, RleViewsList, SimpleRleViewsList, DataFrameList, SimpleDataFrameList, SplitDataFrameList, CompressedSplitDataFrameList, SimpleSplitDataFrameList, RangedData, RangedSelection, SimpleGrouping, CompressedGrouping, SimpleManyToOneGrouping, CompressedManyToOneGrouping, SimpleManyToManyGrouping, CompressedManyToManyGrouping ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S3 methods ### S3method(diff, AtomicList) S3method(`window<-`, Vector) S3method(`window<-`, vector) S3method(`window<-`, factor) ### We also export them thru the export() directive so that (a) they can be ### called directly, (b) tab-completion on the name of the generic shows them, ### and (c) methods() doesn't asterisk them. export( diff.AtomicList, "window<-.Vector", "window<-.vector", "window<-.factor" ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 methods for generics not defined in IRanges ### exportMethods( length, "length<-", names, "names<-", nrow, NROW, ncol, NCOL, dim, rownames, "rownames<-", colnames, "colnames<-", dimnames, "dimnames<-", "[", "[<-", "[[", "[[<-", "$<-", as.vector, as.integer, as.character, as.factor, as.matrix, as.data.frame, as.list, coerce, c, show, match, duplicated, unique, anyDuplicated, "%in%", is.unsorted, order, Ops, Math, Math2, Summary, Complex, summary, rev, rep, drop, start, "start<-", end, "end<-", width, "width<-", min, max, range, which.max, which.min, diff, mean, var, cov, cor, sd, median, quantile, mad, IQR, smoothEnds, runmed, subset, "window<-", transform, nchar, chartr, tolower, toupper, sub, gsub, unlist, stack, "split<-", unsplit, relist, update, append, "!", which, merge, with, within, is.na, by, cbind, rbind, lapply, pmax, pmin, pmax.int, pmin.int, paste, table, tapply, union, intersect, setdiff, values, "values<-", classNameForDisplay, from, to, nLnode, nRnode, pcompare, pcompareRecursively, selfmatch, runLength, "runValue<-", runsum, runmean, runwtsum, runq, elementNROWS, isEmpty, relistToClass, endoapply, mendoapply, revElements, as.env, active, "active<-" ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export non-generic functions ### export( splitAsList, multisplit, solveUserSEW0, IRanges, solveUserSEW, successiveIRanges, slidingIRanges, breakInChunks, whichAsIRanges, asNormalIRanges, rangeComparisonCodeToLetter, IPos, NCList, NCLists, H2LGrouping, Dups, PartitioningByEnd, PartitioningByWidth, PartitioningMap, RangedData, RangedSelection, RangesList, IRangesList, RleViewsList, "%over%", "%within%", "%outside%", "%pover%", "%pwithin%", "%poutside%", mergeByOverlaps, findOverlapPairs, MaskCollection.show_frame, Mask, read.gapMask, read.agpMask, read.liftMask, read.rmMask, read.trfMask, ##read.chain, successiveViews, slidingViews, LogicalList, IntegerList, NumericList, ComplexList, CharacterList, RawList, RleList, FactorList, DataFrameList, SplitDataFrameList, ManyToOneGrouping, ManyToManyGrouping, regroup, selectNearest ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 generics defined in IRanges + export corresponding methods ### export( ## range-squeezers.R: ranges, rglist, ## Vector-class-leftovers.R: showAsCell, mstack, ## Ranges-class.R: mid, isNormal, whichFirstNotNormal, ## IPos-class.R: pos, ## Views-class.R: subject, "ranges<-", Views, trim, subviews, viewApply, viewMins, viewMaxs, viewSums, viewMeans, viewWhichMins, viewWhichMaxs, viewRangeMins, viewRangeMaxs, ## Grouping-class.R: nobj, grouplengths, members, vmembers, togroup, togrouplength, high2low, low2high, grouprank, togrouprank, mapOrder, ## Rle-class-leftovers.R: findRange, splitRanges, ## extractList.R: extractList, ## DataFrameList-class.R: columnMetadata, "columnMetadata<-", ## RangesList-class.R: universe, "universe<-", space, ## MaskCollection-class.R: nir_list, desc, "desc<-", maskedwidth, maskedratio, collapse, ## RangedData-class.R: score, "score<-", ## findOverlaps-methods.R: findOverlaps, countOverlaps, overlapsAny, subsetByOverlaps, overlapsRanges, poverlaps, ## intra-range-methods.R: shift, narrow, resize, flank, reflect, promoters, restrict, threebands, ## inter-range-methods.R: reduce, gaps, disjoin, isDisjoint, disjointBins, ## reverse-methods.R: reverse, ## coverage-methods.R: coverage, ## slice-methods.R: slice, ## setops-methods.R: punion, pintersect, psetdiff, pgap, ## nearest-methods.R: precede, follow, nearest, distance, distanceToNearest, ## tile-methods.R: tile, slidingWindows, ## AtomicList-utils.R: ifelse2 ) ### Exactly the same list as above. exportMethods( ranges, rglist, runsum, runmean, runwtsum, runq, showAsCell, mstack, mid, isNormal, whichFirstNotNormal, pos, subject, "ranges<-", Views, trim, subviews, viewApply, viewMins, viewMaxs, viewSums, viewMeans, viewWhichMins, viewWhichMaxs, viewRangeMins, viewRangeMaxs, nobj, grouplengths, members, vmembers, togroup, togrouplength, high2low, low2high, grouprank, togrouprank, mapOrder, findRange, splitRanges, extractList, columnMetadata, "columnMetadata<-", universe, "universe<-", space, nir_list, desc, "desc<-", maskedwidth, maskedratio, collapse, score, "score<-", findOverlaps, countOverlaps, overlapsAny, subsetByOverlaps, overlapsRanges, poverlaps, shift, narrow, resize, flank, reflect, promoters, restrict, threebands, reduce, gaps, disjoin, isDisjoint, disjointBins, reverse, coverage, slice, punion, pintersect, psetdiff, pgap, precede, follow, nearest, distance, distanceToNearest, tile, slidingWindows, ifelse2 ) IRanges/NEWS0000644000175400017540000012725513175713360013651 0ustar00biocbuildbiocbuildCHANGES IN VERSION 2.12.0 ------------------------- NEW FEATURES o Add IPos objects for storing a set of integer positions where most of the positions are typically (but not necessarily) adjacent. o Add coercion of a character vector or factor representing ranges (e.g. "22-155") to an IRanges object, as well as "as.character" and "as.factor" methods for Ranges objects. o Introduce overlapsRanges() as a replacement for "ranges" methods for Hits and HitsList objects, and deprecate the latter. o Add "is.unsorted" method for Ranges objects. o Add "ranges" method for Ranges objects (downgrade the object to an IRanges instance and drop its metadata columns). o Add 'use.names' and 'use.mcols' args to ranges() generic. SIGNIFICANT USER-VISIBLE CHANGES o Change 'maxgap' and 'minoverlap' defaults for findOverlaps() and family (i.e. countOverlaps(), overlapsAny(), and subsetByOverlaps()). This change addresses 2 long-standing issues: (1) by default zero-width ranges are not excluded anymore, and (2) control of zero-width ranges and adjacent ranges is finally decoupled (only partially though). New default for 'minoverlap' is 0 instead of 1. New default for 'maxgap' is -1 instead of 0. See ?findOverlaps for more information about 'maxgap' and the meaning of -1. For example, if 'type' is "any", you need to set 'maxgap' to 0 if you want adjacent ranges to be considered as overlapping. Note that poverlaps() still uses the old 'maxgap' and 'minoverlap' defaults. o subsetByOverlaps() first 2 arguments are now named 'x' and 'ranges' (instead of 'query' and 'subject') for consistency with the transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps() functions from the GenomicFeatures package and with the snpsByOverlaps() function from the BSgenome package. o Replace ifelse() generic and methods with ifelse2() (eager semantics). o Coercion from Ranges to IRanges now propagates the metadata columns. o Move rglist() generic from GenomicRanges to IRanges package. o The "union", "intersect", and "setdiff" methods for Ranges objects don't act like endomorphisms anymore: now they always return an IRanges *instance* whatever Ranges derivatives are passed to them (e.g. NCList or NormalIRanges). DEPRECATED AND DEFUNCT o Deprecate "ranges" methods for Hits and HitsList objects (replaced with overlapsRanges()). o Deprecate the "overlapsAny", "subsetByOverlaps", "coverage" and "range" methods for RangedData objects. o Deprecate the universe() getter and setter as well as the 'universe' argument of the RangesList(), IRangesList(), RleViewsList(), and RangedData() constructor functions. o Default "togroup" method is now defunct (was deprecated in BioC 3.3). o Remove grouplength() (was deprecated in BioC 3.3 and replaced with grouplengths, then defunct in BioC 3.4). BUG FIXES o nearest() and distanceToNearest() now call findOverlaps() internally with maxgap=0 and minoverlap=0. This fixes incorrect results obtained in some situations e.g. in the situation reported here: https://support.bioconductor.org/p/99369/ (zero-width ranges) but also in this situation: nearest(IRanges(5, 10), IRanges(1, 4:5), select="all") where the 2 ranges in the subject are *both* nearest to the 5-10 range. o Fix restrict() and reverse() on IRanges objects with metadata columns. o Fix table() on Ranges objects. o Various other minor fixes. CHANGES IN VERSION 2.10.0 ------------------------- NEW FEATURES o "range" methods now have a 'with.revmap' argument (like "reduce" and "disjoin" methods). o Add coercion from list-like objects to IRangesList objects. o Add "table" method for SimpleAtomicList objects. o The "gaps" method for CompressedIRangesList objects now uses a chunk processing strategy if the input object has more than 10 million list elements. The hope is to reduce memory usage on very big input objects. DEPRECATED AND DEFUNCT o Remove the RangedDataList and RDApplyParams classes, rdapply(), and the "split" and "reduce" methods for RangedData objects. All these things were defunct in BioC 3.4. o Remove 'ignoreSelf' and 'ignoreRedundant' arguments (replaced by 'drop.self' and 'drop.redundant') from findOverlaps,Vector,missing method (were defunct in BioC 3.4). o Remove GappedRanges class (was defunct in BioC 3.4). BUG FIXES o Fix "setdiff" method for CompressedIRangesList for when all ranges are empty. o Fix long standing bug in coercion from Ranges to PartitioningByEnd when the object to coerce has names. CHANGES IN VERSION 2.8.0 ------------------------ NEW FEATURES o "disjoin" methods now support 'with.revmap' argument. o Add 'invert' argument to subsetByOverlaps(), like grep()'s invert. o Add "unstrsplit" method for RleList objects. o findOverlapPairs() allows 'subject' to be missing for self pairing. o Add "union", "intersect" and "setdiff" methods for Pairs. o Add distance,Pairs,missing method. o Add ManyToManyGrouping, with coercion targets from FactorList and DataFrame. o Add Hits->List and Hits->(ManyToMany)Grouping coercions. o Add "as.matrix" method for AtomicList objects. o Add "selfmatch", "duplicated", "order", "rank", and "median" methods for CompressedAtomicList objects. o Add "anyNA" method for CompressedAtomicList objects that ensures recursive=FALSE. o Add "mean" method for CompressedRleList objects. o Support 'global' argument on "which.min" and "which.max" methods for CompressedAtomicList objects. SIGNIFICANT USER-VISIBLE CHANGES o Make mstack,Vector method more consistent with stack,List method. o Optimize and document coercion from AtomicList to RleViews objects. DEPRECATED AND DEFUNCT o Are now defunct (were deprecated in BioC 3.3): - RangedDataList objects. - RDApplyParams objects and rdapply(). - The "split" and "reduce" methods for RangedData objects. - The 'ignoreSelf' and/or 'ignoreRedundant' arguments of the findOverlaps,Vector,missing method (a.k.a. "self findOverlaps" method). - grouplength() - GappedRanges objects. BUG FIXES o Fix special meaning of findOverlaps's maxgap argument when type="within". o isDisjoint(IRangesList()) now returns logical(0) instead of NULL. o Fixes to regroup() and Grouping construction. o Fix rank,CompressedAtomicList method. o Fix fromLast=TRUE for duplicated,CompressedAtomicList method. CHANGES IN VERSION 2.6.0 ------------------------ NEW FEATURES o Add regroup() function. SIGNIFICANT USER-VISIBLE CHANGES o Remove 'algorithm' argument from findOverlaps(), countOverlaps(), overlapsAny(), subsetByOverlaps(), nearest(), distanceToNearest(), findCompatibleOverlaps(), countCompatibleOverlaps(), findSpliceOverlaps(), summarizeOverlaps(), Union(), IntersectionStrict(), and IntersectionNotEmpty(). The argument was added in BioC 3.1 to facilitate the transition from an Interval Tree to a Nested Containment Lists implementation of findOverlaps() and family. The transition is over. o Restore 'maxgap' special meaning (from BioC < 3.1) when calling findOverlaps() (or other member of the family) with 'type' set to "within". o No more limit on the max depth of *on-the-fly* NCList objects. Note that the limit remains and is still 100000 when the user explicitely calls the NCList() or GNCList() constructor. o Rename 'ignoreSelf' and 'ignoreRedundant' argument of the findOverlaps,Vector,missing method -> 'drop.self' and 'drop.redundant'. The old names are still working but deprecated. o Rename grouplength() -> grouplengths() (old name still available but deprecated). o Modify "replaceROWS" method for IRanges objects so that the replaced elements in 'x' get their metadata columns from 'value'. See this thread on bioc-devel: https://stat.ethz.ch/pipermail/bioc-devel/2015-November/008319.html o Optimized which.min() and which.max() for atomic lists. o Remove the ellipsis (...) from all the setops methods, except the methods for Pairs objects. o Add "togroup" method for ManyToOneGrouping objects and deprecate default method. o Modernize "show" method for Ranges objects: now they're displayed more like GRanges objects. o Coercion from IRanges to NormalIRanges now propagates the metadata columns when the object to coerce is already normal. o Don't export CompressedHitsList anymore from the IRanges package. This doesn't seem to be used at all and it's not clear that we need it. DEPRECATED AND DEFUNCT o Deprecate RDApplyParams objects and rdapply(). o Deprecate RangedDataList objects. o Deprecate the "reduce" method for RangedData objects. o Deprecate GappedRanges objects. o Deprecate the 'ignoreSelf' and 'ignoreRedundant' arguments of the findOverlaps,Vector,missing method in favor of the new 'drop.self' and 'drop.redundant' arguments. o Deprecate grouplength() in favor of grouplengths(). o Default "togroup" method is deprecated. o Remove IntervalTree and IntervalForest classes and methods (were defunct in BioC 3.2). o Remove mapCoords() and pmapCoords() generics (were defunct in BioC 3.2). o Remove all "updateObject" methods (they were all obsolete). BUG FIXES o Fix segfault when calling window() on an Rle object of length 0. o Fix "which.min" and "which.max" methods for IntegerList, NumericList, and RleList objects when 'x' is empty or contains empty list elements. o Fix mishandling of zero-width ranges when calling findOverlaps() (or other member of the family) with 'type' set to "within". o Various fixes to "countOverlaps" method for Vector#missing. See svn commit message for commit 116112 for the details. o Fix validity method for NormalIRanges objects (was not checking anything). CHANGES IN VERSION 2.4.0 ------------------------ NEW FEATURES o Add "cbind" methods for binding Rle or RleList objects together. o Add coercion from Ranges to RangesList. o Add "paste" method for CompressedAtomicList objects. o Add "expand" method for Vector objects for expanding a Vector object 'x' based on a column in mcols(x). o Add overlapsAny,integer,Ranges method. o coverage" methods now accept 'shift' and 'weight' supplied as an Rle. SIGNIFICANT USER-VISIBLE CHANGES o The following was moved to S4Vectors: - The FilterRules stuff. - The "aggregate" methods. - The "split" methods. o The "sum", "min", "max", "mean", "any", and "all" methods on CompressedAtomicList objects are 100X faster on lists with 500k elements, 80X faster for 50k elements. o Tweak "c" method for CompressedList objects to make sure it always returns an object of the same class as its 1st argument. o NCList() constructor now propagates the metadata columns. DEPRECATED AND DEFUNCT o RangedData/RangedDataList are not formally deprecated yet but the documentation now officially declares them as superseded by GRanges/GRangesList and discourages their use. o After being deprecated in BioC 3.1, IntervalTree and IntervalForest objects and the "intervaltree" algorithm in findOverlaps() are now defunct. o After being deprecated in BioC 3.1, mapCoords() and pmapCoords() are now defunct. o Remove seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby() (were defunct in BioC 3.1). BUG FIXES o Fix FactorList() constructor when 'compress=TRUE' (note that the levels are combined during compression). o Fix c() on CompressedFactorList objects (was returning a CompressedIntegerList object). CHANGES IN VERSION 2.2.0 ------------------------ NEW FEATURES o Add NCList() and NCLists() for preprocessing a Ranges or RangesList object into an NCList or NCLists object that can be used for fast overlap search with findOverlaps(). NCList() and NCLists() are replacements for IntervalTree() and IntervalForest() that use Nested Containment Lists instead of interval trees. For a one time use, it's not advised to explicitely preprocess the input. This is because findOverlaps() or countOverlaps() will take care of it and do a better job at it (that is, they preprocess only what's needed when it's needed and release memory as they go). o Add coercion methods from Hits to CompressedIntegerList, to PartitioningByEnd, and to Partitioning. SIGNIFICANT USER-VISIBLE CHANGES o The code behind overlap-based operations like findOverlaps(), countOverlaps(), subsetByOverlaps(), summarizeOverlaps(), nearest(), etc... was refactored and improved. Some highlights on what has changed: - The underlying code used for finding/counting overlaps is now based on the Nested Containment List algorithm by Alexander V. Alekseyenko and Christopher J. Lee. - The old algorithm based on interval trees is still available (but deprecated). The 'algorithm' argument was added to most overlap-based operations to let the user choose between the new (algorithm="nclist", the default) and the old (algorithm="intervaltree") algorithm. - With the new algorithm, the hits returned by findOverlaps() are not fully ordered (i.e. ordered by queryHits and subject Hits) anymore, but only partially ordered (i.e. ordered by queryHits only). Other than that, and except for the 3 particular situations mentioned below, choosing one or the other doesn't affect the output, only performance. - Either the query or subject can be preprocessed with NCList() for a Ranges object (replacement for IntervalTree()), NCLists() for a RangesList object (replacement for IntervalForest()), and GNCList() for a GenomicRanges object (replacement for GIntervalTree()). However, for a one time use, it's not advised to explicitely preprocess the input. This is because findOverlaps() or countOverlaps() will take care of it and do a better job at it (that is, they preprocess only what's needed when it's needed and release memory as they go). - With the new algorithm, countOverlaps() on Ranges or GenomicRanges objects doesn't call findOverlaps() to collect all the hits in a growing Hits object and count them only at the end. Instead the counting happens at the C level and the hits are not kept. This reduces memory usage considerably when there is a lot of hits. - When 'minoverlap=0', zero-width ranges are interpreted as insertion points and are considered to overlap with ranges that contain them. This is the 1st situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - When using 'select="arbitrary"', the new algorithm will generally not select the same hits as the old algorithm. This is the 2nd situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - When using the old interval tree algorithm, 'maxgap' has a special meaning if 'type' is "start", "end", or "within". This is not yet the case with the new algorithm. That feature seems somewhat useful though so maybe the new algorithm should also support it? Anyway, this is the 3rd situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - Objects preprocessed with NCList(), NCLists(), and GNCList() are serializable. o The RleViewsList() constructor function now reorders its 'rleList' argument so that its names match the names on the 'rangesList' argument. o Minor changes to breakInChunks(): - Add 'nchunk' arg. - Now returns a PartitioningByEnd instead of a PartitioningByWidth object. - Now accepts 'chunksize' of 0 if 'totalsize' is 0. o 300x speedup or more when doing unique() on a CompressedRleList object. o 20x speedup or more when doing unlist() on a SimpleRleList object. o Moved the RleTricks.Rnw vignette to the S4Vectors package. DEPRECATED AND DEFUNCT o Deprecated mapCoords() and pmapCoords(). They're replaced by mapToTranscripts() and pmapToTranscripts() from the GenomicFeatures package and mapToAlignments() and pmapToAlignments() from the GenomicAlignments package. o Deprecated IntervalTree and IntervalForest objects. o seqapply(), seqby(), seqsplit(), etc are now defunct (were deprecated in IRanges 2.0.0). o Removed map(), pmap(), and splitAsListReturnedClass() (were defunct in IRanges 2.0.0). o Removed 'with.mapping' argunment from reduce() methods (was defunct in IRanges 2.0.0). BUG FIXES o findOverlaps,Vector,missing method now accepts extra arguments via ... so for example one can specify 'ignore.strand=TRUE' when calling it on a GRanges object (before that, 'findOverlaps(gr, ignore.strand=TRUE)' would fail). o PartitioningByEnd() and PartitioningByWidth() constructors now check that, when 'x' is an integer vector, it cannot contain NAs or negative values. CHANGES IN VERSION 2.0.0 ------------------------ NEW FEATURES o Add mapCoords() and pmapCoords() as replacements for map() and pmap(). o Add coercion from list to RangesList. o Add slice,ANY method as a convenience for slice(as(x, "Rle"), ...). o Add mergeByOverlaps(); acts like base::merge as far as it makes sense. o Add overlapsAny,Vector,missing method. SIGNIFICANT USER-VISIBLE CHANGES o Move Annotated, DataTable, Vector, Hits, Rle, List, SimpleList, and DataFrame classes to new S4Vectors package. o Move isConstant(), classNameForDisplay(), and low-level argument checking helpers isSingleNumber(), isSingleString(), etc... to new S4Vectors package. o Rename Grouping class -> ManyToOneGrouping. Redefine Grouping class as the parent of all groupings (it formalizes the most general kind of grouping). o Change splitAsList() to a generic. o In rbind,DataFrame method, no longer coerce the combined column to the class of the column in the first argument. o Do not carry over row.names attribute from data.frame to DataFrame. o No longer make names valid in [[<-,DataFrame method. o Make the set operations dispatch on Ranges instead of IRanges; they usually return an IRanges, but the input could be any implementation. o Add '...' to splitAsList() generic. o Speed up trim() on a Views object when trimming is actually not needed (no-op). o Speed up validation of IRanges objects by 2x. o Speed up "flank" method for Ranges objects by 4x. DEPRECATED AND DEFUNCT o Defunct map() and pmap(). o reduce() argument 'with.mapping' is now defunct. o splitAsListReturnedClass() is now defunct. o Deprecate seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby(). BUG FIXES o Fix rbind,DataFrame method when first column is a matrix. o Fix a memory leak in the interval tree code. o Fix handling of minoverlap > 1 in findOverlaps(), so that it behaves more consistently and respects 'maxgap', as documented. o Fix findOverlaps,IRanges method for select="last". o Fix subset,Vector-method to handle objects with NULL mcols(x) (e.g. Rle object). o Fix internal helper rbind.mcols() for DataFrame (and potentially other tables). o ranges,SimpleRleList method now returns a SimpleRangesList (instead of CompressedRangesList). o Make flank() work on Ranges object of length 0. CHANGES IN VERSION 1.20.0 ------------------------- NEW FEATURES o Add IntervalForest class from Hector Corrada Bravo. o Add a FilterMatrix class, for holding the results of multiple filters. o Add selfmatch() as a faster equivalent of 'match(x, x)'. o Add "c" method for Views objects (only combine objects with same subject). o Add coercion from SimpleRangesList to SimpleIRangesList. o Add an `%outside%` that is the opposite of `%over%`. o Add validation of length() and names() of Vector objects. o Add "duplicated" and "table" methods for Vector objects. o Add some split methods that dispatch to splitAsList() even when only 'f' is a Vector. o Add set methods (setdiff, intersect, union) for Rle. o Add anyNA methods for Rle and Vector. o Add support for subset(), with(), etc on Vector objects, where the expressions are evaluated in the scope of the mcols and fixed columns. For symbols that should resolve in the calling frame, it is supported and encouraged to escape them with bquote-style ".(x)". o Add "tile" generic and methods for partitioning a ranges object into tiles; useful for iterating over subregions. SIGNIFICANT USER-VISIBLE CHANGES o All functionalities related to XVector objects have been moved to the new XVector package. o Refine how isDisjoint() handles empty ranges. o Remove 'keepLength' argument from "window<-" methods. o unlist( , use.names=FALSE) on a CompressedSplitDataFrameList object now preserves the rownames of the list elements, which is more consistent with what unlist() does on other CompressedList objects. o Splitting a list by a Vector just yields a list, not a List. o The rbind,DataFrame method now handles the case where Rle and vector columns need to be combined (assuming an equivalence between Rle and vector). Also the way the result DataFrame is constructed was changed (avoids undesirable coercions and should be faster). o as.data.frame.DataFrame now passes 'stringsAsFactors=FALSE' and 'check.names=!optional' to the underlying data.frame() call. as(x,"DataFrame") sets 'optional=TRUE' when delegating. Most places where we called as.data.frame(), we now call 'as(x,"data.frame")'. o The [<-,DataFrame method now coerces column sub-replacement value to class of column when the column already exists. o DataFrame() now automatically derives rownames (from the first argument that has some). This is a fairly significant change in behavior, but it probably does better match user behavior. o Make sure that SimpleList objects are coerced to a DataFrame with a single column. The automatic coecion methods created by the methods package were trying to create a DataFrame with one column per element, because DataFrame extends SimpleList. o Change default to 'compress=TRUE' for RleList() constructor. o tapply() now handles the case where only INDEX is a Vector (e.g. an Rle object). o Speedup coverage() in the "tiling case" (i.e. when 'x' is a tiling of the [1, width] interval). This makes it much faster to turn into an Rle a coverage loaded from a BigWig, WIG or BED as a GRanges object. o Allow logical Rle return values from filter rules. o FilterRules no longer requires its elements to be named. o The select,Vector method now returns a DataFrame even when a single column is selected. o Move is.unsorted() generic to BiocGenerics. DEPRECATED AND DEFUNCT o Deprecate seqselect() and subsetByRanges(). o Deprecate 'match.if.overlap' arg of "match" method for Ranges objects. o "match" and "%in%" methods that operate on Views, ViewsList, RangesList, or RangedData objects (20 methods in total) are now defunct. o Remove previously defunct tofactor(). BUG FIXES o The subsetting code for Vector derivatives was substancially refactored. As a consequence, it's now cleaner, simpler, and [ and [[ behave more consistently across Vector derivatives. Some obscure long-standing bugs have been eliminated and the code can be slightly faster in some circumstances. o Fix bug in findOverlaps(); zero-width ranges in the query no longer produce hits ever (regardless of 'maxgap' and 'minoverlap' values). o Correctly free memory allocated for linked list of results compiled for findOverlaps(select="all"). o Various fixes for AsIs and DataFrames. o Allow zero-row replacement values in [<-,DataFrame. o Fix long standing segfault in "[" method for Rle objects (when doing Rle()[0]). o "show" methods now display its most specific class when a column or slot is an S3 object for which class() returns more than one class. o "show" methods now display properly cells that are arrays. o Fix the [<-,DataFrame method for when a value DataFrame has matrix columns. o Fix ifelse() for when one or more of the arguments are Rle objects. o Fix coercion from SimpleList to CompressedList via AtomicList constructors. o Make "show" methods robust to "showHeadLines" and "showTailLines" global options set to NA, Inf or non-integer values. o Fix error condition in eval,FilterRules method. o Corrected an error formatting in eval,FilterRules,ANY method. CHANGES IN VERSION 1.18.0 ------------------------- NEW FEATURES o Add global options 'showHeadLines' and 'showTailLines' to control the number of head/tails lines displayed by "show" methods for Ranges, DataTable, and Hits objects. o "subset" method for Vector objects now considers metadata columns. o Add classNameForDisplay() generic and use it in all "show" methods defined in IRanges and GenomicRanges. o as(x, "DataFrame") now works on *any* R object. o Add findMatches(), an enhanced version of match() that returns all the matches between 'x' and 'table'. The hits are returned in a Hits object. Also add countMatches() for counting the number of matches in 'table' for each element in 'x'. o Add overlapsAny() as a replacement for %in% (now deprecated on range-based objects), and %over% and %within% as convenience wrappers for overlapsAny(). %over% is the replacement for %in%. o Add 'with.mapping' arg to "reduce" methods for IRanges, Ranges, Views, RangesList, and CompressedIRangesList objects. o Add "order" method for Rle objects. o Add subsetByRanges() generic with methods for ANY, NULL, vector, and IRanges for now. This is work-in-progress and more methods will be added soon. The long term plan is to make this a replacement for seqselect(), but with a faster and cleaner implementation. o Add promoters() generic with methods for Ranges, RangesList, Views, and CompressedIRangesList objects. o elementLengths() now works on XVectorList objects (and thus works on DNAStringSet objects and family defined in the Biostrings package). Note that this is the first step towards having relist() work on XVector objects (e.g. DNAString objects) eventhough this is not ready yet. o Add "mstack" method for DataFrame objects. o Add 'name.var' argument to "stack" method for List objects for naming the optional column formed when the elements themselves have named elements. SIGNIFICANT USER-VISIBLE CHANGES o "distanceToNearest" methods now return a Hits instead of a DataFrame object. o The behavior of distance() has changed. Adjacent and overlapping ranges now return a distance of 0L. See ?distance man page for details. A temporary warning will be emitted by distance() until the release of Bioconductor 2.13. o Change arg list of expand() generic: function(x, ...) instead of function(x, colnames, keepEmptyRows). o Dramatic duplicated() and unique() speedups on CompressedAtomicList objects. o Significant endoapply() speedup on XVectorList objects (this benefits DNAStringSet objects and family defined in the Biostrings package). o 2x speedup to "c" method for CompressedList objects. o classNameForDisplay() strips 'Simple' or 'Compressed', which affects all the "show" methods based on it. So now: > IntegerList(1:4, 2:-3) IntegerList of length 2 [[1]] 1 2 3 4 [[2]] 2 1 0 -1 -2 -3 instead of: > IntegerList(1:4, 2:-3) CompressedIntegerList of length 2 [[1]] 1 2 3 4 [[2]] 2 1 0 -1 -2 -3 o Optimization of "[<-" method for Rle objects when no indices are selected (just return self). o "stack" method for List objects now creates a factor for the optional name variable. o Evaluating FilterRules now subsets by each filter individually, rather than subsetting by all at the end. o Optimized which() on CompressedLogicalList objects. o All the binary comparison operations (==, <=, etc...) on Ranges objects are now using compare() behind the scene. This makes them slightly faster and also slightly more memory efficient. DEPRECATED AND DEFUNCT o %in% is now deprecated on range-based objects. Please use %over% instead. More precisely: - "match" and "%in%" methods that operate on Views, ViewsList, RangesList, or RangedData objects (20 methods in total) are now deprecated. - Behavior of match() and %in% on Ranges objects was changed (and will issue a warning) to use equality instead of overlap for comparing elements between Ranges objects 'x' and 'table'. The old behavior is still available for match() via new 'match.if.overlap' arg that is FALSE by default (the arg will be deprecated in BioC 2.13 and removed in BioC 2.14). o tofactor() is now defunct. o '.ignoreElementMetadata' argument of "c" method for IRanges objects is now defunct. BUG FIXES o Small fix to "unlist" method for CompressedList objects when 'use.names' is TRUE and 'x' is a zero-length named List (the zero-length vector returned in that case was not named, now it is). o "resize" method for Ranges objects now allows zero-length 'fix' when 'x' is zero-length. o Subsetting a Views object now subsets its metadata columns. o Names on the vector-like columns of a DataFrame object are now preserved when calling DataFrame(), or when coercing to DataFrame, or when combining DataFrame objects with rbind(). o relist() now propagates the names on 'skeleton' when returning a SimpleList. o Better argument checking in breakInChunks(). o Fix broken "showAsCell" method for ANY. Now tries to coerce uni-dimensional objects to vector instead of data.frame (which never worked anyway, due to a bug). o Fix long standing bug in "flank" method for Ranges objects: it no longer returns an invalid object when NAs are passed thru the 'width' arg. Now it's an error to try to do so. o Fix issue with some of the "as.env" methods not being able to find the environment of the caller. o Fix bug in "showAsCell" method for AtomicList objects: now returns character(0) instead of NULL on an object of length 0. o sort() now drops NA's when 'na.last=NA' on an Rle object (consistent with base::sort). o table() now handles NA's appropriately on an Rle object. o table() now returns all the levels on a factor-Rle object. o Fix sub-replacement of Rles when using Ranges as the index. o Fix bug in [<- method for DataFrame objects. The fix corrects the way a new column created by a subset assignment is filled. Previously, if the first row was set, say, to '1', all values in the column were set to '1' when they needed to be set to NA (for consistency with data.frame). o Fix bug in compare() (was not returning 0 when comparing a 0-width range to itself). o Fix naming of column when passing an AsIs matrix to DataFrame() -- no more .X suffix. o Fix "rbind" method for DataFrame objects when some columns are matrix objects. CHANGES IN VERSION 1.16.0 ------------------------- NEW FEATURES o as( , "SimpleList"), as( , "CompressedList"), and as( , "List") now work on atomic vectors, and each element of the vector corresponds to an element of the returned List (this is consistent with as.list). o Add as.list,Rle method. o Add as.matrix,Views method. Each view corresponds to a row in the returned matrix. Rows corresponding to views shorter than the longest view are right-padded with NAs. o Add FilterClosure closure class for functions placed into a FilterRules. Has methods for getting parameters and showing. o Support 'na.rm' argument in "runsum", "runwtsum", "runq", and "runmean" methods for Rle and RleList objects. o Add splitAsList() and splitAsListReturnedClass(). o Improve summary,FilterRules to support serial evaluation, discarded counts (instead of passed) and percentages. o Make rename work on ordinary vector (in addition to Vector). o Add coercion from RangedData to CompressedIRangesList, IRangesList, or RangesList. It propagates the data columns (aka values) of the RangedData object to the inner metadata columns of the RangesList object. o Add 'NG' arg to PartitioningByEnd() and PartitioningByWidth() constructors. o Make PartitioningByEnd() work on list-like objects (like PartitioningByWidth()). o Fast disjoin() for moderate-sized CompressedIRangesList. o Add countQueryHits() and countSubjectHits(). o coverage() now supports method="auto" and this is the new default. o Add the flippedQuery(), levels(), ngap(), Lngap(), Rngap(), Lencoding(), and Rencoding() getters for OverlapEncodings objects. o Add "encodeOverlaps" method for GRangesList objects. o Enhance "[" methods for IRanges, XVector, XVectorList, and MaskCollection objects, as well as "[<-" method for IRanges objects, by supporting the following subscript types: NULL, Rle, numeric, logical, character, and factor. (All the methods listed above already supported some of those types but no method supported them all). o Add remapHits() for remapping the query and subject hits of a Hits object. o Add match,Hits method. o Add %in%,Vector method. o Add "compare", "==", "!=", "<=", ">=", "<", ">", "is.unsorted", "order", "rank", "match", and "duplicated" methods for XRawList objects. unique() and sort() also work on these objects via the "unique" and "sort" methods for Vector objects. o Add expand() for expanding a DataFrame based on the contents of one or more designated columms. o After being deprecated (in BioC 2.9) and defunct (in BioC 2.10), the "as.vector" method for AtomicList objects is back, but now it mimics what as.vector() does on an ordinary list i.e. it's equivalent to 'as.vector(as.list(x), mode=mode)'. Also coercions from AtomicList to logical/integer/numeric/double/complex/character/raw are back and based on the "as.vector" method for AtomicList objects i.e. they work only on objects with top-level elements of length <= 1. o DataFrame constructor now supports 'check.names' argument. o Add revElements() generic with methods for List and CompressedList objects. SIGNIFICANT USER-VISIBLE CHANGES o Splitting / relisting a Hits object now returns a HitsList instead of an ordinary list. o Operations in the Ops group between a List and an atomic vector operand now coerce the atomic vector to List (SimpleList or CompressedList) before performing the operation. Also, operands are recycled and a better job is done returning zero length results of the correct type. o Change the warning for 'Integer overflow ...' thrown by sum() on integer-Rle's o DataFrame now coerces List/list value to DataFrame in [<-. o Fix as.matrix,DataFrame for zero column DataFrames. Returns an nrow()x0 logical matrix. o union,Hits method now sorts the returned hits first by query hit, then by subject hit. o Add mcols() accessor as the preferred way (over elementMetadata() and values()) to access the metadata columns of a Vector object. o By default, mcols(x) and elementMetadata(x) do NOT propagate the names of x as the row names of the returned DataTable anymore. However the user can still get the old behavior by doing mcols(x, use.names=TRUE). o [<-,XVectorList now preserves the original names instead of propagating the names of the replacement value, which is consistent with how [<- operates on an ordinary vector/list. o coverage() now returns a numeric-Rle when passed numeric weights. o When called on a List object with use.names=TRUE, unlist() no longer tries to mimic the kind of non-sense name mangling that base::unlist() does (e.g. on list(a=1:3)) in a pointless effort to return a vector with unique names. o Remove 'hits' argument from signature of encodeOverlaps() generic function. o unique,Vector now drops the names for consistency with base::unique(). o Remove make.names() coercion in colnames<-,DataFrame for consistency with data.frame. DEPRECATED AND DEFUNCT o Deprecated tofactor(). o Remove RangesMatching, RangesMatchingList, and Binning classes. o Change from deprecated to defunct: matchMatrix(), "dim" method for Hits objects, and RangesMatchingList(). BUG FIXES o Fix bug in pintersect,IRanges,IRanges when input had empty ranges (broken since 2010-03-04). o Avoid integer overflow in mean,Rle method by coercing integer-Rle to numeric-Rle internally. o Change evaluation frame of with,List to parent.frame(), and get the enclosure correct in eval,List. o Many fixes and improvements to coercion from RangesList to RangedData (see commit 68195 for the details). o Fix "runValue" and "ranges" methods for CompressedRleList objects (broken for a very long time). o shift,Ranges method now fails in case of integer overflow instead of returning an invalid Ranges object. o mstack() now works on Vector objects with NULL metadata columns. o In case of integer overflow, coverage() now puts NAs in the returned Rle and issues a warning. o Fix bug in xvcopy,XRawList objects that prevented sequences from being removed from the cache of a BSgenome object. See commit 67171 for the details. o Fix issues related to duplicate column names in DataFrame (see commit 67163 for the details). o Fix a bunch of subsetting methods that were not subsetting the metadata columns: "[", "subseq", and "seqselect" methods for XVector objects, "seqselect" and "window" methods for XVectorList objects, and "[" method for MaskCollection objects. o Fix empty replacement with [<-,Vector o Make %in% robust on an empty 'table' argument when operating on Hits objects. CHANGES IN VERSION 1.14.0 ------------------------- NEW FEATURES o The map generic and RangesMapping class for mapping ranges between sequences according to some alignment. Some useful methods are implemented in GenomicRanges. o The Hits class has experimental support for basic set operations, including setdiff, union and intersect. o Added a number of data manipulation functions and methods, including mstack, multisplit, rename, unsplit for Vector. o Added compare() generic for generalized range-wise comparison of 2 range-based objects. o Added OverlapEncodings class and encodeOverlaps() generic for dealing with "overlap encodings". o subsetByOverlaps() should now work again on an RleViews object. o DataFrame now supports storing an array (like a matrix) in a column. o Added as.matrix,DataFrame method. o Added merge,DataTable,DataTable method. o Added disjointBins,RangesList method. o Added ranges,Rle and ranges,RleList methods. o Added which.max,Rle method. o Added drop,AtomicList method. o Added tofactor() wrapper around togroup(). o Added coercions from vector to any AtomicList subtype (compressed and uncompressed). o Added AtomicList to Character/Numeric/Logical/Integer/Raw/ComplexList coercions. o Added revElements() for reversing individual elements of a List object. SIGNIFICANT USER-VISIBLE CHANGES o RangesMatching has been renamed to Hits and extends Vector, so that it supports metadata columns and other features. o RangesMatchingList has been renamed to HitsList. o The 2 columns of the matrix returned by the "as.matrix" method for Hits objects are now named queryHits/subjectHits instead of query/subject, for consistency with the queryHits() and subjectHits() getters. o queryLength()/subjectLength() are recommended alternatives to dim,Hits. o breakInChunks() returns a PartitioningByWidth object. o The 'weight' arg in "coverage" methods for IRanges, Views and MaskCollection objects now can also be a single string naming a column in elementMetadata(x). o "countOverlaps" methods now propagate the names of the query. DEPRECATED AND DEFUNCT o matchMatrix,Hits is deprecated. o Moved the following deprecated features to defunct status: - use of as.data.frame() or as( , "data.frame") on an AtomicList object; - all coercion methods from AtomicList to atomic vectors; - subsetting an IRanges by Ranges; - subsetting a RangesList or RangedData by RangesList. BUG FIXES o within,RangedData/List now support replacing columns o aggregate() override no longer breaks on . ~ x formulas o "[", "c", "rep.int" and "seqselect" methods for Rle objects are now safer and will raise an error if the object to be returned has a length > .Machine$integer.max o Avoid blowing up memory by not expanding 'logical' Rle's into logical vectors internally in "slice" method for RleList objects. CHANGES IN VERSION 1.12.0 ------------------------- NEW FEATURES o Add "relist" method that works on a List skeleton. o Add XDoubleViews class with support of most of the functionalities available for XIntegerViews. o c() now works on XInteger and XDouble objects (in addition to XRaw objects). o Add min, max, mean, sum, which.min, which.max methods as synonyms for the view* functions. SIGNIFICANT USER-VISIBLE CHANGES o Views and RleViewsList classes don't derive from IRanges and IRangesList classes anymore. o When used on a List or a list, togroup() now returns an integer vector (instead of a factor) for consistency with what it does on other objects (e.g. on a Partitioning object). o Move compact() generic from Biostrings to IRanges. o Drop deprecated 'multiple' argument from "findOverlaps" methods. o Drop deprecated 'start' and 'symmetric' arguments from "resize" method for Ranges objects. DEPRECATED AND DEFUNCT o Using as.data.frame() and or as( , "data.frame") on an AtomicList object is deprecated. o Deprecate all coercion methods from AtomicList to atomic vectors. Those methods were unlisting the object, which can still be done with unlist(). o Deprecate the Binning class. o Remove defunct overlap() and countOverlap(). BUG FIXES o togroup() on a List or a list does not look at the names anymore to infer the grouping, only at the shape of the list-like object. o Fix 'relist(IRanges(), IRangesList())'. o Fix 'rep.int(Rle(), integer(0))'. o Fix some long-standing issues with the XIntegerViews code (better handling of "out of limits" or empty views, overflows, NAs). IRanges/R/0000755000175400017540000000000013175713360013337 5ustar00biocbuildbiocbuildIRanges/R/AtomicList-class.R0000644000175400017540000000271113175713360016636 0ustar00biocbuildbiocbuild### ========================================================================= ### AtomicList objects ### ------------------------------------------------------------------------- ## A list that holds atomic objects setClass("AtomicList", representation("VIRTUAL"), prototype = prototype(elementType = "logical"), contains = "List") setClass("LogicalList", representation("VIRTUAL"), prototype = prototype(elementType = "logical"), contains = "AtomicList") setClass("IntegerList", representation("VIRTUAL"), prototype = prototype(elementType = "integer"), contains = "AtomicList") setClass("NumericList", representation("VIRTUAL"), prototype = prototype(elementType = "numeric"), contains = "AtomicList") setClass("ComplexList", representation("VIRTUAL"), prototype = prototype(elementType = "complex"), contains = "AtomicList") setClass("CharacterList", representation("VIRTUAL"), prototype = prototype(elementType = "character"), contains = "AtomicList") setClass("RawList", representation("VIRTUAL"), prototype = prototype(elementType = "raw"), contains = "AtomicList") setClass("RleList", representation("VIRTUAL"), prototype = prototype(elementType = "Rle"), contains = "AtomicList") setClass("FactorList", representation("VIRTUAL"), prototype = prototype(elementType = "factor"), contains = "IntegerList") IRanges/R/AtomicList-impl.R0000644000175400017540000004345113175713360016500 0ustar00biocbuildbiocbuild### ========================================================================= ### AtomicList object implementations ### ------------------------------------------------------------------------- ## Possible optimizations for compressed lists: ## - order/sort: unlist, order by split factor first ## - cumsum: unlist, cumsum and subtract offsets .ATOMIC_TYPES <- c("logical", "integer", "numeric", "complex", "character", "raw") setClassUnion("atomic", .ATOMIC_TYPES) ## A list that holds atomic objects setClass("CompressedAtomicList", contains = c("AtomicList", "CompressedList"), representation("VIRTUAL")) setClass("SimpleAtomicList", contains = c("AtomicList", "SimpleList"), representation("VIRTUAL")) setClass("CompressedLogicalList", prototype = prototype(elementType = "logical", unlistData = logical()), contains = c("LogicalList", "CompressedAtomicList")) setClass("SimpleLogicalList", prototype = prototype(elementType = "logical"), contains = c("LogicalList", "SimpleAtomicList")) setClass("CompressedIntegerList", prototype = prototype(elementType = "integer", unlistData = integer()), contains = c("IntegerList", "CompressedAtomicList")) setClass("SimpleIntegerList", prototype = prototype(elementType = "integer"), contains = c("IntegerList", "SimpleAtomicList")) setClass("CompressedNumericList", prototype = prototype(elementType = "numeric", unlistData = numeric()), contains = c("NumericList", "CompressedAtomicList")) setClass("SimpleNumericList", prototype = prototype(elementType = "numeric"), contains = c("NumericList", "SimpleAtomicList")) setClass("CompressedComplexList", prototype = prototype(elementType = "complex", unlistData = complex()), contains = c("ComplexList", "CompressedAtomicList")) setClass("SimpleComplexList", prototype = prototype(elementType = "complex"), contains = c("ComplexList", "SimpleAtomicList")) setClass("CompressedCharacterList", prototype = prototype(elementType = "character", unlistData = character()), contains = c("CharacterList", "CompressedAtomicList")) setClass("SimpleCharacterList", prototype = prototype(elementType = "character"), contains = c("CharacterList", "SimpleAtomicList")) setClass("CompressedRawList", prototype = prototype(elementType = "raw", unlistData = raw()), contains = c("RawList", "CompressedAtomicList")) setClass("SimpleRawList", prototype = prototype(elementType = "raw"), contains = c("RawList", "SimpleAtomicList")) setClass("CompressedRleList", prototype = prototype(elementType = "Rle", unlistData = new("Rle")), contains = c("RleList", "CompressedAtomicList")) setClass("SimpleRleList", prototype = prototype(elementType = "Rle"), contains = c("RleList", "SimpleAtomicList")) setClass("CompressedFactorList", prototype = prototype(elementType = "factor", unlistData = factor()), contains = c("FactorList", "CompressedAtomicList")) setClass("SimpleFactorList", prototype = prototype(elementType = "factor"), contains = c("FactorList", "SimpleAtomicList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructors ### .dotargsAsList <- function(type, ...) { listData <- list(...) if (length(listData) == 1) { arg1 <- listData[[1]] if (is.list(arg1) || is(arg1, "List")) listData <- arg1 else if (type == "integer" && class(arg1) == "character") listData <- strsplitAsListOfIntegerVectors(arg1) # weird special case } listData } AtomicListConstructor <- function(type, compress.default = TRUE) { constructor <- eval(substitute(function(..., compress = compress.default) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") listData <- .dotargsAsList(type, ...) CompressedOrSimple <- if (compress) "Compressed" else "Simple" if (is(listData, S4Vectors:::listClassName(CompressedOrSimple, type))) listData else CoercerToList(type, compress)(listData) }, list(type = type))) formals(constructor)$compress <- compress.default constructor } LogicalList <- AtomicListConstructor("logical") IntegerList <- AtomicListConstructor("integer") NumericList <- AtomicListConstructor("numeric") ComplexList <- AtomicListConstructor("complex") CharacterList <- AtomicListConstructor("character") RawList <- AtomicListConstructor("raw") RleList <- AtomicListConstructor("Rle") FactorList <- AtomicListConstructor("factor") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setMethod("as.list", "CompressedAtomicList", function(x, use.names = TRUE) { if (is(x, "CompressedRleList")) { callNextMethod(x, use.names = use.names) } else { codes <- seq_len(length(x)) ans <- split(x@unlistData, structure(rep.int(codes, elementNROWS(x)), levels = as.character(codes), class = "factor")) if (use.names) { names(ans) <- names(x) } else { names(ans) <- NULL } ans } }) setAs("CompressedAtomicList", "list", function(from) as.list(from)) ### Equivalent to 'as.vector(as.list(x), mode=mode)' but faster on ### CompressedAtomicList objects (10x, 75x, or more, depending on 'length(x)'). setMethod("as.vector", "AtomicList", function(x, mode="any") { valid_modes <- c("any", .ATOMIC_TYPES, "double", "list") mode <- match.arg(mode, valid_modes) if (mode %in% c("any", "list")) return(as.list(x)) x_eltNROWS <- elementNROWS(x) if (any(x_eltNROWS > 1L)) stop("coercing an AtomicList object to an atomic vector ", "is supported only for\n", " objects with top-level elements of length <= 1") ans <- base::rep.int(as.vector(NA, mode=mode), length(x)) ans[x_eltNROWS == 1L] <- as.vector(unlist(x, use.names=FALSE), mode=mode) ans } ) as.matrix.AtomicList <- function(x, col.names=NULL, ...) { p <- PartitioningByEnd(x) vx <- decode(unlist(x, use.names=FALSE)) if (is.null(col.names)) { col.names <- names(vx) } if (is.null(col.names) || is.character(col.names)) { col.ind <- as.integer(IRanges(1, width(p))) } else if (is.list(col.names) || is(col.names, "List")) { col.names <- unlist(col.names, use.names=FALSE) if (is.factor(col.names)) { col.ind <- as.integer(col.names) col.names <- levels(col.names) } else { col.ind <- selfmatch(col.names) col.names <- col.names[col.ind == seq_along(col.ind)] } } else { stop("'col.names' should be NULL, a character vector or list") } row.ind <- togroup(p) nc <- if (!is.null(col.names)) length(col.names) else max(width(p)) m <- matrix(nrow=length(x), ncol=nc) m[cbind(row.ind, col.ind)] <- vx if (!is.null(col.names)) colnames(m) <- col.names m } setMethod("as.matrix", "AtomicList", function(x, col.names=NULL) as.matrix.AtomicList(x, col.names)) setMethod("drop", "AtomicList", function(x) { x_eltNROWS <- elementNROWS(x) if (any(x_eltNROWS > 1)) stop("All element lengths must be <= 1") x_dropped <- rep.int(NA, sum(x_eltNROWS)) x_dropped[x_eltNROWS > 0] <- unlist(x, use.names = FALSE) names(x_dropped) <- names(x) x_dropped }) CoercerToList <- function(type, compress) { .coerceToList <- if (compress) coerceToCompressedList else S4Vectors:::coerceToSimpleList function(from) { .coerceToList(from, type) } } setListCoercions <- function(type) { CompressedClass <- S4Vectors:::listClassName("Compressed", type) SimpleClass <- S4Vectors:::listClassName("Simple", type) Class <- S4Vectors:::listClassName("", type) hasCompressedList <- CompressedClass != "CompressedList" if (hasCompressedList) { setAs("ANY", CompressedClass, CoercerToList(type, compress = TRUE)) } setAs("ANY", SimpleClass, CoercerToList(type, compress = FALSE)) setAs("ANY", Class, CoercerToList(type, compress = hasCompressedList)) setAs("SimpleList", Class, CoercerToList(type, compress = FALSE)) setAs("list", Class, CoercerToList(type, compress = FALSE)) } setListCoercions("logical") setListCoercions("integer") setListCoercions("numeric") setListCoercions("complex") setListCoercions("character") setListCoercions("raw") setListCoercions("Rle") setListCoercions("factor") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### General methods ### ### Could actually be made the "table" method for List objects. Will work on ### any List object 'x' for which 'as.factor(unlist(x))' works. setMethod("table", "AtomicList", function(...) { args <- list(...) if (length(args) != 1L) stop("\"table\" method for AtomicList objects ", "can only take one input object") x <- args[[1L]] if (!pcompareRecursively(x)) { ## Not sure why callNextMethod() doesn't work. Is it because of ## dispatch on the ellipsis? #return(callNextMethod()) return(selectMethod("table", "Vector")(...)) } y1 <- togroup(PartitioningByWidth(x)) attributes(y1) <- list(levels=as.character(seq_along(x)), class="factor") y2 <- as.factor(unlist(x, use.names=FALSE)) ans <- table(y1, y2) names(dimnames(ans)) <- NULL x_names <- names(x) if (!is.null(x_names)) rownames(ans) <- x_names ans } ) setMethod("table", "SimpleAtomicList", function(...) { args <- list(...) if (length(args) != 1L) stop("\"table\" method for SimpleAtomicList objects ", "can only take one input object") x <- args[[1L]] levs <- sort(unique(unlist(lapply(x, function(xi) { if (!is.null(levels(xi))) levels(xi) else unique(xi) }), use.names=FALSE))) as.table(do.call(rbind, lapply(x, function(xi) { if (is(xi, "Rle")) runValue(xi) <- factor(runValue(xi), levs) else xi <- factor(xi, levs) table(xi) }))) }) setCompressedNumericalListMethod <- function(fun, def, where=topenv(parent.frame())) { types <- c("Logical", "Integer", "Numeric") classNames <- paste0("Compressed", types, "List") lapply(classNames, function(className) { C_fun <- paste0(className, "_", sub(".", "_", fun, fixed=TRUE)) body(def) <- eval(call("substitute", body(def))) setMethod(fun, className, def, where=where) }) } setCompressedNumericalListMethod("is.unsorted", function(x, na.rm = FALSE, strictly=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) stopifnot(isTRUEorFALSE(strictly)) .Call(C_fun, x, na.rm, strictly) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Rle methods ### ### 'use.names' is ignored. setMethod("unlist", "SimpleRleList", function (x, recursive=TRUE, use.names=TRUE) { if (!identical(recursive, TRUE)) stop("\"unlist\" method for SimpleRleList objects ", "does not support the 'recursive' argument") ans_values <- unlist(lapply(x@listData, slot, "values"), use.names=FALSE) ans_lengths <- unlist(lapply(x@listData, slot, "lengths"), use.names=FALSE) Rle(ans_values, ans_lengths) } ) setMethod("runValue", "RleList", function(x) { as(lapply(x, runValue), "List") }) setMethod("runValue", "CompressedRleList", function(x) { rle <- unlist(x, use.names=FALSE) rlePart <- PartitioningByWidth(runLength(rle)) listPart <- PartitioningByEnd(x) ## 'rlePart' cannot contain empty ranges so using ## Using 'hit.empty.query.ranges=TRUE' won't affect the result ## (because 'rlePart' cannot contain empty ranges) but it makes ## findOverlaps_Ranges_Partitioning() just a little bit faster. hits <- findOverlaps_Ranges_Partitioning(rlePart, listPart, hit.empty.query.ranges=TRUE) ans_partitioning <- PartitioningByEnd(subjectHits(hits), NG=length(x)) ans_unlistData <- runValue(rle)[queryHits(hits)] ans <- relist(ans_unlistData, ans_partitioning) names(ans) <- names(x) ans } ) setReplaceMethod("runValue", "CompressedRleList", function(x, value) { if (!identical(elementNROWS(ranges(x)), elementNROWS(value))) stop("elementNROWS() of 'x' and 'value' must match") runValue(x@unlistData) <- unlist(value, use.names=FALSE) x }) setReplaceMethod("runValue", "SimpleRleList", function(x, value) { if (!identical(elementNROWS(ranges(x)), elementNROWS(value))) stop("elementNROWS() of 'x' and 'value' must match") x@listData <- mapply(function(rle, v) { runValue(rle) <- v rle }, x, value, SIMPLIFY=FALSE) x }) setMethod("runLength", "RleList", function(x) { as(lapply(x, runLength), "IntegerList") }) setMethod("runLength", "CompressedRleList", function(x) { width(ranges(x)) }) setMethod("ranges", "RleList", function(x, use.names=TRUE, use.mcols=FALSE) { as(lapply(x, ranges, use.names=use.names, use.mcols=use.mcols), "List") }) diceRangesByList <- function(x, list) { listPart <- PartitioningByEnd(list) ## 'x' cannot contain empty ranges so using ## 'hit.empty.query.ranges=TRUE' won't affect the result but ## it makes findOverlaps_Ranges_Partitioning() just a little ## bit faster. hits <- findOverlaps_Ranges_Partitioning(x, listPart, hit.empty.query.ranges=TRUE) ov <- overlapsRanges(x, listPart, hits) ans_unlistData <- shift(ov, 1L - start(listPart)[subjectHits(hits)]) ans_partitioning <- PartitioningByEnd(subjectHits(hits), NG=length(list)) ans <- relist(ans_unlistData, ans_partitioning) names(ans) <- names(list) ans } setMethod("ranges", "CompressedRleList", function(x, use.names=TRUE, use.mcols=FALSE) { rle <- unlist(x, use.names=FALSE) rlePart <- PartitioningByWidth(runLength(rle)) diceRangesByList(rlePart, x) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Factor methods ### setMethod("levels", "FactorList", function(x) { CharacterList(lapply(x, levels)) }) setMethod("levels", "CompressedFactorList", function(x) { setNames(rep(CharacterList(levels(x@unlistData)), length(x)), names(x)) }) setMethod("unlist", "SimpleFactorList", function(x, recursive = TRUE, use.names = TRUE) { levs <- levels(x) if (length(x) > 1L && !all(vapply(levs[-1L], identical, logical(1L), levs[[1L]]))) { stop("inconsistent level sets") } structure(callNextMethod(), levels=as.character(levs[[1L]]), class="factor") }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### .showAtomicList <- function(object, minLines, ...) { len <- length(object) object_names <- names(object) k <- min(minLines, len) d <- len - minLines for (i in seq_len(k)) { if (is.null(object_names)) { label <- i } else { nm <- object_names[[i]] if (is.na(nm)) { label <- "NA" } else { label <- paste0("\"", nm, "\"") } } label <- paste0("[[", label, "]]") if (length(object[[i]]) == 0) { cat(label, " ", sep = "") print(object[[i]]) } else { cat(S4Vectors:::labeledLine(label, object[[i]], labelSep = "", count = FALSE)) } } if (d > 0) cat("...\n<", d, ifelse(d == 1, " more element>\n", " more elements>\n"), sep="") } setMethod("show", "AtomicList", function(object) { cat(classNameForDisplay(object), " of length ", length(object), "\n", sep = "") .showAtomicList(object, 10) } ) setMethod("show", "RleList", function(object) { lo <- length(object) k <- min(5, length(object)) diffK <- lo - 5 cat(classNameForDisplay(object), " of length ", lo, "\n", sep = "") show(as.list(head(object, k))) if (diffK > 0) cat("...\n<", diffK, ifelse(diffK == 1, " more element>\n", " more elements>\n"), sep="") }) IRanges/R/AtomicList-utils.R0000644000175400017540000007574013175713360016705 0ustar00biocbuildbiocbuild### ========================================================================= ### Common operations on AtomicList objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Iteration ### setMethod("lapply", "CompressedAtomicList", function(X, FUN, ...) { if (is(X, "CompressedRleList")) { callNextMethod(X, FUN, ...) } else { lapply(as.list(X), FUN, ...) } }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Group generic methods ### emptyOpsReturnValue <- function(.Generic, e1, e2, compress) { dummy.vector <- do.call(.Generic, list(vector(e1@elementType), vector(e2@elementType))) CoercerToList(NULL, compress)(dummy.vector) } recycleList <- function(x, length.out) { if (length.out %% length(x) > 0L) warning("shorter object is not a multiple of longer object length") rep(x, length.out = length.out) } setMethod("Ops", signature(e1 = "SimpleAtomicList", e2 = "SimpleAtomicList"), function(e1, e2) { if (length(e1) == 0L || length(e2) == 0L) { return(emptyOpsReturnValue(.Generic, e1, e2, compress = FALSE)) } n <- max(length(e1), length(e2)) e1 <- recycleList(e1, n) e2 <- recycleList(e2, n) as(Map(.Generic, e1, e2), "List") }) repLengthOneElements <- function(x, times) { x@unlistData <- rep(x@unlistData, times) x@partitioning@end <- cumsum(times) x } recycleListElements <- function(x, newlen) { x_eltNROWS <- elementNROWS(x) if (identical(x_eltNROWS, newlen)) { return(x) } if (all(x_eltNROWS == 1L)) { ans <- repLengthOneElements(x, newlen) } else { ans <- rep(x, newlen / x_eltNROWS) if (length(unlist(ans, use.names=FALSE)) != sum(newlen)) { stop("Some element lengths are not multiples of their corresponding ", "element length in ", deparse(substitute(x))) } } ans } doBinaryCompressedListOp <- function(OP, e1, e2, skeleton) { if (!missing(skeleton)) { n <- length(skeleton) } else { n <- max(length(e1), length(e2)) } e1 <- recycleList(e1, n) e2 <- recycleList(e2, n) if (missing(skeleton)) { n1 <- elementNROWS(e1) n2 <- elementNROWS(e2) if (any(n1 != n2)) { en <- ifelse(n1 == 0L | n2 == 0L, 0L, pmax.int(n1, n2)) } else { en <- NULL } nms <- names(e1) if (is.null(nms)) nms <- names(e2) } else { en <- elementNROWS(skeleton) nms <- names(skeleton) } if (!is.null(en)) { e1 <- recycleListElements(e1, en) e2 <- recycleListElements(e2, en) } partitioning <- PartitioningByEnd(e1) names(partitioning) <- nms relist(OP(unlist(e1, use.names=FALSE), unlist(e2, use.names=FALSE)), partitioning) } setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "CompressedAtomicList"), function(e1, e2) { if (length(e1) == 0L || length(e2) == 0L) { return(emptyOpsReturnValue(.Generic, e1, e2, compress = TRUE)) } doBinaryCompressedListOp(function(x, y) { .Generic <- .Generic callGeneric(x, y) }, e1, e2) }) setMethod("Ops", signature(e1 = "SimpleAtomicList", e2 = "CompressedAtomicList"), function(e1, e2) { if (sum(as.numeric(elementNROWS(e1))) < .Machine$integer.max) e1 <- as(e1, "CompressedList") else e2 <- as(e2, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "SimpleAtomicList"), function(e1, e2) { if (sum(as.numeric(elementNROWS(e2))) < .Machine$integer.max) e2 <- as(e2, "CompressedList") else e1 <- as(e1, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "AtomicList", e2 = "atomic"), function(e1, e2) { e2 <- as(e2, "List") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "atomic", e2 = "AtomicList"), function(e1, e2) { e1 <- as(e1, "List") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "SimpleAtomicList", e2 = "atomic"), function(e1, e2) { e2 <- as(e2, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "atomic", e2 = "SimpleAtomicList"), function(e1, e2) { e1 <- as(e1, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "atomic"), function(e1, e2) { if (length(e2) > 1) { e2 <- S4Vectors:::recycleVector(e2, length(e1)) e2 <- rep(e2, elementNROWS(e1)) } relist(callGeneric(e1@unlistData, e2), e1) }) setMethod("Ops", signature(e1 = "atomic", e2 = "CompressedAtomicList"), function(e1, e2) { if (length(e1) > 1) { e1 <- S4Vectors:::recycleVector(e1, length(e2)) e1 <- rep(e1, elementNROWS(e2)) } relist(callGeneric(e1, e2@unlistData), e2) }) setMethod("Math", "CompressedAtomicList", function(x) { relist(callGeneric(x@unlistData), x) }) setMethod("cumsum", "CompressedAtomicList", function(x) { xunlist <- unlist(x, use.names=FALSE) xcumsum <- cumsum(as.numeric(xunlist)) partition <- PartitioningByEnd(x) ans <- xcumsum - rep(xcumsum[start(partition)] - xunlist[start(partition)], width(partition)) relist(ans, x) }) setMethod("cumprod", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("cummin", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("cummax", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("Math", "SimpleAtomicList", function(x) as(lapply(x@listData, .Generic), "List")) setMethod("Math2", "CompressedAtomicList", function(x, digits) { if (missing(digits)) digits <- ifelse(.Generic == "round", 0, 6) relist(callGeneric(x@unlistData, digits = digits), x) }) setMethod("Math2", "SimpleAtomicList", function(x, digits) { if (missing(digits)) digits <- ifelse(.Generic == "round", 0, 6) as(lapply(x@listData, .Generic, digits = digits), "List") }) setMethod("Summary", "AtomicList", function(x, ..., na.rm = FALSE) { sapply(x, .Generic, na.rm = na.rm) }) setMethod("any", "CompressedAtomicList", function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) ans <- sum(x, na.rm=TRUE) > 0L if (!na.rm) { ans[!ans & any(is.na(x), na.rm=TRUE)] <- NA } ans }) setMethod("all", "CompressedAtomicList", function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) ans <- !any(!x, na.rm=TRUE) if (!na.rm) { ans[ans & any(is.na(x), na.rm=TRUE)] <- NA } ans }) setMethod("anyNA", "CompressedAtomicList", function(x, recursive=FALSE) { callNextMethod(x, recursive=FALSE) ## recursion will just slow us down }) rowsumCompressedList <- function(x, ..., na.rm = FALSE) { x_flat <- unlist(x, use.names = FALSE) ans <- vector(class(x_flat), length(x)) non_empty <- elementNROWS(x) > 0 if (is.logical(x_flat)) x_flat <- as.integer(x_flat) ans[non_empty] <- rowsum(x_flat, togroup(PartitioningByWidth(x)), reorder = FALSE, na.rm = na.rm)[,1] setNames(ans, names(x)) } setCompressedListSummaryMethod <- function(fun, where=topenv(parent.frame())) { setCompressedNumericalListMethod(fun, function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) .Call(C_fun, x, na.rm, PACKAGE="IRanges") }, where) } setCompressedListSummaryMethod("sum") setCompressedListSummaryMethod("prod") setCompressedListSummaryMethod("min") setCompressedListSummaryMethod("max") setMethods("range", list("CompressedLogicalList", "CompressedIntegerList", "CompressedNumericList"), function(x, na.rm=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) cbind(min(x, na.rm=na.rm), max(x, na.rm=na.rm)) }) setMethod("Summary", "CompressedRleList", function(x, ..., na.rm = FALSE) { toViewFun <- list(max = viewMaxs, min = viewMins, sum = viewSums) if (!is.null(viewFun <- toViewFun[[.Generic]])) { ans <- viewFun(as(x, "RleViews"), na.rm = na.rm) names(ans) <- names(x) ans } else sapply(x, .Generic, na.rm = na.rm) }) setMethod("all", "CompressedRleList", function(x, ..., na.rm = FALSE) { args <- list(...) if (length(args) > 0L) stop("Only a single argument in '...' is supported for now") if (!isTRUEorFALSE(na.rm)) stop("'na.rm' must be TRUE or FALSE") rv <- runValue(x) if (na.rm) rv <- rv[!is.na(rv)] rv_eltNROWS <- elementNROWS(rv) ans <- rv_eltNROWS == 0L singletons <- rv_eltNROWS == 1L ans[singletons] <- unlist(rv, use.names = FALSE)[singletons[togroup(PartitioningByWidth(rv))]] ans }) setMethod("Complex", "CompressedAtomicList", function(z) relist(callGeneric(z@unlistData), z)) setMethod("Complex", "SimpleAtomicList", function(z) as(lapply(z@listData, .Generic), "List")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More list-ized methods ### setListMethod <- function(f, inputClass, outputBaseClass, whichArg = 1L, remainingSignature = character(), mapply = FALSE, endoapply = FALSE, applyToUnlist = FALSE, where = topenv(parent.frame())) { fargs <- formals(args(get(f))) args <- sapply(names(fargs), as.name) names(args) <- sub("...", "", names(args), fixed = TRUE) if (applyToUnlist) { call2 <- as.call(c(as.name("@"), args[[whichArg]], "partitioning")) args[[whichArg]] <- as.call(c(as.name("@"), args[[whichArg]], "unlistData")) call1 <- as.call(c(as.name(f), args)) call <- as.call(c(as.name("new2"), paste("Compressed", outputBaseClass, sep=""), unlistData = call1, partitioning = call2, check = FALSE)) } else { args <- c(args[[whichArg]], as.name(f), args[-whichArg]) if (endoapply) { call <- as.call(c(as.name("endoapply"), args)) } else if (missing(outputBaseClass)) { call <- as.call(c(as.name("sapply"), args, list(simplify = TRUE))) } else { if (mapply) { if (length(args) <= 3) { call <- as.call(c(as.name("mapply"), args[c(2:1,3L)], SIMPLIFY = FALSE)) } else { call <- as.call(c(as.name("mapply"), args[c(2:1,3L)], MoreArgs = as.call(c(as.name("list"), tail(args, -3))), SIMPLIFY = FALSE)) } } else { call <- as.call(c(as.name("lapply"), args)) } if (extends(inputClass, "SimpleList")) { call <- as.call(c(as.name("new2"), paste("Simple", outputBaseClass, sep=""), listData = call, check = FALSE)) } else { call <- as.call(c(as.name(outputBaseClass), call, compress = TRUE)) } } } def <- as.function(c(fargs, call)) environment(def) <- parent.frame() setMethod(f, c(rep("ANY", whichArg - 1L), inputClass, remainingSignature), def, where) } setAtomicListMethod <- function(f, inputBaseClass = "AtomicList", outputBaseClass, whichArg = 1L, remainingSignature = character(), mapply = FALSE, endoapply = FALSE, applyToUnlist = FALSE, addRleList = TRUE, rleListOutputBaseClass = "RleList", where = topenv(parent.frame())) { if (missing(outputBaseClass)) { for (i in inputBaseClass) setListMethod(f, i, whichArg = whichArg, remainingSignature = remainingSignature, endoapply = endoapply, where = where) } else if (endoapply) { setListMethod(f, "AtomicList", whichArg = whichArg, remainingSignature = remainingSignature, endoapply = TRUE, where = where) } else { setListMethod(f, paste("Simple", inputBaseClass, sep = ""), outputBaseClass = outputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, where = where) setListMethod(f, paste("Compressed", inputBaseClass, sep = ""), outputBaseClass = outputBaseClass, whichArg = whichArg, remainingSignature, mapply = mapply, applyToUnlist = applyToUnlist, where = where) if (addRleList) { setListMethod(f, "SimpleRleList", outputBaseClass = rleListOutputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, where = where) setListMethod(f, "CompressedRleList", outputBaseClass = rleListOutputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, applyToUnlist = applyToUnlist, where = where) } } } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Logical methods ### setAtomicListMethod("which", inputBaseClass = "LogicalList", outputBaseClass = "IntegerList", rleListOutputBaseClass = "IntegerList") setMethod("which", "CompressedLogicalList", function(x) { x.flat <- unlist(x, use.names = FALSE) part <- PartitioningByEnd(x) which.global <- which(x.flat) group <- findInterval(which.global, start(part)) which.local <- which.global - start(part)[group] + 1L ans <- splitAsList(which.local, factor(group, seq_len(length(x)))) names(ans) <- names(x) ans }) ifelseReturnValue <- function(yes, no, len) { proto <- function(x) new(if (is.atomic(x)) class(x) else x@elementType) v <- logical() v[1L] <- proto(yes)[1L] v[1L] <- proto(no)[1L] v compress <- is(yes, "CompressedList") || is(no, "CompressedList") as(rep(v, length.out = len), if(compress) "CompressedList" else "SimpleList") } setGeneric("ifelse2", function(test, yes, no) standardGeneric("ifelse2")) setMethods("ifelse2", list(c("ANY", "ANY", "List"), c("ANY", "List", "List"), c("ANY", "List", "ANY")), function(test, yes, no) { ans <- ifelseReturnValue(yes, no, length(test)) ok <- !(nas <- is.na(test)) if (any(test[ok])) ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok] if (any(!test[ok])) ans[!test & ok] <- rep(no, length.out = length(ans))[!test & ok] ans[nas] <- NA names(ans) <- names(test) ans }) setMethods("ifelse2", list(c("CompressedLogicalList", "ANY", "ANY"), c("CompressedLogicalList", "ANY", "List"), c("CompressedLogicalList", "List", "ANY"), c("CompressedLogicalList", "List", "List")), function(test, yes, no) { doBinaryCompressedListOp(function(yes, no) { ifelse(unlist(test, use.names=FALSE), yes, no) }, as(yes, "List"), as(no, "List"), test) }) setMethods("ifelse2", list(c("SimpleLogicalList", "ANY", "ANY"), c("SimpleLogicalList", "ANY", "List"), c("SimpleLogicalList", "List", "ANY"), c("SimpleLogicalList", "List", "List")), function(test, yes, no) { as(mapply(ifelse, test, yes, no, SIMPLIFY=FALSE), "List") }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Numerical methods ### ### which.min() and which.max() setMethods("which.min", list("IntegerList", "NumericList", "RleList"), function(x) setNames(as.integer(lapply(x, which.min)), names(x)) ) setMethods("which.max", list("IntegerList", "NumericList", "RleList"), function(x) setNames(as.integer(lapply(x, which.max)), names(x)) ) toglobal <- function(i, x) { start(PartitioningByEnd(x)) + i - 1L } setCompressedListWhichSummaryMethod <- function(fun, where=topenv(parent.frame())) { def <- function(x, global = FALSE) { stopifnot(isTRUEorFALSE(global)) ans <- .Call(C_fun, x) if (global) { ans <- toglobal(ans, x) } ans } setCompressedNumericalListMethod(fun, def, where) } setCompressedListWhichSummaryMethod("which.min") setCompressedListWhichSummaryMethod("which.max") setMethod("which.min", "CompressedRleList", function(x) { viewWhichMins(as(x, "RleViews"), na.rm=TRUE) - c(0L, head(cumsum(elementNROWS(x)), -1)) }) setMethod("which.max", "CompressedRleList", function(x) { viewWhichMaxs(as(x, "RleViews"), na.rm=TRUE) - c(0L, head(cumsum(elementNROWS(x)), -1)) }) for (i in c("IntegerList", "NumericList", "RleList")) { setAtomicListMethod("diff", inputBaseClass = i, endoapply = TRUE) setMethod("pmax", i, function(..., na.rm = FALSE) mendoapply(pmax, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmin", i, function(..., na.rm = FALSE) mendoapply(pmin, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmax.int", i, function(..., na.rm = FALSE) mendoapply(pmax.int, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmin.int", i, function(..., na.rm = FALSE) mendoapply(pmin.int, ..., MoreArgs = list(na.rm = na.rm))) } setMethod("mean", "AtomicList", function(x, ...) sapply(x, mean, ...) ) setMethods("mean", list("CompressedLogicalList", "CompressedIntegerList", "CompressedNumericList", "CompressedRleList"), function(x, trim = 0, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) stopifnot(isSingleNumber(trim)) if (trim > 0) { return(callNextMethod()) } x_eltNROWS <- if (na.rm) sum(!is.na(x)) else elementNROWS(x) sum(x, na.rm=na.rm) / x_eltNROWS }) setMethod("var", c("AtomicList", "missing"), function(x, y=NULL, na.rm=FALSE, use) { if (missing(use)) use <- ifelse(na.rm, "na.or.complete", "everything") sapply(x, var, na.rm=na.rm, use=use) } ) setMethod("var", c("AtomicList", "AtomicList"), function(x, y=NULL, na.rm=FALSE, use) { if (missing(use)) use <- ifelse(na.rm, "na.or.complete", "everything") mapply(var, x, y, MoreArgs=list(na.rm=na.rm, use=use)) } ) setMethod("cov", c("AtomicList", "AtomicList"), function(x, y=NULL, use="everything", method=c("pearson", "kendall", "spearman")) mapply(cov, x, y, MoreArgs=list(use=use, method=match.arg(method))) ) setMethod("cor", c("AtomicList", "AtomicList"), function(x, y=NULL, use="everything", method=c("pearson", "kendall", "spearman")) mapply(cor, x, y, MoreArgs=list(use=use, method=match.arg(method))) ) setMethod("sd", "AtomicList", function(x, na.rm=FALSE) sapply(x, sd, na.rm=na.rm) ) setMethod("median", "AtomicList", function(x, na.rm=FALSE) sapply(x, median, na.rm=na.rm) ) setMethod("median", "CompressedAtomicList", function(x, na.rm=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) sx <- sort(x) n <- lengths(sx) half <- (n + 1L)%/%2L even <- n%%2L != 1L ind <- IRanges(half, width=1L+even) NAs <- half == 0L ind <- relist(ind[!NAs], PartitioningByWidth(as.integer(!NAs))) ## ind <- as(half, "IntegerList") ## ind[even] <- ind[even] + as(0:1, "IntegerList") ans <- mean(sx[ind]) if (!na.rm) { NAs <- NAs | anyNA(x) } if (any(NAs)) { ans[NAs] <- as(NA, elementType(x)) } ans }) setMethod("quantile", "AtomicList", function(x, ...) sapply(x, quantile, ...) ) setMethod("mad", "AtomicList", function(x, center=median(x), constant=1.4826, na.rm=FALSE, low=FALSE, high=FALSE) { if (!missing(center)) stop("'center' argument is not supported") sapply(x, mad, constant=constant, na.rm=na.rm, low=low, high=high) } ) setMethod("IQR", "AtomicList", function(x, na.rm=FALSE, type=7) sapply(x, IQR, na.rm=na.rm, type=type) ) diff.AtomicList <- function(x, ...) diff(x, ...) setMethod("diff", "CompressedAtomicList", function(x, lag = 1L, differences = 1L) { stopifnot(isSingleNumber(lag)) stopifnot(isSingleNumber(differences)) r <- x for (i in seq_len(differences)) r <- ptail(r, -lag) - phead(r, -lag) r }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Running window statistic methods ### setAtomicListMethod("smoothEnds", inputBaseClass = "IntegerList", outputBaseClass = "NumericList", addRleList = FALSE) setAtomicListMethod("smoothEnds", inputBaseClass = "NumericList", endoapply = TRUE) setAtomicListMethod("smoothEnds", inputBaseClass = "RleList", endoapply = TRUE) setMethod("runmed", "CompressedIntegerList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) NumericList(lapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level))) setMethod("runmed", "SimpleIntegerList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) NumericList(lapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level), compress = FALSE)) setMethod("runmed", "NumericList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) endoapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level)) setMethod("runmed", "RleList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) endoapply(x, runmed, k = k, endrule = match.arg(endrule))) setMethod("runmean", "RleList", function(x, k, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runmean, k = k, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runsum", "RleList", function(x, k, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runsum, k = k, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runwtsum", "RleList", function(x, k, wt, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runwtsum, k = k, wt = wt, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runq", "RleList", function(x, k, i, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runq, k = k, i = i, endrule = match.arg(endrule), na.rm = na.rm)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Character ### nchar_CompressedList <- function(x, type="chars", allowNA=FALSE) { unlisted_x <- unlist(x, use.names=FALSE) unlisted_ans <- nchar(unlisted_x, type=type, allowNA=allowNA) relist(unlisted_ans, x) } setMethod("nchar", "CompressedCharacterList", nchar_CompressedList) setMethod("nchar", "SimpleCharacterList", nchar_CompressedList) setMethod("nchar", "CompressedRleList", nchar_CompressedList) setMethod("nchar", "SimpleRleList", nchar_CompressedList) ## need vectorized start, end ##setAtomicListMethod("substr") ##setAtomicListMethod("substring") setAtomicListMethod("chartr", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) setAtomicListMethod("tolower", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", applyToUnlist = TRUE) setAtomicListMethod("toupper", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", applyToUnlist = TRUE) setAtomicListMethod("sub", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) setAtomicListMethod("gsub", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) ### TODO: grep, grepl setMethod("unstrsplit", "CharacterList", function(x, sep="") unstrsplit(as.list(x), sep=sep) ) setMethod("unstrsplit", "RleList", function(x, sep="") unstrsplit(CharacterList(x, compress=FALSE), sep=sep) ) setMethod("paste", "CompressedAtomicList", function(..., sep=" ", collapse=NULL) { args <- lapply(list(...), as, "CharacterList") x_eltNROWS <- do.call(pmax, lapply(args, elementNROWS)) args <- lapply(args, recycleListElements, x_eltNROWS) unlisted <- lapply(args, unlist, use.names=FALSE) ans <- relist(do.call(paste, c(unlisted, sep=sep)), PartitioningByWidth(x_eltNROWS)) if (!is.null(collapse)) { ans <- unstrsplit(ans, collapse) } ans }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Sorting ### setMethod("rank", "CompressedAtomicList", function (x, na.last = TRUE, ties.method = c("average", "first", "last", "random", "max", "min")) { stopifnot(isTRUE(na.last)) ties.method <- match.arg(ties.method) if (ties.method == "last" || ties.method == "random") stop("'ties.method' last/random not yet supported") p <- PartitioningByEnd(x) o <- order(togroup(p), unlist(x, use.names=FALSE)) r <- as.integer(IRanges(1L, width=width(p))) gp <- PartitioningByEnd(end(Rle(unlist(x, use.names=FALSE)[o]))) v <- switch(ties.method, average=(r[start(gp)] + r[end(gp)])/2, first=r, ## last=, ## random=, max=r[end(gp)], min=r[start(gp)]) if (ties.method != "first") v <- rep(v, width(gp)) r[o] <- v relist(r, x) }) setMethod("order", "CompressedAtomicList", function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", "shell", "radix")) { args <- list(...) if (length(args) != 1L) stop("\"order\" method for CompressedAtomicList objects ", "can only take one input object") x <- args[[1L]] p <- PartitioningByEnd(x) ux <- unlist(x, use.names=FALSE) o <- order(togroup(p), ux, na.last=na.last, decreasing=decreasing, method=method) skeleton <- if (is.na(na.last) && anyNA(ux)) { skeleton <- PartitioningByWidth(width(p) - sum(is.na(x))) } else p relist(o, skeleton) - start(p) + 1L }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Set/comparison methods ### subgrouping <- function(x) { g <- grouping(togroup(PartitioningByEnd(x)), unlist(x, use.names=FALSE)) as(g, "ManyToOneGrouping") } .unique.RleList <- function(x, incomparables=FALSE, ...) unique(runValue(x), incomparables=incomparables, ...) setMethod("unique", "RleList", .unique.RleList) .duplicated.CompressedAtomicList <- function(x, incomparables=FALSE, fromLast=FALSE, nmax=NA, ...) { if (!identical(incomparables, FALSE)) stop("\"duplicated\" method for CompressedList objects ", "does not support the 'incomparables' argument") if (length(list(...)) > 0L) { stop("arguments in '...' are not supported") } stopifnot(isTRUEorFALSE(fromLast)) g <- subgrouping(x) p <- PartitioningByEnd(g) first <- unlist(g)[if (fromLast) end(p) else start(p)] v <- rep(TRUE, length(unlist(g))) v[first] <- FALSE relist(v, x) } setMethod("duplicated", "CompressedAtomicList", .duplicated.CompressedAtomicList) setMethod("selfmatch", "CompressedAtomicList", function(x, global=FALSE) { g <- subgrouping(x) first <- unlist(g)[start(PartitioningByEnd(g))] ux <- unlist(x, use.names=FALSE) ux[unlist(g)] <- rep(first, lengths(g)) ans <- relist(ux, x) if (!global) { ans <- ans - start(ans) + 1L } ans }) IRanges/R/CompressedHitsList-class.R0000644000175400017540000000322213175713360020354 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedHitsList objects ### ------------------------------------------------------------------------- ### [H.P. - 2015/12/17] Why do we need this? Where is it used? setClass("CompressedHitsList", prototype = prototype(elementType = "Hits", unlistData = new("Hits")), contains="CompressedList") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessors ### setMethod("from", "CompressedHitsList", function(x) from(x@unlistData)) setMethod("to", "CompressedHitsList", function(x) to(x@unlistData)) setMethod("nLnode", "CompressedHitsList", function(x) nLnode(x@unlistData)) setMethod("nRnode", "CompressedHitsList", function(x) nRnode(x@unlistData)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### CompressedHitsList <- function(hits, query) { if (!(is(query, "CompressedIRangesList"))) stop("'query' must be a 'CompressedIRangesList' object") if (!is(hits, "Hits")) stop("'hits' must be a 'Hits' object") qspace <- space(query) hspace <- as.integer(qspace[queryHits(hits)]) partitioning <- PartitioningByEnd(hspace, names=names(query@partitioning), NG=length(names(query@partitioning))) newCompressedList0("CompressedHitsList", unlistData=hits, partitioning=partitioning) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ## return as.matrix as on Hits, with indices adjusted setMethod("as.matrix", "CompressedHitsList", function(x) { cbind(queryHits=queryHits(x), subjectHits=subjectHits(x)) }) IRanges/R/CompressedList-class.R0000644000175400017540000004170213175713360017531 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedList objects ### ------------------------------------------------------------------------- setClass("CompressedList", contains="List", representation( "VIRTUAL", unlistData="ANY", partitioning="PartitioningByEnd" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters and setters ### setMethod("length", "CompressedList", function(x) length(x@partitioning)) setMethod("names", "CompressedList", function(x) names(x@partitioning)) setMethod("elementNROWS", "CompressedList", function(x) { ans <- elementNROWS(x@partitioning) names(ans) <- names(x) ans } ) setReplaceMethod("names", "CompressedList", function(x, value) { names(x@partitioning) <- value x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### Use ### IRanges:::newCompressedList0(getClass("MyClass"), ### unlistData, partitioning) ### when calling this from another package. ### ### Hackery to avoid R CMD check warning for using an internal... islistfactor <- function(x) { eval(as.call(list(quote(.Internal), substitute(islistfactor(x, FALSE), list(x=x))))) } compress_listData <- function(x, elementType = NULL) { if (length(x) > 0L) { if (islistfactor(x)) { x <- unlist(x, recursive=FALSE, use.names=FALSE) } else if (length(dim(x[[1L]])) < 2L) { x <- do.call(c, unname(x)) } else { x <- do.call(rbind, unname(x)) } } else { x <- vector() } x } .reconcileMetadatacols <- function(x) { x_mcols <- mcols(x) if (is(x_mcols, "DataFrame") && nrow(x_mcols) == 0L && ncol(x_mcols) == 0L) { x_mcols <- S4Vectors:::make_zero_col_DataFrame(length(x)) mcols(x) <- x_mcols } x } ### Low-level. NOT exported. newCompressedList0 <- function(Class, unlistData, partitioning) { ans <- new2(Class, unlistData=unlistData, partitioning=partitioning, check=FALSE) .reconcileMetadatacols(ans) } ### Low-level. NOT exported. ### Stuff to put in elementMetadata slot can be passed either with ### new_CompressedList_from_list(..., elementMetadata=somestuff) ### or with ### new_CompressedList_from_list(..., mcols=somestuff) ### The latter is the new recommended form. new_CompressedList_from_list <- function(Class, x, ..., mcols) { if (!extends(Class, "CompressedList")) stop("class ", Class, " must extend CompressedList") if (!is.list(x)) stop("'x' must be a list") ans_elementType <- elementType(new(Class)) if (!all(sapply(x, function(xi) extends(class(xi), ans_elementType)))) stop("all elements in 'listData' must be ", ans_elementType, " objects") ans_partitioning <- PartitioningByEnd(x) if (length(x) == 0L) { if (missing(mcols)) return(new2(Class, partitioning=ans_partitioning, ..., check=FALSE)) return(new2(Class, partitioning=ans_partitioning, ..., elementMetadata=mcols, check=FALSE)) } ans_unlistData <- compress_listData(x, ans_elementType) if (missing(mcols)) { ans <- new2(Class, unlistData=ans_unlistData, partitioning=ans_partitioning, ..., check=FALSE) } else { ans <- new2(Class, unlistData=ans_unlistData, partitioning=ans_partitioning, ..., elementMetadata=mcols, check=FALSE) } .reconcileMetadatacols(ans) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.CompressedList.partitioning <- function(x) { dataLength <- NROW(x@unlistData) if (nobj(x@partitioning) != dataLength) "improper partitioning" else NULL } .valid.CompressedList.unlistData <- function(x) { ## FIXME: workaround to support CompressedNormalIRangesList ## elementTypeX <- elementType(x) elementTypeX <- elementType(new(class(x))) if (!extends(class(x@unlistData), elementTypeX)) paste("the 'unlistData' slot must be of class", elementTypeX) else NULL } .valid.CompressedList <- function(x) { c(.valid.CompressedList.unlistData(x), .valid.CompressedList.partitioning(x)) } setValidity2("CompressedList", .valid.CompressedList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setMethod("unlist", "CompressedList", function(x, recursive=TRUE, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") unlisted_x <- x@unlistData if (use.names) unlisted_x <- S4Vectors:::set_unlisted_names(unlisted_x, x) unlisted_x } ) setAs("ANY", "CompressedList", function(from) coerceToCompressedList(from)) coerceToCompressedList <- function(from, element.type = NULL, ...) { if (is(from, S4Vectors:::listClassName("Compressed", element.type))) return(from) if (is.list(from) || (is(from, "List") && !is(from, "DataFrame"))) { if (is.list(from)) { v <- compress_listData(from, element.type) } else { v <- unlist(from, use.names = FALSE) } part <- PartitioningByEnd(from) } else { v <- from part <- PartitioningByEnd(seq_len(NROW(from))) } if (!is.null(element.type)) { v <- S4Vectors:::coercerToClass(element.type)(v, ...) } to <- relist(v, part) names(to) <- names(from) to } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("extractROWS", "CompressedList", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ans_eltNROWS <- extractROWS(width(x@partitioning), i) ans_breakpoints <- suppressWarnings(cumsum(ans_eltNROWS)) nbreakpoints <- length(ans_breakpoints) if (nbreakpoints != 0L && is.na(ans_breakpoints[[nbreakpoints]])) stop(wmsg("Subsetting operation on ", class(x), " object 'x' ", "produces a result that is too big to be ", "represented as a CompressedList object. ", "Please try to coerce 'x' to a SimpleList object ", "first (with 'as(x, \"SimpleList\")').")) idx_on_unlisted_x <- IRanges(end=extractROWS(end(x@partitioning), i), width=ans_eltNROWS) ans_unlistData <- extractROWS(x@unlistData, idx_on_unlisted_x) ans_partitioning <- new2("PartitioningByEnd", end=ans_breakpoints, NAMES=extractROWS(names(x), i), check=FALSE) ans_elementMetadata <- extractROWS(x@elementMetadata, i) initialize(x, unlistData=ans_unlistData, partitioning=ans_partitioning, elementMetadata=ans_elementMetadata) } ) setMethod("getListElement", "CompressedList", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=FALSE) if (is.na(i)) return(NULL) unlisted_x <- unlist(x, use.names=FALSE) x_partitioning <- PartitioningByEnd(x) window_start <- start(x_partitioning)[i] window_end <- end(x_partitioning)[i] S4Vectors:::Vector_window(unlisted_x, start=window_start, end=window_end) } ) setReplaceMethod("[[", "CompressedList", function(x, i, j,..., value) { nameValue <- if (is.character(i)) i else "" i <- S4Vectors:::normargSubset2_iOnly(x, i, j, ..., .conditionPrefix="[[<-,CompressedList-method: ") if (is.null(value)) { if (i <= length(x)) # if name did not exist, could be +1 x <- x[-i] } else { value <- try(as(value, elementType(x)), silent = TRUE) if (inherits(value, "try-error")) stop("cannot coerce 'value' to a ", elementType(x), " instance") listData <- as.list(x, use.names = FALSE) listData[[i]] <- value widths <- elementNROWS(x) names(widths) <- NULL widths[i] <- NROW(value) if ((i == length(x) + 1L) && (!is.null(names(x)) || nchar(nameValue) > 0)) { NAMES <- names(x) if (is.null(NAMES)) NAMES <- rep.int("", length(x)) NAMES[i] <- nameValue } else { NAMES <- names(x) } slot(x, "unlistData", check=FALSE) <- compress_listData(listData, elementType(x)) slot(x, "partitioning", check=FALSE) <- new2("PartitioningByEnd", end = cumsum(widths), NAMES = NAMES, check=FALSE) if (i > length(x)) x <- S4Vectors:::rbindRowOfNAsToMetadatacols(x) x } }) setReplaceMethod("$", "CompressedList", function(x, name, value) { x[[name]] <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining. ### .bindROWS <- function(...) { args <- list(...) if (length(dim(args[[1L]])) >= 2L) return(rbind(...)) if (!is.factor(args[[1L]])) return(c(...)) ans_levels <- unique(unlist(lapply(args, levels))) x <- unlist(lapply(args, as.character)) factor(x, levels=ans_levels) } ### Not exported. combine_CompressedList_objects <- function(Class, objects, use.names=TRUE, ignore.mcols=FALSE) { if (!isSingleString(Class)) stop("'Class' must be a single character string") if (!extends(Class, "CompressedList")) stop("'Class' must be the name of a class that extends CompressedList") if (!is.list(objects)) stop("'objects' must be a list") if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") ### TODO: Support 'use.names=TRUE'. if (use.names) stop("'use.names=TRUE' is not supported yet") if (!isTRUEorFALSE(ignore.mcols)) stop("'ignore.mcols' must be TRUE or FALSE") if (length(objects) != 0L) { ## TODO: Implement (in C) fast 'elementIsNull(objects)' in S4Vectors ## that does 'sapply(objects, is.null, USE.NAMES=FALSE)', and use it ## here. null_idx <- which(sapply(objects, is.null, USE.NAMES=FALSE)) if (length(null_idx) != 0L) objects <- objects[-null_idx] } if (length(objects) == 0L) return(new(Class)) ## TODO: Implement (in C) fast 'elementIs(objects, class)' in S4Vectors ## that does 'sapply(objects, is, class, USE.NAMES=FALSE)', and use it ## here. 'elementIs(objects, "NULL")' should work and be equivalent to ## 'elementIsNull(objects)'. if (!all(sapply(objects, is, "CompressedList", USE.NAMES=FALSE))) stop("the objects to combine must be CompressedList objects (or NULLs)") objects_names <- names(objects) names(objects) <- NULL # so lapply(objects, ...) below returns an # unnamed list ## Combine "unlistData" slots. unlistData_slots <- lapply(objects, function(x) x@unlistData) ans_unlistData <- do.call(.bindROWS, unlistData_slots) ## Combine "partitioning" slots. ans_breakpoints <- cumsum(unlist(lapply(objects, elementNROWS))) ans_partitioning <- PartitioningByEnd(ans_breakpoints) ans <- newCompressedList0(Class, ans_unlistData, ans_partitioning) ## Combine "mcols" slots. if (!ignore.mcols) { ans_mcols <- do.call(S4Vectors:::rbind_mcols, objects) rownames(ans_mcols) <- NULL mcols(ans) <- ans_mcols } ans } setMethod("c", "CompressedList", function (x, ..., ignore.mcols=FALSE, recursive=FALSE) { if (!identical(recursive, FALSE)) stop("\"c\" method for CompressedList objects ", "does not support the 'recursive' argument") if (missing(x)) { objects <- list(...) x <- objects[[1L]] } else { objects <- list(x, ...) } combine_CompressedList_objects(class(x), objects, use.names=FALSE, ignore.mcols=ignore.mcols) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Looping. ### ### Cannot really avoid the cost of extracting X[[i]] for all valid i but tries ### to minimize this cost by using 2 tricks: ### 1. Avoids looping on values of i for which X[[i]] has length 0. Instead ### FUN(X[[i]], ...) is computed only once (because it's the same for all ### these values of i) and placed at the corresponding positions in the ### returned list. ### 2. Turn off object validation during the main loop. Note that there is no ### reason to restrict this trick to CompressedList objects and the same ### trick could be used in the "lapply" method for List objects. ### Does NOT propagate the names. lapply_CompressedList <- function(X, FUN, ...) { FUN <- match.fun(FUN) ans <- vector(mode="list", length=length(X)) unlisted_X <- unlist(X, use.names=FALSE) X_partitioning <- PartitioningByEnd(X) X_elt_width <- width(X_partitioning) empty_idx <- which(X_elt_width == 0L) if (length(empty_idx) != 0L) ans[empty_idx] <- list(FUN(extractROWS(unlisted_X, integer(0)), ...)) non_empty_idx <- which(X_elt_width != 0L) if (length(non_empty_idx) == 0L) return(ans) X_elt_start <- start(X_partitioning) X_elt_end <- end(X_partitioning) old_validity_status <- S4Vectors:::disableValidity() S4Vectors:::disableValidity(TRUE) on.exit(S4Vectors:::disableValidity(old_validity_status)) ans[non_empty_idx] <- lapply(non_empty_idx, function(i) FUN(extractROWS(unlisted_X, IRanges(X_elt_start[i], X_elt_end[i])), ...)) S4Vectors:::disableValidity(old_validity_status) for (i in non_empty_idx) { obj <- ans[[i]] if (isS4(obj) && !isTRUE(validObject(obj, test=TRUE))) stop("invalid output element of class \"", class(obj), "\"") } ans } setMethod("lapply", "CompressedList", function(X, FUN, ...) { ans <- lapply_CompressedList(X, FUN, ...) names(ans) <- names(X) ans } ) .updateCompressedList <- function(X, listData) { elementTypeX <- elementType(X) if (!all(sapply(listData, function(Xi) extends(class(Xi), elementTypeX)))) stop("'FUN' must return elements of class ", elementTypeX) if (length(listData) == 0) { end <- integer(0) } else { end <- cumsum(unlist(lapply(listData, NROW), use.names = FALSE)) } initialize(X, unlistData = compress_listData(listData, elementTypeX), partitioning = new2("PartitioningByEnd", end = end, NAMES = names(X), check=FALSE)) } setMethod("endoapply", "CompressedList", function(X, FUN, ...) { .updateCompressedList(X, lapply_CompressedList(X, FUN, ...)) }) setMethod("mendoapply", "CompressedList", function(FUN, ..., MoreArgs = NULL) { .updateCompressedList(list(...)[[1L]], mapply(FUN = match.fun(FUN), ..., MoreArgs = MoreArgs, SIMPLIFY = FALSE)) }) setMethod("revElements", "CompressedList", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (length(i) == 0L) return(x) x_eltNROWS <- elementNROWS(x) offset <- cumsum(c(0L, x_eltNROWS[-length(x_eltNROWS)])) rev <- logical(length(x)) rev <- replaceROWS(rev, i, TRUE) ii <- S4Vectors:::fancy_mseq(x_eltNROWS, offset=offset, rev=rev) x@unlistData <- extractROWS(x@unlistData, ii) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### classNameForDisplay() ### setMethod("classNameForDisplay", "CompressedList", function(x) sub("^Compressed", "", class(x)) ) IRanges/R/CompressedList-comparison.R0000644000175400017540000000543713175713360020603 0ustar00biocbuildbiocbuild### ========================================================================= ### Comparing and ordering CompressedList objects ### ------------------------------------------------------------------------- ### ### Overwrite methods defined in S4Vectors/R/List-comparison.R for List ### objects with optimized methods for CompressedList objects. ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Element-wise (aka "parallel") comparison of 2 List objects. ### ### TODO: Add optimized "==" and "<=" methods for CompressedList objects. ### setMethod("!", "CompressedList", function(x) relist(!unlist(x, use.names=FALSE), x) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### match() ### ### The first match method catches CompressedList,list; 'table' is atomic setMethod("match", c("CompressedList", "vector"), function(x, table, nomatch = NA_integer_, incomparables = NULL, ...) { m <- match(x@unlistData, table, nomatch=nomatch, incomparables=incomparables, ...) relist(m, x) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### duplicated() & unique() ### .duplicated.CompressedList <- function(x, incomparables=FALSE, fromLast=FALSE, nmax=NA) { if (!identical(incomparables, FALSE)) stop("\"duplicated\" method for CompressedList objects ", "does not support the 'incomparables' argument") x_unlistData <- x@unlistData sm <- match(x_unlistData, x_unlistData) # doesn't work on an Rle x_group <- rep.int(seq_along(x), elementNROWS(x)) ans_unlistData <- duplicatedIntegerPairs(x_group, sm, fromLast=fromLast) relist(ans_unlistData, x) } setMethod("duplicated", "CompressedList", .duplicated.CompressedList) .unique.CompressedList <- function(x, ...) { is_dup <- duplicated(x, ...) x_unlistData <- x@unlistData keep_idx <- which(!is_dup@unlistData) ans_unlistData <- x_unlistData[keep_idx] x_group <- rep.int(seq_along(x), elementNROWS(x)) ans_group <- x_group[keep_idx] ans_partitioning <- PartitioningByEnd(ans_group, NG=length(x), names=names(x)) relist(ans_unlistData, ans_partitioning) } setMethod("unique", "CompressedList", .unique.CompressedList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### %in% ### ### The "%in%" method for Vector objects calls is.na() internally. setMethod("is.na", "CompressedList", function(x) relist(is.na(unlist(x, use.names=FALSE)), x) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### order() and related methods. ### ### TODO: Add optimized methods for CompressedList objects. ### IRanges/R/DataFrame-utils.R0000644000175400017540000000111113175713360016436 0ustar00biocbuildbiocbuild### ========================================================================= ### DataFrame utilities ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Splitting. ### setMethod("relistToClass", "data.frame", function(x) "CompressedSplitDataFrameList" ) setMethod("relistToClass", "DataFrame", function(x) "CompressedSplitDataFrameList" ) setMethod("mstack", "DataFrame", function(..., .index.var = "name") { stack(DataFrameList(...), index.var = .index.var) }) IRanges/R/DataFrameList-class.R0000644000175400017540000003644613175713360017262 0ustar00biocbuildbiocbuild### ========================================================================= ### DataFrameList objects ### ------------------------------------------------------------------------- setClass("DataFrameList", representation("VIRTUAL"), prototype = prototype(elementType = "DataFrame"), contains = "List") setClass("SimpleDataFrameList", contains = c("DataFrameList", "SimpleList")) setClass("CompressedDataFrameList", prototype = prototype(unlistData = new("DataFrame")), contains = c("DataFrameList", "CompressedList")) setClass("SplitDataFrameList", representation("VIRTUAL"), contains = "DataFrameList") setClass("SimpleSplitDataFrameList", contains = c("SplitDataFrameList", "SimpleDataFrameList")) setClass("CompressedSplitDataFrameList", contains = c("SplitDataFrameList", "CompressedDataFrameList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("nrow", "DataFrameList", function(x) { if (length(x) == 0L) 0L else elementNROWS(x) }) setMethod("ncol", "DataFrameList", function(x) { if (length(x) == 0L) 0L else unlist(lapply(x, ncol)) }) setMethod("ncol", "SimpleSplitDataFrameList", function(x) { if (length(x) == 0L) 0L else structure(rep.int(ncol(x[[1L]]), length(x)), names = names(x)) }) setMethod("ncol", "CompressedSplitDataFrameList", function(x) { if (length(x) == 0L) 0L else structure(rep.int(ncol(x@unlistData), length(x)), names = names(x)) }) setMethod("dim", "DataFrameList", function(x) { cbind(nrow(x), ncol(x)) }) setMethod("rownames", "DataFrameList", function(x, do.NULL = TRUE, prefix = "row") { CharacterList(lapply(x, rownames, do.NULL = do.NULL, prefix = prefix)) }) setMethod("colnames", "DataFrameList", function(x, do.NULL = TRUE, prefix = "col") { CharacterList(lapply(x, colnames, do.NULL = do.NULL, prefix = prefix)) }) setMethod("colnames", "SplitDataFrameList", function(x, do.NULL = TRUE, prefix = "col") { if (length(x)) { nms <- colnames(x[[1]], do.NULL = do.NULL, prefix = prefix) rep(CharacterList(nms), length(x)) } else NULL }) setMethod("colnames", "CompressedSplitDataFrameList", function(x, do.NULL = TRUE, prefix = "col") { if (length(x)) { nms <- colnames(x@unlistData, do.NULL = do.NULL, prefix = prefix) rep(CharacterList(nms), length(x)) } else NULL }) setMethod("dimnames", "DataFrameList", function(x) { list(rownames(x), colnames(x)) }) setReplaceMethod("rownames", "SimpleDataFrameList", function(x, value) { if (is.null(value) || is(value, "CharacterList")) { if (is.null(value)) value <- list(NULL) else if (length(x) != length(value)) stop("replacement value must be the same length as x") x@listData <- mapply(function(y, rn) {rownames(y) <- rn; y}, x@listData, value, SIMPLIFY=FALSE) } else { stop("replacement value must be NULL or a CharacterList") } x }) setReplaceMethod("rownames", "CompressedSplitDataFrameList", function(x, value) { if (is.null(value)) { rownames(x@unlistData) <- NULL } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") rownames(x@unlistData) <- unlist(value, use.names=FALSE) } else { stop("replacement value must either be NULL or a CharacterList") } x }) setReplaceMethod("colnames", "SimpleDataFrameList", function(x, value) { if (is.null(value)) { x@listData <- lapply(x@listData, function(y) {colnames(y) <- NULL; y}) } else if (is.character(value)) { for (i in seq_len(length(x))) colnames(x@listData[[i]]) <- value } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") for (i in seq_len(length(x))) colnames(x@listData[[i]]) <- value[[i]] } else { stop("replacement value must either be NULL or a CharacterList") } x }) setReplaceMethod("colnames", "CompressedSplitDataFrameList", function(x, value) { if (is.null(value)) { colnames(x@unlistData) <- NULL } else if (is.character(value)) { colnames(x@unlistData) <- value } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") if (length(x) > 0) colnames(x@unlistData) <- unlist(value[[1L]]) } else { stop("replacement value must either be NULL or a CharacterList") } x }) setReplaceMethod("dimnames", "DataFrameList", function(x, value) { if (!is.list(value)) stop("replacement value must be a list") rownames(x) <- value[[1L]] colnames(x) <- value[[2L]] x }) setGeneric("columnMetadata", function(x, ...) standardGeneric("columnMetadata")) setMethod("columnMetadata", "SimpleSplitDataFrameList", function(x) { if (length(x)) mcols(x[[1]]) else NULL }) setMethod("columnMetadata", "CompressedSplitDataFrameList", function(x) { mcols(x@unlistData) }) setGeneric("columnMetadata<-", function(x, ..., value) standardGeneric("columnMetadata<-")) setReplaceMethod("columnMetadata", "SimpleSplitDataFrameList", function(x, value) { x@listData <- lapply(x@listData, function(xi) { mcols(xi) <- value xi }) x }) setReplaceMethod("columnMetadata", "CompressedSplitDataFrameList", function(x, value) { mcols(x@unlistData) <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.SplitDataFrameList <- function(x) { if (length(x) && !is(x, "CompressedList")) { firstNames <- colnames(x[[1L]]) l <- as.list(x, use.names = FALSE) if (!all(sapply(l, function(df) identical(firstNames, colnames(df))))) return("column counts or names differ across elements") firstMetaData <- mcols(x[[1L]]) # could be NULL if (!all(sapply(l, function(df) { identical(firstMetaData, mcols(df)) }))) return("metadata columns must be identical across elements") } NULL } setValidity2("SplitDataFrameList", .valid.SplitDataFrameList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### DataFrameList <- function(...) { listData <- list(...) if (length(listData) == 1 && is.list(listData[[1L]]) && !is.data.frame(listData[[1L]])) listData <- listData[[1L]] if (length(listData) > 0 && !is(listData[[1L]], "DataFrame")) listData <- lapply(listData, as, "DataFrame") S4Vectors:::new_SimpleList_from_list("SimpleDataFrameList", listData) } SplitDataFrameList <- function(..., compress = TRUE, cbindArgs = FALSE) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") listData <- list(...) if (length(listData) == 1 && (is.list(listData[[1L]]) || is(listData[[1L]], "List")) && !(is.data.frame(listData[[1L]]) || is(listData[[1L]], "DataFrame"))) listData <- listData[[1L]] if (cbindArgs) { if (is.null(names(listData))) names(listData) <- paste("X", seq_len(length(listData)), sep = "") listData <- do.call(Map, c(list(DataFrame), listData)) } as(listData, if (compress) "CompressedSplitDataFrameList" else "SimpleSplitDataFrameList") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("[", "SimpleSplitDataFrameList", function(x, i, j, ..., drop=TRUE) { if (!missing(j)) x@listData <- lapply(x@listData, function(y) y[,j,drop=FALSE]) if (!missing(i)) x <- callNextMethod(x, i) if (((nargs() - !missing(drop)) > 2) && (length(x@listData) > 0) && (ncol(x@listData[[1L]]) == 1) && (missing(drop) || drop)) { x <- as(lapply(x@listData, "[[", 1), "List") } x }) setMethod("[", "CompressedSplitDataFrameList", function(x, i, j, ..., drop=TRUE) { if (!missing(j)) x@unlistData <- x@unlistData[, j, drop=FALSE] if (!missing(i)) x <- callNextMethod(x, i) if (((nargs() - !missing(drop)) > 2) && (ncol(x@unlistData) == 1) && (missing(drop) || drop)) { x <- relist(x@unlistData[[1L]], x) } x }) setMethod("normalizeSingleBracketReplacementValue", "SplitDataFrameList", function(value, x) { value <- callNextMethod() # call default method rownames(value) <- NULL if (length(x) != 0L && ncol(x)[[1L]] == ncol(value)[[1L]]) colnames(value)[[1L]] <- colnames(x)[[1L]] value } ) setReplaceMethod("[", "SplitDataFrameList", function(x, i, j,..., value) { if (length(list(...)) > 0L) stop("invalid replacement") value <- normalizeSingleBracketReplacementValue(value, x) if (missing(j)) { if (missing(i)) ans <- callNextMethod(x=x, value=value) else ans <- callNextMethod(x=x, i=i, value=value) return(ans) } colind <- setNames(seq_along(commonColnames(x)), commonColnames(x)) if (missing(i) && is.character(j)) { colnames(value) <- j } j <- normalizeSingleBracketSubscript(j, colind, allow.append=missing(i)) if (missing(i)) { y <- value } else { y <- x[, j, drop=FALSE] if (is.list(i) || (is(i, "List") && !is(i, "Ranges"))) { y <- S4Vectors:::lsubset_List_by_List(y, i, value) } else { y[i] <- value } } if (length(y) < length(x)) { y <- rep(y, length.out=length(x)) } if (is(x, "CompressedList")) { x_eltNROWS <- elementNROWS(x) y_eltNROWS <- elementNROWS(y) if (any(x_eltNROWS != y_eltNROWS)) { indices <- IRanges(start(y@partitioning), width=y_eltNROWS) indices <- rep(indices, x_eltNROWS / y_eltNROWS) if (sum(width(indices)) != sum(x_eltNROWS)) { stop("some element lengths of 'x' are not multiples of the ", "corresponding element lengths of 'value'") } y@unlistData <- y@unlistData[indices, , drop=FALSE] } x@unlistData[, j] <- y@unlistData } else if (is(x, "SimpleList")) { indices <- structure(seq_len(length(x)), names = names(x)) x@listData <- lapply(indices, function(k) { z <- x@listData[[k]] z[j] <- y[[k]] z }) } else { stop(class(x), " objects not supported") } x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ## Casting DataFrameList -> DataFrame implies cast to SplitDataFrameList setAs("DataFrameList", "DataFrame", function(from) { as(as(from, "SplitDataFrameList"), "DataFrame") }) setGeneric("commonColnames", function(x) standardGeneric("commonColnames")) setMethod("commonColnames", "CompressedSplitDataFrameList", function(x) colnames(unlist(x, use.names=FALSE))) setMethod("commonColnames", "SplitDataFrameList", function(x) colnames(head(x, 1L))[[1L]]) setAs("SplitDataFrameList", "DataFrame", function(from) { cols <- sapply(commonColnames(from), function(j) from[,j], simplify=FALSE) DataFrame(cols, check.names=FALSE) } ) setAs("ANY", "SplitDataFrameList", function(from) as(from, "CompressedSplitDataFrameList")) setAs("list", "SplitDataFrameList", function(from) as(from, "SimpleSplitDataFrameList")) setAs("SimpleList", "SplitDataFrameList", function(from) as(from, "SimpleSplitDataFrameList")) setAs("DataFrame", "SplitDataFrameList", function(from) as(from, "CompressedSplitDataFrameList")) setAs("ANY", "SimpleSplitDataFrameList", function(from) { new("SimpleSplitDataFrameList", as(from, "SimpleDataFrameList")) }) setAs("ANY", "CompressedSplitDataFrameList", function(from) { coerceToCompressedList(from, "DataFrame") }) setListCoercions("DataFrame") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### setMethod("show", "SplitDataFrameList", function(object) { k <- length(object) cumsumN <- cumsum(elementNROWS(object)) N <- tail(cumsumN, 1) cat(classNameForDisplay(object), " of length ", k, "\n", sep = "") if (k == 0L) { cat("<0 elements>\n") } else if ((k == 1L) || (N <= 20L)) { show(as.list(object)) } else { sketch <- function(x) c(head(x, 3), "...", tail(x, 3)) if (k >= 3 && cumsumN[3L] <= 20) showK <- 3 else if (k >= 2 && cumsumN[2L] <= 20) showK <- 2 else showK <- 1 diffK <- k - showK show(as.list(head(object, showK))) if (diffK > 0) cat("...\n<", k - showK, ifelse(diffK == 1, " more element>\n", " more elements>\n"), sep="") } }) IRanges/R/DataFrameList-utils.R0000644000175400017540000000344413175713360017305 0ustar00biocbuildbiocbuild### ========================================================================= ### DataFrameList utilities ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining. ### setMethod("cbind", "DataFrameList", function(..., deparse.level=1) mendoapply(cbind, ...)) setMethod("rbind", "DataFrameList", function(..., deparse.level=1) mendoapply(rbind, ...)) setMethod("stack", "DataFrameList", function(x, index.var = "name") { DataFrame(.stack.ind(x, index.var), unlist(x, use.names=FALSE), row.names = unlist(lapply(x, rownames))) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Transforming. ### setClass("SDFLWrapperForTransform", representation(delegate = "SplitDataFrameList"), contains="Vector") setMethod("colnames", "SDFLWrapperForTransform", function(x) { commonColnames(x@delegate) }) setMethod("[[", "SDFLWrapperForTransform", function (x, i, j, ...) { x@delegate[,i] }) setReplaceMethod("[[", "SDFLWrapperForTransform", function(x, i, j, ..., value) { x@delegate[,i] <- value x }) setMethod(S4Vectors:::`column<-`, "SDFLWrapperForTransform", function(x, name, value) { x[[name]] <- value x }) setMethod("as.env", "SDFLWrapperForTransform", function(x, ...) { env <- selectMethod(as.env, "DataTable")(x, ...) S4Vectors:::addSelfRef(x@delegate, env) }) transform.SplitDataFrameList <- function(`_data`, ...) { illConceivedWrapper <- new("SDFLWrapperForTransform", delegate=`_data`) S4Vectors:::transform.DataTable(illConceivedWrapper, ...)@delegate } setMethod("transform", "SplitDataFrameList", transform.SplitDataFrameList) IRanges/R/Grouping-class.R0000644000175400017540000007311713175713360016370 0ustar00biocbuildbiocbuild### ========================================================================= ### Grouping objects ### ------------------------------------------------------------------------- ### ### We call "grouping" an arbitrary mapping from a collection of NO objects ### to a collection of NG groups, or, more formally, a bipartite graph ### between integer sets [1, NO] and [1, NG]. Objects mapped to a given group ### are said to belong to, or to be assigned to, or to be in that group. ### Additionally, the objects in each group are ordered. So for example the ### 2 following groupings are considered different: ### ### Grouping 1: NG = 3, NO = 5 ### group objects ### 1 : 4, 2 ### 2 : ### 3 : 4 ### ### Grouping 2: NG = 3, NO = 5 ### group objects ### 1 : 2, 4 ### 2 : ### 3 : 4 ### ### There are no restriction on the mapping e.g. any object can be mapped ### to 0, 1, or more groups, and can be mapped twice to the same group. Also ### some or all the groups can be empty. ### ### The Grouping class is a virtual class that formalizes the most general ### kind of grouping. More specific groupings (e.g. many-to-one mappings) ### are formalized via specific Grouping subclasses. setClass("Grouping", contains="IntegerList", representation("VIRTUAL")) setGeneric("nobj", function(x) standardGeneric("nobj")) setGeneric("grouplengths", signature="x", function(x, i=NULL) standardGeneric("grouplengths") ) .subset_by_integer <- function(x, i=NULL) { if (is.null(i)) return(x) if (!is.numeric(i)) stop(wmsg("subscript must be NULL or an integer vector")) if (!is.integer(i)) i <- as.integer(i) x_len <- length(x) if (S4Vectors:::anyMissingOrOutside(i, -x_len, x_len)) stop(wmsg("subscript contains NAs or out of bounds indices")) x[i] } setMethod("grouplengths", "Grouping", function(x, i=NULL) { x_grouplens <- elementNROWS(x) .subset_by_integer(x_grouplens, i) } ) setMethod("show", "Grouping", function(object) { NG <- length(object) NO <- nobj(object) cat(class(object), " with ", NG, ifelse(NG == 1, " group ", " groups "), "and ", NO, ifelse(NO == 1, " object\n", " objects\n"), sep="") if (NG == 0L) return(invisible(NULL)) empty_groups <- which(grouplengths(object) == 0L) cat("Nb of empty groups: ", length(empty_groups), " (", 100.00 * length(empty_groups) / NG, "%)\n", sep="") } ) ### ------------------------------------------------------------------------- ### ManyToOneGrouping objects ### ------------------------- ### A ManyToOneGrouping object represents a grouping where every object in ### the collection belongs to exactly one group. setClass("ManyToOneGrouping", contains="Grouping", representation("VIRTUAL")) setMethod("nobj", "ManyToOneGrouping", function(x) sum(grouplengths(x))) setGeneric("members", signature="x", function(x, i) standardGeneric("members") ) setMethod("members", "ManyToOneGrouping", function(x, i) { if (!is.numeric(i)) stop(wmsg("subscript 'i' must be a vector of integers")) if (!is.integer(i)) i <- as.integer(i) sort(unlist(sapply(i, function(ii) x[[ii]]))) } ) setGeneric("vmembers", signature="x", function(x, L) standardGeneric("vmembers") ) setMethod("vmembers", "ManyToOneGrouping", function(x, L) { if (!is.list(L)) stop(wmsg("'L' must be a list of integer vectors")) lapply(L, function(i) members(x, i)) } ) setGeneric("togroup", signature="x", function(x, j=NULL) standardGeneric("togroup") ) ### Works on any ManyToOneGrouping object 'x' for which unlist() and ### elementNROWS() work. setMethod("togroup", "ManyToOneGrouping", function(x, j=NULL) { x_togroup <- unlist(x, use.names=FALSE) x_eltNROWS <- elementNROWS(x) x_togroup[x_togroup] <- rep.int(seq_along(x_eltNROWS), x_eltNROWS) .subset_by_integer(x_togroup, j) } ) setGeneric("togrouplength", signature="x", function(x, j=NULL) standardGeneric("togrouplength") ) setMethod("togrouplength", "ManyToOneGrouping", function(x, j=NULL) grouplengths(x, togroup(x, j)) ) ### ------------------------------------------------------------------------- ### ManyToManyGrouping objects ### ------------------------- ### A ManyToManyGrouping object represents a grouping where objects ### can map to any number of groups. setClass("ManyToManyGrouping", contains="Grouping", representation("VIRTUAL")) ### ------------------------------------------------------------------------- ### BiIndexGrouping objects ### ----------------------- #setClass("BiIndexGrouping", # contains="ManyToOneGrouping", # representation( # group2object="list", # object2group="integer" # ) #) #setMethod("length", "BiIndexGrouping", function(x) length(x@group2object)) #setMethod("nobj", "BiIndexGrouping", function(x) length(x@object2group)) ### ------------------------------------------------------------------------- ### H2LGrouping and Dups objects ### ---------------------------- ### ### High-to-Low Index ManyToOneGrouping objects. ### setClass("H2LGrouping", contains="ManyToOneGrouping", representation( high2low="integer", low2high="list" ) ) ### For storing the grouping implicitly defined by the "duplicated" ### relationship between elements of an arbitrary vector. setClass("Dups", contains="H2LGrouping") ### Two additional accessors for H2LGrouping objects. setGeneric("high2low", function(x) standardGeneric("high2low")) setMethod("high2low", "H2LGrouping", function(x) x@high2low) setGeneric("low2high", function(x) standardGeneric("low2high")) setMethod("low2high", "H2LGrouping", function(x) x@low2high) ### 'length(x)' and 'nobj(x)' are the same. setMethod("length", "H2LGrouping", function(x) length(x@low2high)) setMethod("nobj", "H2LGrouping", function(x) length(x@high2low)) setMethod("getListElement", "H2LGrouping", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) if (is.na(x@high2low[i])) c(i, x@low2high[[i]]) else integer() } ) ### Should be more efficient than the default method for ManyToOneGrouping ### objects. setMethod("grouplengths", "H2LGrouping", function(x, i=NULL) { x_grouplens <- elementNROWS(x@low2high) + 1L x_grouplens[!is.na(x@high2low)] <- 0L .subset_by_integer(x_grouplens, i) } ) setMethod("members", "H2LGrouping", function(x, i) { if (!is.numeric(i)) stop(wmsg("subscript 'i' must be a vector of integers")) if (!is.integer(i)) i <- as.integer(i) ## NAs and "subscript out of bounds" are checked at the C level .Call2("H2LGrouping_members", x, i, PACKAGE="IRanges") } ) setMethod("vmembers", "H2LGrouping", function(x, L) { if (!is.list(L)) stop(wmsg("'L' must be a list of integer vectors")) .Call2("H2LGrouping_vmembers", x, L, PACKAGE="IRanges") } ) setMethod("togroup", "H2LGrouping", function(x, j=NULL) { x_togroup <- x@high2low x_togroup[is.na(x_togroup)] <- which(is.na(x_togroup)) .subset_by_integer(x_togroup, j) } ) ### The default method should be as good (if not better) as this. #setMethod("togrouplength", "H2LGrouping", # function(x) # { # ans <- rep.int(1L, length(x)) # mapped_lows <- setdiff(unique(x@high2low), NA) # for (low in mapped_lows) { # ii <- as.integer(c(low, x@low2high[[low]])) # ans[ii] <- length(ii) # } # ans # } #) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More operations on H2LGrouping objects. These operations are NOT part of ### the core Grouping API. ### ### The rank of group G_i is the number of non-empty groups that are before ### G_i plus one. Or, equivalently, it's the number of non-empty groups with ### an index <= i. setGeneric("grouprank", signature="x", function(x, i=NULL) standardGeneric("grouprank") ) setMethod("grouprank", "H2LGrouping", function(x, i=NULL) { ans <- cumsum(is.na(high2low(x))) if (!is.null(i)) ans <- ans[i] return(ans) } ) ### togrouprank() returns the mapping from objects to group ranks. ### An important property of togrouprank() is that: ### togrouprank(x, neg_idx) ### and ### seq_along(neg_idx) ### are identical, where 'neg_idx' is the vector of the indices of ### the non-empty groups i.e. ### neg_idx <- which(grouplengths(x) != 0L) setGeneric("togrouprank", signature="x", function(x, j=NULL) standardGeneric("togrouprank") ) setMethod("togrouprank", "H2LGrouping", function(x, j=NULL) { to_group <- togroup(x) group_rank <- grouprank(x) ans <- group_rank[to_group] if (!is.null(j)) ans <- ans[j] return(ans) } ) setReplaceMethod("length", "H2LGrouping", function(x, value) { if (!isSingleNumber(value)) stop(wmsg("length must be a single integer")) if (!is.integer(value)) value <- as.integer(value) if (value < 0L) stop(wmsg("length cannot be negative")) if (value > length(x)) stop(wmsg("cannot make a ", class(x), " instance longer")) length(x@high2low) <- value x@low2high <- S4Vectors:::reverseSelfmatchMapping(x@high2low) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.H2LGrouping <- function(x) { if (!is.integer(x@high2low)) return("the 'high2low' slot must contain an integer vector") if (!all(x@high2low >= 1L, na.rm=TRUE)) return("the 'high2low' slot must contain integer values >= 1") if (!all(x@high2low < seq_along(x@high2low), na.rm=TRUE)) { problem <- c("when mapped, elements in the 'high2low' slot must be mapped ", "to elements at a lower position") return(paste(problem, collapse="")) } if (!all(is.na(x@high2low[x@high2low]))) { problem <- c("when mapped, elements in the 'high2low' slot must be mapped ", "to unmapped elements") return(paste(problem, collapse="")) } if (!is.list(x@low2high)) return("the 'low2high' slot must contain a list") if (length(x@high2low) != length(x@low2high)) return("the 'high2low' and 'low2high' slots must have the same length") if (!identical(S4Vectors:::reverseSelfmatchMapping(x@high2low), x@low2high)) { problem <- c("the 'low2high' slot must contain the reverse mapping ", "of the 'high2low' slot") return(paste(problem, collapse="")) } NULL } setValidity("H2LGrouping", function(object) { problems <- .valid.H2LGrouping(object) if (is.null(problems)) TRUE else problems } ) ### For Dups objects only. .duplicated.Dups <- function(x, incomparables=FALSE) { if (!identical(incomparables, FALSE)) stop(wmsg("\"duplicated\" method for Dups objects ", "only accepts 'incomparables=FALSE'")) !is.na(high2low(x)) } setMethod("duplicated", "Dups", .duplicated.Dups) ### For Dups objects only. setMethod("show", "Dups", function(object) { percentage <- 100.00 * sum(duplicated(object)) / length(object) cat(class(object), " of length ", length(object), " (", percentage, "% of duplicates)\n", sep="") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructors. ### .newH2LGrouping <- function(Class, high2low) { if (!is.numeric(high2low)) stop(wmsg("'high2low' must be a vector of integers")) if (!is.integer(high2low)) high2low <- as.integer(high2low) new2(Class, high2low=high2low, low2high=S4Vectors:::reverseSelfmatchMapping(high2low), check=FALSE) } H2LGrouping <- function(high2low=integer()) .newH2LGrouping("H2LGrouping", high2low) Dups <- function(high2low=integer()) .newH2LGrouping("Dups", high2low) setMethod("high2low", "ANY", function(x) { ans <- selfmatch(x) ans[ans == seq_along(x)] <- NA_integer_ ans } ) ### ------------------------------------------------------------------------- ### GroupingRanges objects ### ---------------------- ### ### A GroupingRanges object represents a "block-grouping", that is, a ### grouping where each group is a block of adjacent elements in the original ### collection of objects. GroupingRanges objects support the Ranges API (e.g. ### start/end/width) in addition to the Grouping API. ### setClass("GroupingRanges", ## We put Ranges before Grouping so GroupingRanges objects inherit the ## "show" method for Ranges objects instead of the method for Grouping ## objects. contains=c("Ranges", "Grouping"), representation("VIRTUAL") ) ### Overwrite default method with optimized method for GroupingRanges objects. setMethod("grouplengths", "GroupingRanges", function(x, i=NULL) { x_width <- width(x) .subset_by_integer(x_width, i) } ) setClass("GroupingIRanges", contains=c("IRanges", "GroupingRanges")) ### ------------------------------------------------------------------------- ### Partitioning objects ### -------------------- ### ### A Partitioning object is a GroupingRanges object where the blocks fully ### cover the original collection of objects and don't overlap. This makes ### them many-to-one groupings. Furthermore, the blocks must be ordered by ### ascending position on the original collection of objects. ### Note that for a Partitioning object 'x', 'togroup(x)' is sorted in ### increasing order (not necessarily strictly increasing). ### ### The Partitioning class is virtual with 2 concrete direct subclasses: ### PartitioningByEnd and PartitioningByWidth. ### setClass("Partitioning", contains=c("GroupingRanges", "ManyToOneGrouping"), representation( "VIRTUAL", NAMES="character_OR_NULL" # R doesn't like @names !! ), prototype( NAMES=NULL ) ) ### The default methods below assume that the "length + start/end/width" API ### is already implemented. setMethod("getListElement", "Partitioning", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) ## The purpose of the code below is to extract 'start(x)[i] - 1' ## (stored in 'ans_shift') and 'width(x)[i]' (stored in 'ans_len') ## in the fastest possible way. Looks like a convoluted way to ## extract those 2 values but it is actually 1000x faster than the ## naive way. ans_shift <- 0L ans_len <- end(x)[i] if (i >= 2L) { ans_shift <- end(x)[i - 1L] ans_len <- ans_len - ans_shift } seq_len(ans_len) + ans_shift } ) ### Overwrite method for ManyToOneGrouping objects with optimized method for ### Partitioning objects. setMethod("togroup", "Partitioning", function(x, j=NULL) { x_width <- width(x) x_togroup <- rep.int(seq_along(x_width), x_width) .subset_by_integer(x_togroup, j) } ) setMethod("names", "Partitioning", function(x) x@NAMES) setReplaceMethod("names", "Partitioning", set_IRanges_names) .valid.Partitioning <- function(x) { if (is.null(names(x))) return(NULL) if (!is.character(names(x))) return("the names must be a character vector or NULL") if (length(names(x)) != length(x)) return("number of names and number of elements differ") NULL } setValidity2("Partitioning", .valid.Partitioning) setMethod("NSBS", "Partitioning", function(i, x, exact=TRUE, strict.upper.bound=TRUE, allow.NAs=FALSE) { i <- range(i) callNextMethod() }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningByEnd uses a compact internal representation that allows ### fast mapping from groups to objects. However, it is not efficient for ### mapping from objects to groups. ### setClass("PartitioningByEnd", contains="Partitioning", representation( end="integer" ), prototype( end=integer() ) ) setMethod("end", "PartitioningByEnd", function(x) x@end) ### Overwrite method for Ranges objects with optimized method for ### PartitioningByEnd objects. setMethod("length", "PartitioningByEnd", function(x) length(end(x))) ### Overwrite method for ManyToOneGrouping objects with optimized method for ### PartitioningByEnd objects. setMethod("nobj", "PartitioningByEnd", function(x) S4Vectors:::last_or(end(x), 0L) ) setMethod("start", "PartitioningByEnd", function(x) { x_end <- end(x) if (length(x_end) == 0L) return(integer()) c(1L, x_end[-length(x_end)] + 1L) } ) setMethod("width", "PartitioningByEnd", function(x) S4Vectors:::diffWithInitialZero(end(x)) ) .valid.PartitioningByEnd <- function(x) { if (!is.integer(end(x))) return("the ends must be integers") if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissing(end(x))) return("the ends cannot be NAs") if (S4Vectors:::isNotSorted(end(x))) return("the ends must be sorted") if (end(x)[1L] < 0L) return("the ends cannot be negative") if (!is.null(names(end(x)))) return("the ends should not be named") NULL } setValidity2("PartitioningByEnd", .valid.PartitioningByEnd) .numeric2end <- function(x=integer(0), NG=NULL) { if (!is.integer(x)) x <- as.integer(x) if (S4Vectors:::anyMissingOrOutside(x, 0L)) stop(wmsg("when 'x' is an integer vector, ", "it cannot contain NAs or negative values")) if (S4Vectors:::isNotSorted(x)) stop(wmsg("when 'x' is an integer vector, ", "it must be sorted")) if (is.null(NG)) return(x) ## When 'NG' (number of groups) is supplied, then 'x' is considered ## to represent the group assignment of a collection of 'length(x)' ## objects. Therefore the values in 'x' must be >= 1 and <= 'NG'. ## ADDITIONALLY, 'x' must be *sorted* (not strictly) so it can be ## reconstructed from the object returned by PartitioningByEnd() ## by doing togroup() on that object. if (!isSingleNumber(NG)) stop(wmsg("'NG' must be either NULL or a single integer")) if (!is.integer(NG)) NG <- as.integer(NG) NO <- length(x) # nb of objects if (NG == 0L) { if (NO != 0L) stop(wmsg("when 'NG' is 0, 'x' must be of length 0")) } else { ## 'x' is expected to be non-decreasing and with values >= 1 ## and <= 'NG'. x <- cumsum(tabulate(x, nbins=NG)) ## 'x[NG]' is guaranteed to be <= 'NO'. if (x[NG] != NO) stop(wmsg("when 'NG' is supplied, values in 'x' must ", "be >= 1 and <= 'NG'")) } x } .prepare_Partitioning_names <- function(names, ans_len, NG, x_names) { if (!is.null(names)) { if (!is.character(names) || length(names) != ans_len) stop(wmsg("'names' must be either NULL or a character vector ", "of length 'NG' (if supplied) or 'length(x)' ", "(if 'NG' is not supplied)")) return(names) } if (is.null(NG)) return(x_names) # should be of length 'ans_len' NULL } PartitioningByEnd <- function(x=integer(0), NG=NULL, names=NULL) { if (is(x, "List") || is.list(x)) { if (!is.null(NG)) warning(wmsg("'NG' argument is ignored when 'x' ", "is a list-like object")) if (is(x, "CompressedList")) { ## Behaves like a getter for the 'partitioning' slot. ans <- x@partitioning if (!is.null(names)) names(ans) <- names return(ans) } if (is(x, "PartitioningByEnd")) { if (!is.null(names)) names(x) <- names return(x) } x_names <- names(x) ans_end <- cumsum(elementNROWS(x)) } else { if (!is.numeric(x)) stop(wmsg("'x' must be either a list-like object or ", "a sorted vector of non-NA non-negative integers")) x_names <- names(x) ans_end <- .numeric2end(x, NG) } ans_names <- .prepare_Partitioning_names(names, length(ans_end), NG, x_names) new2("PartitioningByEnd", end=unname(ans_end), NAMES=ans_names, check=FALSE) } setAs("Ranges", "PartitioningByEnd", function(from) { ans <- PartitioningByEnd(end(from), names=names(from)) if (!identical(start(ans), start(from))) stop(wmsg("the Ranges object to coerce does not represent ", "a partitioning")) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningByWidth uses a compact internal representation too. Storing ### the widths instead of the ends would allow the total number of objects ### (nobj(x)) to be greater than 2^31-1 but note that some methods will break ### when this happens, e.g. nobj, end, etc... ### setClass("PartitioningByWidth", contains="Partitioning", representation( width="integer" ), prototype( width=integer() ) ) setMethod("width", "PartitioningByWidth", function(x) x@width) setMethod("end", "PartitioningByWidth", function(x) cumsum(width(x))) setMethod("start", "PartitioningByWidth", function(x) { x_width <- width(x) if (length(x_width) == 0L) return(integer()) c(1L, cumsum(x_width[-length(x_width)]) + 1L) } ) .valid.PartitioningByWidth <- function(x) { if (!is.integer(width(x))) return("the widths must be integers") if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissingOrOutside(width(x), 0L)) return("the widths cannot be NAs or negative") if (!is.null(names(width(x)))) return("the widths should not be named") NULL } setValidity2("PartitioningByWidth", .valid.PartitioningByWidth) .numeric2width <- function(x=integer(0), NG=NULL) { if (!is.integer(x)) x <- as.integer(x) if (S4Vectors:::anyMissingOrOutside(x, 0L)) stop(wmsg("when 'x' is an integer vector, ", "it cannot contain NAs or negative values")) if (is.null(NG)) return(x) ## When 'NG' (number of groups) is supplied, then 'x' is considered ## to represent the group assignment of a collection of 'length(x)' ## objects. Therefore the values in 'x' must be >= 1 and <= 'NG'. ## ADDITIONALLY, 'x' must be *sorted* (not strictly) so it can be ## reconstructed from the object returned by PartitioningByWidth() ## by doing togroup() on that object. if (S4Vectors:::isNotSorted(x)) stop(wmsg("when 'x' is an integer vector, it must be sorted")) if (!isSingleNumber(NG)) stop(wmsg("'NG' must be either NULL or a single integer")) if (!is.integer(NG)) NG <- as.integer(NG) NO <- length(x) # nb of objects if (NG == 0L) { if (NO != 0L) stop(wmsg("when 'NG' is 0, 'x' must be of length 0")) } else { ## 'x' is expected to be non-decreasing and with values >= 1 ## and <= 'NG'. x <- tabulate(x, nbins=NG) ## 'sum(x)' is guaranteed to be <= 'NO'. if (sum(x) != NO) stop(wmsg("when 'NG' is supplied, values in 'x' must ", "be >= 1 and <= 'NG'")) } x } PartitioningByWidth <- function(x=integer(0), NG=NULL, names=NULL) { if (is(x, "List") || is.list(x)) { if (!is.null(NG)) warning(wmsg("'NG' argument is ignored when 'x' ", "is a list-like object")) if (is(x, "PartitioningByWidth")) { if (!is.null(names)) names(x) <- names return(x) } x_names <- names(x) ans_width <- elementNROWS(x) } else { if (!is.numeric(x)) stop(wmsg("'x' must be either a list-like object or ", "a vector of non-NA non-negative integers")) x_names <- names(x) ans_width <- .numeric2width(x, NG) } ans_names <- .prepare_Partitioning_names(names, length(ans_width), NG, x_names) new2("PartitioningByWidth", width=unname(ans_width), NAMES=ans_names, check=FALSE) } setAs("Ranges", "PartitioningByWidth", function(from) { ans <- PartitioningByWidth(width(from), names(from)) if (!identical(start(ans), start(from))) stop(wmsg("the Ranges object to coerce does not represent ", "a partitioning")) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningMap contains PartitioningByEnd and one additional slot, ### 'mapOrder', to specify a different order. This object is used by the ### pack() function in GenomicFiles and is put in @partitioning of a ### GRangesList of pack()ed ranges. 'mapOrder' records the order of the ### unpacked() ranges. ### setClass("PartitioningMap", contains="PartitioningByEnd", representation( mapOrder="integer" ), prototype( mapOrder=integer() ) ) setGeneric("mapOrder", function(x) standardGeneric("mapOrder")) setMethod("mapOrder", "PartitioningMap", function(x) x@mapOrder) .valid.PartitioningMap <- function(x) { if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissing(mapOrder(x))) return("mapOrder cannot contain NA values") if (any(mapOrder(x) < 0L)) return("mapOrder values cannot be negative") if (!is.null(names(mapOrder(x)))) return("the mapOrder should not be named") if (length(maporder <- mapOrder(x))) { maxorder <- max(maporder) if (max(maporder) > max(end(x))) return("max mapOrder value must be == max(end(object))") } NULL } setValidity2("PartitioningMap", .valid.PartitioningMap) PartitioningMap <- function(x=integer(), mapOrder=integer(), ...) new("PartitioningMap", PartitioningByEnd(x=x), mapOrder=mapOrder, ...) setAs("PartitioningByEnd", "PartitioningMap", function(from) new("PartitioningMap", from, mapOrder=numeric()) ) setMethod("show", "PartitioningMap", function(object) { cat(class(object), " of length ", length(object), "\n") cat("mapOrder: ", mapOrder(object), "\n") print(PartitioningByEnd(object)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps() ### ### A simple findOverlaps method that doesn't use NCList but works only ### on a subject with *adjacent* ranges sorted non-decreasingly. ### Can be 30% faster or more than the real findOverlaps() (NCList-based) ### when 'query' is such that 'start(query)' and 'end(query)' are also sorted ### non-decreasingly (which is the case if for example 'query' is a ### Partitioning object). ### TODO: Add a "findOverlaps" method for Partitioning,Partitioning in the ### findOverlaps-methods.R file that calls this. findOverlaps_Ranges_Partitioning <- function(query, subject, hit.empty.query.ranges=FALSE, hit.empty.subject.ranges=FALSE) { if (!is(query, "Ranges")) stop(wmsg("'query' must be a Ranges object")) if (!is(subject, "Partitioning")) stop(wmsg("'subject' must be a Partitioning object")) if (!isTRUEorFALSE(hit.empty.query.ranges) || !isTRUEorFALSE(hit.empty.subject.ranges)) stop(wmsg("'hit.empty.query.ranges' and 'hit.empty.subject.ranges' ", "must be TRUE or FALSE")) q_len <- length(query) q_start <- start(query) q_end <- end(query) s_len <- length(subject) s_end <- end(subject) if (!hit.empty.query.ranges) { q_idx <- which(width(query) != 0L) q_start <- q_start[q_idx] q_end <- q_end[q_idx] } if (!hit.empty.subject.ranges) { s_idx <- which(width(subject) != 0L) s_end <- s_end[s_idx] } vec <- c(0L, s_end) + 0.5 q_start2subject <- findInterval(q_start, vec) q_end2subject <- findInterval(q_end, vec) q_hits <- rep.int(seq_along(q_start), q_end2subject - q_start2subject + 1L) s_hits <- S4Vectors:::mseq(q_start2subject, q_end2subject) ## If 'query' is a Partitioning object, all hits are guaranteed to be ## valid. if (!is(query, "Partitioning")) { ## Remove invalid hits. is_valid <- 1L <= s_hits & s_hits <= length(s_end) q_hits <- q_hits[is_valid] s_hits <- s_hits[is_valid] } ## Remap hits to original query/subject. if (!hit.empty.query.ranges) q_hits <- q_idx[q_hits] if (!hit.empty.subject.ranges) s_hits <- s_idx[s_hits] ## Make and return Hits object. Hits(q_hits, s_hits, q_len, s_len, sort.by.query=TRUE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Old stuff (deprecated & defunct) ### setMethod("togroup", "ANY", function(x, j=NULL) { msg <- wmsg( "Using togroup() on a ", class(x), " object is defunct. ", "Please use togroup(PartitioningByWidth(...)) instead." ) .Defunct(msg=msg) } ) IRanges/R/Hits-class-leftovers.R0000644000175400017540000000554513175713360017514 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 4/29/2014 ### Most of the stuff that used to be in the IRanges/R/Hits-class.R file was ### moved to the S4Vectors package (to R/Hits-class.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Hits-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("Hits", "DataFrame", function(from) { DataFrame(as.matrix(from), if (!is.null(mcols(from))) mcols(from) else S4Vectors:::make_zero_col_DataFrame(length(from))) }) .as.data.frame.Hits <- function(x, row.names=NULL, optional=FALSE, ...) { as.data.frame(as(x, "DataFrame"), row.names=row.names, optional=optional, ...) } setMethod("as.data.frame", "Hits", .as.data.frame.Hits) ### Turn SortedByQueryHits object 'from' into a PartitioningByEnd object that ### describes the grouping of hits by query. .from_SortedByQueryHits_to_PartitioningByEnd <- function(from) PartitioningByEnd(queryHits(from), NG=queryLength(from)) setAs("SortedByQueryHits", "PartitioningByEnd", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "Partitioning", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "Ranges", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "IRanges", function(from) as(.from_SortedByQueryHits_to_PartitioningByEnd(from), "IRanges") ) ### Turn SortedByQueryHits object 'from' into a CompressedIntegerList object ### with one list element per element in the original query. .from_SortedByQueryHits_to_CompressedIntegerList <- function(from) { ans_partitioning <- .from_SortedByQueryHits_to_PartitioningByEnd(from) relist(subjectHits(from), ans_partitioning) } setAs("SortedByQueryHits", "CompressedIntegerList", .from_SortedByQueryHits_to_CompressedIntegerList ) setAs("SortedByQueryHits", "IntegerList", .from_SortedByQueryHits_to_CompressedIntegerList ) setAs("SortedByQueryHits", "List", .from_SortedByQueryHits_to_CompressedIntegerList ) .as.list.SortedByQueryHits <- function(x) as.list(.from_SortedByQueryHits_to_CompressedIntegerList(x)) setMethod("as.list", "SortedByQueryHits", .as.list.SortedByQueryHits) .from_Hits_to_CompressedIntegerList <- function(from) { as(as(from, "SortedByQueryHits"), "CompressedIntegerList") } setAs("Hits", "List", .from_Hits_to_CompressedIntegerList) setAs("Hits", "IntegerList", .from_Hits_to_CompressedIntegerList) setAs("Hits", "CompressedIntegerList", .from_Hits_to_CompressedIntegerList) setMethod("as.list", "Hits", function(x) as.list(as(x, "SortedByQueryHits"))) setAs("Hits", "Grouping", function(from) ManyToManyGrouping(as(from, "List"), nobj=nRnode(from))) IRanges/R/IPos-class.R0000644000175400017540000003053513175713360015445 0ustar00biocbuildbiocbuild### ========================================================================= ### IPos objects ### ------------------------------------------------------------------------- ### setClass("IPos", contains="Ranges", representation( pos_runs="IRanges" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters ### setMethod("length", "IPos", function(x) sum(width(x@pos_runs))) setMethod("names", "IPos", function(x) NULL) setReplaceMethod("names", "IPos", function(x, value) { if (!is.null(value)) stop(class(x), " objects don't accept names") x } ) setGeneric("pos", function(x) standardGeneric("pos")) setMethod("pos", "IPos", function(x) as.integer(x@pos_runs)) setMethod("start", "IPos", function(x) pos(x)) setMethod("end", "IPos", function(x) pos(x)) setMethod("width", "IPos", function(x) rep.int(1L, length(x))) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Collapse runs of "stitchable integer ranges" ### ### In a Ranges object 'x', 2 ranges x[i] and x[i+1] are "stitchable" if ### start(x[i+1]) == end(x[i])+1. For example, in the following object: ### 1: .....xxxx............. ### 2: ...xx................. ### 3: .........xxx.......... ### 4: ............xxxxxx.... ### 5: ..................x... ### x[3] and x[4] are stitchable, and x[4] and x[5] are stitchable. So ### x[3], x[4], and x[5] form a run of "stitchable ranges" that will collapse ### into the following single range after stitching: ### .........xxxxxxxxxx... ### Note that x[1] and x[3] are not stitchable because they are not ### consecutive vector elements (but they would if we removed x[2]). ### stitch_Ranges() below takes any Ranges derivative and returns an IRanges ### object (so is NOT an endomorphism). Note that this transformation ### preserves 'sum(width(x))'. ### Also note that this is an "inter range transformation". However unlike ### range(), reduce(), gaps(), or disjoin(), its result depends on the order ### of the elements in the input vector. It's also idempotent like range(), ### reduce(), and disjoin() (gaps() is not). ### TODO: Define and export stitch() generic and method for Ranges objects ### (in inter-range-methods.R). ### Maybe it would also make sense to have an isStitched() generic like we ### have isDisjoint() to provide a quick and easy way to check the state of ### the object before applying the transformation to it. In theory each ### idempotent inter range transformation could have a "state checker" so ### maybe add isReduced() too (range() probably doesn't need one). stitch_Ranges <- function(x) { if (length(x) == 0L) return(IRanges()) x_start <- start(x) x_end <- end(x) ## Find runs of stitchable elements along 'x'. ## Each run is described by the indices of its first ('run_from') and ## last ('run_to') elements in 'x'. ## The runs form a partitioning of 'x'. new_run_idx <- which(x_start[-1L] != x_end[-length(x)] + 1L) run_from <- c(1L, new_run_idx + 1L) run_to <- c(new_run_idx, length(x)) IRanges(x_start[run_from], x_end[run_to]) } ### The runs of positions in an IPos object are guaranteed to be stitched. stitch_IPos <- function(x) x@pos_runs ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity ### ### TODO ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### Note that if 'pos_runs' is an IPos instance with no metadata or metadata ### columns, then 'identical(IPos(pos_runs), pos_runs)' is TRUE. IPos <- function(pos_runs=IRanges()) { if (!is(pos_runs, "Ranges")) pos_runs <- as(pos_runs, "Ranges", strict=FALSE) suppressWarnings(ans_len <- sum(width(pos_runs))) if (is.na(ans_len)) stop("too many positions in 'pos_runs'") pos_runs <- stitch_Ranges(pos_runs) pos_runs <- pos_runs[width(pos_runs) != 0L] new2("IPos", pos_runs=pos_runs, check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### .from_Ranges_to_IPos <- function(from) { if (!all(width(from) == 1L)) stop(wmsg("all the ranges in the ", class(from), " object to ", "coerce to IPos must have a width of 1")) if (!is.null(names(from))) warning(wmsg("because an IPos object cannot hold them, the names ", "on the ", class(from), " object couldn't be ", "propagated during its coercion to IPos")) ans <- IPos(from) mcols(ans) <- mcols(from) ans } setAs("Ranges", "IPos", .from_Ranges_to_IPos) setAs("ANY", "IPos", function(from) .from_Ranges_to_IPos(as(from, "Ranges"))) ### The "as.data.frame" method for Ranges objects works on an IPos object ### but returns a data.frame with identical "start" and "end" columns, and ### a "width" column filled with 1. We overwrite it to return a data.frame ### with a "pos" column instead of the "start" and "end" columns, and no ### "width" column. ### TODO: Turn this into an S3/S4 combo for as.data.frame.IPos setMethod("as.data.frame", "IPos", function(x, row.names=NULL, optional=FALSE, ...) { ans <- data.frame(pos=pos(x), stringsAsFactors=FALSE) x_mcols <- mcols(x) if (!is.null(x_mcols)) ans <- cbind(ans, as.data.frame(x_mcols)) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting ### ### NOT exported but used in the GenomicRanges package. ### 'pos_runs' must be an IRanges or GRanges object or any range-based ### object as long as it supports start(), end(), width(), and is subsettable. ### 'i' must be a Ranges object with no zero-width ranges. extract_pos_runs_by_ranges <- function(pos_runs, i) { map <- S4Vectors:::map_ranges_to_runs(width(pos_runs), start(i), width(i)) ## Because 'i' has no zero-width ranges, 'mapped_range_span' cannot ## contain zeroes and so 'mapped_range_Ltrim' and 'mapped_range_Rtrim' ## cannot contain garbbage. mapped_range_offset <- map[[1L]] mapped_range_span <- map[[2L]] mapped_range_Ltrim <- map[[3L]] mapped_range_Rtrim <- map[[4L]] run_idx <- S4Vectors:::fancy_mseq(mapped_range_span, mapped_range_offset) pos_runs <- pos_runs[run_idx] if (length(run_idx) != 0L) { Rtrim_idx <- cumsum(mapped_range_span) Ltrim_idx <- c(1L, Rtrim_idx[-length(Rtrim_idx)] + 1L) trimmed_start <- start(pos_runs)[Ltrim_idx] + mapped_range_Ltrim trimmed_end <- end(pos_runs)[Rtrim_idx] - mapped_range_Rtrim start(pos_runs)[Ltrim_idx] <- trimmed_start end(pos_runs)[Rtrim_idx] <- trimmed_end suppressWarnings(new_len <- sum(width(pos_runs))) if (is.na(new_len)) stop("subscript is too big") } pos_runs } setMethod("extractROWS", "IPos", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ## TODO: Maybe make this the coercion method from NSBS to Ranges. if (is(i, "RangesNSBS")) { ir <- i@subscript ir <- ir[width(ir) != 0L] } else { ir <- as(as.integer(i), "IRanges") } new_pos_runs <- extract_pos_runs_by_ranges(x@pos_runs, ir) x@pos_runs <- stitch_Ranges(new_pos_runs) mcols(x) <- extractROWS(mcols(x), i) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### .make_naked_matrix_from_IPos <- function(x) { x_len <- length(x) x_mcols <- mcols(x) x_nmc <- if (is.null(x_mcols)) 0L else ncol(x_mcols) ans <- cbind(pos=as.character(pos(x))) if (x_nmc > 0L) { tmp <- do.call(data.frame, c(lapply(x_mcols, showAsCell), list(check.names=FALSE))) ans <- cbind(ans, `|`=rep.int("|", x_len), as.matrix(tmp)) } ans } show_IPos <- function(x, margin="", print.classinfo=FALSE) { x_class <- class(x) x_len <- length(x) x_mcols <- mcols(x) x_nmc <- if (is.null(x_mcols)) 0L else ncol(x_mcols) cat(x_class, " object with ", x_len, " ", ifelse(x_len == 1L, "position", "positions"), " and ", x_nmc, " metadata ", ifelse(x_nmc == 1L, "column", "columns"), ":\n", sep="") ## S4Vectors:::makePrettyMatrixForCompactPrinting() assumes that head() ## and tail() work on 'xx'. xx <- as(x, "IPos") out <- S4Vectors:::makePrettyMatrixForCompactPrinting(xx, .make_naked_matrix_from_IPos) if (print.classinfo) { .COL2CLASS <- c(pos="integer") classinfo <- S4Vectors:::makeClassinfoRowForCompactPrinting(x, .COL2CLASS) ## A sanity check, but this should never happen! stopifnot(identical(colnames(classinfo), colnames(out))) out <- rbind(classinfo, out) } if (nrow(out) != 0L) rownames(out) <- paste0(margin, rownames(out)) ## We set 'max' to 'length(out)' to avoid the getOption("max.print") ## limit that would typically be reached when 'showHeadLines' global ## option is set to Inf. print(out, quote=FALSE, right=TRUE, max=length(out)) } setMethod("show", "IPos", function(object) show_IPos(object, margin=" ", print.classinfo=TRUE) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining ### ### Note that supporting "[" and "c" makes "[<-" work out-of-the-box! ### ### 'Class' must be "IPos" or the name of a concrete IPos subclass. ### 'objects' must be a list of IPos objects. ### Returns an instance of class 'Class'. combine_IPos_objects <- function(Class, objects, use.names=TRUE, ignore.mcols=FALSE) { if (!isSingleString(Class)) stop("'Class' must be a single character string") if (!extends(Class, "IPos")) stop("'Class' must be the name of a class that extends IPos") if (!is.list(objects)) stop("'objects' must be a list") if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") ### TODO: Support 'use.names=TRUE'. if (use.names) stop("'use.names=TRUE' is not supported yet") if (!isTRUEorFALSE(ignore.mcols)) stop("'ignore.mcols' must be TRUE or FALSE") if (length(objects) != 0L) { ## TODO: Implement (in C) fast 'elementIsNull(objects)' in S4Vectors ## that does 'sapply(objects, is.null, USE.NAMES=FALSE)', and use it ## here. null_idx <- which(sapply(objects, is.null, USE.NAMES=FALSE)) if (length(null_idx) != 0L) objects <- objects[-null_idx] } if (length(objects) == 0L) return(new(Class)) ## TODO: Implement (in C) fast 'elementIs(objects, class)' in S4Vectors ## that does 'sapply(objects, is, class, USE.NAMES=FALSE)', and use it ## here. 'elementIs(objects, "NULL")' should work and be equivalent to ## 'elementIsNull(objects)'. if (!all(sapply(objects, is, Class, USE.NAMES=FALSE))) stop("the objects to combine must be ", Class, " objects (or NULLs)") objects_names <- names(objects) names(objects) <- NULL # so lapply(objects, ...) below returns an # unnamed list ## Combine "pos_runs" slots. pos_runs_slots <- lapply(objects, function(x) x@pos_runs) ## TODO: Use combine_IRanges_objects() here when it's available. ans_pos_runs <- stitch_Ranges(do.call(c, pos_runs_slots)) suppressWarnings(ans_len <- sum(width(ans_pos_runs))) if (is.na(ans_len)) stop("too many genomic positions to combine") ## Combine "mcols" slots. if (ignore.mcols) { ans_mcols <- NULL } else { ans_mcols <- do.call(S4Vectors:::rbind_mcols, objects) } ## Make 'ans' and return it. new2(Class, pos_runs=ans_pos_runs, elementMetadata=ans_mcols, check=FALSE) } setMethod("c", "IPos", function (x, ..., ignore.mcols=FALSE, recursive=FALSE) { if (!identical(recursive, FALSE)) stop("\"c\" method for IPos objects ", "does not support the 'recursive' argument") if (missing(x)) { objects <- list(...) x <- objects[[1L]] } else { objects <- list(x, ...) } combine_IPos_objects(class(x), objects, use.names=FALSE, ignore.mcols=ignore.mcols) } ) IRanges/R/IRanges-class.R0000644000175400017540000004276613175713360016134 0ustar00biocbuildbiocbuild### ========================================================================= ### IRanges objects ### ------------------------------------------------------------------------- ### ### The IRanges class is a simple container for storing a vector of integer ### ranges. ### setClass("IRanges", contains="Ranges", representation( start="integer", width="integer", NAMES="character_OR_NULL" # R doesn't like @names !! ), prototype( start=integer(), width=integer(), NAMES=NULL ) ) ### A NormalIRanges object is an IRanges object where the ranges are: ### (a) not empty (i.e. they have a non-null width); ### (b) not overlapping; ### (c) ordered from left to right; ### (d) not even adjacent (i.e. there must be a non empty gap between 2 ### consecutive ranges). ### If 'x' is an IRanges object of length >= 2, then 'x' is normal iff: ### start(x)[i] <= end(x)[i] < start(x)[i+1] <= end(x)[i+1] ### for every 1 <= i < length(x). ### If length(x) == 1, then 'x' is normal iff width(x)[1] >= 1. ### If length(x) == 0, then 'x' is normal. setClass("NormalIRanges", contains="IRanges") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### parallelSlotNames() ### ### Combine the new parallel slots with those of the parent class. Make sure ### to put the new parallel slots *first*. setMethod("parallelSlotNames", "IRanges", function(x) c("start", "width", "NAMES", callNextMethod()) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters ### setMethod("start", "IRanges", function(x, ...) x@start) setMethod("width", "IRanges", function(x) x@width) setMethod("names", "IRanges", function(x) x@NAMES) setMethod("ranges", "Ranges", function(x, use.names=TRUE, use.mcols=FALSE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") if (!isTRUEorFALSE(use.mcols)) stop("'use.mcols' must be TRUE or FALSE") ans_start <- start(x) ans_width <- width(x) ans_names <- if (use.names) names(x) else NULL ans_mcols <- if (use.mcols) mcols(x) else NULL new2("IRanges", start=ans_start, width=ans_width, NAMES=ans_names, elementMetadata=ans_mcols, check=FALSE) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isEmpty() and isNormal() ### .isNormal_IRanges <- function(x) .Call2("IRanges_isNormal", x, PACKAGE="IRanges") setMethod("isNormal", "IRanges", .isNormal_IRanges) ### Fast methods for NormalIRanges objects. setMethod("isEmpty", "NormalIRanges", function(x) length(x) == 0L) setMethod("isNormal", "NormalIRanges", function(x) TRUE) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods. ### ### Note: defined for NormalIRanges objects only. ### For an ordinary IRanges object 'x', it's not clear what the semantic ### should be. In particular, should empty ranges be ignored or not? If not ### then we could end up with 'min(x)' > 'max(x)' (e.g. when 'x' is made of 1 ### empty range) which is not nice. Another (and more pragmatic) reason for ### not defining these methods for IRanges objects is that I don't need them ### at the moment. ### setMethod("max", "NormalIRanges", function(x, ..., na.rm) { if (isEmpty(x)) { warning("empty ", class(x), " object; returning -Inf") -Inf } else { end(x)[length(x)] } } ) setMethod("min", "NormalIRanges", function(x, ..., na.rm) { if (isEmpty(x)) { warning("empty ", class(x), " object; returning Inf") Inf } else { start(x)[1L] } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### ### Validity of IRanges objects is taken care of by the validity method for ### Ranges objects. ### ### NormalIRanges objects .valid.NormalIRanges <- function(x) { if (!.isNormal_IRanges(x)) return("object is not normal") NULL } setValidity2("NormalIRanges", .valid.NormalIRanges) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("Ranges", "IRanges", function(from) ranges(from, use.mcols=TRUE) ) ### Helper function (not exported) used by the "coerce" methods defined in ### IRanges-utils.R. Believe it or not but the implicit "coerce" methods do ### NOT check that they return a valid object! newNormalIRangesFromIRanges <- function(x, check=TRUE) { if (!is(x, "IRanges")) stop("'x' must be an IRanges object") if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Check only what needs to be checked. if (check) S4Vectors:::stopIfProblems(.valid.NormalIRanges(x)) class(x) <- "NormalIRanges" x } ### The returned IRanges instance is guaranteed to be normal. setAs("logical", "IRanges", function(from) as(as(from, "NormalIRanges"), "IRanges") ) setAs("logical", "NormalIRanges", function(from) .Call2("NormalIRanges_from_logical", from, PACKAGE="IRanges") ) ### coercion from integer setAs("integer", "IRanges", function(from) .Call2("IRanges_from_integer", from, PACKAGE="IRanges") ) setAs("integer", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(from, "IRanges")) ) setAs("numeric", "IRanges", function(from) as(as.integer(from), "IRanges")) setAs("numeric", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(as.integer(from), "IRanges"))) ### coercion from character .from_character_to_IRanges <- function(from) { stopifnot(is.character(from)) if (anyNA(from)) stop(wmsg("converting a character vector to an IRanges object ", "does not support NAs")) error_msg <- wmsg( "The character vector to convert to an IRanges object must ", "contain strings of the form \"start-end\" or \"start..end\", ", "with end >= start - 1, or just \"pos\". For example: \"2501-2900\", ", "\"2501..2900\", or \"740\"." ) ## We want to split on the first occurence of "-" that is preceeded by ## a digit (ignoring and removing the spaces in between if any). from <- sub("([[:digit:]])[[:space:]]*-", "\\1..", from) split2 <- CharacterList(strsplit(from, "..", fixed=TRUE)) split2_eltNROWS <- elementNROWS(split2) if (!all(split2_eltNROWS <= 2L)) stop(error_msg) ans_start <- suppressWarnings(as.integer(phead(split2, n=1L))) ans_end <- suppressWarnings(as.integer(ptail(split2, n=1L))) if (anyNA(ans_start) || anyNA(ans_end)) stop(error_msg) IRanges(ans_start, ans_end, names=names(from)) } setAs("character", "IRanges", .from_character_to_IRanges) .from_factor_to_IRanges <- function(from) { from <- setNames(as.character(from), names(from)) .from_character_to_IRanges(from) } setAs("factor", "IRanges", .from_factor_to_IRanges) setAs("ANY", "Ranges", function(from) as(from, "IRanges")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Low-level setters for IRanges objects. ### ### All these low-level setters preserve the length of the object. ### The choice was made to implement a "resizing" semantic: ### (1) changing the start preserves the end (so it changes the width) ### (2) changing the end preserves the start (so it changes the width) ### (3) changing the width preserves the start (so it changes the end) ### .set_IRanges_start <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") old_start <- start(x) ## Use 'x@start[]' instead of 'x@start' so the right value is recycled. x@start[] <- S4Vectors:::numeric2integer(value) x@width <- width(x) - start(x) + old_start if (check) validObject(x) x } setReplaceMethod("start", "IRanges", function(x, ..., value) .set_IRanges_start(x, value) ) .set_IRanges_end <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Use 'x@width[]' instead of 'x@width' so the right value is recycled. x@width[] <- width(x) + S4Vectors:::numeric2integer(value) - end(x) if (check) validObject(x) x } setReplaceMethod("end", "IRanges", function(x, ..., value) .set_IRanges_end(x, value) ) .set_IRanges_width <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Use 'x@width[]' instead of 'x@width' so the right value is recycled. x@width[] <- S4Vectors:::numeric2integer(value) if (check) validObject(x) x } setReplaceMethod("width", "IRanges", function(x, ..., value) .set_IRanges_width(x, value) ) set_IRanges_names <- function(x, value) { x@NAMES <- S4Vectors:::normalize_names_replacement_value(value, x) ## No need to validate an IRanges object after setting its names so this ## should be safe. x } setReplaceMethod("names", "IRanges", set_IRanges_names) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "update" method. ### ### This is a convenience method for combining multiple modifications in one ### single call. ### ### It must verify 2 important properties: ### (1) update(x) must be identical to x (doesn't touch x at all) ### (2) update(x, start=start(x), width=width(x), names=names(x)) ### must be identical to x too (but this time it updates x with its own ### content) ### .update_IRanges <- function(object, ...) { valid_argnames <- c("start", "width", "end", "names", "mcols") args <- S4Vectors:::extraArgsAsList(valid_argnames, ...) argnames <- names(args) sew <- c("start", "end", "width") narg_in_sew <- sum(sew %in% argnames) if (narg_in_sew == 3L) stop("at most 2 out of the ", paste("'", sew, "'", sep="", collapse=", "), " arguments can be supplied") if (narg_in_sew == 2L && ("names" %in% argnames || is.null(names(object))) && ("mcols" %in% argnames || is.null(mcols(object)))) { ## The update can change the length of the object. if ("end" %in% argnames) { if ("width" %in% argnames) { width <- args$width start <- args$end - width + 1L } else { start <- args$start width <- args$end - start + 1L } } else { start <- args$start width <- args$width } object <- BiocGenerics:::replaceSlots(object, start=S4Vectors:::numeric2integer(start), width=S4Vectors:::numeric2integer(width), NAMES=args$names, elementMetadata=args$mcols, check=FALSE) return(object) } ## The update preserves the length of the object. if ("start" %in% argnames) object <- .set_IRanges_start(object, args$start, check=FALSE) if ("end" %in% argnames) object <- .set_IRanges_end(object, args$end, check=FALSE) if ("width" %in% argnames) object <- .set_IRanges_width(object, args$width, check=FALSE) if ("names" %in% argnames) names(object) <- args$names if ("mcols" %in% argnames) mcols(object) <- args$mcols object } ### FIXME: need some way of specifying the extent of validity ### checking, like giving the class up to which the object is ### assumed valid. setMethod("update", "IRanges", function(object, ..., check = TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") object <- .update_IRanges(object, ...) if (check) validObject(object) object } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("extractROWS", "NormalIRanges", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (is(x, "NormalIRanges")) { if (!isStrictlySorted(i)) stop("subscript must extract elements at strictly sorted ", "positions when\n subsetting a ", class(x), " object") } callNextMethod() } ) setMethod("replaceROWS", "IRanges", function(x, i, value) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ans_start <- replaceROWS(start(x), i, start(value)) ans_width <- replaceROWS(width(x), i, width(value)) ans_mcols <- replaceROWS(mcols(x), i, mcols(value)) update(x, start=ans_start, width=ans_width, mcols=ans_mcols, check=FALSE) } ) setMethod("replaceROWS", "NormalIRanges", function(x, i, value) { ans <- callNextMethod() validObject(ans) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining. ### ### The "c" method for IRanges objects is implemented to behave like an ### endomorphism i.e. to return an object of the same class as 'x'. In ### particular 'c(x)' now returns 'x' and not 'as(x, "IRanges")'. ### It's easy to implement specific "c" methods for IRanges subclasses. ### Typically they just need to do something like: ### ### old_val <- S4Vectors:::disableValidity() ### on.exit(S4Vectors:::disableValidity(old_val)) ### S4Vectors:::disableValidity(TRUE) ### ans <- callNextMethod(x, ..., recursive=FALSE) ### ... ### ### and to take care of the additional slots (aka the subclass-specific ### slots). If there aren't any additional slots (e.g. NormalIRanges), or ### if the additional slots don't need to be modified (e.g. the "subject" ### slot of the Views subclass), then no need to implement a specific method ### at all. ### ### In the case of NormalIRanges objects, 'c(x1, x2)' will fail if the result ### is not normal, but 'c(as(x1, "IRanges"), x2)' or 'c(IRanges(), x1, x2)' ### would work. Note that, in general, 'c(IRanges(), x)' is not the same as ### 'c(x, IRanges())' (the former is equivalent to 'as(x, IRanges)' and the ### latter to 'c(x)' or 'x'). ### Also note that the user needs to be carefull when passing named arguments ### to c() (there is no good reason to do this in the first place) because of ### the following pitfalls: ### (1) If all the arguments are named (e.g. 'c(a=x1, b=x2)') then the first ### argument must be an IRanges *instance* otherwise dispatch will fail. ### It's not clear why dispatch works when 'x1' is an IRanges instance ### because, in that case, formal argument 'x' is missing. It's even ### less clear why it fails when 'x1' is an IRanges object without being ### an IRanges instance. For example: ### x1 <- IRanges(1, 11) ### x2 <- IRanges(22, 33) ### ## works as expected: ### c(a=x1, b=x2) ### ## works as expected: ### c(a=x1, asNormalIRanges(x2)) ### ## dispatch fails (the default "c" method is selected) ### c(a=asNormalIRanges(x1), b=x2)) ### (2) When named and unnamed arguments are mixed and no named argument has ### name 'x' (e.g. 'c(a=x1, x2)'), then, following the standard rules of ### argument matching in R, one would expect that the first unnamed ### argument will match formal argument 'x'. This is more or less what ### happens: ### > c(a=x1, x2) ### IRanges object: ### start end width ### [1] 2 22 21 ### [2] 1 11 11 ### but there are some surprises: ### > c(a=x1, TRUE) ### Error in c(a = x1, TRUE) : ### all arguments in '...' must be logical objects (or NULLs) ### > c(a=asNormalIRanges(x1), TRUE) ### $a ### NormalIRanges object: ### start end width ### [1] 1 11 11 ### ### [[2]] ### [1] TRUE ### setMethod("c", "IRanges", function(x, ..., ignore.mcols=FALSE, recursive=FALSE) { if (!identical(recursive, FALSE)) stop("\"c\" method for IRanges objects ", "does not support the 'recursive' argument") if (!isTRUEorFALSE(ignore.mcols)) stop("'ignore.mcols' must be TRUE or FALSE") if (missing(x)) { args <- unname(list(...)) x <- args[[1L]] } else { args <- unname(list(x, ...)) } if (length(args) == 1L) return(x) arg_is_null <- sapply(args, is.null) if (any(arg_is_null)) args[arg_is_null] <- NULL # remove NULL elements by setting them to NULL! if (!all(sapply(args, is, class(x)))) stop("all arguments in '...' must be ", class(x), " objects (or NULLs)") ans_start <- unlist(lapply(args, start), use.names=FALSE) ans_width <- unlist(lapply(args, width), use.names=FALSE) names_list <- lapply(args, names) arg_has_no_names <- sapply(names_list, is.null) if (all(arg_has_no_names)) { ans_names <- NULL } else { names_list[arg_has_no_names] <- lapply(args[arg_has_no_names], function(arg) character(length(arg))) ans_names <- unlist(names_list, use.names=FALSE) } if (ignore.mcols) { ans_mcols <- NULL } else { ans_mcols <- do.call(S4Vectors:::rbind_mcols, args) } update(x, start=ans_start, width=ans_width, names=ans_names, mcols=ans_mcols) } ) IRanges/R/IRanges-constructor.R0000644000175400017540000001547613175713360017412 0ustar00biocbuildbiocbuild### ========================================================================= ### The IRanges constructor ### ------------------------------------------------------------------------- ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The SEW0 interface: start=NULL/end=NULL/width=NULL ### .normargSEW0 <- function(x, argname) { if (is.null(x)) return(integer()) if (!is.numeric(x) && !(is.atomic(x) && all(is.na(x)))) stop("'", argname, "' must be a numeric vector (or NULL)") if (!is.integer(x)) x <- as.integer(x) x } ### Some of the functions that support the SEW0 interface: IRanges(), Views(), ### etc... solveUserSEW0 <- function(start=NULL, end=NULL, width=NULL) { start <- .normargSEW0(start, "start") end <- .normargSEW0(end, "end") width <- .normargSEW0(width, "width") L1 <- length(start) L2 <- length(end) L3 <- length(width) L123 <- c(L1, L2, L3) max123 <- max(L123) ## We want IRanges(start=integer(0), width=5) and ## IRanges(end=integer(0), width=5) to work and return an empty IRanges ## object. if (max123 == 0L || L1 == 0L && L2 == 0L && L3 == 1L) return(new("IRanges")) ## Recycle start/end/width. if (L1 < max123) { if (L1 == 0L) start <- rep.int(NA_integer_, max123) else start <- S4Vectors:::recycleVector(start, max123) } if (L2 < max123) { if (L2 == 0L) end <- rep.int(NA_integer_, max123) else end <- S4Vectors:::recycleVector(end, max123) } if (L3 < max123) { if (L3 == 0L) width <- rep.int(NA_integer_, max123) else width <- S4Vectors:::recycleVector(width, max123) } .Call2("solve_user_SEW0", start, end, width, PACKAGE="IRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The safe and user-friendly "IRanges" constructor. ### IRanges <- function(start=NULL, end=NULL, width=NULL, names=NULL) { if (is(start, "Ranges")) { if (!is.null(end) || !is.null(width)) stop("'end' and 'width' must be NULLs ", "when 'start' is a Ranges object") ans <- new2("IRanges", start=start(start), width=width(start), NAMES=names, check=FALSE) return(ans) } if ((is.logical(start) && !all(is.na(start))) || is(start, "Rle")) { if (is(start, "Rle") && !is.logical(runValue(start))) stop("'start' is an Rle, but not a logical Rle object") if (!is.null(end) || !is.null(width)) stop("'end' and 'width' must be NULLs when 'start' is a logical ", "vector or logical Rle") ## The returned IRanges instance is guaranteed to be normal. ans <- as(start, "IRanges") names(ans) <- names return(ans) } ans <- solveUserSEW0(start=start, end=end, width=width) names(ans) <- names return(ans) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The SEW interface: start=NA/end=NA/width=NA ### ### Some of the functions that support the SEW interface: narrow(), ### XVector::subseq(), XVector::xvcopy(), Biostrings::BStringSet() (and ### family), BSgenome::getSeq(), etc... ### .normargSEW <- function(x, argname) { if (!S4Vectors:::isNumericOrNAs(x)) stop("'", argname, "' must be a vector of integers") if (!is.integer(x)) x <- as.integer(x) x } ### Use of 'rep.refwidths=TRUE' is supported only when 'refwidths' is of ### length 1. ### If 'rep.refwidths=FALSE' (the default) then 'start', 'end' and 'width' ### are recycled to 'length(refwidths)' (it's an error if one of them is ### longer than 'refwidths'). Otherwise, 'refwidths' is replicated L times ### where L is the length of the longest of 'start', 'end' and 'width'. ### The returned value is an IRanges object of the same length as 'refwidths' ### (after replication if 'rep.refwidths=TRUE'). solveUserSEW <- function(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE) { if (!is.numeric(refwidths)) stop("'refwidths' must be a vector of integers") if (!is.integer(refwidths)) refwidths <- as.integer(refwidths) start <- .normargSEW(start, "start") end <- .normargSEW(end, "end") width <- .normargSEW(width, "width") ## From here, 'refwidths', 'start', 'end' and 'width' are guaranteed to be ## integer vectors. NAs in 'start', 'end' and 'width' are OK but not in ## 'refwidths' so this should be checked at the C level. if (!isTRUEorFALSE(rep.refwidths)) stop("'rep.refwidths' must be TRUE or FALSE") if (!isTRUEorFALSE(translate.negative.coord)) stop("'translate.negative.coord' must be TRUE or FALSE") if (!isTRUEorFALSE(allow.nonnarrowing)) stop("'allow.nonnarrowing' must be TRUE or FALSE") Lsew <- c(length(start), length(end), length(width)) maxLsew <- max(Lsew) minLsew <- min(Lsew) if (minLsew == 0L && maxLsew > 1L) stop("'start', 'end' and 'width' cannot mix zero-length ", "and longer-than-one vectors") ## Check 'start', 'end', and 'width' *without* recycling them. Recycling ## is done at the C level. if (rep.refwidths) { if (length(refwidths) != 1L) stop("'rep.refwidths=TRUE' can be used only when 'refwidths' ", "is of length 1") ## 'ans_len' is the length of the longest of 'start', 'end' ## and 'width'. if (minLsew == 0L) { ans_len <- 0L } else { ans_len <- maxLsew } refwidths <- rep.int(refwidths, ans_len) } else { ans_len <- length(refwidths) if (ans_len == 0L) { if (maxLsew > 1L) stop("'start', 'end' or 'width' is longer than 'refwidths'") } else { if (minLsew == 0L) stop("cannot recycle empty 'start', 'end' or 'width'") if (maxLsew > ans_len) stop("'start', 'end' or 'width' is longer than 'refwidths'") } } .Call2("solve_user_SEW", refwidths, start, end, width, translate.negative.coord, allow.nonnarrowing, PACKAGE="IRanges") } ### Returns an IRanges instance of length 1. Not exported. solveUserSEWForSingleSeq <- function(x_length, start=NA, end=NA, width=NA) { solved_SEW <- try(solveUserSEW(x_length, start=start, end=end, width=width), silent = TRUE) if (is(solved_SEW, "try-error")) stop("Invalid sequence coordinates.\n", " Please make sure the supplied 'start', 'end' and 'width' arguments\n", " are defining a region that is within the limits of the sequence.") solved_SEW } IRanges/R/IRanges-utils.R0000644000175400017540000001477213175713360016163 0ustar00biocbuildbiocbuild### ========================================================================= ### Utility functions for creating or modifying IRanges objects ### ------------------------------------------------------------------------- ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "successiveIRanges" function. ### ### Note that the returned IRanges object is guaranteed to be normal in the ### following cases: ### (a) when length(width) == 0 ### (b) when length(width) == 1 and width > 0 ### (c) when length(width) >= 2 and all(width > 0) and all(gapwidth > 0) ### However, the function doesn't try to turn the result into a NormalIRanges ### object. ### successiveIRanges <- function(width, gapwidth=0, from=1) { if (!is.numeric(width)) stop("'width' must be an integer vector") if (length(width) == 0L) return(IRanges()) if (!is.integer(width)) width <- as.integer(width) # this drops the names else if (!is.null(names(width))) names(width) <- NULL # unname() used to be broken on 0-length vectors if (S4Vectors:::anyMissingOrOutside(width, 0L)) stop("'width' cannot contain NAs or negative values") if (!is.numeric(gapwidth)) stop("'gapwidth' must be an integer vector") if (!is.integer(gapwidth)) gapwidth <- as.integer(gapwidth) if (S4Vectors:::anyMissing(gapwidth)) stop("'gapwidth' cannot contain NAs") if (length(gapwidth) != length(width) - 1L) { if (length(gapwidth) != 1L) stop("'gapwidth' must a single integer or an integer vector ", "with one less element than the 'width' vector") gapwidth <- rep.int(gapwidth, length(width) - 1L) } if (!isSingleNumber(from)) stop("'from' must be a single integer") if (!is.integer(from)) from <- as.integer(from) ans_start <- cumsum(width[-length(width)] + gapwidth) ans_start <- from + c(0L, ans_start) ## 'ans_start' could contain NAs in case of an integer overflow in ## cumsum(), hence the use of 'check=TRUE' here: new2("IRanges", start=ans_start, width=width, check=TRUE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### slidingIRanges() ### slidingIRanges <- function(len, width, shift = 1L) { start <- seq(1L, len-width, by=shift) end <- seq(width, len, by=shift) IRanges(start, end) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### breakInChunks() ### ### TODO: Should not be in IRanges-utils.R because it returns a ### PartitioningByEnd object, not an IRanges object. So move it to another ### file, e.g. to Partitioning-class.R. breakInChunks() is actually a ### specialized PartitioningByEnd constructor. ### .normarg_totalsize <- function(totalsize) { if (!isSingleNumber(totalsize)) stop("'totalsize' must be a single integer") if (!is.integer(totalsize)) totalsize <- as.integer(totalsize) if (totalsize < 0L) stop("'totalsize' cannot be negative") totalsize } .normarg_chunksize_or_nchunk <- function(chunksize, totalsize, what) { if (!isSingleNumber(chunksize)) stop("'", what, "' must be a single integer") if (!is.integer(chunksize)) chunksize <- as.integer(chunksize) if (chunksize < 0L) stop("'", what, "' cannot be negative") if (chunksize == 0L && totalsize != 0L) stop("'", what, "' can be 0 only if 'totalsize' is 0") chunksize } ### TODO: Argument names and order is inconsistent with tileGenome(). ### Reconcile them! breakInChunks <- function(totalsize, chunksize, nchunk) { totalsize <- .normarg_totalsize(totalsize) if (!missing(chunksize)) { if (!missing(nchunk)) stop("only one of 'chunksize' and 'nchunk' can be specified") ## All chunks will have the requested size, except maybe the last one. chunksize <- .normarg_chunksize_or_nchunk(chunksize, totalsize, "chunksize") if (totalsize == 0L) return(PartitioningByEnd()) quot <- totalsize %/% chunksize # integer division ans_end <- cumsum(rep.int(chunksize, quot)) if (quot == 0L || ans_end[[quot]] != totalsize) ans_end <- c(ans_end, totalsize) } else { if (missing(nchunk)) stop("one of 'chunksize' and 'nchunk' must be specified") ## All chunks will have more or less the same size, with the difference ## between smallest and biggest chunks guaranteed to be <= 1. nchunk <- .normarg_chunksize_or_nchunk(nchunk, totalsize, "nchunk") if (nchunk == 0L) return(PartitioningByEnd()) chunksize <- totalsize / nchunk # floating point division ans_end <- as.integer(cumsum(rep.int(chunksize, nchunk))) ## The last value in 'ans_end' *should* be 'totalsize' but there is ## always some uncertainty about what coercing the result of a floating ## point operation to integer will produce. So we set this value ## manually to 'totalsize' just in case. ans_end[[nchunk]] <- totalsize } PartitioningByEnd(ans_end) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "centeredIRanges" function. ### centeredIRanges <- function(center, flank) { if (!is.numeric(center)) stop("'center' must be a numeric vector") if (!is.numeric(flank)) stop("'flank' must be a numeric vector") IRanges(start=center-flank, end=center+flank) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "whichAsIRanges" function. ### ### Note that unlike the standard which() function, whichAsIRanges() drops ### the names of 'x'. ### whichAsIRanges <- function(x) { if (!is.logical(x)) stop("'x' must be a logical vector") as(x, "NormalIRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercing an IRanges object to a NormalIRanges object. ### asNormalIRanges <- function(x, force=TRUE) { if (!is(x, "Ranges")) stop("'x' must be an Ranges object") else if (!is(x, "IRanges")) x <- as(x, "IRanges") if (!isTRUEorFALSE(force)) stop("'force' must be TRUE or FALSE") if (force) x <- reduce(x, drop.empty.ranges=TRUE) newNormalIRangesFromIRanges(x, check=!force) } .asNormalIRanges <- function(from) asNormalIRanges(from, force=TRUE) setAs("IRanges", "NormalIRanges", .asNormalIRanges) IRanges/R/List-class-leftovers.R0000644000175400017540000000667513175713360017525 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 9/4/2014 ### Most of the stuff that used to be in the IRanges/R/List-class.R file ### was moved to the S4Vectors package (to R/List-class.R). ### The stuff that could not be moved there was *temporarily* kept here in ### List-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ## NOTE: while the 'c' function does not have an 'x', the generic does ## c() is a primitive, so 'x' can be missing; dispatch is by position, ## although sometimes this does not work so well, so it's best to keep ## names off the parameters whenever feasible. setMethod("c", "SimpleList", function(x, ..., recursive = FALSE) { slot(x, "listData") <- do.call(c, lapply(unname(list(x, ...)), as.list)) if (!is.null(mcols(x))) mcols(x) <- S4Vectors:::rbind_mcols(x, ...) x }) .stack.ind <- function(x, index.var = "name") { if (length(names(x)) > 0) { spaceLabels <- names(x) } else { spaceLabels <- seq_len(length(x)) } ind <- Rle(factor(spaceLabels, levels = unique(spaceLabels)), elementNROWS(x)) do.call(DataFrame, structure(list(ind), names = index.var)) } ### FIXME: need a recursive argument, when TRUE we call stack on ### unlist result, instead of coercing to DataFrame. setMethod("stack", "List", function(x, index.var = "name", value.var = "value", name.var = NULL) { value <- unlist(x, use.names=FALSE) index <- .stack.ind(x, index.var) unlistsToVector <- is(value, "Vector") if (unlistsToVector) { df <- cbind(index, S4Vectors:::ensureMcols(unname(value))) } else { df <- DataFrame(index, as(unname(value), "DataFrame")) colnames(df)[2] <- value.var } if (!is.null(name.var)) { nms <- as.character(unlist(lapply(x, names))) if (length(nms) == 0L) { rngs <- IRanges(1L, width=elementNROWS(x)) nms <- as.integer(rngs) } else { nms <- factor(nms, unique(nms)) } df[[name.var]] <- nms df <- df[c(index.var, name.var, value.var)] } if (!is.null(mcols(x))) { df <- cbind(df, mcols(x)[togroup(PartitioningByEnd(x)),,drop=FALSE]) } if (unlistsToVector) { mcols(value) <- df value } else { df } }) setMethod("stack", "matrix", function(x, row.var = names(dimnames(x))[1L], col.var = names(dimnames(x))[2L]) { l <- x attributes(l) <- NULL lens <- elementNROWS(l) rn <- rownames(x) if (is.null(rn)) rn <- seq_along(nrow(x)) cn <- colnames(x) if (is.null(cn)) cn <- seq_along(ncol(x)) ans <- DataFrame(row=rep(rn[row(x)], lens), col=rep(Rle(cn, rep(nrow(x), ncol(x))), lens), stack(List(l))) if (is.null(row.var)) row.var <- "row" if (is.null(col.var)) col.var <- "col" colnames(ans) <- c(row.var, col.var) ans }) IRanges/R/ListGrouping-class.R0000644000175400017540000001141113175713360017211 0ustar00biocbuildbiocbuild### ========================================================================= ### Grouping objects implemented with an IntegerList ### ------------------------------------------------------------------------- setClass("SimpleGrouping", ### TODO: contain VIRTUAL after R 3.4 release contains=c("Grouping", "SimpleIntegerList")) setClass("CompressedGrouping", ### TODO: contain VIRTUAL after R 3.4 release contains=c("Grouping", "CompressedIntegerList")) setClass("SimpleManyToOneGrouping", contains=c("ManyToOneGrouping", "SimpleGrouping")) setClass("CompressedManyToOneGrouping", contains=c("ManyToOneGrouping", "CompressedGrouping")) setClass("BaseManyToManyGrouping", representation(nobj="integer"), ### TODO: contain VIRTUAL after R 3.4 release contains="ManyToManyGrouping", validity=function(object) { if (!isSingleNumber(object@nobj)) "'nobj' must be a single, non-NA number" }) setClass("SimpleManyToManyGrouping", contains=c("BaseManyToManyGrouping", "SimpleGrouping")) setClass("CompressedManyToManyGrouping", contains=c("BaseManyToManyGrouping", "CompressedGrouping")) ### ------------------------------------------------------------------------- ### Grouping API implementation ### ---------------------------- ### setMethod("grouplengths", "CompressedGrouping", function(x, i=NULL) grouplengths(PartitioningByEnd(x), i)) setMethod("nobj", "CompressedManyToOneGrouping", function(x) nobj(PartitioningByEnd(x))) setMethod("nobj", "BaseManyToManyGrouping", function(x) x@nobj) ### ------------------------------------------------------------------------- ### Constructors ### ---------------------------- ### ManyToOneGrouping <- function(..., compress=TRUE) { CompressedOrSimple <- if (compress) "Compressed" else "Simple" Class <- paste0(CompressedOrSimple, "ManyToOneGrouping") new(Class, IntegerList(..., compress=compress)) } ManyToManyGrouping <- function(..., nobj, compress=TRUE) { CompressedOrSimple <- if (compress) "Compressed" else "Simple" Class <- paste0(CompressedOrSimple, "ManyToManyGrouping") new(Class, IntegerList(..., compress=compress), nobj=nobj) } ### ------------------------------------------------------------------------- ### Coercion ### ---------------------------- ### setOldClass("grouping") ## utils::relist dipatches only on 'skeleton' so this is here instead of in R setMethod("relist", c("grouping", "missing"), function(flesh, skeleton) { relist(as.integer(flesh), PartitioningByEnd(attr(flesh, "ends"))) }) setMethod("split", c("ANY", "ManyToOneGrouping"), function(x, f, drop=FALSE) { stopifnot(isTRUEorFALSE(drop)) ans <- extractList(x, f) if (drop) { ans <- ans[lengths(ans) > 0L] } ans }) setAs("grouping", "Grouping", function(from) { as(from, "ManyToOneGrouping") }) setAs("grouping", "ManyToOneGrouping", function(from) { ManyToOneGrouping(relist(from), compress=TRUE) }) setAs("vector", "Grouping", function(from) { if (anyNA(from)) as(from, "ManyToManyGrouping") else as(from, "ManyToOneGrouping") }) setAs("vector", "ManyToOneGrouping", function(from) { to <- as(grouping(from), "Grouping") names(to) <- from[unlist(to)[end(PartitioningByEnd(to))]] to }) setAs("factor", "ManyToOneGrouping", function(from) { ManyToOneGrouping(splitAsList(seq_along(from), from)) }) setAs("vector", "ManyToManyGrouping", function(from) { g <- as(from, "ManyToOneGrouping") if (anyNA(from)) g <- g[-length(g)] ManyToManyGrouping(g, nobj=length(from)) }) setAs("ManyToOneGrouping", "factor", function(from) { levels <- if (!is.null(names(from))) { names(from) } else { as.character(seq_along(from)) } structure(togroup(from), levels=levels, class="factor") }) setMethod("as.factor", "ManyToOneGrouping", function(x) { as(x, "factor") }) makeGroupNames <- function(x) { if (is.null(x)) { x <- character(length(x)) } ind <- which(x == "") x[ind] <- paste("Group", ind, sep = ".") x } levelCols <- function(by) { DataFrame(expand.grid(lapply(by, levels))) } setAs("FactorList", "Grouping", function(from) { l <- as.list(from) names(l) <- makeGroupNames(names(from)) as(DataFrame(l), "Grouping") }) setAs("DataFrame", "Grouping", function(from) { factors <- lapply(from, as.factor) l <- splitAsList(seq_len(nrow(from)), factors) mcols(l) <- levelCols(factors) if (anyNA(from, recursive=TRUE)) { ManyToManyGrouping(l, nobj=nrow(from)) } else { ManyToOneGrouping(l) } }) IRanges/R/MaskCollection-class.R0000644000175400017540000003004313175713360017474 0ustar00biocbuildbiocbuild### ========================================================================= ### MaskCollection objects ### ------------------------------------------------------------------------- setClass("MaskCollection", contains="RangesList", representation( nir_list="list", # a list of NormalIRanges objects width="integer", active="logical", NAMES="character", # R doesn't like @names !! desc="character" ), prototype( nir_list=list(), width=0L, active=logical(0), NAMES=as.character(NA), desc=as.character(NA) ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "length" and accessor methods. ### setGeneric("nir_list", function(x) standardGeneric("nir_list")) setMethod("nir_list", "MaskCollection", function(x) x@nir_list) setMethod("length", "MaskCollection", function(x) length(nir_list(x))) setMethod("width", "MaskCollection", function(x) x@width) setMethod("active", "MaskCollection", function(x) { ans <- x@active names(ans) <- names(x) ans } ) setReplaceMethod("active", "MaskCollection", function(x, value) { if (!is.logical(value) || S4Vectors:::anyMissing(value)) stop("'value' must be a logical vector with no NAs") x@active[] <- value x } ) setMethod("names", "MaskCollection", function(x) if (length(x@NAMES) == 1 && is.na(x@NAMES)) NULL else x@NAMES ) setReplaceMethod("names", "MaskCollection", function(x, value) { if (is.null(value)) { x@NAMES <- NA_character_ return(x) } value <- as.character(value) ii <- is.na(value) if (any(ii)) value[ii] <- "" if (length(value) > length(x)) stop("too many names") if (length(value) < length(x)) value <- c(value, character(length(x) - length(value))) x@NAMES <- value x } ) setGeneric("desc", function(x) standardGeneric("desc")) setMethod("desc", "MaskCollection", function(x) if (length(x@desc) == 1 && is.na(x@desc)) NULL else x@desc ) setGeneric("desc<-", signature="x", function(x, value) standardGeneric("desc<-") ) setReplaceMethod("desc", "MaskCollection", function(x, value) { if (is.null(value)) { x@desc <- as.character(NA) return(x) } if (!is.character(value)) stop("'value' must be NULL or a character vector") ii <- is.na(value) if (any(ii)) value[ii] <- "" if (length(value) > length(x)) stop("too many names") if (length(value) < length(x)) value <- c(value, character(length(x) - length(value))) x@desc <- value x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.MaskCollection.width <- function(x) { if (!isSingleInteger(width(x)) || width(x) < 0) return("the width of the collection must be a single non-negative integer") NULL } .valid.MaskCollection.nir_list <- function(x) { if (!is.list(nir_list(x)) || !all(sapply(nir_list(x), function(nir) is(nir, "NormalIRanges")))) return("the 'nir_list' slot must contain a list of NormalIRanges objects") if (!all(1 <= min(x)) || !all(max(x) <= width(x))) return("the min and max of the masks must be >= 1 and <= width of the collection") NULL } .valid.MaskCollection.active <- function(x) { if (!is.logical(active(x)) || S4Vectors:::anyMissing(active(x))) return("the 'active' slot must be a logical vector with no NAs") if (length(active(x)) != length(x)) return("the length of the 'active' slot differs from the length of the object") NULL } .valid.MaskCollection.names <- function(x) { if (S4Vectors:::anyMissing(names(x))) return("the names must be non-NA strings") NULL } .valid.MaskCollection.desc <- function(x) { if (!is.character(x@desc)) return("the 'desc' slot must contain a character vector") if (is.null(desc(x))) return(NULL) if (S4Vectors:::anyMissing(desc(x))) return("the descriptions must be non-NA strings") if (length(desc(x)) != length(x)) return("number of descriptions and number of elements differ") NULL } .valid.MaskCollection <- function(x) { ## The 'width' slot needs to be checked separately and we must return ## if it's invalid. This is because .valid.MaskCollection.nir_list() ## won't work properly if 'x@width' is NA. problems <- .valid.MaskCollection.width(x) if (!is.null(problems)) return(problems) c(.valid.MaskCollection.nir_list(x), .valid.MaskCollection.active(x), .valid.MaskCollection.names(x), .valid.MaskCollection.desc(x)) } setValidity2("MaskCollection", .valid.MaskCollection) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The safe and user-friendly "Mask" constructor. ### Mask <- function(mask.width, start=NULL, end=NULL, width=NULL) { nir <- asNormalIRanges(IRanges(start=start, end=end, width=width), force=FALSE) new2("MaskCollection", nir_list=list(nir), width=S4Vectors:::numeric2integer(mask.width), active=TRUE, check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods. ### setMethod("max", "MaskCollection", function(x, ..., na.rm) { if (length(x) == 0) return(integer(0)) sapply(nir_list(x), max) } ) setMethod("min", "MaskCollection", function(x, ..., na.rm) { if (length(x) == 0) return(integer(0)) sapply(nir_list(x), min) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "maskedwidth" and "maskedratio" generics and methods. ### setGeneric("maskedwidth", function(x) standardGeneric("maskedwidth")) setMethod("maskedwidth", "MaskCollection", function(x) { nir_list <- nir_list(x) if (length(nir_list) == 0) integer(0) else sapply(nir_list, function(nir) sum(width(nir))) } ) setGeneric("maskedratio", function(x) standardGeneric("maskedratio")) setMethod("maskedratio", "MaskCollection", function(x) maskedwidth(x) / width(x)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("getListElement", "MaskCollection", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) nir_list(x)[[i]] } ) ### Always behaves like an endomorphism (i.e. ignores the 'drop' argument and ### behaves like if it was actually set to FALSE). setMethod("extractROWS", "MaskCollection", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (anyDuplicated(i)) stop("subscript would generate duplicated elements") slot(x, "nir_list", check=FALSE) <- extractROWS(nir_list(x), i) slot(x, "active", check=FALSE) <- extractROWS(active(x), i) if (!is.null(names(x))) slot(x, "NAMES", check=FALSE) <- extractROWS(names(x), i) if (!is.null(desc(x))) slot(x, "desc", check=FALSE) <- extractROWS(desc(x), i) mcols(x) <- extractROWS(mcols(x), i) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "append" method. ### ### TODO: Be more consistent with "[" which doesn't allow subscripts with ### duplicated positive values in order to make it harder for the user to ### produce a MaskCollection object with duplicated names. ### The "append" method below makes this too easy (with append(x, x)). ### .append.names.or.desc <- function(nm1, l1, nm2, l2, after) { if (is.null(nm1) && is.null(nm2)) return(as.character(NA)) if (is.null(nm1)) nm1 <- rep.int("", l1) if (is.null(nm2)) nm2 <- rep.int("", l2) append(nm1, nm2, after=after) } setMethod("append", c("MaskCollection", "MaskCollection"), function(x, values, after=length(x)) { if (width(values) != width(x)) stop("'x' and 'values' must have the same width") if (!isSingleNumber(after)) stop("'after' must be a single number") if (length(values) == 0) return(x) ans_nir_list <- append(nir_list(x), nir_list(values), after=after) ans_active <- append(active(x), active(values), after=after) l1 <- length(x) l2 <- length(values) ans_NAMES <- .append.names.or.desc(names(x), l1, names(values), l2, after) ans_desc <- .append.names.or.desc(desc(x), l1, desc(values), l2, after) ## This transformation must be atomic. x@nir_list <- ans_nir_list x@active <- ans_active x@NAMES <- ans_NAMES x@desc <- ans_desc x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### collapse() ### setGeneric("collapse", function(x) standardGeneric("collapse")) ### Always return a MaskCollection object of length 1 where the mask is active. setMethod("collapse", "MaskCollection", function(x) { keep_it <- active(x) if (!all(keep_it)) x <- x[keep_it] if (length(x) == 1) return(x) nir_list <- nir_list(x) if (length(nir_list) == 0) { nir1 <- new("NormalIRanges") } else { start1 <- unlist(lapply(nir_list, start)) width1 <- unlist(lapply(nir_list, width)) ranges <- new2("IRanges", start=start1, width=width1, check=FALSE) nir1 <- asNormalIRanges(ranges, force=TRUE) } ## This transformation must be atomic. x@nir_list <- list(nir1) x@active <- TRUE x@NAMES <- as.character(NA) x@desc <- as.character(NA) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### From a MaskCollection object to a NormalIRanges object. setAs("MaskCollection", "NormalIRanges", function(from) collapse(from)[[1L]] ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### MaskCollection.show_frame <- function(x) { lx <- length(x) cat("masks:") if (lx == 0) { cat(" NONE\n") } else { cat("\n") ## Explictely specify 'row.names=NULL' otherwise data.frame() will ## try to use the names of the first component that has suitable ## names, which could be 'active(x)' (3rd component) if 'x' has names. frame <- data.frame(maskedwidth=maskedwidth(x), maskedratio=maskedratio(x), active=active(x), row.names=NULL, check.names=FALSE) frame$names <- names(x) frame$desc <- desc(x) show(frame) if (lx >= 2) { margin <- format("", width=nchar(as.character(lx))) cat("all masks together:\n") mask0 <- collapse(`active<-`(x, TRUE)) frame <- data.frame(maskedwidth=maskedwidth(mask0), maskedratio=maskedratio(mask0), check.names=FALSE) row.names(frame) <- margin show(frame) if (sum(active(x)) < lx) { cat("all active masks together:\n") mask1 <- collapse(x) frame <- data.frame(maskedwidth=maskedwidth(mask1), maskedratio=maskedratio(mask1), check.names=FALSE) row.names(frame) <- margin show(frame) } } } } setMethod("show", "MaskCollection", function(object) { lo <- length(object) cat(class(object), " of length ", lo, " and width ", width(object), "\n", sep="") MaskCollection.show_frame(object) } ) IRanges/R/NCList-class.R0000644000175400017540000003543513175713360015733 0ustar00biocbuildbiocbuild### ========================================================================= ### NCList and NCLists objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### An S4 implementation of Nested Containment List (NCList). ### ### We deliberately do NOT extend IRanges. setClass("NCList", contains="Ranges", representation( nclist="integer", ranges="IRanges" ) ) setMethod("length", "NCList", function(x) length(x@ranges)) setMethod("names", "NCList", function(x) names(x@ranges)) setMethod("start", "NCList", function(x, ...) start(x@ranges)) setMethod("end", "NCList", function(x, ...) end(x@ranges)) setMethod("width", "NCList", function(x) width(x@ranges)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### .shift_ranges_to_first_circle() and ### .shift_ranges_in_groups_to_first_circle() ### ### TODO: Move to intra-range-methods.R, rename (e.g. shiftToFirstCircle()), ### make it a generic with methods for IRanges and IRangesList, export, and ### document. ### ### Returns a single integer. .normarg_circle.length1 <- function(circle.length) { msg <- "'circle.length' must be a single positive integer or NA" if (!isSingleNumberOrNA(circle.length)) stop(msg) if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) if (!is.na(circle.length) && circle.length <= 0L) stop(msg) circle.length } ### Returns an integer vector of length 'x_len'. .normarg_circle.length2 <- function(circle.length, x_len, what) { msg <- c("'circle.length' must be an integer vector ", "with positive or NA values") if (!is.atomic(circle.length)) stop(msg) if (!(length(circle.length) == 1L || length(circle.length) == x_len)) stop("'circle.length' must have length 1 or length of ", what) all_NAs <- all(is.na(circle.length)) if (!(all_NAs || is.numeric(circle.length))) stop(msg) if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) if (!all_NAs && min(circle.length, na.rm=TRUE) <= 0L) stop(msg) if (length(circle.length) == x_len) return(circle.length) rep.int(circle.length, x_len) } ### 'circle.length' assumed to have length 1 or length of 'x'. .shift_ranges_to_first_circle <- function(x, circle.length) { if (all(is.na(circle.length))) return(x) x_start0 <- start(x) - 1L # 0-based start x_shift0 <- x_start0 %% circle.length - x_start0 x_shift0[is.na(x_shift0)] <- 0L shift(x, x_shift0) } ### 'length(circle.length)' assumed to be >= 'length(x_groups)'. .shift_ranges_in_groups_to_first_circle <- function(x, x_groups, circle.length) { circle.length <- head(circle.length, n=length(x_groups)) if (all(is.na(circle.length))) return(x) unlisted_groups <- unlist(x_groups, use.names=FALSE) circle_len <- rep.int(NA_integer_, length(x)) circle_len[unlisted_groups + 1L] <- rep.int(circle.length, elementNROWS(x_groups)) .shift_ranges_to_first_circle(x, circle_len) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### NCList constructor ### ### Returns an external pointer to the NCList C struct. .NCList_xp <- function(x_start, x_end, x_subset) { ans <- .Call2("NCList_new", PACKAGE="IRanges") reg.finalizer(ans, function(e) .Call("NCList_free", e, PACKAGE="IRanges") ) .Call2("NCList_build", ans, x_start, x_end, x_subset, PACKAGE="IRanges") } .nclist <- function(x_start, x_end, x_subset=NULL) { nclist_xp <- .NCList_xp(x_start, x_end, x_subset) .Call2("new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE="IRanges") } NCList <- function(x, circle.length=NA_integer_) { if (!is(x, "Ranges")) stop("'x' must be a Ranges object") if (!is(x, "IRanges")) x <- as(x, "IRanges") ans_mcols <- mcols(x) mcols(x) <- NULL circle.length <- .normarg_circle.length1(circle.length) x <- .shift_ranges_to_first_circle(x, circle.length) x_nclist <- .nclist(start(x), end(x)) new2("NCList", nclist=x_nclist, ranges=x, elementMetadata=ans_mcols, check=FALSE) } setAs("Ranges", "NCList", function(from) NCList(from)) ### NOT exported. print_NCList <- function(x) { if (!is(x, "NCList")) stop("'x' must be an NCList object") .Call2("NCListAsINTSXP_print", x@nclist, start(x@ranges), end(x@ranges), PACKAGE="IRanges") invisible(NULL) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps_NCList() ### ### NOT exported. findOverlaps_NCList <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length=NA_integer_) { if (!(is(query, "Ranges") && is(subject, "Ranges"))) stop("'query' and 'subject' must be Ranges objects") if (!isSingleNumber(maxgap)) stop("'maxgap' must be a single integer") if (!is.integer(maxgap)) maxgap <- as.integer(maxgap) if (!isSingleNumber(minoverlap)) stop("'minoverlap' must be a single integer") if (!is.integer(minoverlap)) minoverlap <- as.integer(minoverlap) type <- match.arg(type) select <- match.arg(select) circle.length <- .normarg_circle.length1(circle.length) if (is(subject, "NCList")) { nclist <- subject@nclist nclist_is_q <- FALSE query <- .shift_ranges_to_first_circle(query, circle.length) } else if (is(query, "NCList")) { nclist <- query@nclist nclist_is_q <- TRUE subject <- .shift_ranges_to_first_circle(subject, circle.length) } else { ## We'll do "on-the-fly preprocessing". nclist <- NULL nclist_is_q <- NA query <- .shift_ranges_to_first_circle(query, circle.length) subject <- .shift_ranges_to_first_circle(subject, circle.length) } .Call2("NCList_find_overlaps", start(query), end(query), start(subject), end(subject), nclist, nclist_is_q, maxgap, minoverlap, type, select, circle.length, PACKAGE="IRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Representation of a list of NCList objects ### setClass("NCLists", contains="RangesList", representation( nclists="list", rglist="CompressedIRangesList" ), prototype( elementType="NCList" ) ) setMethod("parallelSlotNames", "NCLists", function(x) c("nclists", "rglist", callNextMethod()) ) ### TODO: Move rglist() generic from GenomicRanges to IRanges #setMethod("rglist", "NCLists", function(x, ...) x@rglist) setMethod("ranges", "NCLists", function(x, use.names=TRUE, use.mcols=FALSE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") if (!isTRUEorFALSE(use.mcols)) stop("'use.mcols' must be TRUE or FALSE") ans <- x@rglist if (!use.names) names(ans) <- NULL if (use.mcols) mcols(ans) <- mcols(x) ans } ) setMethod("length", "NCLists", function(x) length(x@rglist)) setMethod("names", "NCLists", function(x) names(x@rglist)) setMethod("start", "NCLists", function(x, ...) start(x@rglist)) setMethod("end", "NCLists", function(x, ...) end(x@rglist)) setMethod("width", "NCLists", function(x) width(x@rglist)) setMethod("elementNROWS", "NCLists", function(x) elementNROWS(x@rglist)) setMethod("getListElement", "NCLists", function (x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) new2("NCList", nclist=x@nclists[[i]], ranges=x@rglist[[i]], check=FALSE) } ) setAs("NCLists", "CompressedIRangesList", function(from) ranges(from, use.mcols=TRUE) ) setAs("NCLists", "IRangesList", function(from) ranges(from, use.mcols=TRUE) ) .extract_groups_from_RangesList <- function(x) { x_partitioning <- PartitioningByEnd(x) relist(as.integer(x_partitioning) - 1L, x_partitioning) } .nclists <- function(x, x_groups) { x_start <- start(x) x_end <- end(x) lapply(x_groups, function(group) .nclist(x_start, x_end, x_subset=group)) } ### NCLists constructor. NCLists <- function(x, circle.length=NA_integer_) { if (!is(x, "RangesList")) stop("'x' must be a RangesList object") if (!is(x, "CompressedIRangesList")) x <- as(x, "CompressedIRangesList") ans_mcols <- mcols(x) mcols(x) <- NULL unlisted_x <- unlist(x, use.names=FALSE) x_groups <- .extract_groups_from_RangesList(x) circle.length <- .normarg_circle.length2(circle.length, length(x_groups), "'x'") unlisted_x <- .shift_ranges_in_groups_to_first_circle( unlisted_x, x_groups, circle.length) x <- relist(unlisted_x, x) x_nclists <- .nclists(unlisted_x, x_groups) new2("NCLists", nclists=x_nclists, rglist=x, elementMetadata=ans_mcols, check=FALSE) } setAs("RangesList", "NCLists", function(from) NCLists(from)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### NCList_find_overlaps_in_groups() ### ### NOT exported. Workhorse behind findOverlaps_NCLists() below and behind ### GenomicRanges:::findOverlaps_GNCList(). NCList_find_overlaps_in_groups <- function( q, q_space, q_groups, s, s_space, s_groups, nclists, nclist_is_q, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length) { if (!(is(q, "Ranges") && is(s, "Ranges"))) stop("'q' and 's' must be Ranges object") if (!is(q_groups, "CompressedIntegerList")) stop("'q_groups' must be a CompressedIntegerList object") if (!is(s_groups, "CompressedIntegerList")) stop("'s_groups' must be a CompressedIntegerList object") if (!isSingleNumber(maxgap)) stop("'maxgap' must be a single integer") if (!is.integer(maxgap)) maxgap <- as.integer(maxgap) if (!isSingleNumber(minoverlap)) stop("'minoverlap' must be a single integer") if (!is.integer(minoverlap)) minoverlap <- as.integer(minoverlap) type <- match.arg(type) select <- match.arg(select) q_circle_len <- circle.length q_circle_len[which(nclist_is_q)] <- NA_integer_ q <- .shift_ranges_in_groups_to_first_circle(q, q_groups, q_circle_len) s_circle_len <- circle.length s_circle_len[which(!nclist_is_q)] <- NA_integer_ s <- .shift_ranges_in_groups_to_first_circle(s, s_groups, s_circle_len) .Call2("NCList_find_overlaps_in_groups", start(q), end(q), q_space, q_groups, start(s), end(s), s_space, s_groups, nclists, nclist_is_q, maxgap, minoverlap, type, select, circle.length, PACKAGE="IRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps_NCLists() ### .split_and_remap_hits <- function(all_hits, query, subject, select) { ## Compute list element lengths and offsets for 'query'. query_partitioning <- PartitioningByEnd(query) query_eltNROWS <- width(query_partitioning) query_offsets <- start(query_partitioning) - 1L ## Compute list element lengths and offsets for 'subject'. subject_partitioning <- PartitioningByEnd(subject) subject_eltNROWS <- width(subject_partitioning) subject_offsets <- start(subject_partitioning) - 1L if (select != "all") { ans <- head(relist(all_hits, query), n=length(subject)) if (select != "count") ans <- ans - head(subject_offsets, n=length(ans)) return(ans) } q_hits <- queryHits(all_hits) query_breakpoints <- end(query_partitioning) h_skeleton <- PartitioningByEnd(findInterval(query_breakpoints, q_hits)) lapply(seq_len(min(length(query), length(subject))), function(i) { hits <- all_hits[h_skeleton[[i]]] hits@from <- hits@from - query_offsets[[i]] hits@to <- hits@to - subject_offsets[[i]] hits@nLnode <- query_eltNROWS[[i]] hits@nRnode <- subject_eltNROWS[[i]] hits }) } ### NOT exported. ### Return an ordinary list of: ### (a) SortedByQueryHits objects if 'select' is "all". In that case the ### list has the length of the shortest of 'query' or 'subject'. ### (b) integer vectors if 'select' is not "all". In that case the list is ### parallel to and has the same shape as 'query'. findOverlaps_NCLists <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length=NA_integer_) { if (!(is(query, "RangesList") && is(subject, "RangesList"))) stop("'query' and 'subject' must be RangesList objects") type <- match.arg(type) select <- match.arg(select) circle.length <- .normarg_circle.length2(circle.length, max(length(query), length(subject)), "longest of 'query' or 'subject'") if (is(subject, "NCLists")) { nclists <- subject@nclists nclist_is_q <- rep.int(FALSE, length(nclists)) subject <- subject@rglist } else if (is(query, "NCLists")) { nclists <- query@nclists nclist_is_q <- rep.int(TRUE, length(nclists)) query <- query@rglist } else { ## We'll do "on-the-fly preprocessing". NG <- min(length(query), length(subject)) nclists <- vector(mode="list", length=NG) nclist_is_q <- rep.int(NA, length(nclists)) } if (!is(query, "CompressedIRangesList")) query <- as(query, "CompressedIRangesList") q <- unlist(query, use.names=FALSE) q_groups <- .extract_groups_from_RangesList(query) if (!is(subject, "CompressedIRangesList")) subject <- as(subject, "CompressedIRangesList") s <- unlist(subject, use.names=FALSE) s_groups <- .extract_groups_from_RangesList(subject) all_hits <- NCList_find_overlaps_in_groups( q, NULL, q_groups, s, NULL, s_groups, nclists, nclist_is_q, maxgap, minoverlap, type, select, circle.length) .split_and_remap_hits(all_hits, query, subject, select) } IRanges/R/RangedData-class.R0000644000175400017540000010012113175713360016552 0ustar00biocbuildbiocbuild### ========================================================================= ### RangedData objects ### ------------------------------------------------------------------------- ## For keeping data with your ranges ## There are two design aims: ## 1) Efficiency when data is large (i.e. apply by chromosome) ## 2) Convenience when data is not so large (i.e. unrolling the data) ## The ranges are stored in a RangesList, while the data is stored in ## a SplitDataFrameList. The RangesList is uncompressed, because ## users will likely want to apply over each Ranges separately, as ## they are usually in separate spaces. Also, it is difficult to ## compress RangesLists, as lists containing Views or NCLists ## are uncompressible. The SplitDataFrameList should be compressed, ## because it's cheap to create from a split factor and, more ## importantly, cheap to get and set columns along the entire dataset, ## which is common. Usually the data columns are atomic vectors and ## thus trivially compressed. It does, however, incur a slight ## performance penalty when applying over the RangedData. setClass("RangedData", contains = c("DataTable", "List"), representation(ranges = "RangesList", values = "SplitDataFrameList"), prototype = prototype(ranges = new("SimpleRangesList"), values = new("CompressedSplitDataFrameList"))) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("values", "RangedData", function(x) x@values) setReplaceMethod("values", "RangedData", function(x, value) { if (extends(class(value), "SplitDataFrameList")) { if (!identical(elementNROWS(values(x)), elementNROWS(value))) stop("'value' must have same elementNROWS ", "as current 'values'") } else if (extends(class(value), "DataFrame")) { value <- split(value, space(x)) } else { stop("'value' must extend class SplitDataFrameList or DataFrame") } if (is.null(rownames(value)) && !is.null(rownames(x))) rownames(value) <- rownames(x) else if (!identical(rownames(value), rownames(values(x)))) stop("rownames of 'value', if non-NULL, must match the ", "rownames of 'x'") x@values <- value x }) setMethod("ranges", "RangedData", function(x, use.names=TRUE, use.mcols=FALSE) x@ranges ) setReplaceMethod("ranges", "RangedData", function(x, value) { if (extends(class(value), "RangesList")) { if (!identical(lapply(ranges(x), names), lapply(value, names))) stop("'value' must have same length and names as current 'ranges'") } else if (extends(class(value), "IRanges")) { value <- split(value, space(x)) } else { stop("'value' must extend class RangesList or IRanges") } x@ranges <- value x }) ## range delegates setMethod("start", "RangedData", function(x) { start(unlist(ranges(x), use.names=FALSE)) }) setMethod("end", "RangedData", function(x) { end(unlist(ranges(x), use.names=FALSE)) }) setMethod("width", "RangedData", function(x) { width(unlist(ranges(x), use.names=FALSE)) }) setReplaceMethod("start", "RangedData", function(x, ..., value) { start(ranges(x), ...) <- value x }) setReplaceMethod("end", "RangedData", function(x, ..., value) { end(ranges(x), ...) <- value x }) setReplaceMethod("width", "RangedData", function(x, ..., value) { width(ranges(x), ...) <- value x }) setMethod("length", "RangedData", function(x) length(ranges(x))) setMethod("names", "RangedData", function(x) names(ranges(x))) setReplaceMethod("names", "RangedData", function(x, value) { if (!is.null(value) && !is.character(value)) stop("'value' must be NULL or a character vector") names(x@ranges) <- value names(x@values) <- value x }) setMethod("elementNROWS", "RangedData", function(x) elementNROWS(ranges(x))) setMethod("space", "RangedData", function(x) space(ranges(x))) setMethod("universe", "RangedData", function(x) universe(ranges(x))) setReplaceMethod("universe", "RangedData", function(x, value) { universe(x@ranges) <- value x }) setMethod("score", "RangedData", function(x) { score <- x[["score"]] ## if (is.null(score) && ncol(x) > 0 && is.numeric(x[[1L]])) ## score <- x[[1L]] score }) setReplaceMethod("score", "RangedData", function(x, value) { if (!is.numeric(value)) stop("score must be numeric") if (length(value) != nrow(x)) stop("number of scores must equal the number of rows") x[["score"]] <- value x }) ## values delegates setMethod("nrow", "RangedData", function(x) { sum(nrow(values(x))) }) setMethod("ncol", "RangedData", function(x) { ncol(values(x))[[1L]] }) setMethod("rownames", "RangedData", function(x, do.NULL = TRUE, prefix = "row") { rn <- unlist(rownames(values(x), do.NULL = do.NULL, prefix = prefix), use.names=FALSE) if (length(rn) == 0) rn <- NULL rn }) setMethod("colnames", "RangedData", function(x, do.NULL = TRUE, prefix = "col") { if (length(x) == 0) character() else colnames(values(x), do.NULL = do.NULL, prefix = prefix)[[1L]] }) setReplaceMethod("rownames", "RangedData", function(x, value) { if (!is.null(value)) { if (length(value) != nrow(x)) { stop("invalid 'row.names' length") } else { if (!is.character(value)) value <- as.character(value) ends <- cumsum(elementNROWS(x)) value <- new("CompressedCharacterList", unlistData = value, partitioning = PartitioningByEnd(ends)) } } ranges <- ranges(x) for(i in seq_len(length(ranges))) { names(ranges[[i]]) <- value[[i]] } x@ranges <- ranges rownames(x@values) <- value x }) setReplaceMethod("colnames", "RangedData", function(x, value) { colnames(x@values) <- value x }) setMethod("columnMetadata", "RangedData", function(x) { columnMetadata(values(x)) }) setReplaceMethod("columnMetadata", "RangedData", function(x, value) { columnMetadata(values(x)) <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.RangedData.ranges <- function(x) { if (!identical(lapply(ranges(x), length), lapply(values(x), nrow))) "'ranges' and 'values' must be of the same length and have the same names" else if (!identical(unlist(lapply(ranges(x), names), use.names=FALSE), rownames(x))) "the names of the ranges must equal the rownames" else NULL } .valid.RangedData.names <- function(x) { nms <- names(x) if (length(nms) != length(x)) "length(names(x)) must equal length(x)" else if (!is.character(nms) || S4Vectors:::anyMissing(nms) || anyDuplicated(nms)) "names(x) must be a character vector without any NA's or duplicates" else NULL } .valid.RangedData <- function(x) { c(.valid.RangedData.ranges(x), .valid.RangedData.names(x)) } setValidity2("RangedData", .valid.RangedData) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### ## creates a single-element RangedData (unless splitter (space) is specified) RangedData <- function(ranges = IRanges(), ..., space = NULL, universe = NULL) { hasDots <- (((nargs() - !missing(space)) - !missing(universe)) > 1) if (!is.null(universe)) { msg <- wmsg("The 'universe' argument of the RangedData() ", "constructor function is deprecated.") .Deprecated(msg=msg) if (!isSingleString(universe)) stop("'universe' must be a single string") } if (is(ranges, "RangesList")) { if (!is.null(space)) warning("since 'class(ranges)' extends RangesList, 'space' argument is ignored") if (is.null(names(ranges))) names(ranges) <- as.character(seq_len(length(ranges))) space <- Rle(factor(names(ranges), levels = names(ranges)), elementNROWS(ranges)) N <- sum(elementNROWS(ranges)) NAMES <- unlist(lapply(ranges, names), use.names=FALSE) } else { if (!is(ranges, "Ranges")) { coerced <- try(as(ranges, "RangedData"), silent=TRUE) if (is(coerced, "RangedData")) return(coerced) stop("'ranges' must be a Ranges or directly coercible to RangedData") } N <- length(ranges) NAMES <- names(ranges) if (is.null(space)) { if (N == 0) space <- Rle(factor()) else space <- Rle(factor("1")) } else if (!is(space, "Rle")) { space <- Rle(space) } if (!is.factor(runValue(space))) runValue(space) <- factor(runValue(space)) if (length(space) != N) { if (length(space) == 0L) stop("'space' is a 0-length vector but length of 'ranges' is > 0") ## We make an exception to the "length(space) must be <= N" rule when ## N != 0L so we can support the direct creation of RangedData objects ## with 0 rows across 1 or more user-specified spaces like in: ## RangedData(ranges=IRanges(), space=letters) if (N != 0L && length(space) > N) stop("length of 'space' greater than length of 'ranges'") if (N %% length(space) != 0) stop("length of 'ranges' not a multiple of 'space' length") space <- rep(space, length.out = N) } if (!is(ranges, "IRanges")) ranges <- as(ranges, "IRanges") ranges <- split(ranges, space) } if (!is.null(universe)) universe(ranges) <- universe if (hasDots) { args <- list(...) if (length(args) == 1L && is(args[[1L]], "SplitDataFrameList")) { values <- unlist(args[[1L]], use.names=FALSE) } else { values <- DataFrame(...) } } else values <- S4Vectors:::make_zero_col_DataFrame(N) if (N != nrow(values)) { if (nrow(values) > N) stop("length of value(s) in '...' greater than length of 'ranges'") if (nrow(values) == 0 || N %% nrow(values) != 0) stop("length of 'ranges' not a multiple of length of value(s) in '...'") rind <- S4Vectors:::recycleVector(seq_len(nrow(values)), N) values <- values[rind,,drop=FALSE] } rownames(values) <- NAMES ## ensure these are identical values <- split(values, space) new2("RangedData", ranges = ranges, values = values, check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### ## The extraction operator delegates to the values (extracts columns) setMethod("[[", "RangedData", function(x, i, j, ...) { dotArgs <- list(...) if (length(dotArgs) > 0) dotArgs <- dotArgs[names(dotArgs) != "exact"] if (!missing(j) || length(dotArgs) > 0) stop("invalid subsetting") if (missing(i)) stop("subscript is missing") if (!is.character(i) && !is.numeric(i)) stop("invalid subscript type") if (length(i) < 1L) stop("attempt to select less than one element") if (length(i) > 1L) stop("attempt to select more than one element") if (is.numeric(i) && !is.na(i) && (i < 1L || i > ncol(x))) stop("subscript out of bounds") if (is.na(i) || (is.character(i) && !(i %in% c("space", "ranges", colnames(x))))) NULL else if (i == "space") space(x) else if (i == "ranges") unlist(ranges(x), use.names=FALSE) else unlist(values(x), use.names=FALSE)[[i]] }) setReplaceMethod("[[", "RangedData", function(x, i, j,..., value) { if (!missing(j) || length(list(...)) > 0) stop("invalid subsetting") if (missing(i)) stop("subscript is missing") if (!is.character(i) && !is.numeric(i)) stop("invalid subscript type") if (length(i) < 1L) stop("attempt to select less than one element") if (length(i) > 1L) stop("attempt to select more than one element") if (is.numeric(i) && (i < 1L || i > ncol(x) + 1L)) stop("subscript out of bounds") if (i == "space") stop("cannot replace \"space\" information") if (i == "ranges") { ranges(x) <- value } else { nrx <- nrow(x) lv <- length(value) if (!is.null(value) && (nrx != lv)) { if ((nrx == 0) || (nrx %% lv != 0)) stop(paste(lv, "elements in value to replace", nrx, "elements")) else value <- rep(value, length.out = nrx) } nrows <- elementNROWS(values(x)) inds <- seq_len(length(x)) spaces <- factor(rep.int(inds, nrows), inds) values <- unlist(values(x), use.names=FALSE) values[[i]] <- value x@values <- split(values, spaces) names(x@values) <- names(x) } x }) setReplaceMethod("$", "RangedData", function(x, name, value) { x[[name]] <- value x }) ### Supported index types: numeric, logical, character, NULL and missing. ## Two index modes: ## - list style ([i]): subsets by range space (e.g. chromosome) ## - matrix style ([i,j]): subsets the data frame setMethod("[", "RangedData", function(x, i, j, ..., drop=FALSE) { if (length(list(...)) > 0) stop("parameters in '...' not supported") if (missing(i) && missing(j)) return(x) checkIndex <- function(i, lx, nms) { if (!is.atomic(i)) return("invalid subscript type") if (is.numeric(i)) { if (!is.integer(i)) i <- as.integer(i) if (S4Vectors:::anyMissingOrOutside(i, upper = lx)) return("subscript contains NAs or out of bounds indices") if (S4Vectors:::anyMissingOrOutside(i, 0L, lx) && S4Vectors:::anyMissingOrOutside(i, upper = 0L)) return("negative and positive indices cannot be mixed") } else if (is.logical(i)) { if (S4Vectors:::anyMissing(i)) return("subscript contains NAs") if (length(i) > lx) return("subscript out of bounds") } else if ((is.character(i) || is.factor(i))) { if (S4Vectors:::anyMissing(i)) return("subscript contains NAs") if (S4Vectors:::anyMissing(match(i, nms))) return("mismatching names") } else if (!is.null(i)) { return("invalid subscript type") } NULL } mstyle <- nargs() > 2 if (mstyle) { ranges <- ranges(x) values <- values(x) if (!missing(j)) { prob <- checkIndex(j, ncol(x), colnames(x)) if (!is.null(prob)) stop("selecting cols: ", prob) values <- values[, j, drop=FALSE] } if (!missing(i)) { if (is(i, "RangesList")) stop("subsetting a RangedData object ", "by a RangesList subscript is not supported") if (is(i, "LogicalList")) { x_eltNROWS <- elementNROWS(ranges(x)) whichRep <- which(x_eltNROWS != elementNROWS(i)) for (k in whichRep) i[[k]] <- rep(i[[k]], length.out = x_eltNROWS[k]) i <- unlist(i, use.names=FALSE) } else if (is(i, "IntegerList")) { itemp <- LogicalList(lapply(elementNROWS(ranges(x)), rep, x = FALSE)) for (k in seq_len(length(itemp))) itemp[[k]][i[[k]]] <- TRUE i <- unlist(itemp, use.names=FALSE) } prob <- checkIndex(i, nrow(x), rownames(x)) if (!is.null(prob)) stop("selecting rows: ", prob) if (is.numeric(i) && any(i < 0)) i <- setdiff(seq(nrow(x)), -i) if (is.logical(i)) { igroup <- factor(rep.int(seq_len(length(x)), elementNROWS(x)), levels = seq_len(length(x))) if (length(i) < nrow(x)) i <- rep(i, length.out = nrow(x)) } else { if (is.null(i)) i <- integer(0) if (is.factor(i)) i <- as.character(i) if (is.character(i)) { dummy <- seq_len(nrow(x)) names(dummy) <- rownames(x) i <- dummy[i] if (S4Vectors:::anyMissing(i)) ## cannot subset by NAs yet stop("invalid rownames specified") } starts <- cumsum(c(1L, head(elementNROWS(x), -1))) igroup <- factor(findInterval(i, starts), levels = seq_len(length(x))) if (anyDuplicated(runValue(Rle(igroup)))) stop("cannot mix row indices from different spaces") i <- i - (starts - 1L)[as.integer(igroup)] } isplit <- split(i, igroup) names(isplit) <- names(x) ranges <- S4Vectors:::subset_List_by_List(ranges, isplit) values <- S4Vectors:::subset_List_by_List(values, isplit) if (drop) { ok <- (elementNROWS(ranges) > 0) ranges <- ranges[ok] values <- values[ok] } } } else { if (!missing(i)) { prob <- checkIndex(i, length(x), names(x)) if (!is.null(prob)) stop("selecting spaces: ", prob) ranges <- ranges(x)[i] values <- values(x)[i] } } x@ranges <- ranges x@values <- values x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining and splitting. ### setMethod("c", "RangedData", function(x, ..., recursive = FALSE) { if (!identical(recursive, FALSE)) stop("\"c\" method for RangedData objects ", "does not support the 'recursive' argument") if (missing(x)) rds <- unname(list(...)) else rds <- unname(list(x, ...)) rd <- rds[[1L]] if (!all(sapply(rds, is, "RangedData"))) stop("all arguments in '...' must be RangedData objects") nms <- lapply(rds, ## figure out names like 'c' on an ordinary vector function(rd) structure(logical(length(rd)), names = names(rd))) nms <- names(do.call(c, nms)) names(rds) <- NULL # critical for dispatch to work ranges <- do.call(c, lapply(rds, ranges)) values <- do.call(c, lapply(rds, values)) names(ranges) <- nms rd@ranges <- ranges names(values) <- nms rd@values <- values rd }) setMethod("rbind", "RangedData", function(..., deparse.level=1) { args <- unname(list(...)) rls <- lapply(args, ranges) nms <- unique(unlist(lapply(args, names), use.names=FALSE)) rls <- lapply(rls, function(x) {y <- as.list(x)[nms];names(y) <- nms;y}) dfs <- lapply(args, function(x) {y <- as.list(values(x))[nms];names(y) <- nms;y}) safe.c <- function(...) { x <- list(...) do.call(c, x[!sapply(x, is.null)]) } rls <- IRangesList(do.call(Map, c(list(safe.c), rls))) safe.rbind <- function(...) { x <- list(...) do.call(rbind, x[!sapply(x, is.null)]) } dfs <- SplitDataFrameList(do.call(Map, c(list(safe.rbind), dfs))) for (i in seq_len(length(rls))) names(rls[[i]]) <- rownames(dfs[[i]]) initialize(args[[1L]], ranges = rls, values = dfs) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ### The 2 functions, as.data.frame.RangesList() and ### as.data.frame.DataFrameList() are needed for as.data.frame.RangedData(). ### ### A new as.data.frame,List method was implemented in BioC 2.15 and ### is now used by all List classes. Because the RangedData class is being ### phased out, we want to retain the old behavior. In order to do that ### we have to keep these 2 helpers because as.data.frame.RangedData() ### uses old methods from both RangesList and DataFrameList. ### ### These helpers are not exported. .as.data.frame.RangesList <- function(x, row.names=NULL, optional=FALSE, ...) { if (!(is.null(row.names) || is.character(row.names))) stop("'row.names' must be NULL or a character vector") x <- as(x, "CompressedIRangesList") spaceLevels <- seq_len(length(x)) if (length(names(x)) > 0) { spaceLabels <- names(x) } else { spaceLabels <- as.character(spaceLevels) } data.frame(space = factor(rep.int(seq_len(length(x)), elementNROWS(x)), levels = spaceLevels, labels = spaceLabels), as.data.frame(unlist(x, use.names = FALSE)), row.names = row.names, stringsAsFactors = FALSE) } .as.data.frame.DataFrameList <- function(x, row.names=NULL, optional=FALSE, ...) { if (!(is.null(row.names) || is.character(row.names))) stop("'row.names' must be NULL or a character vector") if (!missing(optional) || length(list(...))) warning("'optional' and arguments in '...' ignored") stacked <- stack(x) if (is.null(row.names)) row.names <- rownames(stacked) as.data.frame(stacked, row.names = row.names, optional = optional) } .as.data.frame.RangedData <- function(x, row.names=NULL, optional=FALSE, ...) { if (!(is.null(row.names) || is.character(row.names))) stop("'row.names' must be NULL or a character vector") if (!missing(optional) || length(list(...))) warning("'optional' and arguments in '...' ignored") data.frame(.as.data.frame.RangesList(ranges(x)), .as.data.frame.DataFrameList(values(x))[-1L], row.names = row.names, stringsAsFactors = FALSE) } setMethod("as.data.frame", "RangedData", .as.data.frame.RangedData) setAs("RangedData", "DataFrame", function(from) { DataFrame(as.data.frame(ranges(from)), unlist(values(from), use.names=FALSE)) }) setAs("Rle", "RangedData", function(from) { new2("RangedData", ranges = IRangesList("1" = successiveIRanges(runLength(from))), values = SplitDataFrameList("1" = DataFrame(score = runValue(from))), metadata = metadata(from), check = FALSE) }) setAs("RleList", "RangedData", function(from) { ranges <- IRangesList(lapply(from, function(x) successiveIRanges(runLength(x)))) values <- SplitDataFrameList(lapply(from, function(x) DataFrame(score = runValue(x)))) if (is.null(names(from))) { nms <- as.character(seq_len(length(from))) names(ranges) <- nms names(values) <- nms } new2("RangedData", ranges = ranges, values = values, metadata = metadata(from), elementMetadata = elementMetadata(from), check = FALSE) }) setAs("RleViewsList", "RangedData", function(from) { subject <- subject(from) from_ranges <- restrict(ranges(from), 1L, elementNROWS(subject), keep.all.ranges = TRUE) ### FIXME: do we want to insert NAs for out of bounds views? score <- subject[from_ranges] score_part <- as(lapply(width(from_ranges), PartitioningByWidth), "RangesList") score_ranges <- ranges(score) ol <- findOverlaps(score_ranges, score_part) offind <- as(lapply(ol, subjectHits), "IntegerList") offset <- (start(from_ranges) - start(score_part))[offind] ranges <- shift(ranges(ol, score_ranges, score_part), offset) viewNames <- lapply(from_ranges, function(x) { if (is.null(names(x))) seq_len(length(x)) else names(x) }) RangedData(ranges, score = unlist(runValue(score), use.names = FALSE)[queryHits(ol)], view = unlist(viewNames, use.names = FALSE)[subjectHits(ol)]) }) setAs("Ranges", "RangedData", function(from) { RangedData(from) }) setAs("RangesList", "RangedData", function(from) { from_names <- names(from) if (is.null(from_names) || anyDuplicated(from_names)) stop("cannot coerce a RangesList object with no names ", "or duplicated names to a RangedData object") unlisted_from <- unlist(from, use.names=FALSE) unlisted_values <- mcols(unlisted_from) mcols(unlisted_from) <- NULL ans_ranges <- relist(unlisted_from, skeleton=from) metadata(ans_ranges) <- metadata(from) if (!is(unlisted_values, "DataFrame")) { if (!is.null(unlisted_values)) warning("could not propagate the inner metadata columns of ", "'from' (accessed with 'mcols(unlist(from))') ", "to the data columns (aka values) of the returned ", "RangedData object") unlisted_values <- S4Vectors:::make_zero_col_DataFrame(length(unlisted_from)) } ans_values <- newCompressedList0("CompressedSplitDataFrameList", unlisted_values, PartitioningByEnd(ans_ranges)) new2("RangedData", ranges=ans_ranges, values=ans_values, #metadata=metadata(from), elementMetadata=elementMetadata(from), check=FALSE) } ) .fromRangedDataToCompressedIRangesList <- function(from) { ans <- ranges(from) ## Propagate 'values(from)'. ans_unlisted_values <- unlist(values(from), use.names=FALSE) mcols(ans@unlistData) <- ans_unlisted_values ans } setAs("RangedData", "CompressedIRangesList", .fromRangedDataToCompressedIRangesList ) setAs("RangedData", "IRangesList", .fromRangedDataToCompressedIRangesList) setAs("RangedData", "RangesList", .fromRangedDataToCompressedIRangesList) setMethod("as.env", "RangedData", function(x, enclos = parent.frame(2)) { env <- callNextMethod(x, enclos) makeAccessorBinding <- function(fun, name = deparse(substitute(fun))) { makeActiveBinding(name, function() { val <- fun(x) rm(list=name, envir=env) assign(name, val, env) ## cache for further use val }, env) } makeAccessorBinding(ranges) makeAccessorBinding(space) makeAccessorBinding(start) makeAccessorBinding(width) makeAccessorBinding(end) env }) .RangedData_fromDataFrame <- function(from) { required <- c("start", "end") if (!all(required %in% colnames(from))) stop("'from' must at least include a 'start' and 'end' column") datacols <- setdiff(colnames(from), c(required, "space", "width")) RangedData(IRanges(from$start, from$end), from[,datacols,drop=FALSE], space = from$space) } setAs("data.frame", "RangedData", .RangedData_fromDataFrame) setAs("DataTable", "RangedData", .RangedData_fromDataFrame) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### setMethod("show", "RangedData", function(object) { nr <- nrow(object) nc <- ncol(object) lo <- length(object) cat(class(object), " with ", nr, ifelse(nr == 1, " row and ", " rows and "), nc, ifelse(nc == 1, " value column across ", " value columns across "), lo, ifelse(lo == 1, " space\n", " spaces\n"), sep = "") if (nr > 0) { nms <- rownames(object) if (nr < 20) { ranges <- unlist(ranges(object), use.names=FALSE) values <- unlist(values(object), use.names=FALSE) out <- cbind(space = as.character(space(object)), ranges = showAsCell(ranges), "|" = rep.int("|", nr)) if (nc > 0) out <- cbind(out, as.matrix(format(do.call(data.frame, lapply(values, showAsCell))))) if (is.null(nms)) rownames(out) <- as.character(seq_len(nr)) else rownames(out) <- nms classinfo <- matrix(c("", "", "|", unlist(lapply(values, function(x) { paste("<", classNameForDisplay(x), ">", sep = "") }), use.names = FALSE)), nrow = 1, dimnames = list("", colnames(out))) } else { top <- object[1:9, ] topRanges <- unlist(ranges(top), use.names=FALSE) topValues <- unlist(values(top), use.names=FALSE) bottom <- object[(nr-8L):nr, ] bottomRanges <- unlist(ranges(bottom), use.names=FALSE) bottomValues <- unlist(values(bottom), use.names=FALSE) out <- rbind(cbind(space = as.character(space(top)), ranges = showAsCell(topRanges), "|" = rep.int("|", 9)), rbind(rep.int("...", 3)), cbind(space = as.character(space(bottom)), ranges = showAsCell(bottomRanges), "|" = rep.int("|", 9))) if (nc > 0) out <- cbind(out, rbind(as.matrix(format(do.call(data.frame, lapply(topValues, showAsCell)))), rbind(rep.int("...", nc)), rbind(as.matrix(format(do.call(data.frame, lapply(bottomValues, showAsCell))))))) if (is.null(nms)) { rownames(out) <- c(as.character(1:9), "...", as.character((nr-8L):nr)) } else { rownames(out) <- c(head(nms, 9), "...", tail(nms, 9)) } classinfo <- matrix(c("", "", "|", unlist(lapply(topValues, function(x) { paste("<", classNameForDisplay(x), ">", sep = "") }), use.names = FALSE)), nrow = 1, dimnames = list("", colnames(out))) } out <- rbind(classinfo, out) print(out, quote = FALSE, right = TRUE) } }) IRanges/R/RangedData-utils.R0000644000175400017540000000443713175713360016622 0ustar00biocbuildbiocbuild### ========================================================================= ### RangedData utilities ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Applying ### setMethod("lapply", "RangedData", function(X, FUN, ...) { FUN <- match.fun(FUN) inds <- structure(seq(length(X)), names = names(X)) lapply(inds, function(i) FUN(X[i], ...)) }) setMethod("endoapply", "RangedData", function(X, FUN, ...) { FUN <- match.fun(FUN) ans <- try(do.call(c, unname(lapply(X, FUN, ...))), silent = TRUE) if (inherits(ans, "try-error") || (class(ans) != class(X))) stop("'FUN' did not produce an endomorphism") ans }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### within() ### setMethod("within", "RangedData", function(data, expr, ...) { e <- list2env(as.list(as(data, "DataFrame"))) e$ranges <- ranges(data) S4Vectors:::safeEval(substitute(expr), e, S4Vectors:::top_prenv(expr)) reserved <- c("ranges", "start", "end", "width", "space") l <- mget(setdiff(ls(e), reserved), e) l <- l[!sapply(l, is.null)] nD <- length(del <- setdiff(colnames(data), (nl <- names(l)))) for (nm in nl) data[[nm]] <- l[[nm]] for (nm in del) data[[nm]] <- NULL if (!identical(ranges(data), e$ranges)) ranges(data) <- e$ranges else { if (!identical(start(data), e$start)) start(data) <- e$start if (!identical(end(data), e$end)) end(data) <- e$end if (!identical(width(data), e$width)) width(data) <- e$width } data }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Merging (TODO) ### #setMethod("merge", "RangedData", # function(x, y, by = 1, all = FALSE, all.x = all, all.y = all, # resolver = intersect, sort = TRUE, suffixes = c(".x",".y")) # { # # }) IRanges/R/RangedSelection-class.R0000644000175400017540000000307713175713360017642 0ustar00biocbuildbiocbuild### ========================================================================= ### Selection of features and columns by intervals and column names ### ------------------------------------------------------------------------- setClass("RangedSelection", representation(ranges = "RangesList", colnames = "character")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "RangedSelection", function(x, use.names=TRUE, use.mcols=FALSE) x@ranges ) setReplaceMethod("ranges", "RangedSelection", function(x, value) { x@ranges <- value x }) setMethod("colnames", "RangedSelection", function(x, do.NULL = TRUE, prefix = "col") x@colnames) setReplaceMethod("colnames", "RangedSelection", function(x, value) { x@colnames <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### RangedSelection <- function(ranges = RangesList(), colnames = character()) { if (!is(ranges, "RangesList")) stop("'ranges' must be a RangesList") if (!is.character(colnames) || S4Vectors:::anyMissing(colnames)) stop("'colnames' must be a character vector without missing values") new("RangedSelection", ranges = ranges, colnames = colnames) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("RangesList", "RangedSelection", function(from) RangedSelection(from)) IRanges/R/Ranges-class.R0000644000175400017540000002152113175713360016005 0ustar00biocbuildbiocbuild### ========================================================================= ### Ranges objects ### ------------------------------------------------------------------------- ### ### Ranges is a virtual class that serves as the base for all range containers ### Conceptually Ranges are closed, one-dimensional intervals with integer end ### points and on the domain of integers. ### setClass("Ranges", contains="IntegerList", representation("VIRTUAL")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters/setters. ### setMethod("length", "Ranges", function(x) length(width(x))) ### Without this definition, we inherit the method for Vector objects ### which is very inefficient on Ranges objects! setMethod("elementNROWS", "Ranges", function(x) setNames(width(x), names(x))) ### The 3 default methods below provide a formalization of the relationship ### between the starts/widths/ends of a Ranges object. Of course Ranges ### subclasses need to implement at least 2 of them! ### Note that when width(x)[i] is 0, then end(x)[i] is start(x)[i] - 1 setMethod("start", "Ranges", function(x, ...) {1L - width(x) + end(x)}) setMethod("width", "Ranges", function(x) {end(x) - start(x) + 1L}) setMethod("end", "Ranges", function(x, ...) {width(x) - 1L + start(x)}) setGeneric("mid", function(x, ...) standardGeneric("mid")) setMethod("mid", "Ranges", function(x) start(x) + as.integer((width(x)-1) / 2)) setMethod("update", "Ranges", function(object, ...) as(update(as(object, "IRanges"), ...), class(object)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### ### The checking of the names(x) is taken care of by the validity method for ### Vector objects. .valid.Ranges <- function(x) { x_start <- start(x) x_end <- end(x) x_width <- width(x) validity_failures <- .Call2("valid_Ranges", x_start, x_end, x_width, PACKAGE="IRanges") if (!is.null(validity_failures)) return(validity_failures) if (!(is.null(names(x_start)) && is.null(names(x_end)) && is.null(names(x_width)))) return(paste0("'start(x)', 'end(x)', and 'width(x)' ", "cannot have names on them")) NULL } setValidity2("Ranges", .valid.Ranges) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### Propagate the names. setMethod("as.character", "Ranges", function(x) { if (length(x) == 0L) return(setNames(character(0), names(x))) x_start <- start(x) x_end <- end(x) ans <- paste0(x_start, "-", x_end) idx <- which(x_start == x_end) ans[idx] <- as.character(x_start)[idx] names(ans) <- names(x) ans } ) ### The as.factor() generic doesn't have the ... argument so this method ### cannot support the 'ignore.strand' argument. setMethod("as.factor", "Ranges", function(x) factor(as.character(x), levels=as.character(sort(unique(x)))) ) setMethod("as.matrix", "Ranges", function(x, ...) matrix(data=c(start(x), width(x)), ncol=2, dimnames=list(names(x), NULL)) ) .as.data.frame.Ranges <- function(x, row.names=NULL, optional=FALSE, ...) { if (!(is.null(row.names) || is.character(row.names))) stop("'row.names' must be NULL or a character vector") ans <- data.frame(start=start(x), end=end(x), width=width(x), row.names=row.names, check.rows=TRUE, check.names=FALSE, stringsAsFactors=FALSE) ans$names <- names(x) ans } setMethod("as.data.frame", "Ranges", .as.data.frame.Ranges) setMethod("as.integer", "Ranges", function(x, ...) S4Vectors:::fancy_mseq(width(x), offset=start(x)-1L) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More stuff. ### ### TODO: Reorganize this ### setMethod("unlist", "Ranges", function(x, recursive=TRUE, use.names=TRUE) { if (!identical(recursive, TRUE)) stop("\"unlist\" method for Ranges objects ", "does not support the 'recursive' argument") if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") ans <- as.integer(x) # 'ans' should have no names stopifnot(is.null(names(ans))) # sanity check if (use.names && !is.null(names(x))) names(ans) <- rep.int(names(x), elementNROWS(x)) ans } ) setMethod("getListElement", "Ranges", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) ans_shift <- start(x)[i] - 1L ans_length <- width(x)[i] seq_len(ans_length) + ans_shift } ) .make_naked_matrix_from_Ranges <- function(x) { x_len <- length(x) x_mcols <- mcols(x) x_nmc <- if (is.null(x_mcols)) 0L else ncol(x_mcols) ans <- cbind(start=as.character(start(x)), end=as.character(end(x)), width=as.character(width(x))) if (x_nmc > 0L) { tmp <- do.call(data.frame, c(lapply(x_mcols, showAsCell), list(check.names=FALSE))) ans <- cbind(ans, `|`=rep.int("|", x_len), as.matrix(tmp)) } ans } showRanges <- function(x, margin="", print.classinfo=FALSE) { x_class <- class(x) x_len <- length(x) x_mcols <- mcols(x) x_nmc <- if (is.null(x_mcols)) 0L else ncol(x_mcols) cat(x_class, " object with ", x_len, " ", ifelse(x_len == 1L, "range", "ranges"), " and ", x_nmc, " metadata ", ifelse(x_nmc == 1L, "column", "columns"), ":\n", sep="") ## S4Vectors:::makePrettyMatrixForCompactPrinting() assumes that 'x' is ## subsettable but not all Ranges objects are (and if they are, ## subsetting them could be costly). However IRanges objects are assumed ## to be subsettable so if 'x' is not one then we turn it into one (this ## coercion is expected to work on any Ranges object). if (!is(x, "IRanges")) x <- as(x, "IRanges", strict=FALSE) out <- S4Vectors:::makePrettyMatrixForCompactPrinting(x, .make_naked_matrix_from_Ranges) if (print.classinfo) { .COL2CLASS <- c( start="integer", end="integer", width="integer" ) classinfo <- S4Vectors:::makeClassinfoRowForCompactPrinting(x, .COL2CLASS) ## A sanity check, but this should never happen! stopifnot(identical(colnames(classinfo), colnames(out))) out <- rbind(classinfo, out) } if (nrow(out) != 0L) rownames(out) <- paste0(margin, rownames(out)) ## We set 'max' to 'length(out)' to avoid the getOption("max.print") ## limit that would typically be reached when 'showHeadLines' global ## option is set to Inf. print(out, quote=FALSE, right=TRUE, max=length(out)) } setMethod("show", "Ranges", function(object) showRanges(object, margin=" ", print.classinfo=TRUE) ) setMethod("showAsCell", "Ranges", function(object) { if (length(object) == 0L) return(character(0)) paste("[", format(start(object)), ", ", format(end(object)), "]", sep = "") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isEmpty() and isNormal() ### ### All of them test a Ranges object as a whole and return a single TRUE or ### FALSE. ### ### A Ranges object is considered empty iff all its ranges are empty. setMethod("isEmpty", "Ranges", function(x) all(width(x) == 0L)) setGeneric("isNormal", function(x, ...) standardGeneric("isNormal")) setMethod("isNormal", "Ranges", function(x) { all_ok <- all(width(x) >= 1L) if (length(x) >= 2) all_ok <- all_ok && all(start(x)[-1L] - end(x)[-length(x)] >= 2L) all_ok } ) setGeneric("whichFirstNotNormal", function(x) standardGeneric("whichFirstNotNormal") ) setMethod("whichFirstNotNormal", "Ranges", function(x) { is_ok <- width(x) >= 1L if (length(x) >= 2) is_ok <- is_ok & c(TRUE, start(x)[-1L] - end(x)[-length(x)] >= 2L) which(!is_ok)[1L] } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting ### ### TODO: "extractROWS" and most of the Ranges endomorphisms are only ### defined for IRanges objects. Need to fix up the update mechanism, so that ### they can be defined on Ranges. "extractROWS" and other endomorphisms ### are currently implemented as wrappers that coerce to IRanges, which is not ### efficient so not a general, long-term solution. setMethod("extractROWS", "Ranges", function(x, i) as(callNextMethod(as(x, "IRanges", strict=FALSE), i), class(x)) ) IRanges/R/Ranges-comparison.R0000644000175400017540000001245413175713360017057 0ustar00biocbuildbiocbuild### ========================================================================= ### Comparing and ordering ranges ### ------------------------------------------------------------------------- ### setMethod("pcompareRecursively", "Ranges", function(x) FALSE) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pcompare() ### ### Ranges are ordered by starting position first and then by width. ### This way, the space of ranges is totally ordered. ### This "pcompare" method returns one of the 13 predefined codes (>= -6 and ### <= 6) described in the man page. The signs of those codes reflect this ### order. ### setMethod("pcompare", c("Ranges", "Ranges"), function(x, y) { .Call2("Ranges_pcompare", start(x), width(x), start(y), width(y), PACKAGE="IRanges") } ) rangeComparisonCodeToLetter <- function(code) { if (!is.integer(code)) stop("'code' must be an integer vector") code <- code + 7L code[code < 1L | 14L < code] <- 14L levels <- c(letters[1:13], "X") structure(code, levels=levels, class="factor") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### match() ### setMethod("match", c("Ranges", "Ranges"), function(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash")) { if (!is.null(incomparables)) stop("\"match\" method for Ranges objects ", "only accepts 'incomparables=NULL'") ## Equivalent to (but faster than): ## findOverlaps(x, table, type="equal", select="first") ## except when 'x' or 'table' contain empty ranges. matchIntegerPairs(start(x), width(x), start(table), width(table), nomatch=nomatch, method=method) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### selfmatch() ### setMethod("selfmatch", "Ranges", function(x, method=c("auto", "quick", "hash")) selfmatchIntegerPairs(start(x), width(x), method=method) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### order() and related methods. ### ### is.unsorted(), order(), sort(), rank() on Ranges derivatives are ### consistent with the order implied by pcompare(). ### is.unsorted() is a quick/cheap way of checking whether a Ranges ### derivative is already sorted, e.g., called prior to a costly sort. ### sort() and rank() will work out-of-the-box on a Ranges derivative thanks ### to the method for List objects (which delegates to the method for Vector ### objects). ### .Ranges_as_IntegerPairs <- function(x) { a <- start(x) b <- width(x) list(a, b) } setMethod("is.unsorted", "Ranges", function(x, na.rm=FALSE, strictly=FALSE) { if (!identical(na.rm, FALSE)) warning("\"is.unsorted\" method for Ranges objects ", "ignores the 'na.rm' argument") if (!isTRUEorFALSE(strictly)) stop("'strictly' must be TRUE of FALSE") ## It seems that creating the integer pairs below is faster when ## 'x' is already sorted (TODO: Investigate why). Therefore, and ## somewhat counterintuitively, is.unsorted() can be faster when 'x' ## is already sorted (which, in theory, is the worst-case scenario ## because S4Vectors:::sortedIntegerPairs() will then need to take a ## full walk on 'x') than when it is unsorted (in which case ## S4Vectors:::sortedIntegerPairs() might stop walking on 'x' after ## checking its first 2 elements only -- the best-case scenario). pairs <- .Ranges_as_IntegerPairs(x) !S4Vectors:::sortedIntegerPairs(pairs[[1L]], pairs[[2L]], strictly=strictly) } ) .order_Ranges <- function(x, decreasing=FALSE) { if (!isTRUEorFALSE(decreasing)) stop("'decreasing' must be TRUE or FALSE") pairs <- .Ranges_as_IntegerPairs(x) orderIntegerPairs(pairs[[1L]], pairs[[2L]], decreasing=decreasing) } ### 'na.last' is pointless (Ranges objects don't contain NAs) so is ignored. ### 'method' is also ignored at the moment. setMethod("order", "Ranges", function(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")) { ## Turn off this warning for now since it triggers spurious warnings ## when calling sort() on a RangesList object. The root of the ## problem is inconsistent defaults for 'na.last' between order() and ## sort(), as reported here: ## https://stat.ethz.ch/pipermail/r-devel/2015-November/072012.html #if (!identical(na.last, TRUE)) # warning("\"order\" method for Ranges objects ", # "ignores the 'na.last' argument") if (!isTRUEorFALSE(decreasing)) stop("'decreasing' must be TRUE or FALSE") ## All arguments in '...' are guaranteed to be Ranges objects. args <- list(...) if (length(args) == 1L) return(.order_Ranges(args[[1L]], decreasing)) order_args <- c(unlist(lapply(args, .Ranges_as_IntegerPairs), recursive=FALSE, use.names=FALSE), list(na.last=na.last, decreasing=decreasing)) do.call(order, order_args) } ) IRanges/R/RangesList-class.R0000644000175400017540000006032313175713360016644 0ustar00biocbuildbiocbuild### ========================================================================= ### RangesList objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### RangesList ### ### Accepts any type of Ranges object as an element. ### setClass("RangesList", representation("VIRTUAL"), prototype = prototype(elementType = "Ranges"), contains = "List") setClass("SimpleRangesList", prototype = prototype(elementType = "Ranges"), contains = c("RangesList", "SimpleList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### IRangesList ### setClass("IRangesList", representation("VIRTUAL"), prototype = prototype(elementType = "IRanges"), contains = "RangesList") setClass("CompressedIRangesList", prototype = prototype(elementType = "IRanges", unlistData = new("IRanges")), contains = c("IRangesList", "CompressedList")) setClass("SimpleIRangesList", prototype = prototype(elementType = "IRanges"), contains = c("IRangesList", "SimpleRangesList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### NormalIRangesList ### setClass("NormalIRangesList", representation("VIRTUAL"), prototype = prototype(elementType = "NormalIRanges"), contains = "IRangesList") ### CompressedNormalIRangesList cannot hold NormalIRanges as its elements, ### due to the compression combining everything into a single ### NormalIRanges (which could easily become non-normal). So just have it ### hold IRanges, instead. setClass("CompressedNormalIRangesList", prototype = prototype(elementType = "IRanges", unlistData = new("IRanges")), contains = c("NormalIRangesList", "CompressedIRangesList")) setClass("SimpleNormalIRangesList", prototype = prototype(elementType = "NormalIRanges"), contains = c("NormalIRangesList", "SimpleIRangesList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.NormalIRangesList <- function(x) { if (!all(isNormal(x))) return("at least one element of object is not normal") NULL } setValidity2("NormalIRangesList", .valid.NormalIRangesList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("start", "RangesList", function(x) S4Vectors:::new_SimpleList_from_list("SimpleIntegerList", lapply(x, start))) setMethod("end", "RangesList", function(x) S4Vectors:::new_SimpleList_from_list("SimpleIntegerList", lapply(x, end))) setMethod("width", "RangesList", function(x) S4Vectors:::new_SimpleList_from_list("SimpleIntegerList", lapply(x, width))) setGeneric(".replaceSEW", signature="x", # not exported function(x, FUN, ..., value) standardGeneric(".replaceSEW")) setMethod(".replaceSEW", "RangesList", function(x, FUN, ..., value) { if (extends(class(value), "IntegerList")) { if (!identical(lapply(x, names), lapply(value, names)) && !all(elementNROWS(x) == elementNROWS(value))) stop("'value' must have same length and names as current 'ranges'") } else if (is.numeric(value)) { lelts <- sum(elementNROWS(x)) if (lelts != length(value)) value <- rep(value, length.out = lelts) if (!is.integer(value)) value <- as.integer(value) value <- split(value, factor(space(x), names(x))) } else { stop("'value' must extend class IntegerList or integer") } FUN <- match.fun(FUN) for (i in seq_len(length(x))) x[[i]] <- FUN(x[[i]], ..., value = value[[i]]) x } ) setReplaceMethod("start", "RangesList", function(x, ..., value) .replaceSEW(x, "start<-", ..., value=value) ) setReplaceMethod("end", "RangesList", function(x, ..., value) .replaceSEW(x, "end<-", ..., value=value) ) setReplaceMethod("width", "RangesList", function(x, ..., value) .replaceSEW(x, "width<-", ..., value=value) ) setMethod("start", "CompressedIRangesList", function(x) new2("CompressedIntegerList", unlistData = start(unlist(x, use.names=FALSE)), partitioning = x@partitioning, check=FALSE)) setMethod("end", "CompressedIRangesList", function(x) new2("CompressedIntegerList", unlistData = end(unlist(x, use.names=FALSE)), partitioning = x@partitioning, check=FALSE)) setMethod("width", "CompressedIRangesList", function(x) new2("CompressedIntegerList", unlistData = width(unlist(x, use.names=FALSE)), partitioning = x@partitioning, check=FALSE)) setMethod(".replaceSEW", "CompressedIRangesList", function(x, FUN, ..., value) { if (extends(class(value), "IntegerList")) { if (!identical(lapply(x, names), lapply(value, names)) && !all(elementNROWS(x) == elementNROWS(value))) stop("'value' must have same length and names as current 'ranges'") value <- unlist(value) } else if (is.numeric(value)) { lelts <- sum(elementNROWS(x)) if (lelts != length(value)) value <- rep(value, length.out = lelts) if (!is.integer(value)) value <- as.integer(value) } else { stop("'value' must extend class IntegerList or integer") } FUN <- match.fun(FUN) slot(x, "unlistData", check=FALSE) <- FUN(x@unlistData, ..., value = value) x } ) setMethod("space", "RangesList", function(x) { space <- names(x) if (!is.null(space)) space <- factor(rep.int(space, elementNROWS(x)), unique(space)) space }) ### TODO: Why not define this at the List level? Or even at the Vector level? setGeneric("universe", function(x) standardGeneric("universe")) setMethod("universe", "RangesList", function(x) { .Deprecated(msg="The universe() getter is deprecated.") metadata(x)$universe }) setGeneric("universe<-", function(x, value) standardGeneric("universe<-")) setReplaceMethod("universe", "RangesList", function(x, value) { .Deprecated(msg="The universe() setter is deprecated.") if (!is.null(value) && !isSingleString(value)) stop("'value' must be a single string or NULL") metadata(x)$universe <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isNormal() ### ### Test the list elements of a RangesList object 'x' individually and return ### a vector of TRUE's or FALSE's parallel to 'x'. More precisely, is ### equivalent to 'sapply(x, FUN)', when FUN is 'isNormal'. ### setMethod("isNormal", "RangesList", function(x, use.names=FALSE) vapply(x, isNormal, logical(1), USE.NAMES=use.names) ) setMethod("isNormal", "SimpleIRangesList", function(x, use.names=FALSE) .Call2("SimpleIRangesList_isNormal", x, use.names, PACKAGE="IRanges") ) setMethod("isNormal", "CompressedIRangesList", function(x, use.names=FALSE) .Call2("CompressedIRangesList_isNormal", x, use.names, PACKAGE="IRanges") ) setMethod("whichFirstNotNormal", "RangesList", function(x) unlist(lapply(x, whichFirstNotNormal))) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion from list-like object to IRangesList object ### ### Try to turn an arbitrary list-like object into an ordinary list of ### IRanges objects. .as_list_of_IRanges <- function(from) { if (is(from, "Ranges")) { if (!is(from, "IRanges")) from <- as(from, "IRanges", strict=FALSE) along_idx <- setNames(seq_along(from), names(from)) names(from) <- NULL mcols(from) <- NULL lapply(along_idx, function(i) from[i]) } else { lapply(from, as, "IRanges", strict=FALSE) } } ### From ordinary list to IRangesList .from_list_to_CompressedIRangesList <- function(from) { from <- .as_list_of_IRanges(from) new_CompressedList_from_list("CompressedIRangesList", from) } .from_list_to_SimpleIRangesList <- function(from) { from <- .as_list_of_IRanges(from) S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", from) } setAs("list", "CompressedIRangesList", .from_list_to_CompressedIRangesList) setAs("list", "SimpleIRangesList", .from_list_to_SimpleIRangesList) setAs("list", "IRangesList", .from_list_to_SimpleIRangesList) ### From List to IRangesList .from_List_to_CompressedIRangesList <- function(from) { new_CompressedList_from_list("CompressedIRangesList", .as_list_of_IRanges(from), metadata=metadata(from), mcols=mcols(from)) } ### Ranges objects are List objects so this case is already covered by the ### .from_List_to_CompressedIRangesList() helper above. However, we can ### implement it much more efficiently. .from_Ranges_to_CompressedIRangesList <- function(from) { if (!is(from, "IRanges")) from <- as(from, "IRanges", strict=FALSE) ans_partitioning <- PartitioningByEnd(seq_along(from), names=names(from)) names(from) <- NULL ans_mcols <- mcols(from) mcols(from) <- NULL ans <- relist(from, ans_partitioning) mcols(ans) <- ans_mcols ans } .from_List_to_SimpleIRangesList <- function(from) { S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", .as_list_of_IRanges(from), metadata=metadata(from), mcols=mcols(from)) } setAs("List", "CompressedIRangesList", .from_List_to_CompressedIRangesList) setAs("Ranges", "CompressedIRangesList", .from_Ranges_to_CompressedIRangesList) setAs("List", "SimpleIRangesList", .from_List_to_SimpleIRangesList) ### Automatic coercion method from RangesList to SimpleIRangesList silently ### returns a broken object (unfortunately these dummy automatic coercion ### methods don't bother to validate the object they return). So we overwrite ### it. setAs("RangesList", "SimpleIRangesList", .from_List_to_SimpleIRangesList) setAs("SimpleRangesList", "SimpleIRangesList", .from_List_to_SimpleIRangesList) setAs("List", "IRangesList", function(from) { if (is(from, "CompressedList") || is(from, "Ranges")) as(from, "CompressedIRangesList") else as(from, "SimpleIRangesList") } ) ### This case is already covered by the List-to-CompressedIRangesList coercion ### above. However, we can implement it much more efficiently. setAs("CompressedRleList", "CompressedIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a CompressedIRangesList object") ranges <- as(unlist(from, use.names = FALSE), "IRanges") to <- diceRangesByList(ranges, from) metadata(to) <- metadata(from) mcols(to) <- mcols(from) to }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### RangesList <- function(..., universe = NULL) { if (!is.null(universe)) { msg <- wmsg("The 'universe' argument of the RangesList() ", "constructor function is deprecated.") .Deprecated(msg=msg) if (!isSingleString(universe)) stop("'universe' must be a single string or NULL") } ranges <- list(...) if (length(ranges) == 1 && is.list(ranges[[1L]])) ranges <- ranges[[1L]] if (!all(sapply(ranges, is, "Ranges"))) stop("all elements in '...' must be Ranges objects") ans <- S4Vectors:::new_SimpleList_from_list("SimpleRangesList", ranges) if (!is.null(universe)) universe(ans) <- universe ans } IRangesList <- function(..., universe=NULL, compress=TRUE) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") if (!is.null(universe)) { msg <- wmsg("The 'universe' argument of the IRangesList() ", "constructor function is deprecated.") .Deprecated(msg=msg) if (!isSingleString(universe)) stop("'universe' must be a single string or NULL") } args <- list(...) if (length(args) == 2L && setequal(names(args), c("start", "end")) && !is(args[[1L]], "Ranges") && !is(args[[2L]], "Ranges")) { if (!compress) stop(wmsg("'compress' must be TRUE when passing the 'start' ", "and 'end' arguments")) ans_start <- IntegerList(args[["start"]], compress=TRUE) ans_end <- IntegerList(args[["end"]], compress=TRUE) ans_partitioning <- PartitioningByEnd(ans_start) if (!identical(ans_partitioning, PartitioningByEnd(ans_end))) stop("'start' and 'end' are not compatible") unlisted_start <- unlist(ans_start, use.names=FALSE) unlisted_end <- unlist(ans_end, use.names=FALSE) unlisted_ans <- IRanges(start=unlisted_start, end=unlisted_end) ans <- relist(unlisted_ans, ans_partitioning) } else { if (length(args) == 1L) { x1 <- args[[1L]] if (is.list(x1) || (is(x1, "List") && !is(x1, "Ranges"))) args <- x1 } if (compress) ans <- as(args, "CompressedIRangesList") else ans <- as(args, "SimpleIRangesList") } if (!is.null(universe)) universe(ans) <- universe ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("getListElement", "CompressedNormalIRangesList", function(x, i, exact=TRUE) newNormalIRangesFromIRanges(callNextMethod()) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### showRangesList <- function(x, with.header=TRUE) { x_len <- length(x) if (with.header) cat(classNameForDisplay(x), " of length ", x_len, "\n", sep = "") if (x_len == 0L) return(invisible(NULL)) cumsumN <- end(PartitioningByEnd(x)) N <- tail(cumsumN, 1) if (x_len <= 5L && N <= 20L) { show(as.list(x)) return(invisible(NULL)) } if (x_len >= 3L && cumsumN[3L] <= 20L) { showK <- 3L } else if (x_len >= 2L && cumsumN[2L] <= 20L) { showK <- 2L } else { showK <- 1L } show(as.list(x[seq_len(showK)])) diffK <- x_len - showK if (diffK > 0L) cat("...\n<", x_len - showK, ifelse(diffK == 1L, " more element>\n", " more elements>\n"), sep="") } setMethod("show", "RangesList", function(object) showRangesList(object)) setMethod("showAsCell", "RangesList", function(object) { unlist(lapply(object, function(x) { if (length(x) <= 3) paste(showAsCell(x), collapse = " ") else paste(c(showAsCell(head(x, 3)), "..."), collapse = " ") }), use.names = FALSE) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More coercions ### .as.list.CompressedNormalIRangesList <- function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") ans <- lapply_CompressedList(x, newNormalIRangesFromIRanges) if (use.names) names(ans) <- names(x) ans } setMethod("as.list", "CompressedNormalIRangesList", .as.list.CompressedNormalIRangesList) setMethod("unlist", "SimpleNormalIRangesList", function(x, recursive = TRUE, use.names = TRUE) { x <- S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", lapply(x, as, "IRanges")) callGeneric() }) setAs("list", "RangesList", function(from) { S4Vectors:::coerceToSimpleList(from, "Ranges") }) setAs("Ranges", "RangesList", function(from) as(from, "IRangesList")) setAs("RangesList", "SimpleRangesList", function(from) S4Vectors:::new_SimpleList_from_list("SimpleRangesList", lapply(from, as, "Ranges"), metadata = metadata(from), mcols = mcols(from))) ### Coercion from RangesList to NormalIRangesList. .from_RangesList_to_SimpleNormalIRangesList <- function(from) { S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), mcols=mcols(from), metadata=metadata(from)) } setAs("RangesList", "SimpleNormalIRangesList", .from_RangesList_to_SimpleNormalIRangesList ) setAs("SimpleIRangesList", "SimpleNormalIRangesList", .from_RangesList_to_SimpleNormalIRangesList ) setAs("NormalIRangesList", "CompressedNormalIRangesList", function(from) { ans <- as(from, "CompressedIRangesList", strict=FALSE) class(ans) <- "CompressedNormalIRangesList" ans } ) setAs("CompressedIRangesList", "CompressedNormalIRangesList", function(from) { if (!all(isNormal(from))) from <- reduce(from, drop.empty.ranges=TRUE) class(from) <- "CompressedNormalIRangesList" from } ) setAs("RangesList", "CompressedNormalIRangesList", function(from) { as(as(from, "CompressedIRangesList", strict=FALSE), "CompressedNormalIRangesList") } ) setAs("RangesList", "NormalIRangesList", function(from) { if (is(from, "SimpleRangesList")) as(from, "SimpleNormalIRangesList") else as(from, "CompressedNormalIRangesList") } ) ### Coercion from LogicalList to NormalIRangesList. setAs("LogicalList", "NormalIRangesList", function(from) { if (is(from, "CompressedList")) as(from, "CompressedNormalIRangesList") else as(from, "SimpleNormalIRangesList") }) setAs("LogicalList", "CompressedNormalIRangesList", function(from) new_CompressedList_from_list("CompressedNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from))) setAs("LogicalList", "SimpleNormalIRangesList", function(from) S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from))) ### Coercion from RleList to NormalIRangesList. setAs("RleList", "NormalIRangesList", function(from) { if (is(from, "CompressedList")) as(from, "CompressedNormalIRangesList") else as(from, "SimpleNormalIRangesList") }) setAs("RleList", "CompressedNormalIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a CompressedNormalIRangesList object") new_CompressedList_from_list("CompressedNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from)) }) setAs("RleList", "SimpleNormalIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a SimpleNormalIRangesList object") S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### merge() ### ### Merges various RangesList objects into a single RangesList object. ### The merging is either by name (if all the RangesList objects have names), ### or by position (if any RangesList object is missing names). ### When merging by name, and in case of duplicated names within a given ### RangesList, the elements corresponding to the duplicated names are ignored. ### When merging by position, all the RangesList objects must have the same ### length. ### Note that the "range" method for RangesList objects expects "merge" to ### behave like this. .RangesList.merge <- function(...) { args <- unname(list(...)) if (length(args) == 0L) stop("nothing to merge") x <- args[[1L]] spaceList <- lapply(args, names) names <- spaces <- unique(do.call(c, spaceList)) if (any(sapply(spaceList, is.null))) { ## Merging by position. if (!all(unlist(lapply(args, length)) == length(x))) stop("if any RangesList objects to merge are missing names, ", "all must have same length") names <- NULL spaces <- seq_len(length(x)) } ranges <- lapply(spaces, function(space) { r <- lapply(args, `[[`, space) do.call(c, r[!sapply(r, is.null)]) }) names(ranges) <- names if (is(x, "CompressedList")) ans <- new_CompressedList_from_list(class(x), ranges) else ans <- S4Vectors:::new_SimpleList_from_list(class(x), ranges) ans } setMethod("merge", c("RangesList", "missing"), function(x, y, ...) .RangesList.merge(x, ...) ) setMethod("merge", c("missing", "RangesList"), function(x, y, ...) .RangesList.merge(y, ...) ) setMethod("merge", c("RangesList", "RangesList"), function(x, y, ...) .RangesList.merge(x, y, ...) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods for NormalIRangesList objects. ### CompressedNormalIRangesList.max <- function(x, use.names) { if (!is(x, "CompressedNormalIRangesList")) stop("'x' must be a CompressedNormalIRangesList object") use.names <- S4Vectors:::normargUseNames(use.names) .Call2("CompressedNormalIRangesList_max", x, use.names, PACKAGE="IRanges") } setMethod("max", "CompressedNormalIRangesList", function(x, ..., na.rm) CompressedNormalIRangesList.max(x, TRUE)) setMethod("max", "SimpleNormalIRangesList", function(x, ..., na.rm) .Call2("SimpleNormalIRangesList_max", x, PACKAGE="IRanges")) CompressedNormalIRangesList.min <- function(x, use.names) { if (!is(x, "CompressedNormalIRangesList")) stop("'x' must be a CompressedNormalIRangesList object") use.names <- S4Vectors:::normargUseNames(use.names) .Call2("CompressedNormalIRangesList_min", x, use.names, PACKAGE="IRanges") } setMethod("min", "CompressedNormalIRangesList", function(x, ..., na.rm) CompressedNormalIRangesList.min(x, TRUE)) setMethod("min", "SimpleNormalIRangesList", function(x, ..., na.rm) .Call2("SimpleNormalIRangesList_min", x, PACKAGE="IRanges")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "summary" method. ### setMethod("summary", "CompressedIRangesList", function(object) .Call2("CompressedIRangesList_summary", object, PACKAGE="IRanges")) IRanges/R/Rle-class-leftovers.R0000644000175400017540000000552013175713360017320 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 7/2/2014 ### Most of the stuff that used to be in the IRanges/R/Rle-class.R file was ### moved to the S4Vectors package (to R/Rle-class.R and R/Rle-utils.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Rle-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "Rle", function(x, use.names=TRUE, use.mcols=FALSE) IRanges(start(x), width=width(x)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("Rle", "IRanges", function(from) { if (!is.logical(runValue(from)) || S4Vectors:::anyMissing(runValue(from))) stop("cannot coerce a non-logical 'Rle' or a logical 'Rle' ", "with NAs to an IRanges object") keep <- runValue(from) ## The returned IRanges instance is guaranteed to be normal. ans_start <- start(from)[keep] ans_width <- runLength(from)[keep] new2("IRanges", start=ans_start, width=ans_width, check=FALSE) }) setAs("Rle", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(from, "IRanges"), check=FALSE)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### General methods ### setGeneric("findRange", signature = "vec", function(x, vec) standardGeneric("findRange")) setMethod("findRange", signature = c(vec = "Rle"), function(x, vec) { run <- findRun(x, vec) if (S4Vectors:::anyMissing(run)) stop("all 'x' values must be in [1, 'length(vec)']") IRanges(start = start(vec)[run], width = width(vec)[run], names = names(x)) }) setGeneric("orderAsRanges", signature = c("x"), # not exported function(x, na.last = TRUE, decreasing = FALSE) standardGeneric("orderAsRanges")) setMethod("orderAsRanges", "Rle", function(x, na.last = TRUE, decreasing = FALSE) { ord <- base::order(runValue(x), na.last = na.last, decreasing = decreasing) new2("IRanges", start = start(x)[ord], width = runLength(x)[ord], check = FALSE) }) setGeneric("splitRanges", signature = "x", function(x) standardGeneric("splitRanges")) setMethod("splitRanges", "Rle", function(x) { split(IRanges(start = start(x), width = runLength(x)), runValue(x)) }) setMethod("splitRanges", "vector_OR_factor", function(x) { callGeneric(Rle(x)) }) IRanges/R/RleViews-class.R0000644000175400017540000001041713175713360016330 0ustar00biocbuildbiocbuild### ========================================================================= ### RleViews objects ### ------------------------------------------------------------------------- ### ### The RleViews class is the basic container for storing a set of views ### (start/end locations) on the same Rle object, called the "subject" ### vector. setClass("RleViews", contains=c("Views", "RleList"), representation( subject="Rle" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### User-friendly constructor. ### setMethod("Views", "Rle", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) new_Views(subject, start=start, end=end, width=width, names=names, Class="RleViews") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("AtomicList", "RleViews", function(from) { to <- Views(as(unlist(from, use.names = FALSE), "Rle"), PartitioningByEnd(from)) names(to) <- names(from) to }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### ### The 2 helper functions below convert a given view on an Rle object ### into a character-string. ### Both assume that 'start' <= 'end' (so they don't check it) and ### padd the result with spaces to produce the "margin effect" ### if 'start' or 'end' are out of limits. RleViews.show_vframe_header <- function(iW, startW, endW, widthW) { cat(format("", width=iW+1), format("start", width=startW, justify="right"), " ", format("end", width=endW, justify="right"), " ", format("width", width=widthW, justify="right"), "\n", sep="") } RleViews.show_vframe_line <- function(x, i, iW, startW, endW, widthW) { lsx <- length(subject(x)) start <- start(x)[i] end <- end(x)[i] width <- end - start + 1 snippetWidth <- getOption("width") - 10 - iW - startW - endW - widthW if (width > 0 && lsx > 0 && start <= lsx && end >= 1) { snippetStart <- max(min(start,lsx),1) snippetEnd <- max(min(end,lsx,start + snippetWidth),1) snippet <- format(as.vector(extractROWS(subject(x), IRanges(snippetStart, snippetEnd)))) snippet <- snippet[cumsum(nchar(snippet) + 1L) < snippetWidth] if (length(snippet) < width) { snippet <- c(snippet, "...") } snippet <- paste(snippet, collapse = " ") } else { snippet <- " " } cat(format(paste("[", i,"]", sep=""), width=iW, justify="right"), " ", format(start, width=startW, justify="right"), " ", format(end, width=endW, justify="right"), " ", format(width, width=widthW, justify="right"), " ", "[", snippet, "]\n", sep="") } ### 'half_nrow' must be >= 1 RleViews.show_vframe <- function(x, half_nrow=9L) { cat("\nviews:") lx <- length(x) if (lx == 0) cat(" NONE\n") else { cat("\n") iW <- nchar(as.character(lx)) + 2 # 2 for the brackets startMax <- max(start(x)) startW <- max(nchar(startMax), nchar("start")) endMax <- max(end(x)) endW <- max(nchar(endMax), nchar("end")) widthMax <- max(width(x)) widthW <- max(nchar(widthMax), nchar("width")) RleViews.show_vframe_header(iW, startW, endW, widthW) if (lx <= 2*half_nrow+1) { for (i in seq_len(lx)) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) } else { for (i in 1:half_nrow) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) cat(format("...", width=iW, justify="right"), " ", format("...", width=startW, justify="right"), " ", format("...", width=endW, justify="right"), " ", format("...", width=widthW, justify="right"), " ...\n", sep="") for (i in (lx-half_nrow+1L):lx) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) } } } setMethod("show", "RleViews", function(object) { cat("Views on a ", length(subject(object)), "-length Rle subject\n", sep="") RleViews.show_vframe(object) } ) IRanges/R/RleViews-utils.R0000644000175400017540000000443413175713360016365 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply", "viewMins", "viewMaxs", and "viewSums" generics and ### methods. ### setMethod("viewApply", "RleViews", function(X, FUN, ..., simplify = TRUE) { X <- trim(X) ans <- aggregate(subject(X), start = structure(start(X), names = names(X)), end = end(X), FUN = FUN, ..., simplify = simplify) if (!simplify) { ans <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans, metadata=metadata(X), mcols=mcols(X)) } ans }) setMethod("viewMins", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewMins", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewMaxs", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewMaxs", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewSums", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewSums", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewMeans", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewMeans", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewWhichMins", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewWhichMins", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewWhichMaxs", "RleViews", function(x, na.rm = FALSE) .Call2("RleViews_viewWhichMaxs", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewRangeMaxs", "RleViews", function(x, na.rm = FALSE) { maxs <- viewWhichMaxs(trim(x), na.rm = na.rm) if (S4Vectors:::anyMissing(maxs)) stop("missing values present, set 'na.rm = TRUE'") findRange(maxs, subject(x)) }) setMethod("viewRangeMins", "RleViews", function(x, na.rm = FALSE) { mins <- viewWhichMins(trim(x), na.rm = na.rm) if (S4Vectors:::anyMissing(mins)) stop("missing values present, set 'na.rm = TRUE'") findRange(mins, subject(x)) }) IRanges/R/RleViewsList-class.R0000644000175400017540000000776113175713360017174 0ustar00biocbuildbiocbuild### ========================================================================= ### RleViewsList objects ### ------------------------------------------------------------------------- setClass("RleViewsList", representation("VIRTUAL"), prototype = prototype(elementType = "RleViews"), contains = "ViewsList") setClass("SimpleRleViewsList", prototype = prototype(elementType = "RleViews"), contains = c("RleViewsList", "SimpleViewsList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor. ### setMethod("subject", "SimpleRleViewsList", function(x) S4Vectors:::new_SimpleList_from_list("SimpleRleList", lapply(x, slot, "subject")) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### setMethod("Views", "RleList", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) RleViewsList(rleList = subject, rangesList = start)) RleViewsList <- function(..., rleList, rangesList, universe = NULL) { if (!is.null(universe)) { msg <- wmsg("The 'universe' argument of the RleViewsList() ", "constructor function is deprecated.") .Deprecated(msg=msg) if (!isSingleString(universe)) stop(wmsg("'universe' must be a single string or NULL")) } views <- list(...) if (!missing(rleList) && !missing(rangesList)) { if (length(views) > 0) stop(wmsg("'...' must be empty when 'rleList' and 'rangesList' ", "are specified")) if (!is(rleList, "RleList")) stop(wmsg("'rleList' must be a RleList object")) if (!is(rangesList, "RangesList")) { rangesList <- try(IRangesList(rangesList), silent = TRUE) if (inherits(rangesList, "try-error")) stop(wmsg("'rangesList' must be a RangesList object")) } if (length(rleList) != length(rangesList)) stop("'rleList' and 'rangesList' must have the same length") rleList_names <- names(rleList) rangesList_names <- names(rangesList) if (!(is.null(rleList_names) || is.null(rangesList_names) || identical(rleList_names, rangesList_names))) { if (anyDuplicated(rleList_names,) || anyDuplicated(rangesList_names)) stop(wmsg("when both 'rleList' and 'rangesList' have names, ", "the names on each object cannot have duplicates")) if (!setequal(rleList_names, rangesList_names)) stop(wmsg("when both 'rleList' and 'rangesList' have names, ", "the set of names must be the same on each object")) warning(wmsg("'rleList' was reordered so that its names ", "match the names on 'rangesList'")) rleList <- rleList[rangesList_names] } views <- Map(Views, rleList, rangesList) } else if ((length(views) > 0) && (!missing(rleList) || !missing(rangesList))) { stop(wmsg("cannot specify 'rleList' or 'rangesList' ", "when specifying '...'")) } else { if (length(views) == 1 && is.list(views[[1L]])) views <- views[[1L]] if (!all(sapply(views, is, "RleViews"))) stop(wmsg("all elements in '...' must be RleViews objects")) } ans <- S4Vectors:::new_SimpleList_from_list("SimpleRleViewsList", views) if (!is.null(universe)) universe(ans) <- universe ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("RleViewsList", "IRangesList", function(from) IRangesList(lapply(from, as, "IRanges"))) setAs("RleViewsList", "CompressedIRangesList", function(from) IRangesList(lapply(from, as, "IRanges"), compress=TRUE)) setAs("RleViewsList", "SimpleIRangesList", function(from) IRangesList(lapply(from, as, "IRanges"), compress=FALSE)) IRanges/R/RleViewsList-utils.R0000644000175400017540000000737513175713360017230 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply", "viewMins", "viewMaxs", and "viewSums" generics and ### methods. ### setMethod("viewApply", "RleViewsList", function(X, FUN, ..., simplify = TRUE) { ans_listData <- lapply(structure(seq_along(X), names=names(X)), function(i) { ans_elt <- aggregate( subject(X[[i]]), start=structure(start(X[[i]]), names=names(start(X[[i]]))), end=end(X[[i]]), FUN=FUN, ..., simplify=simplify) if (!simplify) ans_elt <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans_elt, metadata=metadata(X[[i]]), mcols=mcols(X[[i]])) ans_elt }) S4Vectors:::new_SimpleList_from_list("SimpleList", ans_listData, metadata=metadata(X), mcols=mcols(X))}) .summaryRleViewsList <- function(x, FUN, na.rm = FALSE, outputListType = NULL) { FUN <- match.fun(FUN) if (length(x) == 0) { outputListType <- "SimpleList" listData <- list() } else { if (is.null(outputListType)) { valuesClass <- class(runValue(subject(x[[1L]]))) if (valuesClass == "integer" || valuesClass == "logical") outputListType <- "SimpleIntegerList" else if (valuesClass == "numeric") outputListType <- "SimpleNumericList" else stop("cannot compute numeric summary over a non-numeric Rle") } listData <- lapply(structure(seq_len(length(x)), names = names(x)), function(i) FUN(x[[i]], na.rm = na.rm)) } S4Vectors:::new_SimpleList_from_list(outputListType, listData, metadata = metadata(x), mcols = mcols(x)) } setMethod("viewMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMins, na.rm = na.rm)) setMethod("viewMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMaxs, na.rm = na.rm)) setMethod("viewSums", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewSums, na.rm = na.rm)) setMethod("viewMeans", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMeans, na.rm = na.rm, outputListType = "SimpleNumericList")) setMethod("viewWhichMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewWhichMins, na.rm = na.rm, outputListType = "SimpleIntegerList")) setMethod("viewWhichMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewWhichMaxs, na.rm = na.rm, outputListType = "SimpleIntegerList")) setMethod("viewRangeMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewRangeMaxs, na.rm = na.rm, outputListType = "SimpleIRangesList")) setMethod("viewRangeMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewRangeMins, na.rm = na.rm, outputListType = "SimpleIRangesList")) IRanges/R/Vector-class-leftovers.R0000644000175400017540000001305013175713360020035 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 4/29/2014 ### Most of the stuff that used to be in the IRanges/R/Vector-class.R file ### was moved to the S4Vectors package (to R/Vector-class.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Vector-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Other subsetting-related operations ### ### S3/S4 combo for window<-.Vector `window<-.Vector` <- function(x, start=NA, end=NA, width=NA, ..., value) { window(x, start, end, width, ...) <- value x } `.window<-.Vector` <- function(x, start=NA, end=NA, width=NA, ..., value) { i <- solveUserSEWForSingleSeq(NROW(x), start, end, width) li <- width(i) if (li == 0L) { ## Surprisingly, in that case, `[<-` on standard vectors does not ## even look at 'value'. So neither do we... return(x) } lv <- NROW(value) if (lv == 0L) stop("replacement has length zero") value <- normalizeSingleBracketReplacementValue(value, x) if (li != lv) { if (li %% lv != 0L) warning("number of values supplied is not a sub-multiple ", "of the number of values to be replaced") value <- extractROWS(value, rep(seq_len(lv), length.out=li)) } c(window(x, end=start(i)-1L), value, window(x, start=end(i)+1L)) } setReplaceMethod("window", "Vector", `.window<-.Vector`) ### S3/S4 combo for window<-.vector `window<-.vector` <- `window<-.Vector` setReplaceMethod("window", "vector", `window<-.vector`) ### S3/S4 combo for window<-.factor `window<-.factor` <- function(x, start=NA, end=NA, width=NA, ..., value) { levels <- levels(x) x <- as.character(x) value <- as.character(value) factor(callGeneric(), levels=levels) } setReplaceMethod("window", "factor", `window<-.factor`) setMethod("rev", "Vector", function(x) { if (length(x) == 0L) return(x) x[length(x):1] } ) setMethod("rep", "Vector", function(x, ...) x[rep(seq_len(length(x)), ...)]) setMethod("subset", "Vector", function(x, subset, select, drop = FALSE, ...) { i <- S4Vectors:::evalqForSubset(subset, x, ...) if (!is.null(mcols(x))) { j <- S4Vectors:::evalqForSelect(select, mcols(x), ...) mcols(x) <- mcols(x)[,j,drop=FALSE] } x[i, drop=drop] }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### mstack() ### setGeneric("mstack", function(..., .index.var = "name") standardGeneric("mstack"), signature = "...") BiocGenerics:::apply_hotfix73465(getGeneric("mstack")) setMethod("mstack", "Vector", function(..., .index.var = "name") { if (!isSingleString(.index.var)) stop("'.index.var' must be a single, non-NA string") args <- list(...) combined <- compress_listData(args) df <- .stack.ind(args, .index.var) if (!is.null(mcols(combined))) df <- cbind(df, mcols(combined)) mcols(combined) <- df combined }) setMethod("mstack", "vector", function(..., .index.var = "name") { if (!isSingleString(.index.var)) stop("'.index.var' must be a single, non-NA string") args <- list(...) combined <- compress_listData(args) df <- DataFrame(.stack.ind(args, .index.var), combined) if (ncol(df) == 2L) colnames(df)[2] <- "value" df }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Looping methods ### #.tapplyDefault <- base::tapply #environment(.tapplyDefault) <- topenv() .tapplyDefault <- function(X, INDEX, FUN = NULL, ..., simplify = TRUE) { if (!is.null(FUN)) FUN <- match.fun(FUN) if (missing(INDEX)) stop("'INDEX' is missing") if (!is(INDEX, "RleList")) { if (!is.list(INDEX) && !is(INDEX, "Rle")) INDEX <- Rle(INDEX) INDEX <- RleList(INDEX) } nI <- length(INDEX) namelist <- vector("list", nI) names(namelist) <- names(INDEX) extent <- integer(nI) nx <- NROW(X) one <- 1L group <- Rle(one, nx) ngroup <- one for (i in seq_len(nI)) { index <- INDEX[[i]] if (!is.factor(runValue(index))) runValue(index) <- factor(runValue(index)) offset <- index runValue(offset) <- ngroup * (as.integer(runValue(index)) - one) if (length(index) != nx) stop("arguments must have same length") namelist[[i]] <- levels(index) extent[i] <- nlevels(index) group <- group + offset ngroup <- ngroup * nlevels(index) } if (is.null(FUN)) return(as.vector(group)) groupRanges <- splitRanges(group) ans <- lapply(groupRanges, function(i) FUN(extractROWS(X, i), ...)) index <- as.integer(names(ans)) if (simplify && all(unlist(lapply(ans, length), use.names=FALSE) == 1L)) { ansmat <- array(dim = extent, dimnames = namelist) ans <- unlist(ans, recursive = FALSE) } else { ansmat <- array(vector("list", prod(extent)), dim = extent, dimnames = namelist) } if (length(index) > 0) { names(ans) <- NULL ansmat[index] <- ans } ansmat } setMethod("tapply", c("Vector", "ANY"), .tapplyDefault) setMethod("tapply", c("ANY", "Vector"), .tapplyDefault) setMethod("tapply", c("Vector", "Vector"), .tapplyDefault) IRanges/R/Views-class.R0000644000175400017540000002746113175713360015674 0ustar00biocbuildbiocbuild### ========================================================================= ### Views objects ### ------------------------------------------------------------------------- ### ### The Views virtual class is a general container for storing a set of views ### on an arbitrary Vector object, called the "subject". ### setClass("Views", contains="List", representation( "VIRTUAL", subject="Vector", ranges="IRanges" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setGeneric("subject", function(x) standardGeneric("subject")) setMethod("subject", "Views", function(x) x@subject) setMethod("ranges", "Views", function(x, use.names=TRUE, use.mcols=FALSE) x@ranges ) setGeneric("ranges<-", function(x, ..., value) standardGeneric("ranges<-")) setReplaceMethod("ranges", "Views", function(x, ..., value) { stop("ranges setter for Views objects not ready yet") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods derived from the IRanges interface. ### setMethod("length", "Views", function(x) length(ranges(x))) setMethod("start", "Views", function(x, ...) start(ranges(x))) setMethod("end", "Views", function(x, ...) end(ranges(x))) setMethod("width", "Views", function(x) width(ranges(x))) setMethod("names", "Views", function(x) names(ranges(x))) setReplaceMethod("start", "Views", function(x, ..., value) { start(x@ranges, ...) <- value x } ) setReplaceMethod("end", "Views", function(x, ..., value) { end(x@ranges, ...) <- value x } ) setReplaceMethod("width", "Views", function(x, ..., value) { width(x@ranges, ...) <- value x } ) setReplaceMethod("names", "Views", function(x, value) { names(x@ranges) <- value x } ) setMethod("extractROWS", "Views", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) x@ranges <- extractROWS(ranges(x), i) mcols(x) <- extractROWS(mcols(x), i) x } ) setMethod("elementNROWS", "Views", function(x) width(x)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.Views.width <- function(x) { if (length(width(x)) != 0L && min(width(x)) <= 0L) return("null widths are not allowed") NULL } setValidity2("Views", .valid.Views.width) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### The low-level "Views" constructor. ### NOT exported but used in XVector, Biostrings, and triplex packages. ### TODO: - add a 'check.limits' arg (default to TRUE) for raising an error if ### some views are "out of limits". new_Views <- function(subject, start=NULL, end=NULL, width=NULL, names=NULL, Class=NULL) { if (is(start, "Ranges")) { if (!is.null(end) || !is.null(width)) stop(wmsg("'end' and 'width' must be NULLs when ", "'start' is a Ranges object")) ans_ranges <- start if (class(ans_ranges) != "IRanges") ans_ranges <- as(ans_ranges, "IRanges") ## Keep the names that are already in 'ranges' unless the 'names' arg ## was specified. if (!is.null(names)) names(ans_ranges) <- names ans_mcols <- mcols(ans_ranges) mcols(ans_ranges) <- NULL } else { ans_ranges <- IRanges(start=start, end=end, width=width, names=names) ans_mcols <- NULL } if (is.null(Class)) Class <- paste(class(subject), "Views", sep="") new2(Class, subject=subject, ranges=ans_ranges, elementMetadata=ans_mcols, check=FALSE) } ### The user-friendly "Views" constructor. ### TODO: Same as for the new_Views() function above. setGeneric("Views", signature="subject", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) standardGeneric("Views") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### Returns a single view covering the entire sequence. setAs("Vector", "Views", function(from) Views(from, start=1L, width=length(from)) ) setAs("Views", "Ranges", function(from) ranges(from)) setAs("Views", "IRanges", function(from) ranges(from)) ### Unfortunately, even if we've already defined the IRanges->NormalIRanges ### "coerce" method to override the silly implicit one, we still need to ### define the ->NormalIRanges ones for every that contains ### IRanges. Otherwise, again, 'as(x, "NormalIRanges")' would call another ### silly implicit method when 'x' is a instance. ### Yes, this is another S4 "feature": ### https://stat.ethz.ch/pipermail/r-devel/2008-April/049027.html setAs("Views", "NormalIRanges", function(from) asNormalIRanges(ranges(from), force=TRUE) ) setMethod("as.matrix", "Views", function(x, rev = FALSE, max.width = NA) { x_ranges <- restrict(ranges(x), start = 1L) if (is.na(max.width)) { max.width <- max(width(x_ranges)) } rev <- S4Vectors:::recycleVector(rev, length(x)) part <- PartitioningByWidth(x_ranges) ord <- S4Vectors:::mseq(ifelse(rev, end(part), start(part)), ifelse(rev, start(part), end(part))) v <- extractROWS(subject(x), x_ranges)[ord] v_fill <- rep.int(NA, max.width * length(x)) part <- PartitioningByWidth(rep(max.width, length(x))) i <- as.integer(IRanges(start(part), width = width(x_ranges))) v_fill[i] <- as.vector(v) matrix(v_fill, ncol = max.width, byrow = TRUE, dimnames = list(names(x), NULL)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Extracting a view. ### setMethod("getListElement", "Views", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact, error.if.nomatch=TRUE) start <- start(x)[i] end <- end(x)[i] if (start < 1L || end > length(subject(x))) stop("view is out of limits") extractROWS(subject(x), IRanges(start, end)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining Views objects. ### setMethod("c", "Views", function(x, ..., ignore.mcols=FALSE, recursive=FALSE) { if (!identical(recursive, FALSE)) stop("\"c\" method for Views objects ", "does not support the 'recursive' argument") if (!isTRUEorFALSE(ignore.mcols)) stop("'ignore.mcols' must be TRUE or FALSE") if (missing(x)) { args <- unname(list(...)) x <- args[[1L]] } else { args <- unname(list(x, ...)) } if (length(args) == 1L) return(x) arg_is_null <- sapply(args, is.null) if (any(arg_is_null)) args[arg_is_null] <- NULL # remove NULL elements by setting them to NULL! if (!all(sapply(args, is, class(x)))) stop("all arguments in '...' must be ", class(x), " objects (or NULLs)") ok <- sapply(args, function(arg) isTRUE(all.equal(subject(arg), subject(x)))) if (!all(ok)) stop("all Views objects to combine must have the same subject") x@ranges <- do.call(c, lapply(args, ranges)) if (ignore.mcols) { mcols(x) <- NULL } else { mcols(x) <- do.call(S4Vectors:::rbind_mcols, args) } validObject(x) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "trim" function. ### setGeneric("trim", signature="x", function(x, use.names=TRUE, ...) standardGeneric("trim") ) setMethod("trim", "Views", function(x, use.names=TRUE) { if (length(x) == 0L) return(x) if (min(start(x)) >= 1L && max(end(x)) <= length(subject(x))) return(x) x@ranges <- restrict(ranges(x), start=1L, end=length(subject(x)), keep.all.ranges=TRUE, use.names=use.names) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "subviews" function. ### ### TODO: - add a 'check.limits' arg (default to TRUE) for raising an error if ### some views are "out of limits" setGeneric("subviews", signature="x", function(x, start=NA, end=NA, width=NA, use.names=TRUE) standardGeneric("subviews") ) setMethod("subviews", "Views", function(x, start=NA, end=NA, width=NA, use.names=TRUE) trim(narrow(x, start=start, end=end, width=width, use.names=use.names)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "successiveViews" function. ### successiveViews <- function(subject, width, gapwidth=0, from=1) { ranges <- successiveIRanges(width, gapwidth=gapwidth, from=from) Views(subject, ranges) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "slidingViews" function. ### slidingViews <- function(subject, width, shift = 1L) { ranges <- slidingIRanges(length(subject), width, shift) Views(subject, ranges) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply" function. ### setGeneric("viewApply", signature="X", function(X, FUN, ..., simplify = TRUE) standardGeneric("viewApply") ) setMethod("viewApply", "Views", function(X, FUN, ..., simplify = TRUE) { X <- trim(X) Xsubject <- subject(X) Xstart <- start(X) Xwidth <- width(X) ans <- sapply(structure(seq_len(length(X)), names=names(X)), function(i) FUN(extractROWS(Xsubject, IRanges(start=Xstart[i], width=Xwidth[i])), ...), simplify = simplify) if (!simplify) { ans <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans, metadata = metadata(X), mcols = mcols(X)) } ans } ) setGeneric("viewMins", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMins")) setGeneric("viewMaxs", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMaxs")) setGeneric("viewSums", signature="x", function(x, na.rm = FALSE) standardGeneric("viewSums")) setGeneric("viewMeans", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMeans")) setGeneric("viewWhichMins", signature="x", function(x, na.rm = FALSE) standardGeneric("viewWhichMins")) setGeneric("viewWhichMaxs", signature="x", function(x, na.rm = FALSE) standardGeneric("viewWhichMaxs")) setGeneric("viewRangeMaxs", function(x, na.rm = FALSE) standardGeneric("viewRangeMaxs")) setGeneric("viewRangeMins", function(x, na.rm = FALSE) standardGeneric("viewRangeMins")) setMethod("Summary", "Views", function(x, ..., na.rm = FALSE) { viewSummaryFunMap <- list(min = viewMins, max = viewMaxs, sum = viewSums) viewSummaryFun <- viewSummaryFunMap[[.Generic]] if (!is.null(viewSummaryFun)) { if (length(list(...))) stop("Passing multiple arguments to '", .Generic, "' is not supported.") viewSummaryFun(x, na.rm = na.rm) } else { Summary(ranges(x), ..., na.rm = na.rm) } }) setMethod("mean", "Views", viewMeans) setMethod("which.max", "Views", function(x) { viewWhichMaxs(x, na.rm = TRUE) }) setMethod("which.min", "Views", function(x) { viewWhichMins(x, na.rm = TRUE) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### setMethod("showAsCell", "Views", function(object) { showAsCell(as(object, relistToClass(subject(object)))) }) IRanges/R/ViewsList-class.R0000644000175400017540000000552013175713360016520 0ustar00biocbuildbiocbuild### ========================================================================= ### ViewsList objects ### ------------------------------------------------------------------------- setClass("ViewsList", contains="List", representation("VIRTUAL"), prototype(elementType="Views") ) setClass("SimpleViewsList", contains=c("ViewsList", "SimpleList"), representation("VIRTUAL"), prototype(elementType="Views") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "SimpleViewsList", function(x, use.names=TRUE, use.mcols=FALSE) S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", lapply(x, ranges, use.names=use.names, use.mcols=use.mcols)) ) setMethod("start", "SimpleViewsList", function(x, ...) start(ranges(x))) setMethod("end", "SimpleViewsList", function(x, ...) end(ranges(x))) setMethod("width", "SimpleViewsList", function(x) width(ranges(x))) ### TODO: Why not define this at the List level? Or even at the Vector level? setMethod("universe", "ViewsList", function(x) { .Deprecated(msg="The universe() getter is deprecated.") ### FIXME: for compatibility with older versions, eventually emit warning if (is.null(metadata(x)) || is.character(metadata(x))) metadata(x) else metadata(x)$universe }) ### TODO: Why not define this at the List level? Or even at the Vector level? setReplaceMethod("universe", "ViewsList", function(x, value) { .Deprecated(msg="The universe() setter is deprecated.") if (!is.null(value) && !isSingleString(value)) stop("'value' must be a single string or NULL") metadata(x)$universe <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setMethod("as.matrix", "ViewsList", function(x, rev = FALSE, use.names = FALSE) { if (!isTRUEorFALSE(use.names)) stop("use.names must be TRUE or FALSE") rev <- normargAtomicList1(rev, LogicalList, length(x)) max_width <- max(max(width(restrict(ranges(x), start = 1L)))) m <- do.call(rbind, mapply(as.matrix, x, rev, IntegerList(max_width), SIMPLIFY = FALSE)) nms <- names(x) if (!is.null(nms) && use.names) { nms <- rep(nms, elementNROWS(x)) rownms <- rownames(m) if (is.null(rownms)) rownms <- as.integer(IRanges(1L, width = elementNROWS(x))) rownames(m) <- paste(nms, rownms, sep = ".") } m }) IRanges/R/cbind-Rle-methods.R0000644000175400017540000000325413175713360016726 0ustar00biocbuildbiocbuild### ========================================================================= ### Binding Rle or RleList objects together ### ------------------------------------------------------------------------- ### Return a DataFrame object with 1 row per run. Its first column is ### "runLength" and is followed by 1 column per supplied Rle object. setMethod("cbind", "Rle", function(...) { args <- list(...) args_names <- names(args) if (is.null(args_names)) { noname_idx <- seq_along(args) } else { noname_idx <- which(args_names %in% c("", NA_character_)) } if (length(noname_idx) != 0L) names(args)[noname_idx] <- paste0("V", noname_idx) ## TODO: Add 'with.revmap' arg to disjoin method for Ranges object. ## Then use that feature to avoid the call to findOverlaps() below. ans_runs <- disjoin(do.call(c, unname(lapply(args, ranges)))) DataFrame( runLength=width(ans_runs), DataFrame( lapply(args, function(x) { run_idx <- findOverlaps(ans_runs, ranges(x), type="within", select="arbitrary") runValue(x)[run_idx] }) ) ) } ) ### The supplied RleList objects are recycled the "mapply way" if necessary. ### Return a CompressedSplitDataFrameList object parallel to the longest ### supplied RleList object. setMethod("cbind", "RleList", function(...) { args <- list(...) DF_list <- do.call(mapply, c(list(cbind), args, list(SIMPLIFY=FALSE))) as(DF_list, "CompressedSplitDataFrameList") } ) IRanges/R/coverage-methods.R0000644000175400017540000002662613175713360016732 0ustar00biocbuildbiocbuild### ========================================================================= ### coverage() ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### .IRanges.coverage() and .CompressedIRangesList.coverage() ### ### These 2 internal helpers are the workhorses behind most "coverage" ### methods. All the hard work is almost entirely performed at the C level. ### Only some argument checking/normalization plus the "folding" of the ### result are performed in R. ### .fold_and_truncate_coverage <- function(cvg, circle.length, width) { cvg <- fold(cvg, circle.length) if (is.na(width)) return(cvg) head(cvg, n=width) } ### Returns an Rle object. .IRanges.coverage <- function(x, shift=0L, width=NULL, weight=1L, circle.length=NA, method=c("auto", "sort", "hash")) { ## Check 'x'. if (!is(x, "IRanges")) stop("'x' must be an IRanges object") ## 'shift' will be checked at the C level. if (is(shift, "Rle")) shift <- S4Vectors:::decodeRle(shift) ## Check 'width'. if (is.null(width)) { width <- NA_integer_ } else if (!isSingleNumberOrNA(width)) { stop("'width' must be NULL or a single integer") } else if (!is.integer(width)) { width <- as.integer(width) } ## 'weight' will be checked at the C level. if (is(weight, "Rle")) weight <- S4Vectors:::decodeRle(weight) ## Check 'circle.length'. if (!isSingleNumberOrNA(circle.length)) stop("'circle.length' must be a single integer") if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) ## Check 'method'. method <- match.arg(method) ## Ready to go... ans <- .Call2("IRanges_coverage", x, shift, width, weight, circle.length, method, PACKAGE="IRanges") if (is.na(circle.length)) return(ans) .fold_and_truncate_coverage(ans, circle.length, width) } ### Returns an ordinary list. .normarg_shift_or_weight <- function(arg, arg.label) { if (!is.list(arg)) { if (!(is.numeric(arg) || (is(arg, "Rle") && is.numeric(runValue(arg))) || is(arg, "List"))) stop("'", arg.label, "' must be a numeric vector ", "or a list-like object") arg <- as.list(arg) } if (length(arg) != 0L) { idx <- which(sapply(arg, is, "Rle")) if (length(idx) != 0L) arg[idx] <- lapply(arg[idx], S4Vectors:::decodeRle) } arg } .check_arg_names <- function(arg, arg.label, x_names, x_names.label) { arg_names <- names(arg) if (!(is.null(arg_names) || identical(arg_names, x_names))) stop("when '", arg.label, "' has names, ", "they must be identical to ", x_names.label) } ## Some packages like easyRNASeq or TEQC pass 'width' as a named list-like ## object where each list element is a single number, an NA, or a NULL, when ## calling coverage() on a RangesList or RangedData object. They do so because, ## for whatever reason, we've been supporting this for a while, and also ## because, in the case of the method for RangedData objects, the arg default ## for 'width' used to be such a list (a named list of NULLs in that case). ## However, it never really made sense to support a named list-like object for ## 'width', and it makes even less sense now that the signature of the method ## for RangedData objects has been modified (as of BioC 2.13) to use the same ## arg defaults as the coverage() generic and all other methods. ## TODO: Deprecate support for this. Preferred 'width' form: NULL or an integer ## vector. An that's it. .unlist_width <- function(width, x_names, x_names.label) { if (!identical(names(width), x_names)) stop("when 'width' is a list-like object, it must be named ", "and its names must be identical to ", x_names.label) width_eltNROWS <- elementNROWS(width) if (!all(width_eltNROWS <= 1L)) stop("when 'width' is a list-like object, each list element ", "should contain at most 1 element or be NULL") width[width_eltNROWS == 0L] <- NA_integer_ setNames(unlist(width, use.names=FALSE), x_names) } ### Returns a SimpleRleList object of the length of 'x'. .CompressedIRangesList.coverage <- function(x, shift=0L, width=NULL, weight=1L, circle.length=NA, method=c("auto", "sort", "hash"), x_names.label="'x' names") { ## Check 'x'. if (!is(x, "CompressedIRangesList")) stop("'x' must be a CompressedIRangesList object") x_names <- names(x) ## Check and normalize 'shift'. shift <- .normarg_shift_or_weight(shift, "shift") .check_arg_names(shift, "shift", x_names, x_names.label) ## Check and normalize 'width'. if (is.null(width)) { width <- NA_integer_ } else { if (is.numeric(width)) { .check_arg_names(width, "width", x_names, x_names.label) } else if (is.list(width) || is(width, "List")) { width <- .unlist_width(width, x_names, x_names.label) } else { ## We purposedly omit to mention that 'width' can also be a named ## list-like object because this will be deprecated soon (this is ## why it's not documented in man/coverage-methods.Rd either). stop("'width' must be NULL or an integer vector") } if (!is.integer(width)) width <- setNames(as.integer(width), names(width)) } ## Check and normalize 'weight'. weight <- .normarg_shift_or_weight(weight, "weight") .check_arg_names(weight, "weight", x_names, x_names.label) ## Check and normalize 'circle.length'. if (identical(circle.length, NA)) { circle.length <- NA_integer_ } else if (!is.numeric(circle.length)) { stop("'circle.length' must be an integer vector") } else if (!is.integer(circle.length)) { circle.length <- setNames(as.integer(circle.length), names(circle.length)) } .check_arg_names(circle.length, "circle.length", x_names, x_names.label) ## Check and normalize 'method'. method <- match.arg(method) ## Ready to go... ans_listData <- .Call2("CompressedIRangesList_coverage", x, shift, width, weight, circle.length, method, PACKAGE="IRanges") ## "Fold" the coverage vectors in 'ans_listData' associated with a ## circular sequence. ## Note that the C code should have raised an error or warning already if ## the length of 'circle.length' or 'width' didn't allow proprer recycling ## to the length of 'x'. So using silent 'rep( , length.out=length(x))' is ## safe. circle.length <- rep(circle.length, length.out=length(x)) fold_idx <- which(!is.na(circle.length)) if (length(fold_idx) != 0L) { width <- rep(width, length.out=length(x)) ## Because we "fold" the coverage vectors in an lapply() loop, it will ## be inefficient if 'x' has a lot of list elements associated with a ## circular sequence. ans_listData[fold_idx] <- lapply(fold_idx, function(i) .fold_and_truncate_coverage(ans_listData[[i]], circle.length[i], width[i])) } names(ans_listData) <- names(x) S4Vectors:::new_SimpleList_from_list("SimpleRleList", ans_listData, metadata=metadata(x), mcols=mcols(x)) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### coverage() generic and methods. ### setGeneric("coverage", signature="x", function(x, shift=0L, width=NULL, weight=1L, ...) standardGeneric("coverage") ) setMethod("coverage", "Ranges", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) { if (isSingleString(weight)) { x_mcols <- mcols(x) if (!is(x_mcols, "DataTable") || sum(colnames(x_mcols) == weight) != 1L) stop("'mcols(x)' has 0 or more than 1 \"", weight, "\" columns") weight <- x_mcols[[weight]] } .IRanges.coverage(as(x, "IRanges"), shift=shift, width=width, weight=weight, method=method) } ) setMethod("coverage", "Views", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) { if (is.null(width)) width <- length(subject(x)) coverage(as(x, "IRanges"), shift=shift, width=width, weight=weight, method=method) } ) setMethod("coverage", "RangesList", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) { x_mcols <- mcols(x) x_mcolnames <- colnames(x_mcols) if (isSingleString(shift)) { if (!(shift %in% x_mcolnames)) stop("the string supplied for 'shift' (\"", shift, "\")", "is not a valid metadata column name of 'x'") shift <- x_mcols[[shift]] } if (isSingleString(width)) { if (!(width %in% x_mcolnames)) stop("the string supplied for 'width' (\"", width, "\")", "is not a valid metadata column name of 'x'") width <- x_mcols[[width]] } if (isSingleString(weight)) { if (!(weight %in% x_mcolnames)) stop("the string supplied for 'weight' (\"", weight, "\")", "is not a valid metadata column name of 'x'") weight <- x_mcols[[weight]] } .CompressedIRangesList.coverage(as(x, "CompressedIRangesList"), shift=shift, width=width, weight=weight, method=method) } ) .coverage_RangedData_deprecation_msg <- c( "The \"coverage\" method for RangedData objects is deprecated ", "and won't be replaced. Please migrate your code to use GRanges or ", "GRangesList objects instead. RangedData objects will be deprecated ", "soon (their use has been discouraged since BioC 2.12, that is, since ", "2014). See IMPORTANT NOTE in ?RangedData" ) setMethod("coverage", "RangedData", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) { .Deprecated(msg=wmsg(.coverage_RangedData_deprecation_msg)) x_ranges <- ranges(x) if (length(metadata(x)) > 0) metadata(x_ranges) <- metadata(x) varnames <- colnames(x) if (isSingleString(shift) && (shift %in% varnames)) shift <- values(x)[, shift] if (isSingleString(weight) && (weight %in% varnames)) weight <- values(x)[, weight] coverage(x_ranges, shift=shift, width=width, weight=weight, method=method) } ) IRanges/R/extractList.R0000644000175400017540000003315713175713360016001 0ustar00biocbuildbiocbuild### ========================================================================= ### Group elements of a vector-like object into a list-like object ### ------------------------------------------------------------------------- ### ### What should go in this file? ### ### - All "relist" methods defined in IRanges should go here. ### - extractList() generic and default method. ### ### TODO: Maybe put the default methods for the reverse transformations here ### (unlist, unsplit, and unsplit<-). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### relist() ### setMethod("relist", c("ANY", "PartitioningByEnd"), function(flesh, skeleton) { ans_class <- relistToClass(flesh) skeleton_len <- length(skeleton) if (skeleton_len == 0L) { flesh_len2 <- 0L } else { flesh_len2 <- end(skeleton)[skeleton_len] } if (NROW(flesh) != flesh_len2) stop("shape of 'skeleton' is not compatible with 'NROW(flesh)'") if (extends(ans_class, "CompressedList")) return(newCompressedList0(ans_class, flesh, skeleton)) if (!extends(ans_class, "SimpleList")) stop("don't know how to split or relist a ", class(flesh), " object as a ", ans_class, " object") listData <- lapply(skeleton, function(i) extractROWS(flesh, i)) ## TODO: Once "window" methods have been revisited/tested and ## 'window(flesh, start=start, end=end)' is guaranteed to do the ## right thing for any 'flesh' object (in particular it subsets a ## data.frame-like object along the rows), then replace the line above ## by the code below (which should be more efficient): #skeleton_start <- start(skeleton) #skeleton_end <- end(skeleton) #FUN <- function(start, end) window(flesh, start=start, end=end) #names(skeleton_start) <- names(skeleton) #listData <- mapply(FUN, skeleton_start, skeleton_end) ## or, if we don't trust mapply(): #skeleton_start <- start(skeleton) #skeleton_end <- end(skeleton) #X <- seq_len(skeleton_len) #names(X) <- names(skeleton) #listData <- lapply(X, function(i) window(flesh, # start=skeleton_start[i], # end=skeleton_end[i])) S4Vectors:::new_SimpleList_from_list(ans_class, listData) } ) setMethod("relist", c("ANY", "List"), function(flesh, skeleton) { relist(flesh, PartitioningByEnd(skeleton)) } ) setMethod("relist", c("Vector", "list"), function(flesh, skeleton) { relist(flesh, PartitioningByEnd(skeleton)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### splitAsList() ### ### 'f' is assumed to be an integer vector with no NAs. .splitAsList_by_integer <- function(x, f, drop) { if (length(f) > NROW(x)) stop("'f' cannot be longer than 'NROW(x)' when it's an integer vector") if (!identical(drop, FALSE)) warning("'drop' is ignored when 'f' is an integer vector") f_is_not_sorted <- S4Vectors:::isNotSorted(f) if (f_is_not_sorted) { idx <- base::order(f) f <- f[idx] x <- extractROWS(x, idx) } tmp <- Rle(f) f <- cumsum(runLength(tmp)) names(f) <- as.character(runValue(tmp)) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be a factor with no NAs. .splitAsList_by_factor <- function(x, f, drop) { x_NROW <- NROW(x) f_len <- length(f) f_levels <- levels(f) f <- as.integer(f) if (f_len > x_NROW) f <- head(f, n=x_NROW) f_is_not_sorted <- S4Vectors:::isNotSorted(f) if (f_is_not_sorted) { idx <- base::order(f) x <- extractROWS(x, idx) } f <- tabulate(f, nbins=length(f_levels)) names(f) <- f_levels if (drop) f <- f[f != 0L] f <- cumsum(f) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be an integer-Rle object with no NAs. .splitAsList_by_integer_Rle <- function(x, f, drop) { if (length(f) > NROW(x)) stop("'f' cannot be longer than data when it's an integer-Rle") if (!identical(drop, FALSE)) warning("'drop' is ignored when 'f' is an integer-Rle") f_vals <- runValue(f) f_lens <- runLength(f) f_is_not_sorted <- S4Vectors:::isNotSorted(f_vals) if (f_is_not_sorted) { idx <- base::order(f_vals) xranges <- successiveIRanges(f_lens)[idx] f_vals <- f_vals[idx] f_lens <- f_lens[idx] x <- extractROWS(x, xranges) } tmp <- Rle(f_vals, f_lens) f <- cumsum(runLength(tmp)) names(f) <- as.character(runValue(tmp)) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be an Rle object with no NAs. .splitAsList_by_Rle <- function(x, f, drop) { x_NROW <- NROW(x) f_len <- length(f) f_vals <- runValue(f) if (!is.factor(f_vals)) { f_vals <- as.factor(f_vals) if (f_len > x_NROW) { runValue(f) <- f_vals f <- head(f, n=x_NROW) f_vals <- runValue(f) } } else if (f_len > x_NROW) { f <- head(f, n=x_NROW) f_vals <- runValue(f) } f_lens <- runLength(f) f_levels <- levels(f_vals) f_vals <- as.integer(f_vals) f_is_not_sorted <- S4Vectors:::isNotSorted(f_vals) if (f_is_not_sorted) { idx <- base::order(f_vals) xranges <- successiveIRanges(f_lens)[idx] x <- extractROWS(x, xranges) f <- S4Vectors:::tabulate2(f_vals, nbins=length(f_levels), weight=f_lens) if (drop) { f_levels <- f_levels[f != 0L] f <- f[f != 0L] } } else if (length(f_vals) == length(f_levels) || drop) { if (drop) f_levels <- as.character(runValue(f)) f <- f_lens } else { f <- integer(length(f_levels)) f[f_vals] <- f_lens } names(f) <- f_levels f <- cumsum(f) f <- PartitioningByEnd(f) relist(x, f) } setGeneric("splitAsList", signature=c("x", "f"), function(x, f, drop=FALSE, ...) standardGeneric("splitAsList") ) toFactor <- function(x) { if (is(x, "Rle")) { runValue(x) <- as.factor(runValue(x)) x } else as.factor(x) } ### Took this out of the still-in-incubation LazyList package interaction2 <- function(factors) { nI <- length(factors) nx <- length(factors[[1L]]) factors <- lapply(factors, toFactor) useRle <- any(vapply(factors, is, logical(1), "Rle")) if (useRle) { group <- as(factors[[1L]], "Rle") runValue(group) <- as.integer(runValue(group)) } else { group <- as.integer(factors[[1L]]) } ngroup <- nlevels(factors[[1L]]) for (i in tail(seq_len(nI), -1L)) { index <- factors[[i]] if (useRle) { offset <- as(index, "Rle") runValue(offset) <- ngroup * (as.integer(runValue(offset)) - 1L) } else { offset <- ngroup * (as.integer(index) - 1L) } group <- group + offset ngroup <- ngroup * nlevels(index) } if (useRle) { runValue(group) <- structure(runValue(group), levels=as.character(seq_len(ngroup)), class="factor") group } else { structure(group, levels=as.character(seq_len(ngroup)), class="factor") } } normSplitFactor <- function(f, x) { if (is(f, "formula")) { if (length(f) == 3L) stop("formula 'f' should not have a left hand side") f <- S4Vectors:::formulaValues(x, f) } if (is.list(f) || is(f, "List")) { if (length(f) == 1L) { f <- toFactor(f[[1L]]) } else { f <- interaction2(f) } } f_len <- length(f) if (f_len < NROW(x)) { if (f_len == 0L) stop("split factor has length 0 but 'NROW(x)' is > 0") if (NROW(x) %% f_len != 0L) warning("'NROW(x)' is not a multiple of split factor length") f <- rep(f, length.out=NROW(x)) } f } ## about 3X faster than as.factor on a ~450k tx ids ## caveats: no NAs, and radix sort of levels does not support all encodings ## todo: Would be faster if sort() returned grouping info, ## but then we might coalesce this with the order/split. ## todo: if we could pass na.rm=TRUE to grouping(), NAs would be handled as.factor2 <- function(x) { if (is.factor(x)) return(x) if (is.null(x)) return(factor()) g <- grouping(x) p <- PartitioningByEnd(relist(g)) levs <- as.character(x[g[end(p)]]) if (is.character(x)) { o <- order(levs, method="radix") map <- integer(length(levs)) # or rep(NA_integer_, length(levs)) for NAs map[o] <- seq_along(o) ref <- map[togroup(p)] levs <- levs[o] } else { ref <- togroup(p) } f <- integer(length(x)) f[g] <- ref structure(f, levels=levs, class="factor") } splitAsList_default <- function(x, f, drop=FALSE) { if (!isTRUEorFALSE(drop)) stop("'drop' must be TRUE or FALSE") f <- normSplitFactor(f, x) if (anyNA(f)) { keep_idx <- which(!is.na(f)) x <- extractROWS(x, keep_idx) f <- f[keep_idx] } if (is.integer(f)) return(.splitAsList_by_integer(x, f, drop)) if (!is(f, "Rle")) { f <- as.factor2(f) return(.splitAsList_by_factor(x, f, drop)) } ## From now on, 'f' is guaranteed to be an Rle. f_vals <- runValue(f) if (!((is.vector(f_vals) && is.atomic(f_vals)) || is.factor(f_vals))) stop("'f' must be an atomic vector or a factor (possibly in Rle form)") if (is.integer(f_vals)) return(.splitAsList_by_integer_Rle(x, f, drop)) return(.splitAsList_by_Rle(x, f, drop)) } setMethod("splitAsList", c("ANY", "ANY"), function(x, f, drop=FALSE) splitAsList_default(x, f, drop=drop) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### extractList() ### ### Would extractGroups be a better name for this? ### Or extractGroupedROWS? (analog to extractROWS, except that the ROWS are ### grouped). ### ### 'x' must be a vector-like object and 'i' a list-like object. ### Must return a list-like object parallel to 'i' and with same "shape" as ### 'i' (i.e. same elementNROWS). If 'i' has names, they should be ### propagated to the returned value. The list elements of the returned value ### must have the class of 'x'. setGeneric("extractList", function(x, i) standardGeneric("extractList")) ### Default method. setMethod("extractList", c("ANY", "ANY"), function(x, i) { if (is(i, "Ranges")) return(relist(extractROWS(x, i), i)) if (is.list(i)) { unlisted_i <- unlist(i, recursive=FALSE, use.names=FALSE) } else if (is(i, "List")) { ## The various "unlist" methods for List derivatives don't know ## how to operate recursively and don't support the 'recursive' ## arg. unlisted_i <- unlist(i, use.names=FALSE) } else { stop("'i' must be a list-like object") } relist(extractROWS(x, unlisted_i), i) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### regroupBySupergroup() ### ### A very efficient way to concatenate groups of successive list elements ### in 'x'. ### 'x' must be a list-like object (typically a CompressedList object). ### 'supergroups' must be an object that defines a partitioning of ### 'seq_along(x)' (i.e. it could be used to do ### 'relist(seq_along(x), supergroups)'). It will be immediately replaced with ### 'PartitioningByEnd(supergroups)' so it should be an object that is ### accepted by the PartitioningByEnd() constructor (note that this constructor ### is a no-op if 'supergroups' is already a PartitioningByEnd object). ### Return a list-like object of the same elementType() as 'x' and parallel ### to 'supergroups'. The names on 'supergroups' are propagated but not the ### metadata columns. ### ### Some properties: ### - Behaves as an endomorphism on a CompressedList or PartitioningByEnd ### object. ### - This ### regroupBySupergroup(x, length(x))[[1L]] ### is equivalent to ### unlist(x, use.names=FALSE) ### ### Other possible names for regroupBySupergroup: regroup, ### mergeGroupsInSupergroups, combineGroupsOfListElements, ### unlistGroupsOfListElements, unlistBySupergroup. ### ### TODO: Maybe export and document this? regroupBySupergroup <- function(x, supergroups) { supergroups <- PartitioningByEnd(supergroups) x_breakpoints <- end(PartitioningByEnd(x)) ans_breakpoints <- x_breakpoints[end(supergroups)] nleading0s <- length(supergroups) - length(ans_breakpoints) if (nleading0s != 0L) ans_breakpoints <- c(rep.int(0L, nleading0s), ans_breakpoints) ans_partitioning <- PartitioningByEnd(ans_breakpoints, names=names(supergroups)) if (is(x, "PartitioningByEnd")) return(ans_partitioning) relist(unlist(x, use.names=FALSE), ans_partitioning) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### resplit() and regroup() ### ### Similar to regroupBySupergroup() but there is no assumption that ### the new grouping is a super-grouping of the current grouping. For ### resplit(), the grouping is expressed as a factor, although it is ### effectively a synonym of regroup(), since the latter coerces the ### input to a Grouping. ### resplit <- function(x, f) { regroup(x, f) } regroup <- function(x, g) { g <- as(g, "Grouping") gends <- end(PartitioningByEnd(g)) xg <- x[unlist(g, use.names=FALSE)] p <- PartitioningByEnd(end(PartitioningByEnd(xg))[gends]) names(p) <- names(g) relist(unlist(xg, use.names=FALSE, recursive=FALSE), p) } IRanges/R/findOverlaps-methods.R0000644000175400017540000010205013175713360017555 0ustar00biocbuildbiocbuild### ========================================================================= ### findOverlaps (and related) methods ### ------------------------------------------------------------------------- ### ## internal generic setGeneric("process_self_hits", signature="x", # not exported function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) standardGeneric("process_self_hits")) setMethod("process_self_hits", "SortedByQueryHits", function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) { x <- as(x, "SortedByQuerySelfHits") select <- match.arg(select) if (!isTRUEorFALSE(drop.self)) stop("'drop.self' must be TRUE or FALSE") if (!isTRUEorFALSE(drop.redundant)) stop("'drop.redundant' must be TRUE or FALSE") if (drop.self) { self_idx <- which(isSelfHit(x)) if (length(self_idx) != 0L) x <- x[-self_idx] } if (drop.redundant) { redundant_idx <- which(isRedundantHit(x)) if (length(redundant_idx) != 0L) x <- x[-redundant_idx] } selectHits(x, select=select) } ) setMethod("process_self_hits", "SortedByQueryHitsList", function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) { x <- as(x, "SortedByQuerySelfHitsList") select <- match.arg(select) ans <- lapply(x, process_self_hits, select, drop.self, drop.redundant) if (select != "all") return(IntegerList(ans)) S4Vectors:::new_SimpleList_from_list("HitsList", ans, subjectOffsets = x@subjectOffsets) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps() ### ### Find objects in the query that overlap those in the subject. ### setGeneric("findOverlaps", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ...) standardGeneric("findOverlaps") ) findOverlaps_Ranges <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { type <- match.arg(type) select <- match.arg(select) findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) } setMethod("findOverlaps", c("Ranges", "Ranges"), findOverlaps_Ranges) setMethod("findOverlaps", c("Vector", "missing"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ..., drop.self=FALSE, drop.redundant=FALSE) { select <- match.arg(select) result <- findOverlaps(query, query, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select="all", ...) process_self_hits(result, select, drop.self, drop.redundant) } ) setMethod("findOverlaps", c("integer", "Ranges"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { findOverlaps(IRanges(query, query), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select)) } ) setMethod("findOverlaps", c("Views", "Views"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { findOverlaps(ranges(query), ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select)) } ) setMethod("findOverlaps", c("Views", "Vector"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { findOverlaps(ranges(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select)) } ) setMethod("findOverlaps", c("Vector", "Views"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { findOverlaps(query, ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select)) } ) # might consider making this the following: # setMethod("findOverlaps", c("RangesList", "RangesList"), # function(query, subject, maxgap = -1L, minoverlap = 0L, # type = c("any", "start", "end", "within", "equal"), # select = c("all", "first", "last", "arbitrary"), # drop = FALSE) # { # findOverlaps(query, NCLists(query), # maxgap = maxgap, minoverlap = minoverlap, # type = match.arg(type), select = match.arg(select), drop = drop) # } # ) setMethod("findOverlaps", c("RangesList", "RangesList"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), drop = FALSE) { type <- match.arg(type) select <- match.arg(select) query <- as.list(query) subject <- as.list(subject) origSubject <- subject if (!is.null(names(subject)) && !is.null(names(query))) { subject <- subject[names(query)] names(subject) <- names(query) # get rid of NA's in names } else { subject <- subject[seq_along(query)] } ## NULL's are introduced where they do not match ## We replace those with empty IRanges subject[sapply(subject, is.null)] <- list(IRanges()) ans <- lapply(seq_len(length(subject)), function(i) { findOverlaps(query[[i]], subject[[i]], maxgap = maxgap, minoverlap = minoverlap, type = type, select = select) }) names(ans) <- names(subject) if (select == "all") { ans <- HitsList(ans, origSubject) } else if (drop) { off <- head(c(0L, cumsum(sapply(origSubject, length))), -1) names(off) <- names(origSubject) if (is.null(names(ans))) off <- off[seq_along(ans)] else off <- off[names(ans)] ans <- unlist(ans, use.names=FALSE) + rep.int(unname(off), sapply(ans, length)) } else { ans <- IntegerList(ans) } ans }) setMethod("findOverlaps", c("ViewsList", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), drop=FALSE) { findOverlaps(ranges(query), ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select), drop=drop) } ) setMethod("findOverlaps", c("ViewsList", "Vector"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), drop=FALSE) { findOverlaps(ranges(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select), drop=drop) } ) setMethod("findOverlaps", c("Vector", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), drop=FALSE) { findOverlaps(query, ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select=match.arg(select), drop=drop) } ) setMethod("findOverlaps", c("RangedData", "RangedData"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), drop = FALSE) { findOverlaps(ranges(query), ranges(subject), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = match.arg(select), drop = drop) }) setMethod("findOverlaps", c("RangedData", "RangesList"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), drop = FALSE) { findOverlaps(ranges(query), subject, maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = match.arg(select), drop = drop) }) setMethod("findOverlaps", c("RangesList", "RangedData"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), drop = FALSE) { findOverlaps(query, ranges(subject), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = match.arg(select), drop = drop) }) setMethod("findOverlaps", c("Pairs", "missing"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("Pairs", "ANY"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("ANY", "Pairs"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(query, zipup(subject), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("Pairs", "Pairs"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), zipup(subject), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### countOverlaps() ### setGeneric("countOverlaps", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("countOverlaps") ) .countOverlaps_default <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { type <- match.arg(type) if (missing(subject)) { hits <- findOverlaps(query, maxgap=maxgap, minoverlap=minoverlap, type=type, ...) } else { hits <- findOverlaps(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, ...) } ans <- countQueryHits(hits) names(ans) <- names(query) ans } setMethod("countOverlaps", c("Vector", "Vector"), .countOverlaps_default) setMethod("countOverlaps", c("Vector", "missing"), .countOverlaps_default) countOverlaps_Ranges <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal")) { type <- match.arg(type) ans <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="count") names(ans) <- names(query) ans } setMethod("countOverlaps", c("Ranges", "Ranges"), countOverlaps_Ranges) setMethod("countOverlaps", c("RangesList", "RangesList"), function(query, subject, maxgap=-1L, minoverlap=0L, type = c("any", "start", "end", "within", "equal")) { IntegerList(mapply(countOverlaps, query, subject, MoreArgs = list(maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type)), SIMPLIFY = FALSE)) }) setMethod("countOverlaps", c("ViewsList", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal")) { countOverlaps(ranges(query), ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=type) } ) setMethod("countOverlaps", c("ViewsList", "Vector"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal")) { countOverlaps(ranges(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=type) } ) setMethod("countOverlaps", c("Vector", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal")) { countOverlaps(query, ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=type) } ) setMethod("countOverlaps", c("RangedData", "RangedData"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal")) { countOverlaps(ranges(query), ranges(subject), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type)) }) setMethod("countOverlaps", c("RangedData", "RangesList"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal")) { countOverlaps(ranges(query), subject, maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type)) }) setMethod("countOverlaps", c("RangesList", "RangedData"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal")) { countOverlaps(query, ranges(subject), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### overlapsAny() ### ### Same args and signature as countOverlaps(). setGeneric("overlapsAny", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("overlapsAny") ) .overlapsAny_default <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { type <- match.arg(type) if (missing(subject)) { ahit <- findOverlaps(query, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary", ...) } else { ahit <- findOverlaps(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary", ...) } !is.na(ahit) } setMethod("overlapsAny", c("Vector", "Vector"), .overlapsAny_default) setMethod("overlapsAny", c("Vector", "missing"), .overlapsAny_default) setMethod("overlapsAny", c("RangesList", "RangesList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { query <- as.list(query) subject <- as.list(subject) type <- match.arg(type) if (!is.null(names(query)) && !is.null(names(subject))) { subject <- subject[names(query)] names(subject) <- names(query) # get rid of NA's in names } else { subject <- subject[seq_along(query)] } ## NULL's are introduced where they do not match ## We replace those with empty IRanges subject[sapply(subject, is.null)] <- list(IRanges()) LogicalList(lapply(structure(seq_len(length(query)), names = names(query)), function(i) overlapsAny(query[[i]], subject[[i]], maxgap=maxgap, minoverlap=minoverlap, type=type, ...))) } ) setMethod("overlapsAny", c("ViewsList", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { overlapsAny(ranges(query), ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) setMethod("overlapsAny", c("ViewsList", "Vector"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { overlapsAny(ranges(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) setMethod("overlapsAny", c("Vector", "ViewsList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { overlapsAny(query, ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) .overlapsAny_RangedData_deprecation_msg <- c( "The \"overlapsAny\" methods for RangedData objects are deprecated ", "and won't be replaced. Please migrate your code to use GRanges or ", "GRangesList objects instead. RangedData objects will be deprecated ", "soon (their use has been discouraged since BioC 2.12, that is, since ", "2014). See IMPORTANT NOTE in ?RangedData" ) setMethod("overlapsAny", c("RangedData", "RangedData"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { .Deprecated(msg=wmsg(.overlapsAny_RangedData_deprecation_msg)) overlapsAny(ranges(query), ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) setMethod("overlapsAny", c("RangedData", "RangesList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { .Deprecated(msg=wmsg(.overlapsAny_RangedData_deprecation_msg)) overlapsAny(ranges(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) setMethod("overlapsAny", c("RangesList", "RangedData"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { .Deprecated(msg=.wmsg(overlapsAny_RangedData_deprecation_msg)) overlapsAny(query, ranges(subject), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) } ) ### Convenience wrappers for the 3 most common use cases. `%over%` <- function(query, subject) overlapsAny(query, subject) `%within%` <- function(query, subject) overlapsAny(query, subject, type="within") `%outside%` <- function(query, subject) !overlapsAny(query, subject) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### subsetByOverlaps() ### ### First 2 arguments are 'x' and 'ranges' like for the ### transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps() functions ### from the GenomicFeatures package and the snpsByOverlaps() function from ### the BSgenome package. setGeneric("subsetByOverlaps", signature=c("x", "ranges"), function(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) standardGeneric("subsetByOverlaps") ) setMethod("subsetByOverlaps", c("Vector", "Vector"), function(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) { ov_any <- overlapsAny(x, ranges, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) if (invert) ov_any <- !ov_any x[ov_any] } ) .subsetByOverlaps_RangedData_deprecation_msg <- c( "The \"subsetByOverlaps\" methods for RangedData objects are deprecated ", "and won't be replaced. Please migrate your code to use GRanges or ", "GRangesList objects instead. RangedData objects will be deprecated ", "soon (their use has been discouraged since BioC 2.12, that is, since ", "2014). See IMPORTANT NOTE in ?RangedData" ) setMethod("subsetByOverlaps", c("RangedData", "RangedData"), function(x, ranges, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), invert = FALSE) { .Deprecated(msg=wmsg(.subsetByOverlaps_RangedData_deprecation_msg)) ov_any <- unlist(!is.na(findOverlaps(ranges(x), ranges(ranges), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = "arbitrary")), use.names=FALSE) if (invert) ov_any <- !ov_any x[ov_any] }) setMethod("subsetByOverlaps", c("RangedData", "RangesList"), function(x, ranges, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), invert = FALSE) { .Deprecated(msg=wmsg(.subsetByOverlaps_RangedData_deprecation_msg)) ov_any <- unlist(!is.na(findOverlaps(ranges(x), ranges, maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = "arbitrary")), use.names=FALSE) if (invert) ov_any <- !ov_any x[ov_any] }) setMethod("subsetByOverlaps", c("RangesList", "RangedData"), function(x, ranges, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), invert = FALSE) { .Deprecated(msg=wmsg(.subsetByOverlaps_RangedData_deprecation_msg)) ov_any <- !is.na(findOverlaps(x, ranges(ranges), maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type), select = "arbitrary")) if (invert) ov_any <- !ov_any x[ov_any] }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### overlapsRanges() ### ### Extracts the actual regions of intersection between the overlapping ranges. ### setGeneric("overlapsRanges", signature=c("query", "subject"), function(query, subject, hits=NULL, ...) standardGeneric("overlapsRanges") ) setMethod("overlapsRanges", c("Ranges", "Ranges"), function(query, subject, hits=NULL, ...) { if (is.null(hits)) { hits <- findOverlaps(query, subject, ...) } else { if (!is(hits, "Hits")) stop("'hits' must be a Hits object") if (length(list(...)) != 0L) stop(wmsg("Extra arguments are only accepted when the 'hits' ", "argument is not supplied, in which case they are ", "passed to the internal call to findOverlaps(). ", "See ?overlapsRanges for more information.")) if (queryLength(hits) != length(query) || subjectLength(hits) != length(subject)) stop("'hits' is not compatible with 'query' and 'subject'") } ### Could be replaced by 1-liner: ### pintersect(query[queryHits(hits)], subject[subjectHits(hits)]) ### but will fail if 'query' or 'subject' is a kind of Ranges object ### that cannot be subsetted (e.g. Partitioning object). m <- as.matrix(hits) qstart <- start(query)[m[,1L]] qend <- end(query)[m[,1L]] sstart <- start(subject)[m[,2L]] send <- end(subject)[m[,2L]] IRanges(pmax.int(qstart, sstart), pmin.int(send, qend)) } ) setMethod("overlapsRanges", c("RangesList", "RangesList"), function(query, subject, hits=NULL, ...) { if (is.null(hits)) { hits <- findOverlaps(query, subject, ...) } else { if (!is(hits, "HitsList")) stop("'hits' must be a HitsList object") if (length(list(...)) != 0L) stop(wmsg("Extra arguments are only accepted when the 'hits' ", "argument is not supplied, in which case they are ", "passed to the internal call to findOverlaps(). ", "See ?overlapsRanges for more information.")) if (length(hits) != length(query) || length(hits) != length(subject)) stop("'query', 'subject', and 'hits' must have the same length") } queries <- as.list(query, use.names = FALSE) subjects <- as.list(subject, use.names = FALSE) els <- as.list(hits, use.names = FALSE) ans <- do.call(RangesList, lapply(seq_len(length(hits)), function(i) { overlapsRanges(queries[[i]], subjects[[i]], els[[i]]) })) names(ans) <- names(hits) ans } ) setMethod("ranges", "Hits", function(x, use.names=TRUE, use.mcols=FALSE, query, subject) { msg <- c("\"ranges\" method for Hits objects is deprecated. ", "Please use overlapsRanges() instead.") .Deprecated(msg=wmsg(msg)) query_is_missing <- missing(query) subject_is_missing <- missing(subject) if (query_is_missing || subject_is_missing) { query <- if (subject_is_missing) use.names else use.mcols subject <- if (query_is_missing) use.mcols else use.names } overlapsRanges(query, subject, x) }) setMethod("ranges", "HitsList", function(x, use.names=TRUE, use.mcols=FALSE, query, subject) { msg <- c("\"ranges\" method for HitsList objects is deprecated. ", "Please use overlapsRanges() instead.") .Deprecated(msg=wmsg(msg)) query_is_missing <- missing(query) subject_is_missing <- missing(subject) if (query_is_missing || subject_is_missing) { query <- if (subject_is_missing) use.names else use.mcols subject <- if (query_is_missing) use.mcols else use.names } overlapsRanges(query, subject, x) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### poverlaps() ### setGeneric("poverlaps", signature=c("query", "subject"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("poverlaps") ) setMethod("poverlaps", c("Ranges", "Ranges"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { stopifnot(isSingleNumber(maxgap)) stopifnot(isSingleNumber(minoverlap)) type <- match.arg(type) if (type == "any") { query <- query + maxgap } else if (type == "within") { if (maxgap > 0L) { warning("'maxgap' is ignored when type=='within'") } return(start(query) >= start(subject) & end(query) <= end(subject) & width(query) >= minoverlap) } amount <- pmin(end(query), end(subject)) - pmax(start(query), start(subject)) + 1L overlaps <- amount >= minoverlap samePos <- function(x, y) { x <= (y + maxgap) & x >= (y - maxgap) } keep <- switch(type, any = TRUE, start = samePos(start(query), start(subject)), end = samePos(end(query), end(subject)), equal = samePos(start(query), start(subject)) & samePos(end(query), end(subject))) overlaps & keep } ) setMethod("poverlaps", c("integer", "Ranges"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { poverlaps(IRanges(query, width=1L), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type)) }) setMethod("poverlaps", c("Ranges", "integer"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { poverlaps(query, IRanges(subject, width=1L), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type)) }) ### Convenience operators for poverlaps() `%pover%` <- function(query, subject) poverlaps(query, subject) `%pwithin%` <- function(query, subject) poverlaps(query, subject, type="within") `%poutside%` <- function(query, subject) !poverlaps(query, subject) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Merge two sets of ranges by overlap into a DataFrame ### mergeByOverlaps <- function(query, subject, ...) { hits <- findOverlaps(query, subject, ...) query_df <- as(extractROWS(query, queryHits(hits)), "DataFrame") colnames(query_df)[1L] <- deparse(substitute(query)) subject_df <- as(extractROWS(subject, subjectHits(hits)), "DataFrame") colnames(subject_df)[1L] <- deparse(substitute(subject)) cbind(query_df, subject_df) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Convenience for dereferencing overlap hits to a Pairs ### findOverlapPairs <- function(query, subject, ...) { hits <- findOverlaps(query, subject, ...) if (missing(subject)) { subject <- query } Pairs(query, subject, hits=hits) } IRanges/R/inter-range-methods.R0000644000175400017540000004636613175713360017355 0ustar00biocbuildbiocbuild### ========================================================================= ### Inter-range methods ### ------------------------------------------------------------------------- ### ### TODO: We need a ranges() setter for Views objects that provides this ### functionality. Once we have it, use it instead of this. .set_Views_ranges <- function(x, new_ranges) { new_mcols <- mcols(new_ranges) mcols(new_ranges) <- NULL BiocGenerics:::replaceSlots(x, ranges=new_ranges, elementMetadata=new_mcols, check=FALSE) } ### NOT exported but used in the GenomicRanges package global2local_revmap <- function(unlisted_revmap, y, x) { offsets <- rep.int(start(PartitioningByEnd(x)) - 1L, elementNROWS(y)) unlisted_revmap - offsets } ### NOT exported but used in the GenomicFeatures package local2global_revmap <- function(unlisted_revmap, y, x) { offsets <- rep.int(start(PartitioningByEnd(x)) - 1L, elementNROWS(y)) unlisted_revmap + offsets } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### range() ### ### Always return an IRanges (or NormalIRanges) *instance* whatever Ranges ### derivative the input is, so does NOT act like an endomorphism in general. setMethod("range", "Ranges", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!identical(na.rm, FALSE)) warning("\"range\" method for Ranges objects ", "ignores the 'na.rm' argument") args <- unname(list(x, ...)) ## TODO: Replace line below with ## args <- lapply(args, ranges) ## when ranges() works on Ranges objects. args <- lapply(args, function(arg) IRanges(start(arg), width=width(arg))) ir <- do.call(c, args) ans <- .Call2("IRanges_range", ir, PACKAGE="IRanges") if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") if (with.revmap){ mcols(ans) <- DataFrame(revmap=IntegerList(seq_along(ir))) } ans } ) ### Overwrite above method with optimized method for IPos objects. ### Like the above method, return an IRanges instance. setMethod("range", "IPos", function(x, ..., with.revmap=FALSE, ignore.strand=FALSE, na.rm=FALSE) callGeneric(stitch_IPos(x), ..., with.revmap=with.revmap, ignore.strand=ignore.strand, na.rm=na.rm) ) setMethod("range", "RangesList", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (length(list(x, ...)) >= 2L) x <- merge(x, ...) endoapply(x, range, with.revmap=with.revmap) } ) ### Equivalent to, but much faster than, 'endoapply(x, range)'. .range_CompressedIRangesList <- function(x, with.revmap=FALSE) { ## 'x_start' and 'x_end' are CompressedIntegerList objects with the ## same shape as 'x'. x_start <- start(x) x_end <- end(x) ## TEMPORARY HACK! if (!requireNamespace("XVector", quietly=TRUE)) stop("the XVector package is required by the \"range\" method ", "for CompressedIRangesList objects") ## 'sv' and 'ev' are XIntegerViews objects (see XVector package). sv <- Views(x_start@unlistData, x_start@partitioning) ev <- Views(x_end@unlistData, x_end@partitioning) is_not_empty_view <- width(sv) != 0L # same as 'width(ev) != 0L' unlisted_ans <- IRanges(viewMins(sv)[is_not_empty_view], viewMaxs(ev)[is_not_empty_view]) ans_partitioning <- PartitioningByEnd(cumsum(is_not_empty_view)) if (with.revmap) { x_partitioning <- unname(PartitioningByEnd(x)) global_revmap <- relist(seq_along(unlist(x, use.names=FALSE)), x_partitioning[width(x_partitioning) != 0L]) local_revmap <- global2local_revmap(global_revmap, ans_partitioning, x) mcols(unlisted_ans)$revmap <- local_revmap } ans <- relist(unlisted_ans, ans_partitioning) names(ans) <- names(x) ans } setMethod("range", "CompressedIRangesList", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (length(list(x, ...)) >= 2L) x <- merge(x, ...) .range_CompressedIRangesList(x, with.revmap=with.revmap) } ) .range_RangedData_deprecation_msg <- c( "The \"range\" method for RangedData objects is deprecated ", "and won't be replaced. Please migrate your code to use GRanges or ", "GRangesList objects instead. RangedData objects will be deprecated ", "soon (their use has been discouraged since BioC 2.12, that is, since ", "2014). See IMPORTANT NOTE in ?RangedData" ) setMethod("range", "RangedData", function(x, ..., na.rm) { .Deprecated(msg=wmsg(.range_RangedData_deprecation_msg)) args <- list(x, ...) rangeLists <- lapply(args, ranges) do.call(range, rangeLists) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### reduce() ### setGeneric("reduce", signature="x", function(x, drop.empty.ranges=FALSE, ...) standardGeneric("reduce") ) ### Always return an IRanges (or NormalIRanges) *instance* whatever Ranges ### derivative the input is, so does NOT act like an endomorphism in general. setMethod("reduce", "Ranges", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { if (!isTRUEorFALSE(drop.empty.ranges)) stop("'drop.empty.ranges' must be TRUE or FALSE") if (!isSingleNumber(min.gapwidth)) stop("'min.gapwidth' must be a single integer") if (!is.integer(min.gapwidth)) min.gapwidth <- as.integer(min.gapwidth) if (min.gapwidth < 0L) stop("'min.gapwidth' must be non-negative") if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!isTRUEorFALSE(with.inframe.attrib)) stop("'with.inframe.attrib' must be TRUE or FALSE") C_ans <- .Call2("Ranges_reduce", start(x), width(x), drop.empty.ranges, min.gapwidth, with.revmap, with.inframe.attrib, PACKAGE="IRanges") ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") if (with.revmap) { mcols(ans) <- DataFrame(revmap=IntegerList(C_ans$revmap)) } if (with.inframe.attrib) { inframe <- new2("IRanges", start=C_ans$inframe.start, width=width(x), check=FALSE) attr(ans, "inframe") <- inframe } ans } ) setMethod("reduce", "Views", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { new_ranges <- callGeneric(x@ranges, drop.empty.ranges=drop.empty.ranges, min.gapwidth=min.gapwidth, with.revmap=with.revmap, with.inframe.attrib=with.inframe.attrib) .set_Views_ranges(x, new_ranges) } ) setMethod("reduce", "RangesList", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { endoapply(x, reduce, drop.empty.ranges = drop.empty.ranges, min.gapwidth = min.gapwidth, with.revmap=with.revmap, with.inframe.attrib = with.inframe.attrib) } ) ### 'with.inframe.attrib' is ignored for now. ### TODO: Support 'with.inframe.attrib=TRUE'. .reduce_CompressedIRangesList <- function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { if (!isTRUEorFALSE(drop.empty.ranges)) stop("'drop.empty.ranges' must be TRUE or FALSE") if (!isSingleNumber(min.gapwidth)) stop("'min.gapwidth' must be a single integer") if (!is.integer(min.gapwidth)) min.gapwidth <- as.integer(min.gapwidth) if (min.gapwidth < 0L) stop("'min.gapwidth' must be non-negative") if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!identical(with.inframe.attrib, FALSE)) stop("'with.inframe.attrib' argument not yet supported ", "when reducing a CompressedIRangesList object") C_ans <- .Call2("CompressedIRangesList_reduce", x, drop.empty.ranges, min.gapwidth, with.revmap, PACKAGE="IRanges") unlisted_ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (with.revmap) mcols(unlisted_ans) <- DataFrame(revmap=IntegerList(C_ans$revmap)) ans_partitioning <- PartitioningByEnd(C_ans$breakpoints) names(ans_partitioning) <- names(x) relist(unlisted_ans, ans_partitioning) } setMethod("reduce", "CompressedIRangesList", .reduce_CompressedIRangesList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### gaps() ### setGeneric("gaps", signature="x", function(x, start=NA, end=NA) standardGeneric("gaps") ) ### Always return an IRanges (or NormalIRanges) *instance* whatever Ranges ### derivative the input is, so does NOT act like an endomorphism in general. setMethod("gaps", "Ranges", function(x, start=NA, end=NA) { start <- S4Vectors:::normargSingleStartOrNA(start) end <- S4Vectors:::normargSingleEndOrNA(end) C_ans <- .Call2("IRanges_gaps", start(x), width(x), start, end, PACKAGE="IRanges") ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") ans } ) setMethod("gaps", "Views", function(x, start=NA, end=NA) { if (!isSingleNumberOrNA(start)) stop("'start' must be a single integer") if (!is.integer(start)) start <- as.integer(start) if (!isSingleNumberOrNA(end)) stop("'end' must be a single integer") if (!is.integer(end)) end <- as.integer(end) if (is.na(start)) start <- 1L if (is.na(end)) end <- length(subject(x)) new_ranges <- gaps(x@ranges, start=start, end=end) .set_Views_ranges(x, new_ranges) } ) .gaps_RangesList <- function(x, start=NA, end=NA) { x_len <- length(x) if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be an integer vector or NA") if (!is.integer(start)) start <- as.integer(start) if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be an integer vector or NA") if (!is.integer(end)) end <- as.integer(end) start <- IntegerList(as.list(S4Vectors:::recycleVector(start, x_len))) end <- IntegerList(as.list(S4Vectors:::recycleVector(end, x_len))) mendoapply(gaps, x, start = start, end = end) } setMethod("gaps", "RangesList", .gaps_RangesList) .gaps_CompressedIRangesList <- function(x, start=NA, end=NA) { ## Normalize 'start'. if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be an integer vector or NA") if (!is.integer(start)) start <- as.integer(start) if (length(start) != 1L) start <- S4Vectors:::V_recycle(start, x, x_what="start", skeleton_what="x") ## Normalize 'end'. if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be an integer vector or NA") if (!is.integer(end)) end <- as.integer(end) if (length(end) != 1L) end <- S4Vectors:::V_recycle(end, x, x_what="end", skeleton_what="x") chunksize <- 10000000L if (length(x) <= chunksize) { ## Process all at once. ans <- .Call2("CompressedIRangesList_gaps", x, start, end, PACKAGE="IRanges") return(ans) } ## Process by chunk. verbose <- getOption("verbose", default=FALSE) chunks <- as(breakInChunks(length(x), chunksize), "IRanges") ans_chunks <- lapply(seq_along(chunks), function(i) { if (verbose) cat("Processing chunk #", i, "/", length(chunks), " ... ", sep="") chunk <- chunks[i] x_chunk <- extractROWS(x, chunk) start_chunk <- if (length(start) == 1L) start else extractROWS(start, chunk) end_chunk <- if (length(end) == 1L) end else extractROWS(end, chunk) ans_chunk <- .gaps_CompressedIRangesList(x_chunk, start=start_chunk, end=end_chunk) if (verbose) cat("OK\n") ans_chunk }) do.call(c, ans_chunks) } setMethod("gaps", "CompressedIRangesList", .gaps_CompressedIRangesList) ### 'start' and 'end' are ignored. setMethod("gaps", "MaskCollection", function(x, start=NA, end=NA) { start <- 1L end <- width(x) x@nir_list <- lapply(nir_list(x), function(nir) gaps(nir, start=start, end=end) ) x@NAMES <- as.character(NA) x@desc <- as.character(NA) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### disjoin() ### setGeneric("disjoin", function(x, ...) standardGeneric("disjoin")) ### Always return an IRanges *instance* whatever Ranges derivative the input ### is, so does NOT act like an endomorphism in general. ### FIXME: Does not properly handle zero-width ranges at the moment e.g. ### disjoin(IRanges(c(1, 11, 13), width=c(2, 5, 0)) returns ### IRanges(c(1, 11, 13), width=c(2, 2, 3)) when it should return ### IRanges(c(1, 11, 13, 13), width=c(2, 2, 0, 3)). setMethod("disjoin", "Ranges", function(x, with.revmap=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") ## starts: original starts and end+1 when inside another interval ## ends: original ends and start-1 when inside another interval starts <- unique(start(x)) ends <- unique(end(x)) adj_start <- head(sort(unique(c(starts, ends + 1L))), -1L) adj_end <- tail(sort(unique(c(ends, starts - 1L))), -1L) adj_width <- adj_end - adj_start + 1L adj <- new2("IRanges", start=adj_start, width=adj_width, check=FALSE) adj <- subsetByOverlaps(adj, x, minoverlap=1L) if (with.revmap) mcols(adj)$revmap <- as(sort(findOverlaps(adj, x)),"List") adj } ) ### Basically a no-op but returns a NormalIRanges *instance* for consistency ### with how the other inter-range transformations (range(), reduce(), gaps()) ### behave on a NormalIRanges object. setMethod("disjoin", "NormalIRanges", function(x) as(x, "NormalIRanges")) setMethod("disjoin", "RangesList", function(x, with.revmap=FALSE) endoapply(x, disjoin, with.revmap=with.revmap) ) setMethod("disjoin", "CompressedIRangesList", function(x, with.revmap=FALSE, ...) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") .wunlist <- function(x) ## unlist CompressedIntegerList, with integer(0) as 0 { w <- integer(length(x)) w[elementNROWS(x) != 0L] <- unlist(x, use.names=FALSE) w } rng <- range(x) if (sum(.wunlist(width(rng) + 1)) > .Machine$integer.max) return(endoapply(x, disjoin, with.revmap=with.revmap, ...)) ## localize coordinates off0 <- head(.wunlist(width(rng) + 1L), -1L) offset <- c(1L, cumsum(off0)) - .wunlist(start(rng)) local <- unlist(shift(x, offset), use.names=FALSE) ## disjoin d <- disjoin(local, with.revmap=with.revmap, ...) vec <- unlist(start(shift(rng, offset)), use.names=FALSE) lvls <- factor(seq_along(x)) lvls0 <- lvls[elementNROWS(rng) != 0] f <- lvls0[findInterval(start(d), vec)] ans <- split(d, f) ## globalize coordinates ans <- shift(ans, -offset) ## localize 'revmap' if (with.revmap) { unlisted_ans <- unlist(ans, use.names=FALSE) global_revmap <- mcols(unlisted_ans)$revmap local_revmap <- global2local_revmap(global_revmap, ans, x) mcols(unlisted_ans)$revmap <- local_revmap ans <- relist(unlisted_ans, ans) } names(ans) <- names(x) ans }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isDisjoint() ### setGeneric("isDisjoint", function(x, ...) standardGeneric("isDisjoint")) setMethod("isDisjoint", "Ranges", function(x) { x_len <- length(x) if (x_len < 2L) return(TRUE) x_start <- start(x) x_end <- end(x) oo <- order(x) start2 <- x_start[oo] end2 <- x_end[oo] all(start2[-1L] > end2[-x_len]) } ) ### Overwrite above method with optimized method for IPos objects. setMethod("isDisjoint", "IPos", function(x) callGeneric(stitch_IPos(x))) setMethod("isDisjoint", "NormalIRanges", function(x) TRUE) setMethod("isDisjoint", "RangesList", function(x) vapply(x, isDisjoint, logical(1)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### disjointBins() ### ## make intervals disjoint by segregating them into separate Ranges setGeneric("disjointBins", function(x, ...) standardGeneric("disjointBins")) setMethod("disjointBins", "Ranges", function(x) { x_ord <- NULL if (S4Vectors:::isNotSorted(start(x))) { # minimize work for sorted ranges (common) x_ord <- order(x) x <- x[x_ord] } bins <- .Call2("Ranges_disjointBins", start(x), width(x), PACKAGE="IRanges") if (!is.null(x_ord)) { rev_ord <- integer(length(x_ord)) rev_ord[x_ord] <- seq_along(rev_ord) bins <- bins[rev_ord] } names(bins) <- names(x) bins } ) ### Overwrite above method with trivial method for NormalIRanges objects. setMethod("disjointBins", "NormalIRanges", function(x) rep.int(1L, length(x))) setMethod("disjointBins", "RangesList", function(x) as(lapply(x, disjointBins), "IntegerList") ) IRanges/R/intra-range-methods.R0000644000175400017540000005751213175713360017344 0ustar00biocbuildbiocbuild### ========================================================================= ### Intra-range methods ### ------------------------------------------------------------------------- ### normargAtomicList1 <- function(arg, List, lx, argname = deparse(substitute(arg))) { if (is.vector(arg)) arg <- List(as.list(S4Vectors:::recycleVector(arg, lx))) else if (!is(arg, "AtomicList")) stop("'", argname,"' must be a vector or AtomicList object") arg } normargAtomicList2 <- function(arg, List, lx, x_eltNROWS, argname = deparse(substitute(arg))) { if (!(is.vector(arg) && length(arg) == 1L)) { if (is.vector(arg)) arg <- as(rep(S4Vectors:::recycleVector(arg, lx), x_eltNROWS), class(unlist(List()))) else { if (!is(arg, "AtomicList")) stop("'arg' must be a vector or AtomicList object") if (!isTRUE(all.equal(elementNROWS(arg), x_eltNROWS, check.attributes=FALSE))) arg <- mapply(S4Vectors:::recycleVector, arg, List(as.list(x_eltNROWS))) arg <- unlist(arg, use.names=FALSE) } } else if (is.list(arg)){ arg <- unlist(arg, use.names=FALSE) } arg } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### shift() ### setGeneric("shift", signature="x", function(x, shift=0L, use.names=TRUE) standardGeneric("shift") ) setMethod("shift", "Ranges", function(x, shift=0L, use.names=TRUE) { if (is(x, "NormalIRanges") && !isSingleNumber(shift)) stop("'shift' must be a single number when shifting ", "a NormalIRanges object") shift <- recycleIntegerArg(shift, "shift", length(x)) new_start <- start(x) + shift if (is(x, "IRanges")) { x@start <- new_start } else { x <- update(x, start=new_start, width=width(x), check=FALSE) } if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL validObject(x) x } ) ### Overwrite above method with optimized method for IPos objects. ### An IPos object cannot hold names so the 'use.names' arg has no effect. ### NOTE: We only support shifting by a single value at the moment! setMethod("shift", "IPos", function(x, shift=0L, use.names=TRUE) { if (!is.numeric(shift)) stop("'shift' must be a numeric vector") if (!is.integer(shift)) shift <- as.integer(shift) if (length(shift) != 1L) { if (length(shift) != length(x)) stop("'shift' must be a single number or have the ", "length of 'x' when shifting an IPos object") if (length(shift) != 0L) { if (!isConstant(shift)) stop("'shift' must be constant when shifting ", "an IPos object") shift <- shift[[1L]] } } x@pos_runs <- callGeneric(x@pos_runs, shift=shift) x } ) setMethod("shift", "Views", function(x, shift=0L, use.names=TRUE) { x@ranges <- shift(ranges(x), shift=shift, use.names=use.names) x } ) setMethod("shift", "RangesList", function(x, shift=0L, use.names = TRUE) { lx <- length(x) shift <- normargAtomicList1(shift, IntegerList, lx) mendoapply("shift", x = x, shift = shift, MoreArgs = list(use.names = use.names)) }) setMethod("shift", "CompressedIRangesList", function(x, shift=0L, use.names = TRUE) { lx <- length(x) x_eltNROWS <- elementNROWS(x) shift <- normargAtomicList2(shift, IntegerList, lx, x_eltNROWS) slot(x, "unlistData", check=FALSE) <- shift(x@unlistData, shift = shift, use.names = use.names) x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### narrow() ### setGeneric("narrow", signature="x", function(x, start=NA, end=NA, width=NA, use.names=TRUE) standardGeneric("narrow") ) setMethod("narrow", "Ranges", function(x, start=NA, end=NA, width=NA, use.names=TRUE) { if (is(x, "NormalIRanges")) stop("narrowing a NormalIRanges object is not supported") solved_SEW <- solveUserSEW(width(x), start=start, end=end, width=width) ans_start <- start(x) + start(solved_SEW) - 1L ans_width <- width(solved_SEW) x <- update(x, start=ans_start, width=ans_width, check=FALSE) if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) setMethod("narrow", "Views", function(x, start=NA, end=NA, width=NA, use.names=TRUE) { x@ranges <- narrow(ranges(x), start=start, end=end, width=width, use.names=use.names) x } ) setMethod("narrow", "RangesList", function(x, start = NA, end = NA, width = NA, use.names = TRUE) { lx <- length(x) start <- normargAtomicList1(start, IntegerList, lx) end <- normargAtomicList1(end, IntegerList, lx) width <- normargAtomicList1(width, IntegerList, lx) mendoapply(narrow, x = x, start = start, end = end, width = width, MoreArgs = list(use.names = use.names)) }) setMethod("narrow", "CompressedIRangesList", function(x, start = NA, end = NA, width = NA, use.names = TRUE) { lx <- length(x) x_eltNROWS <- elementNROWS(x) start <- normargAtomicList2(start, IntegerList, lx, x_eltNROWS) end <- normargAtomicList2(end, IntegerList, lx, x_eltNROWS) width <- normargAtomicList2(width, IntegerList, lx, x_eltNROWS) slot(x, "unlistData", check=FALSE) <- narrow(x@unlistData, start = start, end = end, width = width, use.names = use.names) x }) setMethod("narrow", "MaskCollection", function(x, start=NA, end=NA, width=NA, use.names=TRUE) { solved_SEW <- solveUserSEWForSingleSeq(width(x), start, end, width) solved_start <- start(solved_SEW) solved_end <- end(solved_SEW) solved_width <- width(solved_SEW) x@nir_list <- lapply(nir_list(x), function(nir) shift(restrict(nir, start=solved_start, end=solved_end), 1L - solved_start) ) x@width <- solved_width if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### resize() ### setGeneric("resize", signature="x", function(x, width, fix="start", use.names=TRUE, ...) standardGeneric("resize") ) setMethod("resize", "Ranges", function(x, width, fix="start", use.names=TRUE) { if (is(x, "NormalIRanges")) stop("resizing a NormalIRanges object is not supported") lx <- length(x) if (!is.numeric(width) || S4Vectors:::anyMissing(width)) stop("'width' must be a numeric vector without NA's") if (!is.integer(width)) width <- as.integer(width) if (S4Vectors:::anyMissingOrOutside(width, 0L)) stop("'width' values must be non-negative") if (!(is.character(fix) || (is(fix, "Rle") && is.character(runValue(fix)))) || (length(fix) == 0L && length(x) > 0L) || (length(setdiff(unique(fix), c("start", "end", "center"))) > 0)) { stop("'fix' must be a character vector or character Rle ", "with values in \"start\", \"end\", and \"center\"") } if (!is(fix, "Rle")) fix <- Rle(fix) if (length(fix) != lx) fix <- rep(fix, length.out = lx) ans_width <- S4Vectors:::recycleVector(width, lx) ans_start <- start(x) if (!identical(runValue(fix), "start")) { fixEnd <- as(fix == "end", "IRanges") if (length(fixEnd) > 0) { value <- extractROWS(ans_start, fixEnd) + (extractROWS(width(x), fixEnd) - extractROWS(ans_width, fixEnd)) ans_start <- replaceROWS(ans_start, fixEnd, value) } fixCenter <- as(fix == "center", "IRanges") if (length(fixCenter) > 0) { value <- extractROWS(ans_start, fixCenter) + (extractROWS(width(x), fixCenter) - extractROWS(ans_width, fixCenter)) %/% 2L ans_start <- replaceROWS(ans_start, fixCenter, value) } } x <- update(x, start=ans_start, width=ans_width, check=FALSE) if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) setMethod("resize", "RangesList", function(x, width, fix = "start", use.names = TRUE) { lx <- length(x) width <- normargAtomicList1(width, IntegerList, lx) fix <- normargAtomicList1(fix, CharacterList, lx) mendoapply(resize, x = x, width = width, fix = fix, MoreArgs = list(use.names = use.names)) }) setMethod("resize", "CompressedIRangesList", function(x, width, fix = "start", use.names = TRUE) { lx <- length(x) x_eltNROWS <- elementNROWS(x) width <- normargAtomicList2(width, IntegerList, lx, x_eltNROWS) fix <- normargAtomicList2(fix, CharacterList, lx, x_eltNROWS) slot(x, "unlistData", check=FALSE) <- resize(x@unlistData, width = width, fix = fix, use.names = use.names) x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### flank() ### setGeneric("flank", signature="x", function(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) standardGeneric("flank") ) setMethod("flank", "Ranges", function(x, width, start=TRUE, both=FALSE, use.names=TRUE) { if (is(x, "NormalIRanges")) stop("flanking a NormalIRanges object is not supported") width <- recycleIntegerArg(width, "width", length(x)) if (!is.logical(start) || S4Vectors:::anyMissing(start)) stop("'start' must be logical without NA's") start <- S4Vectors:::recycleVector(unname(start), length(x)) if (!isTRUEorFALSE(both)) stop("'both' must be TRUE or FALSE") ans_start <- integer(length(x)) if (both) { idx1 <- which(start) idx2 <- which(!start) width <- abs(width) ans_width <- 2L * width ans_start[idx1] <- start(x)[idx1] - width[idx1] ans_start[idx2] <- end(x)[idx2] - width[idx2] + 1L } else { idx1a <- which(start & width >= 0L) idx1b <- which(start & width < 0L) idx2a <- which(!start & width >= 0L) idx2b <- which(!start & width < 0L) ans_width <- abs(width) ans_start[idx1a] <- start(x)[idx1a] - width[idx1a] ans_start[idx1b] <- start(x)[idx1b] ans_start[idx2a] <- end(x)[idx2a] + 1L ans_start[idx2b] <- end(x)[idx2b] + width[idx2b] + 1L } x <- update(x, start=ans_start, width=ans_width, check=FALSE) if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) setMethod("flank", "RangesList", function(x, width, start = TRUE, both = FALSE, use.names = TRUE) { lx <- length(x) width <- normargAtomicList1(width, IntegerList, lx) start <- normargAtomicList1(start, LogicalList, lx) mendoapply(flank, x = x, width = width, start = start, MoreArgs = list(both = both, use.names = use.names)) }) setMethod("flank", "CompressedIRangesList", function(x, width, start = TRUE, both = FALSE, use.names = TRUE) { lx <- length(x) x_eltNROWS <- elementNROWS(x) width <- normargAtomicList2(width, IntegerList, lx, x_eltNROWS) start <- normargAtomicList2(start, LogicalList, lx, x_eltNROWS) slot(x, "unlistData", check=FALSE) <- flank(x@unlistData, width = width, start = start, both = both, use.names = use.names) x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### promoters() ### setGeneric("promoters", signature="x", function(x, upstream=2000, downstream=200, ...) standardGeneric("promoters") ) setMethod("promoters", "Ranges", function(x, upstream=2000, downstream=200, ...) { if (is(x, "NormalIRanges")) stop("promoters on a NormalIRanges object is not supported") if (!isSingleNumber(upstream)) stop("'upstream' must be a single integer") if (!is.integer(upstream)) upstream <- as.numeric(upstream) if (!isSingleNumber(downstream)) stop("'downstream' must be a single integer") if (!is.integer(downstream)) downstream <- as.numeric(downstream) if (upstream < 0 | downstream < 0) stop("'upstream' and 'downstream' must be integers >= 0") st <- start(x) start(x) <- st - upstream end(x) <- st + downstream - 1L x } ) setMethod("promoters", "Views", function(x, upstream=2000, downstream=200, ...) { x@ranges <- promoters(ranges(x), upstream, downstream) x } ) setMethod("promoters", "RangesList", function(x, upstream=2000, downstream=200, ...) { endoapply(x, promoters, upstream = upstream, downstream = downstream) } ) setMethod("promoters", "CompressedIRangesList", function(x, upstream=2000, downstream=200, ...) { slot(x, "unlistData", check=FALSE) <- promoters(x@unlistData, upstream = upstream, downstream = downstream, ...) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### reflect() ### setGeneric("reflect", signature="x", function(x, bounds, use.names=TRUE) standardGeneric("reflect") ) setMethod("reflect", "Ranges", function(x, bounds, use.names=TRUE) { if (is(x, "NormalIRanges")) stop("reflecting a NormalIRanges object is not supported") if (!is(bounds, "Ranges")) stop("'bounds' must be a Ranges object") if (length(x) > 1 && length(bounds) == 0) stop("'bounds' is an empty Ranges object") if (length(x) < length(bounds)) bounds <- head(bounds, length(x)) ans_start <- (2L * start(bounds) + width(bounds) - 1L) - end(x) x <- update(x, start=ans_start, width=width(x), check=FALSE) if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### restrict() ### setGeneric("restrict", signature="x", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) standardGeneric("restrict") ) ### We distinguish 4 exclusive types of relationship between a range in 'x' ### and its corresponding restriction interval (specified by 'start' and ### 'end'): ### - Type 1: They have a non-empty intersection. ### - Type 2: The restriction interval is empty but its bounds are within ### the range in 'x'. ### - Type 3: The restriction interval is not empty and is adjacent to the ### range in 'x' i.e. the range in 'x' ends at start - 1 or starts ### at end - 1. ### ### drop.ranges.mode: ### 0L: Ranges in 'x' that are empty after restriction are dropped. ### 1L: Ranges in 'x' that are not overlapping and not even adjacent ### with the region of restriction are dropped. ### "Not overlapping and not adjacent" means that they end strictly ### before start - 1 or start strictly after end + 1. ### Those that are not overlapping but are however adjacent are kept ### (and are empty after restriction). ### 2L: All ranges in 'x' are kept after restriction. ### Ranges ### Note that the only mode compatible with a NormalIRanges object is 0L. Ranges.restrict <- function(x, start, end, drop.ranges.mode, use.names) { if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be a vector of integers") if (!is.integer(start)) start <- as.integer(start) if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be a vector of integers") if (!is.integer(end)) end <- as.integer(end) if (length(x) != 0L) { if (length(start) == 0L || length(start) > length(x)) stop("invalid 'start' length") if (length(end) == 0L || length(end) > length(x)) stop("invalid 'end' length") } start <- S4Vectors:::recycleVector(start, length(x)) end <- S4Vectors:::recycleVector(end, length(x)) use.names <- S4Vectors:::normargUseNames(use.names) ans_start <- start(x) ans_end <- end(x) if (use.names) ans_names <- names(x) else ans_names <- NULL ans_mcols <- mcols(x) ## Compare ranges in 'x' with 'start'. if (drop.ranges.mode == 0L) far_too_left <- !is.na(start) & (ans_end < start) else far_too_left <- !is.na(start) & (ans_end < start - 1L) if (drop.ranges.mode == 2L) { ans_end[far_too_left] <- start[far_too_left] - 1L } else { ## Drop the ranges that are far too left with respect to the ## region of restriction. keep_idx <- which(!far_too_left) ans_start <- ans_start[keep_idx] ans_end <- ans_end[keep_idx] if (!is.null(ans_names)) ans_names <- ans_names[keep_idx] ans_mcols <- extractROWS(ans_mcols, keep_idx) start <- start[keep_idx] end <- end[keep_idx] } ## Fix 'ans_start'. too_left <- !is.na(start) & (ans_start < start) ans_start[too_left] <- start[too_left] ## Compare ranges in 'x' with 'end'. if (drop.ranges.mode == 0L) far_too_right <- !is.na(end) & (ans_start > end) else far_too_right <- !is.na(end) & (ans_start > end + 1L) if (drop.ranges.mode == 2L) { ans_start[far_too_right] <- end[far_too_right] + 1L } else { ## Drop the ranges that are far too right with respect to the ## region of restriction. keep_idx <- which(!far_too_right) ans_start <- ans_start[keep_idx] ans_end <- ans_end[keep_idx] if (!is.null(ans_names)) ans_names <- ans_names[keep_idx] ans_mcols <- extractROWS(ans_mcols, keep_idx) start <- start[keep_idx] end <- end[keep_idx] } ## Fix 'ans_end'. too_right <- !is.na(end) & (ans_end > end) ans_end[too_right] <- end[too_right] ans_width <- ans_end - ans_start + 1L update(x, start=ans_start, width=ans_width, names=ans_names, mcols=ans_mcols, check=FALSE) } setMethod("restrict", "Ranges", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) { if (!isTRUEorFALSE(keep.all.ranges)) stop("'keep.all.ranges' must be TRUE or FALSE") use.names <- S4Vectors:::normargUseNames(use.names) if (is(x, "NormalIRanges")) { if (keep.all.ranges) stop("'keep.all.ranges=TRUE' is not supported ", "when 'x' is a NormalIRanges object") drop.ranges.mode <- 0L } else { if (keep.all.ranges) drop.ranges.mode <- 2L else drop.ranges.mode <- 1L } Ranges.restrict(x, start, end, drop.ranges.mode, use.names) } ) setMethod("restrict", "RangesList", function(x, start = NA, end = NA, keep.all.ranges = FALSE, use.names = TRUE) { lx <- length(x) start <- normargAtomicList1(start, IntegerList, lx) end <- normargAtomicList1(end, IntegerList, lx) mendoapply(restrict, x, start = start, end = end, MoreArgs = list(keep.all.ranges = keep.all.ranges, use.names = use.names)) }) setMethod("restrict", "CompressedIRangesList", function(x, start = NA, end = NA, keep.all.ranges = FALSE, use.names = TRUE) { if (!isTRUEorFALSE(keep.all.ranges)) stop("'keep.all.ranges' must be TRUE or FALSE") if (keep.all.ranges) { lx <- length(x) x_eltNROWS <- elementNROWS(x) start <- normargAtomicList2(start, IntegerList, lx, x_eltNROWS) end <- normargAtomicList2(end, IntegerList, lx, x_eltNROWS) slot(x, "unlistData", check=FALSE) <- restrict(x@unlistData, start = start, end = end, keep.all.ranges = keep.all.ranges, use.names = use.names) } else x <- callNextMethod() x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### threebands() ### setGeneric("threebands", signature="x", function(x, start=NA, end=NA, width=NA) standardGeneric("threebands") ) ### Method for IRanges only! setMethod("threebands", "IRanges", function(x, start=NA, end=NA, width=NA) { middle <- narrow(x, start=start, end=end, width=width, use.names=FALSE) left <- right <- middle left@start <- start(x) left@width <- start(middle) - start(x) right@start <- end(middle) + 1L right@width <- end(x) - end(middle) list(left=left, middle=middle, right=right) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Zooming (symmetrically scales the width). ### ### TODO: Implement a zoom() generic and methods and make the "Ops" method ### below a simple convenience wrapper for zoom(). Recommend the use of zoom() ### over "Ops" methods in packages and scripts. Reserve "Ops" methods as a ### convenience when working interactively. ### setMethod("Ops", c("Ranges", "numeric"), function(e1, e2) { if (S4Vectors:::anyMissing(e2)) stop("NA not allowed as zoom factor") if ((length(e1) < length(e2) && length(e1)) || (length(e1) && !length(e2)) || (length(e1) %% length(e2) != 0)) stop("zoom factor length not a multiple of number of ranges") if (.Generic == "*") { e2 <- ifelse(e2 < 0, abs(1/e2), e2) r <- e1 mid <- (start(r)+end(r))/2 w <- width(r)/e2 update(r, start = ceiling(mid - w/2), width = floor(w), check = FALSE) } else { if (.Generic == "-") { e2 <- -e2 .Generic <- "+" } if (.Generic == "+") { if (any(-e2*2 > width(e1))) stop("adjustment would result in ranges with negative widths") update(e1, start = start(e1) - e2, end = end(e1) + e2, check = FALSE) } } } ) setMethod("Ops", c("RangesList", "numeric"), function(e1, e2) { for (i in seq_len(length(e1))) e1[[i]] <- callGeneric(e1[[i]], e2) e1 }) setMethod("Ops", c("CompressedIRangesList", "numeric"), function(e1, e2) { relist(callGeneric(unlist(e1, use.names = FALSE), e2), e1) }) IRanges/R/multisplit.R0000644000175400017540000000065713175713360015700 0ustar00biocbuildbiocbuild### ========================================================================= ### multisplit() ### ------------------------------------------------------------------------- ### multisplit <- function(x, f) { if (!is.list(f) && !is(f, "List")) stop("'f' must be a list") if (length(x) != length(f)) stop("Length of 'f' must equal length of 'x'") splitAsList(rep(x, elementNROWS(f)), unlist(f, use.names = FALSE)) } IRanges/R/nearest-methods.R0000644000175400017540000001523513175713360016572 0ustar00biocbuildbiocbuild### ========================================================================= ### nearest (and related) methods ### ------------------------------------------------------------------------- ### setClassUnion("RangesORmissing", c("Ranges", "missing")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### precede() and follow() ### setGeneric("precede", function(x, subject = x, ...) standardGeneric("precede")) setMethod("precede", c("Ranges", "RangesORmissing"), function(x, subject, select = c("first", "all")) { select <- match.arg(select) s <- start(subject) ord <- NULL if (S4Vectors:::isNotSorted(s)) { ord <- base::order(s) s <- s[ord] } if (select == "all") { srle <- Rle(s) s <- runValue(srle) } i <- findInterval(end(x), s) + 1L i[i > length(s)] <- NA if (select == "all") { vectorToHits(i, srle, ord) } else { if (!is.null(ord)) i <- ord[i] i } } ) setGeneric("follow", function(x, subject = x, ...) standardGeneric("follow")) setMethod("follow", c("Ranges", "RangesORmissing"), function(x, subject, select = c("last", "all")) { select <- match.arg(select) e <- end(subject) ord <- NULL if (S4Vectors:::isNotSorted(e)) { ord <- base::order(e) e <- e[ord] } if (select == "all") { srle <- Rle(e) e <- runValue(srle) } i <- findInterval(start(x) - 1L, e) i[i == 0] <- NA if (select == "all") { vectorToHits(i, srle, ord) } else { if (!is.null(ord)) i <- ord[i] i } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### nearest() ### ### Used in GenomicRanges. ### TODO: Move to Hits-class.R vectorToHits <- function(i, srle, ord) { lx <- length(i) v <- !is.na(i) i <- i[v] w <- width(srle)[i] subj <- as.integer(IRanges(start(srle)[i], width=w)) m <- cbind(queryHits = rep(seq(lx)[v], w), subjectHits = if (!is.null(ord)) ord[subj] else subj) if (!is.null(ord)) m <- m[orderIntegerPairs(m[,1L], m[,2L]),,drop=FALSE] Hits(m[ , 1L], m[ , 2L], lx, length(srle), sort.by.query=TRUE) } setGeneric("nearest", function(x, subject, ...) standardGeneric("nearest")) setMethod("nearest", c("Ranges", "RangesORmissing"), function(x, subject, select = c("arbitrary", "all")) { select <- match.arg(select) if (!missing(subject)) { ol <- findOverlaps(x, subject, maxgap = 0L, select = select) } else { subject <- x ol <- findOverlaps(x, maxgap = 0L, select = select, drop.self = TRUE) } if (select == "all") { olv <- selectHits(ol, select="first") } else olv <- ol x <- x[is.na(olv)] before <- precede(x, subject, if (select == "all") "all" else "first") after <- follow(x, subject, if (select == "all") "all" else "last") if (select == "all") { before0 <- before before <- selectHits(before, select="first") after0 <- after after <- selectHits(after, select="first") } leftdist <- (start(subject)[before] - end(x)) rightdist <- (start(x) - end(subject)[after]) left <- leftdist < rightdist left[is.na(left)] <- is.na(after)[is.na(left)] if (select == "all") { filterHits <- function(hits, i) { m <- as.matrix(hits[as(hits, "IRanges")[i]]) m[,1L] <- map[m[,1L]] m } map <- which(is.na(olv)) right <- !left left[leftdist == rightdist] <- TRUE m <- rbind(as.matrix(ol), filterHits(before0, left), filterHits(after0, right)) m <- m[orderIntegerPairs(m[,1L], m[,2L]),, drop=FALSE] ## unname() required because in case 'm' has only 1 row ## 'm[ , 1L]' and 'm[ , 2L]' will return a named atomic vector ol@from <- unname(m[ , 1L]) ol@to <- unname(m[ , 2L]) } else { olv[is.na(olv)] <- ifelse(left, before, after) ol <- olv } ol }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### distance() ### setGeneric("distance", function(x, y, ...) standardGeneric("distance")) setMethod("distance", c("Ranges", "Ranges"), function(x, y) { max_start <- pmax.int(start(x), start(y)) min_end <- pmin.int(end(x), end(y)) pmax.int(max_start - min_end - 1L, 0L) } ) setMethod("distance", c("Pairs", "missing"), function(x, y) { distance(first(x), second(x)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### distanceToNearest() ### setGeneric("distanceToNearest", function(x, subject = x, ...) standardGeneric("distanceToNearest")) setMethod("distanceToNearest", c("Ranges", "RangesORmissing"), function(x, subject, select = c("arbitrary", "all")) { select <- match.arg(select) if (missing(subject)) { subject <- x x_nearest <- nearest(x, select = select) } else { x_nearest <- nearest(x, subject, select = select) } if (select == "arbitrary") { queryHits <- seq_along(x)[!is.na(x_nearest)] subjectHits <- x_nearest[!is.na(x_nearest)] } else { queryHits <- queryHits(x_nearest) subjectHits <- subjectHits(x_nearest) } if (!length(subjectHits) || all(is.na(subjectHits))) { Hits(nLnode=length(x), nRnode=length(subject), distance=integer(0), sort.by.query=TRUE) } else { distance = distance(x[queryHits], subject[subjectHits]) Hits(queryHits, subjectHits, length(x), length(subject), distance, sort.by.query=TRUE) } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### selectNearest() ### selectNearest <- function(hits, x, subject) { hits <- as(hits, "SortedByQueryHits") hitsByQuery <- relist(hits, as(hits, "Partitioning")) dist <- distance(x[queryHits(hits)], subject[subjectHits(hits)]) distByQuery <- relist(dist, hitsByQuery) unlist(hitsByQuery[distByQuery == min(distByQuery)]) } IRanges/R/range-squeezers.R0000644000175400017540000000160713175713360016606 0ustar00biocbuildbiocbuild### ========================================================================= ### Generic functions for squeezing the ranges out of a range-based object ### ------------------------------------------------------------------------- ### Extract the ranges as an IRanges object. setGeneric("ranges", signature="x", function(x, use.names=TRUE, use.mcols=FALSE, ...) standardGeneric("ranges") ) ### Extract the ranges as an IRangesList object. setGeneric("rglist", signature="x", function(x, use.names=TRUE, use.mcols=FALSE, ...) standardGeneric("rglist") ) ### Pairs method. setMethod("rglist", "Pairs", function(x, use.names=TRUE, use.mcols=FALSE) { stopifnot(isTRUEorFALSE(use.mcols)) rl <- zipup(ranges(first(x)), ranges(second(x))) if (!use.mcols) { mcols(rl) <- NULL } rl }) IRanges/R/read.Mask.R0000644000175400017540000003372013175713360015274 0ustar00biocbuildbiocbuild### ========================================================================= ### Read a mask from a file ### ----------------------- ### ### From an NCBI "agp" file (for chrY in hs b36v3): ### library(BSgenome.Hsapiens.NCBI.b36v3) ### file1 <- system.file("extdata", "hs_b36v3_chrY.agp", package="IRanges") ### mask1 <- read.agpMask(file1, seqname="chrY", mask.width=length(Hsapiens$chrY)) ### ### From an UCSC "gap" file (for chrY in hg18): ### library(BSgenome.Hsapiens.UCSC.hg18) ### file2 <- system.file("extdata", "chrY_gap.txt", package="IRanges") ### mask2 <- read.gapMask(file2, seqname="chrY", mask.width=length(Hsapiens$chrY)) ### ### From an UCSC "lift" file (for hg18): ### file3 <- system.file("extdata", "hg18liftAll.lft", package="IRanges") ### mask3 <- read.liftMask(file3, seqname="chr1") ### ### From a RepeatMasker .out file (for chrM in ce2): ### library(BSgenome.Celegans.UCSC.ce2) ### file4 <- system.file("extdata", "ce2chrM.fa.out", package="IRanges") ### mask4 <- read.rmMask(file4, seqname="chrM", mask.width=length(Celegans$chrM)) ### ### From a Tandem Repeats Finder .bed file (for chrM in ce2): ### file5 <- system.file("extdata", "ce2chrM.bed", package="IRanges") ### mask5 <- read.trfMask(file5, seqname="chrM", mask.width=length(Celegans$chrM)) ### ### ------------------------------------------------------------------------- .showDistinctSeqnamesAndStop <- function(seqnames) { distinct_seqnames <- paste("\"", unique(seqnames), "\"", sep="") distinct_seqnames <- paste(distinct_seqnames, collapse=", ") stop(length(distinct_seqnames), " distinct seqnames found in this file: ", distinct_seqnames) } .newEmptyMask <- function(seqname, mask.width, mask.name, mask.desc, nofound_what="information") { msg <- paste("No ", nofound_what, " found for sequence \"", seqname, "\" in this file. ", sep="") if (is.na(mask.width)) stop(msg, "Please use the\n", " 'mask.width' argument to specify the width of the empty mask to\n", " return (i.e. the length of the sequence this mask will be put on).") warning(msg, "returning empty mask") ans <- Mask(mask.width) # empty mask names(ans) <- mask.name desc(ans) <- paste(mask.desc, "(empty)") ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.agpMask() and read.gapMask() extract the AGAPS mask from an NCBI ### "agp" file or a UCSC "gap" file, respectively. ### .guessGapFileCOL2CLASS <- function(file) { ## UCSC "gap" files generally have the 9 columns below except for some ## organisms like Rhesus that have only 8 columns (no 'bin' column). COL2CLASS <- c( `bin`="integer", `chrom`="character", `chr_start`="integer", `chr_stop`="integer", `part_no`="integer", `part_type`="character", `gap_len`="integer", `gap_type`="character", `bridge`="character" ) line1 <- try(read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, nrows=1L, check.names=FALSE), silent=TRUE) if (!inherits(line1, "try-error")) return(COL2CLASS) COL2CLASS <- COL2CLASS[-1L] line1 <- try(read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, nrows=1L, check.names=FALSE), silent=TRUE) if (!inherits(line1, "try-error")) return(COL2CLASS) stop("unable to guess the column names in \"gap\" file '", file, "', sorry") } .read.agpORgapFile <- function(agp_or_gap, file) { if (agp_or_gap == "agp") { COL2CLASS <- c( `chrom`="character", `chr_start`="integer", `chr_stop`="integer", `part_no`="integer", `part_type`="character", `gap_len`="character", `gap_type`="character", `linkage`="character", `empty`="character" ) } else if (agp_or_gap == "gap") { COL2CLASS <- .guessGapFileCOL2CLASS(file) } else { stop("read.Mask internal error: please report") } COLS <- c( "chrom", "chr_start", "chr_stop", "part_type", "gap_len", "gap_type" ) COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" data <- read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE, fill=TRUE) } .read.agpORgapMask <- function(agp_or_gap, file, seqname, mask.width, gap.types, use.gap.types) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) if (!is.null(gap.types) && (!is.character(gap.types) || S4Vectors:::anyMissing(gap.types) || anyDuplicated(gap.types))) stop("'gap.types' must be 'NULL' or a character vector ", "with no NAs and no duplicated") if (!isTRUEorFALSE(use.gap.types)) stop("'use.gap.types' must be TRUE or FALSE") data <- .read.agpORgapFile(agp_or_gap, file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$chrom) data <- data[data$chrom == seqname, ] ii <- data$part_type == "N" if (agp_or_gap == "agp") { data <- data[ii, ] } else if (!all(ii)) { warning("gap file contains gaps with a part_type that is not N") } if (length(gap.types) == 1 && gap.types == "?") { found_types <- paste("\"", unique(data$gap_type), "\"", sep="") found_types <- paste(found_types, collapse=", ") stop("gap types found in this file for sequence \"", seqname, "\": ", found_types) } mask.name <- "AGAPS" mask.desc <- "assembly gaps" if (!is.null(gap.types)) { data <- data[data$gap_type %in% gap.types, ] mask.desc <- paste(mask.desc, " [type=", paste(gap.types, collapse="|"), "]", sep="") } if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, mask.name, mask.desc, mask.desc)) if (agp_or_gap == "agp") ranges_start <- data$chr_start else ranges_start <- data$chr_start + 1L ranges <- IRanges(start=ranges_start, width=as.integer(data$gap_len)) ## Sanity check if (!identical(end(ranges), data$chr_stop)) stop("broken \"", agp_or_gap, "\" file: contains inconsistent ", "chr_start/chr_stop/gap_len values ", "for assembly gaps in sequence \"", seqname, "\"") if (use.gap.types) { names(ranges) <- data$gap_type if (S4Vectors:::isNotStrictlySorted(start(ranges))) ranges <- ranges[base::order(start(ranges))] if (!isNormal(ranges)) stop("cannot use the gap types when some gaps are adjacent or overlap") nir1 <- asNormalIRanges(ranges, force=FALSE) } else { nir1 <- asNormalIRanges(ranges, force=TRUE) } ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES=mask.name, desc=mask.desc, check=FALSE) } read.agpMask <- function(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) .read.agpORgapMask("agp", file, seqname, mask.width, gap.types, use.gap.types) read.gapMask <- function(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) .read.agpORgapMask("gap", file, seqname, mask.width, gap.types, use.gap.types) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.liftMask() extracts the AGAPS mask from a UCSC "lift" file. ### .read.liftFile <- function(file) { COL2CLASS <- c( `offset`="integer", `xxxx`="NULL", # not sure how to call this `width`="integer", `seqname`="character", `seqlen`="integer" ) read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE) } read.liftMask <- function(file, seqname="?", mask.width=NA) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) data <- .read.liftFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$seqname) data <- data[data$seqname == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "AGAPS", "assembly gaps")) ## Sanity checks seqlen0 <- unique(data$seqlen) if (length(seqlen0) != 1) stop("broken \"lift\" file: contains different lengths ", "for sequence \"", seqname, "\"") if (!is.na(mask.width) && mask.width != seqlen0) stop("when supplied, 'mask.width' must match the length found ", "in the file for sequence \"", seqname, "\"") contigs0 <- IRanges(start=data$offset+1, width=data$width) contigs1 <- asNormalIRanges(contigs0, force=TRUE) if (length(contigs1) != length(contigs0)) warning("some contigs are adjacent or overlapping") contigs <- Mask(seqlen0, start=start(contigs1), width=width(contigs1)) ans <- gaps(contigs) names(ans) <- "AGAPS" desc(ans) <- "assembly gaps" ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.rmMask() extracts the RM mask from a RepeatMasker .out file. ### ### See http://www.repeatmasker.org/webrepeatmaskerhelp.html for a ### description of the RepeatMasker output format. ### .read.rmFile <- function(file) { COL2CLASS <- c( `SW_score`="integer", `perc_div`="numeric", `perc_del`="numeric", `perc_ins`="numeric", `query_sequence`="character", `begin_in_query`="integer", `end_in_query`="integer", `left_in_query`="character", `C`="character", `matching_repeat`="character", `repeat_class_or_family`="character", `begin_in_repeat`="integer", `end_in_repeat`="integer", `left_in_repeat`="character", `ID`="character" ) COLS <- c("query_sequence", "begin_in_query", "end_in_query", "ID") COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, skip=3, check.names=FALSE) } read.rmMask <- function(file, seqname="?", mask.width=NA, use.IDs=FALSE) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) if (!isTRUEorFALSE(use.IDs)) stop("'use.IDs' must be TRUE or FALSE") data <- .read.rmFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$query_sequence) data <- data[data$query_sequence == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "RM", "RepeatMasker")) ranges <- IRanges(start=data$begin_in_query, end=data$end_in_query) if (use.IDs) { names(ranges) <- data$ID if (S4Vectors:::isNotStrictlySorted(start(ranges))) ranges <- ranges[base::order(start(ranges))] if (!isNormal(ranges)) stop("cannot use the repeat IDs when some repeats are adjacent or overlap") nir1 <- asNormalIRanges(ranges, force=FALSE) } else { nir1 <- asNormalIRanges(ranges, force=TRUE) } ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES="RM", desc="RepeatMasker", check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.trfMask() extracts the TRF mask from a Tandem Repeats Finder .bed ### file. ### .read.trfFile <- function(file) { COL2CLASS <- c( `chrom`="character", `chromStart`="integer", `chromEnd`="integer", `name`="character", `period`="integer", `copyNum`="numeric", `consensusSize`="integer", `perMatch`="integer", `perIndel`="integer", `score`="integer", `A`="integer", `C`="integer", `G`="integer", `T`="integer", `entropy`="numeric", `sequence`="character" ) COLS <- c("chrom", "chromStart", "chromEnd") COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE) } read.trfMask <- function(file, seqname="?", mask.width=NA) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) data <- .read.trfFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$chrom) data <- data[data$chrom == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "TRF", "Tandem Repeats Finder")) ranges <- IRanges(start=data$chromStart+1, end=data$chromEnd) nir1 <- asNormalIRanges(ranges, force=TRUE) ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES="TRF", desc="Tandem Repeats Finder", check=FALSE) } IRanges/R/reverse-methods.R0000644000175400017540000000546613175713360016611 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "reverse" generic and methods. ### setGeneric("reverse", function(x, ...) standardGeneric("reverse")) setMethod("reverse", "character", function(x, ...) { if (length(x) == 0) return(x) sapply(strsplit(x, NULL, fixed=TRUE), function(xx) paste(rev(xx), collapse="")) } ) ### This method does NOT preserve normality. .IRanges.reverse <- function(x, ...) { if (length(x) == 0L) return(x) args <- S4Vectors:::extraArgsAsList(NULL, ...) argnames <- names(args) n2p <- match(c("start", "end", "use.names"), argnames) if (is.na(n2p[1L])) { start <- min(start(x)) } else { start <- args[[n2p[1L]]] if (!is.numeric(start)) stop("'start' must be a vector of integers") if (!is.integer(start)) start <- as.integer(start) if (S4Vectors:::anyMissing(start)) stop("'start' contains NAs") } if (is.na(n2p[2L])) { end <- max(end(x)) } else { end <- args[[n2p[2L]]] if (!is.numeric(end)) stop("'end' must be a vector of integers") if (!is.integer(end)) end <- as.integer(end) if (S4Vectors:::anyMissing(end)) stop("'end' contains NAs") } if (!is.na(n2p[3L]) && !S4Vectors:::normargUseNames(args[[n2p[3L]]])) x <- set_IRanges_names(x, NULL) ## WARNING: -end(x) *must* appear first in this expression if we want ## the supplied 'start' and 'end' to be recycled properly. ## Remember that in R, because of the recycling, addition of numeric ## vectors of different lengths is not associative i.e. in general ## '(x + y) + z' is not the same as 'x + (y + z)'. For example: ## (integer(6) + 1:2) + 1:3 and integer(6) + (1:2 + 1:3) ## are not the same. x@start[] <- -end(x) + start + end x } setMethod("reverse", "IRanges", .IRanges.reverse) setMethod("reverse", "NormalIRanges", function(x, ...) { ## callNextMethod() temporarily breaks 'x' as a NormalIRanges object ## because the returned ranges are ordered from right to left. x <- callNextMethod() update(x, start=rev(start(x)), width=rev(width(x)), names=rev(names(x)), mcols=S4Vectors:::revROWS(mcols(x))) } ) setMethod("reverse", "Views", function(x, ...) { x@subject <- rev(subject(x)) x@ranges <- reverse(ranges(x), start=1L, end=length(subject(x))) x } ) setMethod("reverse", "MaskCollection", function(x, ...) { start <- 1L end <- width(x) x@nir_list <- lapply(nir_list(x), function(nir) reverse(nir, start=start, end=end) ) x } ) IRanges/R/seqapply.R0000644000175400017540000000221613175713360015321 0ustar00biocbuildbiocbuild### ========================================================================= ### The stuff in this file should go somewhere else, probably close to ### splitAsList() (currently extractList.R) ### ------------------------------------------------------------------------- ### ## NOT exported. `splitAsList<-` <- function(x, f, drop = FALSE, ..., value) { if (!isTRUEorFALSE(drop)) stop("'drop' must be TRUE or FALSE") if (length(x) != length(f)) stop("Length of 'f' must equal the length of 'x'") ind <- splitAsList(seq_len(length(x)), f, drop = drop) if (length(ind) != length(value)) stop("Length of 'value' must equal the length of a split on 'f'") x[unlist(ind, use.names=FALSE)] <- unlist(value, use.names = FALSE) x } setMethod("unsplit", "List", function(value, f, drop = FALSE) { value_flat <- unlist(value, use.names = FALSE) if (length(value_flat) != length(f)) stop("Length of 'unlist(value)' must equal length of 'f'") splitAsList(value_flat, f, drop = drop) <- value value_flat }) setReplaceMethod("split", "Vector", function(x, f, drop = FALSE, ..., value) { splitAsList(x, f, drop = drop, ...) <- value x }) IRanges/R/setops-methods.R0000644000175400017540000002544013175713360016445 0ustar00biocbuildbiocbuild### ========================================================================= ### Set operations ### ------------------------------------------------------------------------- ### ### 1) Vector-wise set operations: union, intersect, setdiff ### ### When the input are Ranges objects, the functions in that group interpret ### each supplied object ('x' or 'y') as a set of integer values. Therefore, ### if 2 IRanges objects 'x1' and 'x2' represent the same set of integers, ### then each of these functions will return the same result when 'x1' is ### replaced with 'x2' in the input. The returned IRanges object is ### guaranteed to be normal but is *not* promoted to NormalIRanges. ### ### 2) Element-wise (aka "parallel") set operations: punion, pintersect, ### psetdiff, pgap ### ### The functions in that group take 2 *objects* of the same length and ### return an object of the same class and length as the first argument. ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### union() ### ### Always return an IRanges *instance* whatever Ranges derivatives are passed ### to it (e.g. IPos, NCList or NormalIRanges), so does NOT act like an ### endomorphism in general. setMethod("union", c("Ranges", "Ranges"), function(x, y) { ## We downgrade 'x' to an IRanges instance so 'c(x, y)' is guaranteed ## to work (even e.g. if 'x' is a NormalIRanges object). x <- as(x, "IRanges", strict=TRUE) reduce(c(x, y), drop.empty.ranges=TRUE) } ) setMethod("union", c("RangesList", "RangesList"), function(x, y) mendoapply(union, x, y)) setMethod("union", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { len <- max(length(x), length(y)) if (length(x) != len) x <- x[S4Vectors:::recycleVector(seq_len(length(x)), len)] if (length(y) != len) y <- y[S4Vectors:::recycleVector(seq_len(length(y)), len)] xy <- c(unlist(x, use.names = FALSE), unlist(y, use.names = FALSE)) xy_list <- split(xy, factor(c(togroup(PartitioningByWidth(x)), togroup(PartitioningByWidth(y))), seq_len(length(x)))) names(xy_list) <- names(x) reduce(xy_list, drop.empty.ranges=TRUE) }) setMethod("union", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### intersect() ### ### Always return an IRanges *instance* whatever Ranges derivatives are passed ### to it (e.g. IPos, NCList or NormalIRanges), so does NOT act like an ### endomorphism in general. setMethod("intersect", c("Ranges", "Ranges"), function(x, y) { if (length(x) == 0L) return(x) start <- min(c(start(x), start(y))) end <- max(c(end(x), end(y))) setdiff(x, gaps(y, start=start, end=end)) } ) setMethod("intersect", c("RangesList", "RangesList"), function(x, y) mendoapply(intersect, x, y)) setMethod("intersect", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { nonempty <- elementNROWS(x) != 0L rx <- unlist(range(x), use.names = FALSE) startx <- integer() startx[nonempty] <- start(rx) endx <- integer() endx[nonempty] <- end(rx) setdiff(x, gaps(y, start = startx, end = endx)) }) setMethod("intersect", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) setMethod("intersect", c("CompressedAtomicList", "CompressedAtomicList"), function(x, y) { fx <- if (!is(x, "IntegerList")) as(x, "FactorList") else x fy <- if (!is(y, "IntegerList")) as(y, "FactorList") else y m <- S4Vectors:::matchIntegerPairs(togroup(PartitioningByEnd(x)), unlist(fx, use.names=FALSE), togroup(PartitioningByEnd(y)), unlist(fy, use.names=FALSE), nomatch=0L) m[duplicated(m)] <- 0L x[relist(m > 0L, x)] }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### setdiff() ### ### Always return an IRanges *instance* whatever Ranges derivatives are passed ### to it (e.g. IPos, NCList or NormalIRanges), so does NOT act like an ### endomorphism in general. setMethod("setdiff", c("Ranges", "Ranges"), function(x, y) { if (length(x) == 0L) return(x) start <- min(c(start(x), start(y))) end <- max(c(end(x), end(y))) gaps(union(gaps(x, start=start, end=end), y), start=start, end=end) } ) setMethod("setdiff", c("RangesList", "RangesList"), function(x, y) mendoapply(setdiff, x, y)) setMethod("setdiff", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { nonempty <- elementNROWS(x) != 0L rx <- unlist(range(x), use.names = FALSE) startx <- rep(NA_integer_, length(x)) startx[nonempty] <- start(rx) endx <- rep(NA_integer_, length(x)) endx[nonempty] <- end(rx) gaps(union(gaps(x), y), start = startx, end = endx) }) setMethod("setdiff", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### punion() ### setGeneric("punion", signature=c("x", "y"), function(x, y, ...) standardGeneric("punion") ) setMethod("punion", c("Ranges", "Ranges"), function(x, y, fill.gap=FALSE) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") if (!isTRUEorFALSE(fill.gap)) stop("'fill.gap' must be TRUE or FALSE") if (!fill.gap) { gap <- pmax.int(start(x), start(y)) - pmin.int(end(x), end(y)) - 1L if (any(gap > 0L)) stop("some pair of ranges have a gap within ", "the 2 members of the pair.\n", " Use 'fill.gap=TRUE' to enforce their ", "union by filling the gap.") } ans_start <- pmin.int(start(x), start(y)) ans_end <- pmax.int(end(x), end(y)) ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, end=ans_end, names=ans_names) } ) setMethod("punion", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pintersect() ### setGeneric("pintersect", signature=c("x", "y"), function(x, y, ...) standardGeneric("pintersect") ) setMethod("pintersect", c("Ranges", "Ranges"), function(x, y, resolve.empty=c("none", "max.start", "start.x")) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_start <- pmax.int(start(x), start(y)) ans_end <- pmin.int(end(x), end(y)) ans_width <- ans_end - ans_start + 1L keep_empty_x <- width(x) == 0L if (any(keep_empty_x)) { keep_empty_x <- keep_empty_x & ((start(x) >= start(y) & start(x) <= end(y)) | (start(x) == start(y) & width(y) == 0L)) } if (any(keep_empty_x)) { ans_start[keep_empty_x] <- start(x)[keep_empty_x] ans_width[keep_empty_x] <- 0L } keep_empty_y <- width(y) == 0L if (any(keep_empty_y)) { keep_empty_y <- keep_empty_y & start(y) >= start(x) & start(y) <= end(x) } if (any(keep_empty_y)) { ans_start[keep_empty_y] <- start(y)[keep_empty_y] ans_width[keep_empty_y] <- 0L } check_empty <- ans_width < 0L check_empty[keep_empty_x | keep_empty_y] <- FALSE if (any(check_empty)) { resolve.empty <- match.arg(resolve.empty) if (resolve.empty == "none") { stop("some intersections produce ambiguous empty ranges.\n", " Use argument 'resolve.empty' to resolve them.") } else { ans_width[check_empty] <- 0L if (resolve.empty == "start.x") ans_start[check_empty] <- start(x)[check_empty] } } ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, width=ans_width, names=ans_names) } ) setMethod("pintersect", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### psetdiff() ### setGeneric("psetdiff", signature=c("x", "y"), function(x, y, ...) standardGeneric("psetdiff") ) setMethod("psetdiff", c("Ranges", "Ranges"), function(x, y) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_start <- start(x) ans_end <- end(x) if (any((start(y) > ans_start) & (end(y) < ans_end))) stop("some ranges in 'y' have their end points strictly inside\n", " the range in 'x' that they need to be subtracted from.\n", " Cannot subtract them.") start2 <- pmax.int(ans_start, start(y)) end2 <- pmin.int(ans_end, end(y)) ii <- start2 <= end2 jj <- end2 == ans_end kk <- ii & jj ans_end[kk] <- start2[kk] - 1L kk <- ii & (!jj) ans_start[kk] <- end2[kk] + 1L ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, end=ans_end, names=ans_names) } ) setMethod("psetdiff", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pgap() ### setGeneric("pgap", signature=c("x", "y"), function(x, y, ...) standardGeneric("pgap") ) setMethod("pgap", c("Ranges", "Ranges"), function(x, y) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_end_plus1 <- pmax.int(start(x), start(y)) ans_start <- pmin.int(end(x), end(y)) + 1L ans_width <- ans_end_plus1 - ans_start ans_width[ans_width < 0L] <- 0L ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, width=ans_width, names=ans_names) } ) IRanges/R/slice-methods.R0000644000175400017540000000662213175713360016230 0ustar00biocbuildbiocbuild### ========================================================================= ### Slice the bread ### ------------------------------------------------------------------------- setGeneric("slice", signature="x", function(x, lower=-Inf, upper=Inf, ...) standardGeneric("slice")) setMethod("slice", "Rle", function(x, lower = -Inf, upper = Inf, includeLower = TRUE, includeUpper = TRUE, rangesOnly = FALSE) { if (!isSingleNumber(lower)) { stop("'lower' must be a single number") } if (!isSingleNumber(upper)) { stop("'upper' must be a single number") } if (!isTRUEorFALSE(includeLower)) { stop("'includeLower' must be TRUE or FALSE") } if (!isTRUEorFALSE(includeUpper)) { stop("'includeUpper' must be TRUE or FALSE") } if (!isTRUEorFALSE(rangesOnly)) { stop("'rangesOnly' must be TRUE or FALSE") } if (lower == -Inf) { ranges <- Rle(TRUE, length(x)) } else if (includeLower) { ranges <- (x >= lower) } else { ranges <- (x > lower) } if (upper < Inf) { if (includeUpper) { ranges <- ranges & (x <= upper) } else { ranges <- ranges & (x < upper) } } if (rangesOnly) { as(ranges, "IRanges") } else { Views(x, ranges) } }) setMethod("slice", "RleList", function(x, lower = -Inf, upper = Inf, includeLower = TRUE, includeUpper = TRUE, rangesOnly = FALSE) { if (!isSingleNumber(lower)) stop("'lower' must be a single number") if (!isSingleNumber(upper)) stop("'upper' must be a single number") if (!isTRUEorFALSE(includeLower)) stop("'includeLower' must be TRUE or FALSE") if (!isTRUEorFALSE(includeUpper)) stop("'includeUpper' must be TRUE or FALSE") if (!isTRUEorFALSE(rangesOnly)) stop("'rangesOnly' must be TRUE or FALSE") if (lower == -Inf) { ranges <- RleList(lapply(elementNROWS(x), function(len) Rle(TRUE, len)), compress=FALSE) } else if (includeLower) { ranges <- (x >= lower) } else { ranges <- (x > lower) } if (upper < Inf) { if (includeUpper) { ranges <- ranges & (x <= upper) } else { ranges <- ranges & (x < upper) } } if (rangesOnly) { as(ranges, "CompressedIRangesList") } else { RleViewsList(rleList = x, rangesList = as(ranges, "SimpleIRangesList")) } }) setMethod("slice", "ANY", function(x, lower=-Inf, upper=Inf, ...) { slice(as(x, "Rle"), lower=lower, upper=upper, ...) }) IRanges/R/subsetting-utils.R0000644000175400017540000000477113175713360017020 0ustar00biocbuildbiocbuild### ========================================================================= ### Subsetting utility functions ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### RangesNSBS objects. ### setClass("RangesNSBS", # not exported contains="NSBS", representation( subscript="Ranges" ) #prototype( # subscript=IRanges() #) ) ### Construction methods. ### Supplied arguments are trusted so we don't check them! setMethod("NSBS", "Ranges", function(i, x, exact=TRUE, strict.upper.bound=TRUE, allow.NAs=FALSE) { i_len <- length(i) if (i_len == 0L) { ## Return a NativeNSBS object of length 0. i <- NULL return(callGeneric()) } x_NROW <- NROW(x) if (min(start(i)) < 1L || strict.upper.bound && max(end(i)) > x_NROW) S4Vectors:::.subscript_error("subscript contains out-of-bounds ", "ranges") if (i_len == 1L) { ans <- new2("RangeNSBS", subscript=c(start(i), end(i)), upper_bound=x_NROW, upper_bound_is_strict=strict.upper.bound, check=FALSE) return(ans) } new2("RangesNSBS", subscript=i, upper_bound=x_NROW, upper_bound_is_strict=strict.upper.bound, check=FALSE) } ) ### Other methods. setMethod("length", "RangesNSBS", function(x) sum(width(x@subscript))) setMethod("anyDuplicated", "RangesNSBS", function(x, incomparables=FALSE, ...) !isDisjoint(x@subscript) ) setMethod("isStrictlySorted", "RangesNSBS", function(x) isNormal(x@subscript)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### "extractROWS" methods for subsetting *by* a Ranges object. ### setMethod("extractROWS", c("vector_OR_factor", "RangesNSBS"), function(x, i) { start <- start(i@subscript) width <- width(i@subscript) S4Vectors:::extract_ranges_from_vector_OR_factor(x, start, width) } ) setMethod("extractROWS", c("Rle", "RangesNSBS"), function(x, i) { start <- start(i@subscript) width <- width(i@subscript) ans <- S4Vectors:::extract_ranges_from_Rle(x, start, width) mcols(ans) <- extractROWS(mcols(x), i) ans } ) IRanges/R/tile-methods.R0000644000175400017540000000572613175713360016072 0ustar00biocbuildbiocbuild### ========================================================================= ### "tile" methods ### ------------------------------------------------------------------------- ### ### TODO: We have a profileration of tools for creating these "sliding ### windows" or "tiles": successiveIRanges(), tileGenome(), tile(), and now ### slidingWindows(). With no visible coherent naming scheme. Introducing ### a new verb each time we get frustrated because the existing tools don't ### let us create tiles or windows exactly the way we'd like for the use case ### of the day is not a sustainable strategy in the long run. This just adds ### more and more confusion for the end-user. ### So some effort will need to be done towards unification of all these ### tools. H.P. -- Oct 16, 2016. setGeneric("tile", function(x, n, width, ...) standardGeneric("tile"), signature="x") setMethod("tile", "Ranges", function(x, n, width, ...) { if (!missing(n)) { if (!missing(width)) stop("only one of 'n' and 'width' can be specified") if (any(IRanges::width(x) < n)) stop("some width(x) are less than 'n'") if (any(n < 0L)) stop("some 'n' are negative") n <- S4Vectors:::recycleVector(n, length(x)) } if (!missing(width)) { if (!missing(n)) stop("only one of 'n' and 'width' can be specified") if (any(width < 0L)) stop("some 'width' are negative") n <- ceiling(width(x) / width) } width <- IRanges::width(x) / n ## The floor() is intentional for compatibility with Jim Kent's BigWig code ## tileGenome() uses ceiling() instead tile.end <- floor(as.integer(IRanges(rep(1L, length(n)), width=n)) * rep(width, n)) tile.end.abs <- tile.end + rep(start(x), n) - 1L tile.width <- S4Vectors:::diffWithInitialZero(as.integer(tile.end.abs)) p <- PartitioningByWidth(n) tile.width[start(p)] <- tile.end[start(p)] relist(IRanges(width=tile.width, end=tile.end.abs), p) }) ### ========================================================================= ### "slidingWindows" methods ### ------------------------------------------------------------------------- ### setGeneric("slidingWindows", function(x, width, step = 1L, ...) standardGeneric("slidingWindows"), signature="x") setMethod("slidingWindows", "Ranges", function(x, width, step = 1L) { if (!isSingleNumber(width)) stop("'width' must be a single, non-NA number") if (!isSingleNumber(step)) stop("'step' must be a single, non-NA number") if (any(width < 0L)) stop("some 'width' are negative") if (any(step < 0L)) stop("some 'step' are negative") n <- ceiling(pmax(width(x) - width, 0L) / step) + 1L window.starts <- as.integer(IRanges(rep(0L, length(n)), width=n)) * step + 1L windows <- restrict(IRanges(window.starts, width=width), end=rep(width(x), n)) windows.abs <- shift(windows, rep(start(x), n) - 1L) relist(windows.abs, PartitioningByWidth(n)) }) IRanges/R/zzz.R0000644000175400017540000000022113175713360014312 0ustar00biocbuildbiocbuild### .onUnload <- function(libpath) { library.dynam.unload("IRanges", libpath) } .test <- function() BiocGenerics:::testPackage("IRanges") IRanges/TODO0000644000175400017540000001212513175713360013627 0ustar00biocbuildbiocbuildImmediate TODO list ------------------- - Bug fix: Combining RangedData objects is currently broken (IRanges 1.9.20): library(IRanges) ranges <- IRanges(c(1,2,3),c(4,5,6)) rd1 <- RangedData(ranges) rd2 <- RangedData(shift(ranges, 100)) rd <- c(rd1, rd2) # Seems to work (with some warnings)... validObject(rd) # but returns an invalid object! - Herve: Make the MaskCollection class a derivative of the SimpleIRangesList class. - Herve: Use a different name for "reverse" method for IRanges and MaskCollection objects. Seems like, for IRanges objects, reverse() and reflect() are doing the same thing, so I should just keep (and eventually adapt) the latter. Also, I should add a "reflect" method for SimpleIRangesList objects that would do what the current "reverse" method for MaskCollection objects does. Once this is done, adapt R/reverse.R file in Biostrings to use reflect() instead of reverse() wherever needed. - Clean up endomorphisms. Long term TODO list ------------------- o RangesList: - parallel rbind - binary ops: "nearest", "intersect", "setdiff", "union" - 'y' omitted: become n-ary ops on items in collection - 'y' specified: performed element-wise - unary ops: "coverage" etc are vectorized o DataTable: - group generics (Math, Ops, Summary) o SplitDataFrameList: - rbind o IO: - xscan() - read data directly into XVector objects ------------------------------------- Conceptual framework (by Michael) ------------------------------------- Basic problem: We have lots of (long) data series and need a way to efficiently represent and manipulate them. A series is a vector, except that the positions of the elements are meaningful. That is, we often expect strong auto-correlation. We have an abstraction called "Vector" for representing these series. There are currently two optimized means of storing long series: 1) Externally, currently only in memory, in XVector derivatives. The main benefit here is avoiding unnecessary copying, though there is potential for vectors stored in databases and flat files on disk (but this is outside our use case). 2) Run-length encoding (Rle class). This is a classic means of compressing discrete-valued series. It is very efficient, as long as there are long runs of equal value. Rle, so far, is far ahead of XVector in terms of direct usefulness. If XVector were implemented with an environment, rather than an external pointer, adding functionality would be easier. Could carry some things over from externalVector. As the sequence of observations in a series is important, we often want to manipulate specific regions of the series. We can use the window() function to select a particular region from a Vector, and a logical Rle can represent a selection of multiple regions. A slightly more general representation, that supports overlapping regions, is the Ranges class. A Ranges object holds any number of start,width pairs that describe closed intervals representing the set of integers that fall within the endpoints. The primary implementation is IRanges, which stores the information as two integer vectors. Often the endpoints of the intervals are interesting independent of the underlying sequence. Many utilities are implemented for manipulating and analyzing Ranges. These include: 1) overlap detection 2) nearest neighbors: precede, follow, nearest 3) set operations: (p)union, (p)intersect, (p)setdiff, gaps 4) coverage, too bio specific? rename to 'table'? 5) resolving overlap: reduce and (soon) collapse 6) transformations: flank, reflect, restrict, narrow... 7) (soon) mapping/alignment There are two ways to explicitly pair a Ranges object with a Vector: 1) Masking, as in MaskedXString, where only the elements outside of the Ranges are considered by an operation. 2) Views, which are essentially lists of subsequences. This relies in the fly-weight pattern for efficiency. Several fast paths, like viewSums and viewMaxs, are implemented. There is an RleViews and an XIntegerViews (is this one currently used at all?). Views are limited to subsequences derived from a single sequence. For more general lists of sequences, we have a separate framework, based on the List class. The List optionally ensures that all of its elements are derived from a specified type, and it also aims to efficiently represent a major use case of lists: splitting a vector by a factor. The indices of the elements with each factor level are stored, but there is no physical split of the vector into separate list elements. A special case that often occurs in data analysis is a list containing a set of variables in the same dataset. This problem is solved by 'data.frame' in base R, and we have an equivalent DataFrame class that can hold any type of R object, as long as it has a vector semantic. Many of the important data structures have List analogs. These include all atomic types, as well as: * SplitDataFrameList: a list of DataFrames that have the same columns (usually the result of a split) * RangesList: Essentially just a list of Ranges objects, but often used for splitting Ranges by their "space" (e.g. chromosome) IRanges/build/0000755000175400017540000000000013175724757014251 5ustar00biocbuildbiocbuildIRanges/build/vignette.rds0000644000175400017540000000034113175724757016606 0ustar00biocbuildbiocbuildb```b`fab`b2 1# ' JKO-/K-*L- +GS"嘧WRR\PEȰ44%( VQ@#4%bXZ Bid9S+`&aaqI"$apq2]a>o8`>ONI,F6WJbI^ZP?9IRanges/inst/0000755000175400017540000000000013175724757014127 5ustar00biocbuildbiocbuildIRanges/inst/CITATION0000644000175400017540000000165413175713360015256 0ustar00biocbuildbiocbuildcitEntry(entry="article", title = "Software for Computing and Annotating Genomic Ranges", author = personList( as.person("Michael Lawrence" ), as.person("Wolfgang Huber" ), as.person("Herv\\'e Pag\\`es" ), as.person("Patrick Aboyoun" ), as.person("Marc Carlson" ), as.person("Robert Gentleman" ), as.person("Martin Morgan" ), as.person("Vincent Carey" )), year = 2013, journal = "{PLoS} Computational Biology", volume = "9", issue = "8", doi = "10.1371/journal.pcbi.1003118", url = "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118", textVersion = "Lawrence M, Huber W, Pag\\`es H, Aboyoun P, Carlson M, et al. (2013) Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol 9(8): e1003118. doi:10.1371/journal.pcbi.1003118" ) IRanges/inst/doc/0000755000175400017540000000000013175724757014674 5ustar00biocbuildbiocbuildIRanges/inst/doc/IRangesOverview.R0000644000175400017540000003060113175724756020075 0ustar00biocbuildbiocbuild### R code from vignette source 'IRangesOverview.Rnw' ### Encoding: UTF-8 ################################################### ### code chunk number 1: options ################################################### options(width=72) ################################################### ### code chunk number 2: biocLite (eval = FALSE) ################################################### ## source("http://bioconductor.org/biocLite.R") ## biocLite("IRanges") ################################################### ### code chunk number 3: initialize ################################################### library(IRanges) ################################################### ### code chunk number 4: initialize ################################################### set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length = 500), seq(10, 0.001, length = 500)) xVector <- Rle(rpois(1e7, lambda)) yVector <- Rle(rpois(1e7, lambda[c(251:length(lambda), 1:250)])) ################################################### ### code chunk number 5: basic-ops ################################################### length(xVector) xVector[1] zVector <- c(xVector, yVector) ################################################### ### code chunk number 6: seq-extraction ################################################### xSnippet <- xVector[IRanges(4751, 4760)] xSnippet head(xSnippet) tail(xSnippet) rev(xSnippet) rep(xSnippet, 2) subset(xSnippet, xSnippet >= 5L) ################################################### ### code chunk number 7: seq-combine ################################################### c(xSnippet, rev(xSnippet)) append(xSnippet, xSnippet, after = 3) ################################################### ### code chunk number 8: aggregate ################################################### xSnippet aggregate(xSnippet, start = 1:8, width = 3, FUN = median) ################################################### ### code chunk number 9: shiftApply-cor ################################################### cor(xVector, yVector) shifts <- seq(235, 265, by=3) corrs <- shiftApply(shifts, yVector, xVector, FUN = cor) ################################################### ### code chunk number 10: figshiftcorrs ################################################### plot(shifts, corrs) ################################################### ### code chunk number 11: Rle-construction ################################################### xRle <- Rle(xVector) yRle <- Rle(yVector) xRle yRle ################################################### ### code chunk number 12: Rle-vector-compare ################################################### as.vector(object.size(xRle) / object.size(xVector)) identical(as.vector(xRle), xVector) ################################################### ### code chunk number 13: Rle-accessors ################################################### head(runValue(xRle)) head(runLength(xRle)) ################################################### ### code chunk number 14: Rle-ops ################################################### xRle > 0 xRle + yRle xRle > 0 | yRle > 0 ################################################### ### code chunk number 15: Rle-summary ################################################### range(xRle) sum(xRle > 0 | yRle > 0) ################################################### ### code chunk number 16: Rle-math ################################################### log1p(xRle) ################################################### ### code chunk number 17: Rle-cor ################################################### cor(xRle, yRle) shiftApply(249:251, yRle, xRle, FUN = function(x, y) var(x, y) / (sd(x) * sd(y))) ################################################### ### code chunk number 18: list-intro ################################################### getClassDef("RleList") ################################################### ### code chunk number 19: list-construct ################################################### args(IntegerList) cIntList1 <- IntegerList(x = xVector, y = yVector) cIntList1 sIntList2 <- IntegerList(x = xVector, y = yVector, compress = FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress = FALSE) object.size(cIntList2) object.size(sIntList2) ################################################### ### code chunk number 20: list-length ################################################### length(cIntList2) Rle(elementNROWS(cIntList2)) ################################################### ### code chunk number 21: list-lapply ################################################### system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) ################################################### ### code chunk number 22: list-groupgenerics ################################################### xRleList <- RleList(xRle, 2L * rev(xRle)) yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList > 0 xRleList + yRleList sum(xRleList > 0 | yRleList > 0) ################################################### ### code chunk number 23: list-endoapply ################################################### safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] ################################################### ### code chunk number 24: iranges-constructor ################################################### ir1 <- IRanges(start = 1:10, width = 10:1) ir2 <- IRanges(start = 1:10, end = 11) ir3 <- IRanges(end = 11, width = 10:1) identical(ir1, ir2) & identical(ir2, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width = c(12, 6, 6, 15, 6, 2, 7)) ################################################### ### code chunk number 25: iranges-start ################################################### start(ir) ################################################### ### code chunk number 26: iranges-end ################################################### end(ir) ################################################### ### code chunk number 27: iranges-width ################################################### width(ir) ################################################### ### code chunk number 28: iranges-subset-numeric ################################################### ir[1:4] ################################################### ### code chunk number 29: iranges-subset-logical ################################################### ir[start(ir) <= 15] ################################################### ### code chunk number 30: ranges-extraction ################################################### ir[[1]] ################################################### ### code chunk number 31: plotRanges ################################################### plotRanges <- function(x, xlim = x, main = deparse(substitute(x)), col = "black", sep = 0.5, ...) { height <- 1 if (is(xlim, "Ranges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col = col, ...) title(main) axis(1) } ################################################### ### code chunk number 32: ir-plotRanges ################################################### plotRanges(ir) ################################################### ### code chunk number 33: ranges-reduce ################################################### reduce(ir) plotRanges(reduce(ir)) ################################################### ### code chunk number 34: rangeslist-contructor ################################################### rl <- RangesList(ir, rev(ir)) ################################################### ### code chunk number 35: rangeslist-start ################################################### start(rl) ################################################### ### code chunk number 36: bracket-ranges ################################################### irextract <- IRanges(start = c(4501, 4901) , width = 100) xRle[irextract] ################################################### ### code chunk number 37: overlap-ranges ################################################### ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) ################################################### ### code chunk number 38: ranges-coverage ################################################### cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) ################################################### ### code chunk number 39: ranges-shift ################################################### shift(ir, 10) ################################################### ### code chunk number 40: ranges-narrow ################################################### narrow(ir, start=1:5, width=2) ################################################### ### code chunk number 41: ranges-restrict ################################################### restrict(ir, start=2, end=3) ################################################### ### code chunk number 42: ranges-threebands ################################################### threebands(ir, start=1:5, width=2) ################################################### ### code chunk number 43: ranges-plus ################################################### ir + seq_len(length(ir)) ################################################### ### code chunk number 44: ranges-asterisk ################################################### ir * -2 # double the width ################################################### ### code chunk number 45: ranges-disjoin ################################################### disjoin(ir) plotRanges(disjoin(ir)) ################################################### ### code chunk number 46: ranges-disjointBins ################################################### disjointBins(ir) ################################################### ### code chunk number 47: ranges-reflect ################################################### reflect(ir, IRanges(start(ir), width=width(ir)*2)) ################################################### ### code chunk number 48: ranges-flank ################################################### flank(ir, width = seq_len(length(ir))) ################################################### ### code chunk number 49: ranges-gaps ################################################### gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) ################################################### ### code chunk number 50: ranges-pgap ################################################### ################################################### ### code chunk number 51: ranges-union ################################################### ################################################### ### code chunk number 52: ranges-punion ################################################### ################################################### ### code chunk number 53: ranges-intersect ################################################### ################################################### ### code chunk number 54: ranges-pintersect ################################################### ################################################### ### code chunk number 55: ranges-setdiff ################################################### ################################################### ### code chunk number 56: ranges-psetdiff ################################################### ################################################### ### code chunk number 57: Views-constructors ################################################### xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xViewsList <- slice(xRleList, 1) ################################################### ### code chunk number 58: views-looping ################################################### head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) ################################################### ### code chunk number 59: sessionInfo ################################################### toLatex(sessionInfo()) IRanges/inst/doc/IRangesOverview.Rnw0000644000175400017540000007512613175713360020442 0ustar00biocbuildbiocbuild%\VignetteIndexEntry{An Introduction to IRanges} %\VignetteDepends{} %\VignetteKeywords{Ranges} %\VignettePackage{IRanges} \documentclass[10pt]{article} \usepackage{times} \usepackage{hyperref} \textwidth=6.5in \textheight=8.5in %\parskip=.3cm \oddsidemargin=-.1in \evensidemargin=-.1in \headheight=-.3in \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Rcode}[1]{{\texttt{#1}}} \newcommand{\software}[1]{\textsf{#1}} \newcommand{\R}{\software{R}} \newcommand{\IRanges}{\Rpackage{IRanges}} \title{An Introduction to \IRanges{}} \author{Patrick Aboyoun, Michael Lawrence, Herv\'e Pag\`es} \date{\today} \begin{document} \maketitle <>= options(width=72) @ \section{Introduction} The \IRanges{} package is designed to represent sequences, ranges representing indices along those sequences, and data related to those ranges. In this vignette, we will rely on simple, illustrative example datasets, rather than large, real-world data, so that each data structure and algorithm can be explained in an intuitive, graphical manner. We expect that packages that apply \IRanges{} to a particular problem domain will provide vignettes with relevant, realistic examples. The \IRanges{} package is available at bioconductor.org and can be downloaded via \Rfunction{biocLite}: <>= source("http://bioconductor.org/biocLite.R") biocLite("IRanges") @ <>= library(IRanges) @ \section{Vector objects} In the context of the \IRanges{} package, a sequence is an ordered finite collection of elements. The \IRanges{} packages represents two types of objects as sequences: (1) atomic sequences and (2) lists (or non-atomic sequences). The following subsections describe each in turn. All \IRanges{}-derived sequences inherit from the \Rclass{Vector} virtual class. \subsection{Atomic Vectors} In \R{}, atomic sequences are typically stored in atomic vectors. The \IRanges{} package includes an additional atomic sequence object type, \Rclass{Rle}, which compresses an atomic sequence through run-length encoding. We begin our discussion of atomic sequences using two \Rclass{Rle} vectors. <>= set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length = 500), seq(10, 0.001, length = 500)) xVector <- Rle(rpois(1e7, lambda)) yVector <- Rle(rpois(1e7, lambda[c(251:length(lambda), 1:250)])) @ All atomic sequences in \R{} have three main properties: (1) a notion of length or number of elements, (2) the ability to extract elements to create new atomic sequences, and (3) the ability to be combined with one or more atomic sequences to form larger atomic sequences. The main functions for these three operations are \Rfunction{length}, \Rfunction{[}, and \Rfunction{c}. <>= length(xVector) xVector[1] zVector <- c(xVector, yVector) @ While these three methods may seem trivial, they provide a great deal of power and many atomic sequence manipulations can be constructed using them. \subsubsection{Vector Subsetting} As with ordinary \R{} atomic vectors, it is often necessary to subset one sequence from another. When this subsetting does not duplicate or reorder the elements being extracted, the result is called a \textit{subsequence}. In general, the \Rfunction{[} function can be used to construct a new sequence or extract a subsequence, but its interface is often inconvenient and not amenable to optimization. To compensate for this, the \IRanges{} package supports seven additional functions for sequence extraction: \begin{enumerate} \item \Rfunction{window} - Extracts a subsequence over a specified region. \item \Rfunction{subset} - Extracts the subsequence specified by a logical vector. \item \Rfunction{head} - Extracts a consecutive subsequence containing the first n elements. \item \Rfunction{tail} - Extracts a consecutive subsequence containing the last n elements. \item \Rfunction{rev} - Creates a new sequence with the elements in the reverse order. \item \Rfunction{rep} - Creates a new sequence by repeating sequence elements. \end{enumerate} The following code illustrates how these functions are used on an ordinary \R{} \Rclass{integer} vector: <>= xSnippet <- xVector[IRanges(4751, 4760)] xSnippet head(xSnippet) tail(xSnippet) rev(xSnippet) rep(xSnippet, 2) subset(xSnippet, xSnippet >= 5L) @ \subsubsection{Combining Vectors} The \IRanges{} package uses two generic functions, \Rfunction{c} and \Rfunction{append}, for combining two \Rclass{Vector} objects. The methods for \Rclass{Vector} objects follow the definition that these two functions are given the the \Rpackage{base} package. <>= c(xSnippet, rev(xSnippet)) append(xSnippet, xSnippet, after = 3) @ \subsubsection{Looping over Vectors and Vector subsets} In \R{}, \Rfunction{for} looping can be an expensive operation. To compensate for this, \IRanges{} uses three generics, \Rfunction{endoapply}, \Rfunction{lapply}, and \Rfunction{sapply}, for looping over sequences and two generics, \Rfunction{aggregate} and \Rfunction{shiftApply}, to perform calculations over subsequences. The \Rfunction{lapply} and \Rfunction{sapply} functions are familiar to many \R{} users since they are the standard functions for looping over the elements of an \R{} \Rclass{list} object. The \Rfunction{endoapply} function performs an endomorphism equivalent to \Rfunction{lapply}, i.e. returns a \Rclass{Vector} object of the same class as the input rather than a \Rclass{list} object. More will be given on these three functions in the Lists subsection. The \Rfunction{aggregate} function combines sequence extraction functionality of the \Rfunction{window} function with looping capabilities of the \Rfunction{sapply} function. For example, here is some code to compute medians across a moving window of width 3 using the function \Rfunction{aggregate}: <>= xSnippet aggregate(xSnippet, start = 1:8, width = 3, FUN = median) @ The \Rfunction{shiftApply} function is a looping operation involving two sequences whose elements are lined up via a positional shift operation. For example, the elements of \Robject{xVector} and \Robject{yVector} were simulated from Poisson distributions with the mean of element i from \Robject{yVector} being equivalent to the mean of element i + 250 from \Robject{xVector}. If we did not know the size of the shift, we could estimate it by finding the shift that maximizes the correlation between \Robject{xVector} and \Robject{yVector}. <>= cor(xVector, yVector) shifts <- seq(235, 265, by=3) corrs <- shiftApply(shifts, yVector, xVector, FUN = cor) @ % <>= plot(shifts, corrs) @ The result is shown in Fig.~\ref{figshiftcorrs}. \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-figshiftcorrs} \caption{\label{figshiftcorrs}% Correlation between \Robject{xVector} and \Robject{yVector} for various shifts.} \end{center} \end{figure} \subsubsection{Run Length Encoding} Up until this point we have used \R{} atomic vectors to represent atomic sequences, but there are times when these object become too large to manage in memory. When there are lots of consecutive repeats in the sequence, the data can be compressed and managed in memory through a run-length encoding where a data value is paired with a run length. For example, the sequence \{1, 1, 1, 2, 3, 3\} can be represented as values = \{1, 2, 3\}, run lengths = \{3, 1, 2\}. The \Rclass{Rle} class in \IRanges{} is used to represent a run-length encoded (compressed) sequence of \Rclass{logical}, \Rclass{integer}, \Rclass{numeric}, \Rclass{complex}, \Rclass{character}, or \Rclass{raw} values. One way to construct an \Rclass{Rle} object is through the \Rclass{Rle} constructor function: <>= xRle <- Rle(xVector) yRle <- Rle(yVector) xRle yRle @ When there are lots of consecutive repeats, the memory savings through an RLE can be quite dramatic. For example, the \Robject{xRle} object occupies less than one quarter of the space of the original \Robject{xVector} object, while storing the same information: <>= as.vector(object.size(xRle) / object.size(xVector)) identical(as.vector(xRle), xVector) @ The functions \Rfunction{runValue} and \Rfunction{runLength} extract the run values and run lengths from an \Rclass{Rle} object respectively: <>= head(runValue(xRle)) head(runLength(xRle)) @ The \Rclass{Rle} class supports many of the basic methods associated with \R{} atomic vectors including the Ops, Math, Math2, Summary, and Complex group generics. Here is a example of manipulating \Rclass{Rle} objects using methods from the Ops group: <>= xRle > 0 xRle + yRle xRle > 0 | yRle > 0 @ Here are some from the Summary group: <>= range(xRle) sum(xRle > 0 | yRle > 0) @ And here is one from the Math group: <>= log1p(xRle) @ As with the atomic vectors, the \Rfunction{cor} and \Rfunction{shiftApply} functions operate on \Rclass{Rle} objects: <>= cor(xRle, yRle) shiftApply(249:251, yRle, xRle, FUN = function(x, y) var(x, y) / (sd(x) * sd(y))) @ For more information on the methods supported by the \Rclass{Rle} class, consult the \Rcode{Rle} man page. \subsection{Lists} In many data analysis situation there is a desire to organize and manipulate multiple objects simultaneously. Typically this is done in \R{} through the usage of a list. While a list serves as a generic container, it does not confer any information about the specific class of its elements, provides no infrastructure to ensure type safety, and the S3 and S4 method dispatch mechanisms do not support method selection for lists with homogeneous object types. The \Rclass{List} virtual class defined in the \IRanges{} package addresses these issues. \Rclass{List} is a direct extension of \Rclass{Vector}. \subsubsection{Lists of Atomic Vectors} The first type of lists we consider are those containing atomic sequences such as \Rclass{integer} vectors or \Rclass{Rle} objects. We may wish to define a method that retrieves the length of each atomic sequence element, without special type checking. To enable this, we define collection classes such as \Rclass{IntegerList} and \Rclass{RleList}, which inherit from the \Rclass{List} virtual class, for representing lists of \Rclass{integer} vectors and \Rclass{Rle} objects respectively. <>= getClassDef("RleList") @ As the class definition above shows, the \Rclass{RleList} class is virtual with subclasses \Rclass{SimpleRleList} and \Rclass{CompressedRleList}. A \Rclass{SimpleRleList} class uses a regular \R{} list to store the underlying elements and the \Rclass{CompressedRleList} class stores the elements in an unlisted form and keeps track of where the element breaks are. The former ``simple list" class is useful when the Rle elements are long and the latter ``compressed list" class is useful when the list is long and/or sparse (i.e. a number of the list elements have length 0). In fact, all of the atomic vector types (raw, logical, integer, numeric, complex, and character) have similar list classes that derive from the \Rclass{List} virtual class. For example, there is an \Rclass{IntegerList} virtual class with subclasses \Rclass{SimpleIntegerList} and \Rclass{CompressedIntegerList}. Each of the list classes for atomic sequences, be they stored as vectors or \Rclass{Rle} objects, have a constructor function with a name of the appropriate list virtual class, such as \Rclass{IntegerList}, and an optional argument \Rfunarg{compress} that takes an argument to specify whether or not to create the simple list object type or the compressed list object type. The default is to create the compressed list object type. <>= args(IntegerList) cIntList1 <- IntegerList(x = xVector, y = yVector) cIntList1 sIntList2 <- IntegerList(x = xVector, y = yVector, compress = FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress = FALSE) object.size(cIntList2) object.size(sIntList2) @ The \Rfunction{length} function returns the number of elements in a \Rclass{Vector}-derived object and, for a \Rclass{List}-derived object like ``simple list" or ``compressed list", the \Rfunction{elementNROWS} function returns an integer vector containing the lengths of each of the elements: <>= length(cIntList2) Rle(elementNROWS(cIntList2)) @ Just as with ordinary \R{} \Rclass{list} objects, \Rclass{List}-derived object support the \Rfunction{[[} for element extraction, \Rfunction{c} for combining, and \Rfunction{lapply}/\Rfunction{sapply} for looping. When looping over sparse lists, the ``compressed list" classes can be much faster during computations since only the non-empty elements are looped over during the \Rfunction{lapply}/\Rfunction{sapply} computation and all the empty elements are assigned the appropriate value based on their status. <>= system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) @ Unlist ordinary \R{} \Rclass{list} objects, \Rclass{AtomicList} objects support the \Rfunction{Ops} (e.g. \Rfunction{+}, \Rfunction{==}, \Rfunction{\&}), \Rfunction{Math} (e.g. \Rfunction{log}, \Rfunction{sqrt}), \Rfunction{Math2} (e.g. \Rfunction{round}, \Rfunction{signif}), \Rfunction{Summary} (e.g. \Rfunction{min}, \Rfunction{max}, \Rfunction{sum}), and \Rfunction{Complex} (e.g. \Rfunction{Re}, \Rfunction{Im}) group generics. <>= xRleList <- RleList(xRle, 2L * rev(xRle)) yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList > 0 xRleList + yRleList sum(xRleList > 0 | yRleList > 0) @ Since these atomic lists inherit from \Rclass{List}, they can also use the looping function \Rfunction{endoapply} to perform endomorphisms. <>= safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] @ \section{Data Tables} To Do: \Rclass{DataTable}, \Rclass{DataFrame}, \Rclass{DataFrameList}, \Rclass{SplitDataFrameList} \section{Vector Annotations} Often when one has a collection of objects, there is a need to attach metadata that describes the collection in some way. Two kinds of metadata can be attached to a \Rclass{Vector} object: \begin{enumerate} \item Metadata about the object as a whole: this metadata is accessed via the \Rfunction{metadata} accessor and is represented as an ordinary \Rclass{list}; \item Metadata about the individual elements of the object: this metadata is accessed via the \Rfunction{mcols} accessor (\Rfunction{mcols} stands for {\it metadata columns}) and is represented as a \Rclass{DataTable} object (i.e. as an instance of a concrete subclass of \Rclass{DataTable}, e.g. a \Rclass{DataFrame} object). This \Rclass{DataTable} object can be thought of as the result of binding together one or several vector-like objects (the metadata columns) of the same length as the \Rclass{Vector} object. Each row of the \Rclass{DataTable} object annotates the corresponding element of the \Rclass{Vector} object. \end{enumerate} \section{Vector Ranges} When analyzing sequences, we are often interested in particular consecutive subsequences. For example, the \Sexpr{letters} vector could be considered a sequence of lower-case letters, in alphabetical order. We would call the first five letters (\textit{a} to \textit{e}) a consecutive subsequence, while the subsequence containing only the vowels would not be consecutive. It is not uncommon for an analysis task to focus only on the geometry of the regions, while ignoring the underlying sequence values. A list of indices would be a simple way to select a subsequence. However, a sparser representation for consecutive subsequences would be a range, a pairing of a start position and a width, as used when extracting sequences with \Rfunction{window}. When analyzing subsequences in \IRanges{}, each range is treated as an observation. The virtual \Rclass{Ranges} class represents lists of ranges, or, equivalently and as a derivative \Rclass{IntegerList}, sequences of consecutive integers. The most commonly used implementation of \Rclass{Ranges} is \Rclass{IRanges}, which stores the starts and widths as ordinary integer vectors. To construct an \Rclass{IRanges} instance, we call the \Rfunction{IRanges} constructor. Ranges are normally specified by passing two out of the three parameters: start, end and width (see \Rcode{help(IRanges)} for more information). % <>= ir1 <- IRanges(start = 1:10, width = 10:1) ir2 <- IRanges(start = 1:10, end = 11) ir3 <- IRanges(end = 11, width = 10:1) identical(ir1, ir2) & identical(ir2, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width = c(12, 6, 6, 15, 6, 2, 7)) @ % All of the above calls construct an \Rclass{IRanges} instance with the same ranges, using different combinations of the \Rfunarg{start}, \Rfunarg{end} and \Rfunarg{width} parameters. Accessing the starts, widths and ends is supported by every \Rclass{Ranges} implementation. <>= start(ir) @ <>= end(ir) @ <>= width(ir) @ For \Rclass{IRanges} and some other \Rclass{Ranges} derivatives, subsetting is also supported, by numeric and logical indices. <>= ir[1:4] @ <>= ir[start(ir) <= 15] @ One may think of each range as a sequence of integer ranges, and \Rclass{Ranges} is, in fact, derived from \Rclass{IntegerList}. <>= ir[[1]] @ In order to illustrate range operations, we'll create a function to plot ranges. <>= plotRanges <- function(x, xlim = x, main = deparse(substitute(x)), col = "black", sep = 0.5, ...) { height <- 1 if (is(xlim, "Ranges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col = col, ...) title(main) axis(1) } @ <>= plotRanges(ir) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ir-plotRanges} \caption{\label{fig-ir-plotRanges}% Plot of original ranges.} \end{center} \end{figure} \subsection{Normality} Sometimes, it is necessary to formally represent a subsequence, where no elements are repeated and order is preserved. Also, it is occasionally useful to think of a \Rclass{Ranges} object as a set, where no elements are repeated and order does not matter. While every \Rclass{Ranges} object, as a \Rclass{Vector} derivative, has an implicit ordering, one can enforce the same ordering for all such objects, so that ordering becomes inconsequential within that context. The \Rclass{NormalIRanges} class formally represents either a subsequence encoding or a set of integers. By definition a Ranges object is said to be \textit{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). There are three main advantages of using a \textit{normal} \Rclass{Ranges} object: (1) it guarantees a subsequence encoding or set of integers, (2) it is compact in terms of the number of ranges, and (3) it uniquely identifies its information, which simplifies comparisons. The \Rfunction{reduce} function reduces any \Rclass{Ranges} object to a \Rclass{NormalIRanges} by merging redundant ranges. <>= reduce(ir) plotRanges(reduce(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-reduce} \caption{\label{fig-ranges-reduce}% Plot of reduced ranges.} \end{center} \end{figure} \subsection{Lists of \Rclass{Ranges} objects} It is common to manipulate collections of \Rclass{Ranges} objects during an analysis. Thus, the \IRanges{} package defines some specific classes for working with multiple \Rclass{Ranges} objects. The \Rclass{RangesList} class asserts that each element is a \Rclass{Ranges} object and provides convenience methods, such as \Rfunction{start}, \Rfunction{end} and \Rfunction{width} accessors that return \Rclass{IntegerList} objects, aligning with the \Rclass{RangesList} object. To explicitly construct a \Rclass{RangesList}, use the \Rfunction{RangesList} function. <>= rl <- RangesList(ir, rev(ir)) @ % <>= start(rl) @ \subsection{Vector Extraction} As the elements of a \Rclass{Ranges} object encode consecutive subsequences, they may be used directly in sequence extraction. Note that when a \textit{normal} \Rclass{Ranges} is given as the index, the result is a subsequence, as no elements are repeated or reordered. If the sequence is a \Rclass{Vector} subclass (i.e. not an ordinary \Rclass{vector}), the canonical \Rfunction{[} function accepts a \Rclass{Ranges} instance. % <>= irextract <- IRanges(start = c(4501, 4901) , width = 100) xRle[irextract] @ % \subsection{Finding Overlapping Ranges} The function \Rfunction{findOverlaps} detects overlaps between two \Rclass{Ranges} objects. <>= ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) @ \subsection{Counting Overlapping Ranges} The function \Rfunction{coverage} counts the number of ranges over each position. <>= cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-coverage} \caption{\label{fig-ranges-coverage}% Plot of ranges with accumulated coverage.} \end{center} \end{figure} \subsection{Finding Neighboring Ranges} The \Rfunction{nearest} function finds the nearest neighbor ranges (overlapping is zero distance). The \Rfunction{precede} and \Rfunction{follow} functions find the non-overlapping nearest neighbors on a specific side. \subsection{Transforming Ranges} Utilities are available for transforming a \Rclass{Ranges} object in a variety of ways. Some transformations, like \Rfunction{reduce} introduced above, can be dramatic, while others are simple per-range adjustments of the starts, ends or widths. \subsubsection{Adjusting starts, ends and widths} Perhaps the simplest transformation is to adjust the start values by a scalar offset, as performed by the \Rfunction{shift} function. Below, we shift all ranges forward 10 positions. % <>= shift(ir, 10) @ There are several other ways to transform ranges. These include \Rfunction{narrow}, \Rfunction{resize}, \Rfunction{flank}, \Rfunction{reflect}, \Rfunction{restrict}, and \Rfunction{threebands}. For example \Rfunction{narrow} supports the adjustment of start, end and width values, which should be relative to each range. These adjustments are vectorized over the ranges. As its name suggests, the ranges can only be narrowed. % <>= narrow(ir, start=1:5, width=2) @ The \Rfunction{restrict} function ensures every range falls within a set of bounds. Ranges are contracted as necessary, and the ranges that fall completely outside of but not adjacent to the bounds are dropped, by default. % <>= restrict(ir, start=2, end=3) @ The \Rfunction{threebands} function extends \Rfunction{narrow} so that the remaining left and right regions adjacent to the narrowed region are also returned in separate \Rclass{Ranges} objects. % <>= threebands(ir, start=1:5, width=2) @ The arithmetic operators \Rfunction{+}, \Rfunction{-} and \Rfunction{*} change both the start and the end/width by symmetrically expanding or contracting each range. Adding or subtracting a numeric (integer) vector to a \Rclass{Ranges} causes each range to be expanded or contracted on each side by the corresponding value in the numeric vector. <>= ir + seq_len(length(ir)) @ % The \Rfunction{*} operator symmetrically magnifies a \Rclass{Ranges} object by a factor, where positive contracts (zooms in) and negative expands (zooms out). % <>= ir * -2 # double the width @ WARNING: The semantic of these arithmetic operators might be revisited at some point. Please restrict their use to the context of interactive visualization (where they arguably provide some convenience) but avoid to use them programmatically. \subsubsection{Making ranges disjoint} A more complex type of operation is making a set of ranges disjoint, \textit{i.e.} non-overlapping. For example, \Rfunction{threebands} returns a disjoint set of three ranges for each input range. The \Rfunction{disjoin} function makes a \Rclass{Ranges} object disjoint by fragmenting it into the widest ranges where the set of overlapping ranges is the same. <>= disjoin(ir) plotRanges(disjoin(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-disjoin} \caption{\label{fig-ranges-disjoin}% Plot of disjoined ranges.} \end{center} \end{figure} A variant of \Rfunction{disjoin} is \Rfunction{disjointBins}, which divides the ranges into bins, such that the ranges in each bin are disjoint. The return value is an integer vector of the bins. <>= disjointBins(ir) @ \subsubsection{Other transformations} Other transformations include \Rfunction{reflect} and \Rfunction{flank}. The former ``flips'' each range within a set of common reference bounds. <>= reflect(ir, IRanges(start(ir), width=width(ir)*2)) @ % The \Rfunction{flank} returns ranges of a specified width that flank, to the left (default) or right, each input range. One use case of this is forming promoter regions for a set of genes. <>= flank(ir, width = seq_len(length(ir))) @ % \subsection{Set Operations} Sometimes, it is useful to consider a \Rclass{Ranges} object as a set of integers, although there is always an implicit ordering. This is formalized by \Rclass{NormalIRanges}, above, and we now present versions of the traditional mathematical set operations \textit{complement}, \textit{union}, \textit{intersect}, and \textit{difference} for \Rclass{Ranges} objects. There are two variants for each operation. The first treats each \Rclass{Ranges} object as a set and returns a \textit{normal} value, while the other has a ``parallel'' semantic like \Rfunction{pmin}/\Rfunction{pmax} and performs the operation for each range pairing separately. The \textit{complement} operation is implemented by the \Rfunction{gaps} and \Rfunction{pgap} functions. By default, \Rfunction{gaps} will return the ranges that fall between the ranges in the (normalized) input. It is possible to specify a set of bounds, so that flanking ranges are included. <>= gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-gaps} \caption{\label{fig-ranges-gap}% Plot of gaps from ranges.} \end{center} \end{figure} \Rfunction{pgap} considers each parallel pairing between two \Rclass{Ranges} objects and finds the range, if any, between them. Note that the function name is singular, suggesting that only one range is returned per range in the input. <>= @ The remaining operations, \textit{union}, \textit{intersect} and \textit{difference} are implemented by the \Rfunction{[p]union}, \Rfunction{[p]intersect} and \Rfunction{[p]setdiff} functions, respectively. These are relatively self-explanatory. <>= @ <>= @ <>= @ <>= @ <>= @ <>= @ % \subsection{Mapping Ranges Between Vectors} \section{Vector Views} The \IRanges{} package provides the virtual \Rclass{Views} class, which stores a sequence together with an \Rclass{IRanges} object defining ranges on the sequence. Each range is said to represent a \textit{view} onto the sequence. Here, we will demonstrate the \Rclass{RleViews} class, where the sequence is of class \Rclass{Rle}. Other \Rclass{Views} implementations exist, such as \Rclass{XStringViews} in the \Rpackage{Biostrings} package. \subsection{Creating Views} There are two basic constructors for creating views: the \Rfunction{Views} function based on indicators and the \Rfunction{slice} based on numeric boundaries. <>= xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xViewsList <- slice(xRleList, 1) @ \subsection{Aggregating Views} While \Rfunction{sapply} can be used to loop over each window, the native functions \Rfunction{viewMaxs}, \Rfunction{viewMins}, \Rfunction{viewSums}, and \Rfunction{viewMeans} provide fast looping to calculate their respective statistical summaries. <>= head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) @ \section{IRanges in Biological Sequence Analysis} The \IRanges{} packages was primarily designed with biological sequence analysis in mind and Table \ref{table:bioseq} shows how some biological sequence analysis concepts are represented in the \IRanges{} class system. \begin{table}[ht] \begin{center} \begin{tabular}{l|l} \hline Biological Entity & \Rclass{Vector} Subclass \\ \hline Genome browser track(s) & \Rclass{GRanges}/\Rclass{GRangesList} \\ Coverage across chromosomes/contigs & \Rclass{RleList} \\ Mapped ranges to genome & \Rclass{CompressedIRangesList} \\ Data (sans ranges) across chroms/contigs & \Rclass{SplitDataFrameList} \\ \hline \end{tabular} \end{center} \caption{\Rclass{Vector} subclasses for Biological Sequence Analysis} \label{table:bioseq} \end{table} \pagebreak[4] \section{Session Information} \begin{table*}[tbp] \begin{minipage}{\textwidth} <>= toLatex(sessionInfo()) @ \end{minipage} \caption{\label{tab:sessioninfo}% The output of \Rfunction{sessionInfo} on the build system after running this vignette.} \end{table*} \end{document} IRanges/inst/doc/IRangesOverview.pdf0000644000175400017540000070273413175724756020462 0ustar00biocbuildbiocbuild%PDF-1.5 % 117 0 obj << /Length 2441 /Filter /FlateDecode >> stream xYKW9QȨH8@q.`98%οO=ɦDxrA]]]7o4\m<:¥Q}%|οO 9[o,f69ۧwnH)Z< wc}WIJ\ cК"ߢy&t.6) iё@mkiC3&d!M{[eS*F,ʥ&( |I OJGQy.sgFW!sggպ՝ T#6S歀do.mXi= bRoP//R98$)m>os.lGM)Mc0(Xv6/ \+x˝&)r'T4fޙɉ}P5%"<B-Li{1o#В?p{N}ej5LɌ( }} kc%ZQ֍5~m@tVhnZb7R{81^Ҥ{Idiْ1ɃB݇.|'xV qGb S iwV5-4Q jwrl\]/[(2d>CH$uO5{1qq 5L tw!|Ad2,0B1{=1ft+/5>uQ+2]"^W/:\y5\U%dT4Fi)8LhKҡ砖CI3k_d^1Ee~O2+D*RE8ZKt-d֙L2~KaZhF(02A2yb#bex 9gVWKX8az[q2D]OWE^!]t!24eݤɹLźQRif/Рȓ0`5{EJk7\Ytj¬=ak)Di '*[^]/g( T8 & E4о?|!OHtUt>~oj1D*-P0_zԣ1>wyj=|]ȧf26 U%Nʏˁ6F~k4\(+ |xY$o-aSɍ'Lg-St8/8q';9AlCF|,eOA?ֲIYiMӓ~'^fuHd'+ފ^\@n@"NMgDiGkC\"o %WS(-!G$a@.Tdg.pgǚG5X-O;VLQ5 CnFwzvQ>6)4ף`;`-lp,-1^ykЬC,ʕ0Ou"8&\@'O4^ [ekA/&P;=)jt 澿<|OQFVi~">WG@d JB VV:^%\:/_JG]QHW#ML KtyxxCj;x9q u9ͩw%%> stream xXK6W(1+zEH=F{HsZctm9w^(Y^'fhK9~Cw$PEr=Kc"TY͖M|h|G0,F   0i:gi[X(j#(f5[|y++ %Gv J?^Pp*g(REZM  ' }H)*DFe[( /D&"{._aCyΫXR"W2U))a&qmFs(D[VDT̗sƢv^_AZitR|\|ϋrцW SʲHzTy@;abdF׻9nKҒGRV g`Y#ppI3  iielkYW|~Bl0azvm5QifX1 z"hT$]y4 ~mK9gPL.G<`w}\(dnV9e2$Ce- as8scEO&i-}<]85jKˍ6 QKb,te0La?Fc)%Szp_Ԡ'ek='7zz`g/2J͘<ڷ`' qQFMf4q \r endstream endobj 143 0 obj << /Length 2461 /Filter /FlateDecode >> stream xZKsϯQMnU*Ie&5I6\6{%VVQק Iɖfvg9ФltouyUsu}{e+LU0[ -l?x]^'<;x־; 4+f݊_Vk/rn<ó6\[;?˭po_'4Y'|kt+^0D* |mX“9Ȫ==*zzF'F3!սEw^:OguP40|!Q_QwQտΊ_XJLJߘ ʹ]LTqٟs)Pֻn*3k3ʁJ&K$b Fw(D 3Z9N\OsmfH_I<\ eR]Nz::TzrYMd9&b_eWXM2+gRD.)ÑvO'Y;:1HcTIOanAs-pm2H2Q9 衚 |aa^X .$HRp1Z3izIM &tudˍ7:}٪A'.q#vGԇdcYߖgS\G7ld& ,qRPK%GME[b2,ώ-}|L ,qldH]ɨk 4; bxTrv=/T]}аj2n\'=>{쾙ٕbgǐOy7!xX<ꄖLMAଂ3KKl.K'1}ʰDoXn6[tN@w05Yԝld)=|n"PKdi$opQ4-UV1(9dxgę:Ɲy57A$SEˠ!6ejt`L @eg޼in=fT6Y>+;! h:M>Q*MKuWƞY[XJ09sv#/޵gOk9>O:~?=u3k(V }|qc&4Zfio w+UJFzt}Ԃ+X־otQPɴjhCd%D~C1 @/S?=C[8C9I= >ʊYm[ԷpŏTCN;%lPAo B fK{JFL6+9{wK^{o,v>~"¶2OlK Z߭_;]w$.XV*ٜ.MOA% }7tҥ>;F%c+&p~n$rb7gaC>re$bՒ_"!{^bVe"_B%@]ͱ.5 endstream endobj 149 0 obj << /Length 1656 /Filter /FlateDecode >> stream xڥX[o6~ϯ Tx`Kt(6pl+;7R4H;?R # #Wz4]_|P/ MOҲJ^YѯşrXƅ H <WQxh4뻕{!%ӝGpaZ*` Q4I*WU"4 J8FO%! cCp#? F@Bi+#1,YҫUadwxSnFKE@3 s~1>$ЈG nyHWTy٫m*apN/,{Yy;(~ r %VwWT!YZ[KH㋀n=\}QJE r7sֳ3e;Dv" TDn5N9U+`d#*/"AB;M' OrA*})oXSdU*)%W]%þWDI:P?Pߧsd] T|~R5MB)=H]\ > endstream endobj 146 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpK2CQE6/Rbuild728849649488/IRanges/vignettes/IRangesOverview-figshiftcorrs.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 153 0 R /BBox [0 0 360 360] /Resources << /ProcSet [ /PDF /Text ] /Font << /F1 154 0 R/F2 155 0 R>> /ExtGState << >>/ColorSpace << /sRGB 156 0 R >>>> /Length 590 /Filter /FlateDecode >> stream xVMk1W̱=t}_B =܆8- ֒ln^S3k; Wp `Z100t ~OpR?W7_~(;lY];hHH x}Fa>S'B @sDYrd v.-K&H=Z^ süCYrw|L$xg1.%wo\ĸ,5!r#pF-3+e4P/-a0.ȏ71$V;^ɢՍ/ldb'!`/Gk?o?SZ-Oa(8#v\@9l\nOEq^$RNm bґzhѤ^ffyK\3|=6WW(.et}BS~A%R'N#e+_`l?>^;S]_^2" 2XX:Nshve If06 v3yV L# |zZhhddQl'px5n7:o5Jwg?#c*;`{?ܰ endstream endobj 158 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 161 0 obj << /Length 1481 /Filter /FlateDecode >> stream xXKoFWH%|`-lEh@UϠ?Zr)8(Âٙof? R7h0MQqs6i?pmm6<FBo0G Z´5[¢{lVw1toι?k b|o9e4$,7V.'o<ޑ t$P ?|Cp2 'ۚlER&1mI-WlŢVa`J있5{ٞiOp q@5F [!)iczv\sZ3vSq4/R3)q.%PӞ }ٙ %MxX*VxΫ~Ky_`1GϦ34jj]&G3Vϸ^;^@g,[.nʏI8rvm+; "b(mܾYZUY)i|Fw}ʍl=LԔQ@-hx?ʵl=[k*0YCJ1%7v!8~nJSOe*W;Ui˵۾8{:H3 ؗ/gQ[nWvha )r.V&3|9f^=I#; agAꛙ4r͆Sa!1ִAݼ&axUAU3~f+ZaVTH, Mye^D uF-GW R\YRU1ߖ)](] aVK0mf{ q_y`"]r7GJ_ G. odDX7bKhVq{C7Ǭ`A«Xn$VሴQ 0ob#n[ ' 8! x|4Zr-GUwh{J(ĢXbk>_]X5jdR(kc)2jgj K=:KUc9?XSw.> stream xYKFW>I ͗,Ā7Nij'8%QKR8ȏO=M9Y]U]գ{uq6Ƀ<1f=IlM|y8YMO9&6pzϏx_=<%S _:b~(x,&04f-eO;Qh>WG]QMUͪx?iձk^ێ[Y7D:}{؞@<![?jGi U-ם*(}!{s*s-ILaDzJ͗y4$&YHA`&Ed-rnѥblKalI8dWXwTB Ml4!5Ϧ+X'\:DYp5":z͖/ SxQf Y[6IOwL|r2F Sv-:B;Y6f%Æ+v!ĥ t$kUq{<4&Ҥ\^MLS4"*m=ӈđe7ݸEĎ*Ahys$sL#4JmL8ĵ{s(IVRcX]u>jLȉ& 5NM+YBOPlr -&d])w)ʡvDUx.9 m<1,4R4h,…eMb (6ze@U{ .q]BսݑjĤC!ӗ-J0[4C%OeP S[*bn:H8" )%[3Egh[}q(5rY hXIxĦ@!uZNF_6Q$}6(WfVOT&1" Yv,f㞁:?vwevP-^exce>" %"Ցng9(lx"$e4֍fa;KrlF4qϕCV.N g5B >[ZԠCOcM)04fQ zْ7h+1~x E|UHYgSh{dikC5)l% 0r,8+b؟@#Mh,N =) endstream endobj 169 0 obj << /Length 2197 /Filter /FlateDecode >> stream xZݏ6_a\_dfDG; wHH! n|$o6ۿE]磹;\!7Ùp_>xUʳ|<*UJϖQ7_Ls_͟:#_^s q6 ȲY^3Nm 2`Ʃ4~mQvqez+E*5xf\#Nv9ޞ!#BFUUbzL T@"RZV֪riڹ˒Zב:asvLyjp.MV埧 S+Z1Y^LZl:&'c`ez:`SP[ ؉iw{~OL ;>7V8ѭLvyjREM`]{/MלI\OKYSYe'WyTD)?kK/6vvՠɫq׃&:#VLu8ү)3~¶9FiWzF+F2 zJ X6#At48ׇ85MD,% bM^f6̋(9 ;kKH–$!Iiʮ{#Q;"0^Á^LnHv{h@=s etƈtaɨ8CURKNKs}yCU>4D)mARM$Ҡ1 UYSt?>g/8v쬤]ԉ;ȗa(H]3ZA% um֋r#lksXi!juMjEkZ\"| (%Hj[-><"5^enB %ֶZl"ծࠐYAY*Z-U@i劰 Y!wdLfɓyiitU j}*2B S\=^={Ӫ"goyMUn*0QO%6W1GҢOdfL]q'`2'OR0U2*l^2aXS{|F{}Ǿ!5'5C-$ҿqyi [󲐳B^:b_SFXt|Ӌ3.$&3'y)lMNziAMцl:F ~y}IiΦ8dQ$ǮאHigoUaqIĎȯ0d}_Qxk߲ A1WYg%HxC.|֦屨pʵfηQQ, ,9c 2Ĺܴvp>b D;D&6rNI0L2e>NT9 Lx*+$>Ȝ FOؐ Td &0eb=^Pc~ʣ=bLbgE"U{#"+?TO{1 3H5cUh&ND? g2o@s;p{ǀc8Y{tycCq^RF^]Γ1zޣ;m=֝;G'HxO<?N\E#nr<6Y'/~qw61*'_C%{zg_(ߧ b׈ U4"P&}u<``҈$LoA^^|4&D*yhP6r<2.ǜTzpyy^[UV5].2N.`~ʤxuՠ}yRky* pA7]M!R ?)hܟE]Ch1|-0BW|wx)XRׅS{7(.g4>㕥߇tr2n)~бX9bh2iծ"2U⹜/]a6[Tʤʭ2lk&ԱLXY68W bT.6o WB=][nq;9@> stream xڭXYoF~篘(܃"(uk eV2tyɿ7JQdncvgVJI'dɤKI&)r$ #'R1R,f̏%m,,$&c46 ]vd2|<h$EFQj%kO V* ؕZ4% s ,SOddyeS)-y $` n$pzA$8k3|SEY2SyFe JG2D83 dMpiʪl iW)e:XB)bDSnT2F ?:Dq;L e1:t yv=CX&&_`_!i'$.ç6}=ZeS'ͦC8ȹ8J8CG9ҙؠ@L9Imo'QJhԙ-,l#}f;Xz.HQAC>ȈxżDduOZM¶lTS] {H'} e-sA?5mYtw=j7;w2vulPF]5/ 0Ea:oÐW~??A]ɻ 2ԶcP!#nȇGۈ+qv|3~5J_^VGU[}Rq7@շYw_wa<ŬF QuŢ?}DQ6E7_O+ +qЀ%/ $XJ@a1; sm(2,"oew^U y߀H)nf`&L_N-Q~'8 ։0*c^.&)S.ccWmsփ)_Ab%a%fϽvm2!O:ZV?c3p;~@n6Y) @o ώ~ٱˎ{}@ 4! endstream endobj 175 0 obj << /Length 1638 /Filter /FlateDecode >> stream xYKo7W=ː\4"(8!a-Ɋ=\%ΫEsvyp8CzvDAJ#b`+,/lsk=UQ* \- pBGyU|MXV"QLQ f-;sQ~X1u[pȫ+k7Z9)otFB(WK_"}C.|B5HHM7oJ^,Zb[޼ӻx^m~Uq!9c3Ui9wxj{|Fzf:2D/Mȸqm$qҫŇ6ئ9;/D3W6ӵIMq,܈_.\B]{AoDmSz+ ɱ3-1+ņdeHqp kwCtIT.LO Vn%RJ 9Vz|*+somW1Ga> +[̑w1keVqq!b9@:y^S_^S+s`K3\}5Xʖb_i +=s3qب'CQ}ըnS:%RW Nߗ9Sy*Vne(Lk&,WfL蠢XƪPP=<@&e~ als)/~'wJb)j{mJ n7ew!ۂ]Dϑdɔ()5# |uqq[#J$ j)>Vs0"p6R|G _wSv26GOAEwssp "K@\FݽZw31hf7N^}$:LtmDnDPR(DEQHE{m~%> stream xXo0~#Hų؉6S6Vw4lMm}8ᄈ;ގzoKhU4P1K9KF7U@2KX eshNf-n ѿápp=`(R"`֪NoyzG1^SR[4Kȸ}ahM ,1_6=E"!=-݂ljeէƴ2h,خ HZn:00XS;%JHmߛ8qsv (- (4ur[|J r')0l~Ha.ڴRjK=ƒ ]ApiᱦRLw|lXg Z8I%<(fItP*`uS3! ]# W$RY)gƦ3&c,],3LM2s;2RI c%0#7 ƼxA "Ǘu%"A0xqK]OV4/M$%p\X`̕2lLrV(1 _ru6kvÝ⼰ |onkFi9Grf+mN+%qBzQzvYE1#/4~O܆"yb{\e\ }C{G^ JI3x8/iE]?vu?NNPNPNPu'?Hv2^3*-Y) /#JJ)vR_T_k|o@'#%bрYh~` endstream endobj 183 0 obj << /Length 3190 /Filter /FlateDecode >> stream xڽZY~BF4$W*+c#}ꃇvv'FfD6ȿ}S(電g5E^5n}cV.Ji}.Tvj˲2ݢ1Yn Cq߾~S4yC*R.j&qˬe} Kox5ɚ䦄gnQU NӍ ʪ./jD =,ue/o%7ټ.go# 非O6TҩeFWLՈzX4e֭N+~߳Α}ϫ0$ 큖){Ǜ}ǣ8U^}'Қ?4$2gvҍ[3GtlGʶjk Dw]!B[Y7 ׸8݃wx)v+Z%gѢ4kY"!RgtLO> m H%[C$'{D-v${bG#׭ٸI8c< 4$)^ S9VpJku ѕ[%Z [yAЃ0!Bpl"~䱘&a;>ɌU5&5*TgTW o:WM߅-kd&w1IiÌQJOeJb Ƨ~]IzU * GR%D6f}J*j6*q<l8J2 cԑ}<ʞx0f " %q!$PsB;s\m t)GRh 7E@YU@ixYp_ Uld(*}/T cWƸh\UkλY~@{6ONt\\c:LtHrMNx#_'>e ͤ{¸J.>9]>1r Ja{};7 H^Ӆgusn˰$k(LwGوʹv i^G (%\"U:ctm_bN6[!l@!'UTD"i+ɴijf,iUn@q!ElVܕr+-ϛr_0$ޢ>:R糷YFOUc@  A B3".Ў~:BRR=oy[6Xb;qPwBHv>z[ P؇[NZgBkR AJ e;Դ{KQ8X%TMS'`gM:n}ӶWX|ؓ7zm"uװz7sֱߞ}t^mn=A<%h%JaH11E^K%=HLM*| O uhO!rQCJ񦏔W[[ny7Md;WOqd"CÑhFx& = dbB8Zq5i*X/O@sHNVz+!Iw ,_dա bo#S]U^ZK2$ ;y PV8kzyurm&AסzBA ,!iuZE>/mwuC' >um˨QIpǔՇmOc3k~t?cW3>S}cCS\Oq: >K䅡`$`wZt(Gmx_"HAj ;{˷5FJ2(L^OTxyseiC8b`qv H.{`Fhtm]<ժ̎v~1JT&&Q65KP\ K*|ܖ 7yWԍ({1(d px9z[T i5 *F:}&Mc- &V1ؘΣo,!ƷED"iǼQCfĄ(\ɫ]!! en@a'`ZhN!l_ *k*ُئaPU ^IMsQCv0Zԝ/Mn[ux67`Lf_6 (G몔=Bup_ĺGpڀhip7̩gu7ːp.|̩)x D\'Q(EAˀUB/1ow 'Kοt݄u)ftvXhw/&ˁ \OL \By! _ULTK5[;mm9lDs6RJ=c"#6ɓdiY#YMCǯK7-x_c 4jIʹI;J|1h@9503M/fTOk *C~ $ɂjYA<0W|>KE1Btr@w!`:}w:ngH}n~p]r䥭} L` g 0lJ> +aK==?[:i#=c?ף4gqޡQ|;Թo4&~eNaa(+5,TZ$E< I.PTJ~4'-IMԎ l$)뻯 7 endstream endobj 190 0 obj << /Length 1288 /Filter /FlateDecode >> stream xXK6 %3ٵjɖECآ@bnIg6WE{)=EQ2E~")li$B9 ǔjmYd43Eg*Kn ب|za+=EրvZ"bgvqݐ'9+·" +<ڨ{`Ky }SK@HkVN5fq#B]F w qdϴl\ :+;>iÞ*ڸei]$=IPik쪕PaܴjɄz$)KId\q),SM?XriTK1A ,ĩ@̥K+6"f6glCDkƈ6 1,ƭ͚hJlsz?a{Yw$kmMF&:l5ۙ}1ShYk/pu4lk%8!o0(ەj7MEf"J˺D(*a]wdAl ei0ƪf5/As7jǡZϴjD/>yug ߟPjkl1511mXĝ"UOE\cgvrQ޹gj# rHÅ#6}khdXGW^]\}\xÆ@=.uZ7)/(B$=F_Fnx"=إ7ۺ<>QsA9S)+ ӂsW6XglMm_&T:լ.r\פpG xC:h QВ`~YSo);W1fXT6Wz0$m, gU@+v#}=G95^3g:޺mFl<4 1p3N6TUȌw㕅R8́_#^/_ ;pEtnD=ne#;;c~n|ϼ0K+QY+ >V'Ws}ˊuY3/rJr˿6b_Pޯ(Pf "װW?ͯN7 endstream endobj 195 0 obj << /Length 1869 /Filter /FlateDecode >> stream xڭX[o6~ϯ Ԭx VlbbI:[J}is#M5r/Ĥg%陮gA[Uv^|PPlJӗ[3i s\d,_^<}Qf12FMb[>ෛ/y6_8g_a)]uoa2Ra.2!djݿٞ9c9!rP3+C "< WʹV,t@pZUe+=_hm\j9و0x`~8% lsD-xtU0ZyeGJ,bSٜVw'{Mhy~+ZF X0AD;<Ó;`]F30uuL=Wt b :Id##KVV 0qC;"7Mj5uj)XC!#yEhz#k[r϶ġHs^-J?v5ip5_xW'}<@!I{D؉9p]?(˼"͖HY8c0`Ql!8*V=`gJ/ AH'Vb]\0&kM^9lҨL*]dǼ5Ͻ)"6wU$5O7 EϺ ڥ8y* Ж Pr-np<5FT)w) 9-G]J\9 #d7lZ:6x>}]6|Iwih a~.,D st^'PQ{ Ti*u5J=Ur kG)RV&(8'Y7lf^RIhWF`$m%Z((g8]`JĶ g Єhk.cM][V_J|QN)'Qpzsj`)mm^$Rm$F_6N=|.o{~otϑSr ߍKm+v1֊T v|5t3yƶ6u#kYEb>vK#w{'xǞb3ԞD( 6:.]ϖ]֬Խ`h? r*_qhĸ?ƞ1+.ǎl[I<_Qd|բII,q,n!dO?c|BUt'&&ɒ>*$x[+ Qi_` l5DF6B sͽ056l3e6l ȍ}Pt vy4?\6 /jGQzI3 endstream endobj 187 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpK2CQE6/Rbuild728849649488/IRanges/vignettes/IRangesOverview-ir-plotRanges.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 199 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 200 0 R/F3 201 0 R>> /ExtGState << >>/ColorSpace << /sRGB 202 0 R >>>> /Length 338 /Filter /FlateDecode >> stream xOK1益13mQPA#4c7GF%/CU1X endstream endobj 204 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 192 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpK2CQE6/Rbuild728849649488/IRanges/vignettes/IRangesOverview-ranges-reduce.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 205 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 206 0 R/F3 207 0 R>> /ExtGState << >>/ColorSpace << /sRGB 208 0 R >>>> /Length 316 /Filter /FlateDecode >> stream xAO13WhlA<&1bvw˲뷯32&`5SlKHN@J x7W{cZ_x l4OϰX$=k3 aB4g-1+)UG2Ւ\gfTKIT/C,,ԣ`[-?u7QuՔ_ J;I{(,0;í rukVys~\`HfI=v/ endstream endobj 210 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 213 0 obj << /Length 2090 /Filter /FlateDecode >> stream xڽY[o~_GsEehtE ,y. EV&Rv23Fi{YDw?|Jk8gV7wҪzU6ٮ>gn3e~ƻ7CYTZ^Lᗵ p^1֪_1ޮ7~=\QԫF5)qFRYJj\\c7>ixxe>#( ^}õ@KlĔkUE$fE SVfa < "ЊT-koWBg7kSg~:'IU`0}Ll}VVkw Jzʐ3J7GjO$EB~x4۔ǁM]M.6B!܏諘v}b}ş~ Ix꽫[$XmTiSЏf`j/c! USdVBCY3UբiLAȏ ?~X\.@W΀7tfi2<="5!GOd}GxG_ $zJ|Hg6ǻ:Ab ? *= ⭸ƕJP),v W"ح^ni# &m=TZWi@"Jq7rb_N@ܪGhUu7ٿF- xQ'S0T *O7EIX\%~ÜSaj#Y+,So{8 /xo6h9._Advi!ٹYڽF֧-(%!n}b¢>H'8LI_<)6vs$NcI$fn(Sƣ03ExK]Z+'j, &S2V>|Z-8%C@)Yt٧3``꼥#}NUsGZh]S—|rqnbn=<N3ߺ0RÒ5}MlK_dioei0qĊ_E$>G(JG5&Tm&TE!EM@y/BH_?'7d;\n+K<$(aPe ;=(|!ιqJT]ٔۈ9f h~ HrٿCB n vp<* te!3-Hy==1 U\aDYtw t Dѿ`QνVsmؘߦk]V⧓1m[Jy@eםd[!<^ jF^N%Ch[p/SMvZ6w\vUYڂb.k bK.(ÂrgjwvNGN.z6:q8)U9'ݍ\M-`'^~ůGoz> stream xڭXKD09E<4( TrZ~=ȖBQ[Z4=GrS3U9e<7換ʕ4 iJ*pξn~??24bY-@̊ZeyUٛn-<#o1&_SkM >N{_ isX38af\ɓpip]1é?JZbnŏwEst)P:*++J=O m| Sy緼u?)& ޾aRb$eڑuP{xg5xej৒72[x>scr-d$5vi׹HtQD2bLlH _w#jrGO+HY>#X顣wR LUFB́"eRm$5ݘu/@۳8#P*3by ϙ":2k '2&rWO`ֵ_Y􄫗'Kw5nKN!0M#[AQ!|\S7m UG4kJ)(H }  VB3θdukʇY+#-k?Æ+%HnQd|)R(~\.1{fN}͑$&aSt<$=͖2%ZڛdKTCƕQ•py e#T;ʔ]Ocaᒆ{Y%88IQX#V8ڒ8Z6`{z^p>)()rX6yr6b [F0,TNIV2=ʅ<*Ms+i'x!<{.O)D]%.O!'EU!\PUވR{%뫡-a爈iZ~Afv9,xƺX@NrV"ghɣbx8G6R$/8&jMn %zp"lC[1H ;7(]@3pU #ReҠ&(1c*H$Rci:tVP6T0w+|HxWPP(l> u^sxؐs[,6"{e>s&Ҹe~͎җ9%Qt &`.{#\ ̪#X͈ME;_J9(͟A (*_>77:٫5)Z/JJ jE\湏J|-Ĭ6][?hiS95W5) XzMxOGe_P)ts;.2}T28q*jh`*Cu ZǪ:;4轎n!ț)Ak[XXS1|S-˛UTE-6bM0'a=[\bQqMzqƠ؏6R݊&hqA6,.'0}V`|RUq1`98[q;܅ utX QVW"Q  .XϥOd+8].;~E:/'"N]JxZ_l7,bnj{k͹y k>9ȴy}E }H L5&죏5/󠉮&baaU8+˰4/JQ/5q.ˍԈ*܆~x>xDU@i a^eg)^f#21$.cxϡ,@]0e5MWPcíA_c$\tϞ:m%~ kJ18l׋?(,n)1$h;.ަP] a4vP֯m&ԙ{~; #{Wf@\=pfۚQn_y[v endstream endobj 217 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpK2CQE6/Rbuild728849649488/IRanges/vignettes/IRangesOverview-ranges-coverage.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 223 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 224 0 R/F3 225 0 R>> /ExtGState << >>/ColorSpace << /sRGB 226 0 R >>>> /Length 651 /Filter /FlateDecode >> stream xnT1 y /anJ * Qu.x}$-mG]j9!8K6|F :#g 7?+1oo|ŭhy{g'}7zk} ? ]*&$dGHC@;C0$оfQNUC,ǘf5{$^X۹,:AҌ&QGiv'^t_&,+GtGx5.ͻCeve}ԯ:` `AxdC2fzIevzwCh7Ńe`:r LtJqV rݵ^,'SHS.=kI c'h.KhteR\꣕r@74]k`u"<(aR%ut5$:z$phɞ-m6sL}`B;K4m0s•Z+)؁,e;E7Rf?bہЇ Sy1x [btOs5@,b5z؈y,G[%f$() l" #ʤjJm`B7ɣ,^;?6TvDF%L+ 9օ>]]uU A?eݛ)եߤV3ɵi#< endstream endobj 228 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 231 0 obj << /Length 1003 /Filter /FlateDecode >> stream xWKoFW ܢ@GC$Zd2iR]̒+6ASć7_Wo*a*1U+-OM>%˵i񴆵zlwt waKOO$- Tγ?I9Y$w&c5ր7x Fm{^_8n큘­ٳ,)lX{XhAk&Z0Q$Yv̥17FnL\ġʟ#63Au=,W%@.݉B"sYA3 J &KVՑpd%a K$\FG|"] iq-nw=H`\۱^0! Xxv ӓK|4&ӹwLTi뎞ϗVnk:TTx+q%k`Yn"&3Ƌ{z ^_MېKLpۥs>g }2rrpG@jdVX;(Hi_! Aw=uso$ʔw@bZ5V{!Ѝ3G7gy..(c-+=rNvbb_s}pVۜLp+&f-%EepjQ+~y- _QHفKeT'.HÄAh^*|h񅿠ˮ g6w\S y1ǸED针NfƼ _-~un%|-ğ[RZ-?ߎiQplwGoLQqR_g:HNA*oCj sx Y%,,?aa:qp|z(GusBӆhӆUK0i#"O 9(s~()`c !Jf* ;Ϳ.Y$1^| endstream endobj 235 0 obj << /Length 1841 /Filter /FlateDecode >> stream xXK6Whr3_@ 4E 4(zHBkkN"˵$__΃%K)pH z䙱RVjQ,lWūe]ٹu֕Ot2QCTypJ\U]yk]Y cSN<߼ߔ> :  {Wή4,sܘBwgiw]$H+ks\Ր6+^6a? q%SQrx`=MߑڰYE^̕7˕3(JjQ=W8&YTWAd5\D'˹:,Qž^Ϊn{f$DuͫFuy4YA&?yVDA,]=IuC}`թ=J`eEe]*c#Ws "?У&~Fzv'|.׸C4z~}OXz_‘F ;p\-ƗWMN7CEWi(m|\QIq@ ]3vFh͇9}gЁ#KO}8+h!}VP>`W|qL'_@Z$3sqe")^ qh $q^{< c@lb2d/~D^IB\D9*DE'`Чط>91! 騧7[Z [B "aCW)q۰*Lv.rCZ 9NkjQ{ 爪I,P+S h4#dDpe}QzIh',)1zE bx듓z:muCሻp6,k~+4 33)X/T3S?Rsv4Uo=y)I;'irR\\h*QL ]PHw/l:j  mb,(z鏄`d~&50-pj:OzBsMh8*ZUwk4V4S&~CܬƋg*qHxK uzz󻂳U8H- DFUzSo`7B*9GY3Y$e5Fdzx8ĴgrUSC<^3j n (bÌ?D/6J?{/ q:&χ9s"-oӶ/})t#oD{e@Tj[]Z%δcWxt@m>䴀 z;ME.E79C1RFozƯ† yUdIR" WKG! R#<W79 endstream endobj 240 0 obj << /Length 1125 /Filter /FlateDecode >> stream xWKo7W,zZ^or@ր{I[ at%%3P*'h咜o|(V(ƹ*?f.gð$H&n﷦e?~iUz6Ej+m1[Rj)L--TxN ]M+Ti^M+U;Lg5e$_١2.!|U6}7FT̀欖dOHu'@7*xg? kqah<QAЇ 'u9N 8l12~6IpI,a -a0^qLK !6CD.(#3ٟ'D C#ٝDtX۶! { wNǷ;QG! yx:I3cꌣ#º#;6a= 1a;,.1c}-`#lq}M΄U0E@(,cz&#+yAљ:&˔^%Lvp 9פ-[y-g˟rlBuJ9CEAc"ƭ`ςQq`UJf/`7y 2]]TƨB,gԼ\1h5C RƟre s~֑u\M2WxR6.vyx{fϞ/[ԑzECX1Mޟ6YneZX=$Sپi*Y?׌yikaĆ^rK& v X gzڗoHC{};?nK!LA ޤ{d\qkV7mF|aa6_»oJ%+w&hCZvFHu"%tKm5aa9-Jsn)"&Ʈō4%v46ٿ1nC2ql/Eı endstream endobj 237 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpK2CQE6/Rbuild728849649488/IRanges/vignettes/IRangesOverview-ranges-disjoin.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 243 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 244 0 R/F3 245 0 R>> /ExtGState << >>/ColorSpace << /sRGB 246 0 R >>>> /Length 366 /Filter /FlateDecode >> stream xN0EY3~TT H,(F͂NGXɵGws ( pQjp {4 7xsXWBknly/g3%^^AFS ?2 xD\94|Ȃ2'0f*9Js> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 252 0 obj << /Length 2110 /Filter /FlateDecode >> stream xYݏ߿ OIA҇=ɾBk^f{;_(K] % 3Ùg*?RʕW&sڞn~ɜUICJۓ_7[xrUe~u_i3ʕTnu[}~Q#gx6[c֬]C=wyN3۩ 3n/2DzU-XUzs&$.!M "I,C]Nd|Dz$ w!o}xUk'k&R5bk!IDEBB#f/ݪ<Ȝ3Ji"P(~B0bGtu`ġȔH) D'4%r4)z#p7FhpZ"M3]!Y~qr3?ج(@))z5\vڐc-?=s&%Vj +/j||#]|z[/cS# ?@, xEֶGy9a5l!Æ?,2R.(+ټ~ 1a) 'ҡ.~G# ni rssۈ;vmЂ'h(W ;lr?Ց ib8J"7:Aiȼ_R=mQr 0>SS&,B/OE7tXd. j 5ɼ`@%0YzdȀXQc8KhDplEot@& M+l`V4u }@pAϋ.? F-+cp.18l+`;C hD{iZ<9< ꬴ0%\/Q(cu|]Brڭ)c1WI}5$E!瀹tK-AxD5 _/YUKnh.LR'Q2hen[<(ǚکRGr AჄx[u#:.۠?lU*tINsKCK\G H`3 \ǚ,A| [/LT<#׏p$PنH+!LDH*! IXmJ,pas`>Km%Zh3c >(k{c3rjƄѰ}'ˬW'=]D\1nZ`o,Nc}!9qTs^gPwR /g}7浠|,ա|E7ӥ ݻO/LZ!g^NP/.S/}1SjWcB _(6؏Ue_(6OX\pۉ0BE⃢pDnOos&^9p;X˽lI-gs;9V$QZM9p0,åXߏs~Qu"Ķݍm]Pwj9YQ@ 24WpG\QW@M3 ~GgElI, }Z!pO%Ԑ.G>$xvvj,wSg> /ExtGState << >>/ColorSpace << /sRGB 258 0 R >>>> /Length 353 /Filter /FlateDecode >> stream xKK1gB> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 264 0 obj << /Length 2031 /Filter /FlateDecode >> stream xڽYK6W(1#RE )RE"-栵^׿pH&mh__<{C9D|rq5)*Qh=ɫT\,&LәR*p_u>\< +]}7O?^{IiDjIjϹAbm3 5FY);#x: `T2aLRN 1U"3Elw΁޴\gt1(e:ݣKr4h\Y?LKL1b93oa1aԃ=kO?ҾwPovG[<JEsE,(LJV| U:y?2̲nt2SRi $GO +> ̫eOM+PNKa5ʅU4eu:[U )8L LPwmy{8k+@+Ɗ\=lZzw@"ѶES0uD_J!qx:!GLon?ԲXUN0UPe컆!<zwr{*c3 |N>d(e-tĸq1iYa^gv\\#%=LU&qfhE\izv8Ȍ]YKkJ,fA)dgyϓd]N9](D5$?#apC0FPĝ C_;fCGܪRUc~/mBB "*e4=aYrHm'|hQ ^z,rUsno©8J=pKXPUɠĞ?ʢ O9;_RRyI{v:3{*K&*KҷXx Lhcn7MS/aJjxcce?Xhs2E46)^P)^adZV+mxxwtf%}GC$@* :>yqiG`~}ݸ3^-vvf[ŝäAǍ uC^ݸ"8ET`K]ޒɯ#o٩hsq9ۓqy9wdJ7&ܒwD@Jܯv%8WLp^["WTw^:jWm,, :vU-xf5|cc|W" d]iHcR`4[Z}Y* G*g~" ][|2SSTx|2 &iqnTV3Eä́0Ycn8IN*T8PQgn_)`G "t#k\@0Gx/>ޟ(^~55xHxuT` ^P/F)DAh1  Oq =ua":Te}g0kVtv:V0j[j}0~bEW{&tα=Qb~|rQW)`u fw|coFv:vi'XtvWۂ^?0%ls> stream xXmo9ί؏T}ҝDҤ6is@+D*oeǻK 4/t6klg^<Wel0Uԟ>}I &4"Xf4tk4Xf ld2βzѧ$M㆔~?汰54pd3}S;־L,2i X*<Ш'dZmckXzA, Al&:L a}T绸[W1#HlCH|~58Z+M~tkX۱PҞ1ne'l-"rjnﭼxsTxz$ρ594en<PٖG>0_'?1EdP EڍSFx޻*S}Yw=$ޢV޼[{ {]6sk$gP^I)h ~Zns^mvÄ>A>Ĺl74!CVLUEx|[obe΋$ %pVCnڮ" ؖ~S ֏SB)UKWdv37sOo3-m<^c չoc_֔M5+TMb6g *Vy7T5ŞjCӤhV=R5Ӻ`'jǗB8U`ӷ0,KOό9 Swòr ^SL7~ bu].4NZw{q θ 0P說ԝPN[箒Ź ͛L{콧uǕ۱]+|^\ËMB_]hHǹ(1IxXUⰝ/-].z.IywA/w W1i3(.t,1HeȂw<)Gq.^L:c{ endstream endobj 274 0 obj << /Length 149 /Filter /FlateDecode >> stream x3135R0P0Bc3csCB.c46K$r9yr+p{E=}JJS ]  b<]00 @0?`d=0s@f d'n.WO@.sud endstream endobj 172 0 obj << /Type /ObjStm /N 100 /First 867 /Length 1859 /Filter /FlateDecode >> stream xZYoG~_яQ}/N$6CØɳH/9 ~ҌFXDGuu]MD'Ak97 0I ,QD£nP"["z^/b \))g4谄 Vư otiAB I3@H ÅNdJDs0<)d Cb8X?J-$3 &S1p²[tY< {>%ag*  "5X!\ @ M`-3S-TcMh8"`X6!du\I VO@ݹDaeyTZ!_ Zڈ0L@ L,Z`<$=ΘML"0kaX  ר8_#0 IvBldM -!V$B b5  I@`oѲA1aaP!L,tLhٺŋBZ.P`0-^[k57ҧ"c9m|m~^-oF\ +V1)0+χ uU3Z-5]PrVۮY].?K&X\a.L C..9XЉKb3qq.}!hwx9cLvd22`0@jCOR18}lG@ty V ͐_S_508CA?'VUyT#-l8ۺcnU~x_OMaӜL6jţV:I'uX7wMNS78Hf7rR2! JԃLetwn4Rw_zjEwq6 NQ >dg[Fix_2DIÀq$Kɇ1E{>DH_!necnݞrQ돑^GǁGM背{yit>.KV؈PKq绽c>F;\iNn)tZ ?'`p 4KLgK3^+[fժEm&{u.y01$5r:/MU:e+7B}|7Mn^Is)ؑ`GH&3҅$>k@6f5<G8LBztYvI" ~`)3af >^]Dz@cUg@[~,U͢l/wWlN[? 1jԾW}~(_Y~>p ]`Xƿn~?KH7nUTV<}[mQ%%5҈4"% &qϿл권,g@Z]gzp$}*;=N^oofO &܎.Vm$N$<).=c endstream endobj 291 0 obj << /Length1 1932 /Length2 14534 /Length3 0 /Length 15720 /Filter /FlateDecode >> stream xڍP ݃ >$ n-ܝ`$8-}j}ս*2U &1Kgs $ `e`feeGҴ;#GlŐpdf7@`gge4(1䝝 D* go7[k[<h-l||<nfN%3 -@6` ? '#Zi A@7%fKcFh؂Rh8[=܀7 fd tEh)T\N"0nÑӟfΎ.fN޶N+[ @EZf9YA4s9ٛy:L -0{@n.`3Ypf)'K gGGG~n@{}NΞNAVNVa¢d&BGf XYYyy@W†.?ljz+ 2nV/BdcXZ@k['Vw&G̢/ 'wU;{|L\66nNÇjfwsr[3@ח !+qߔuoFi"mfM;m vRuɁvAmIz-Um6_r? j0vYؿ]S|[ )dlǖ7!!./:Zb 3V'`CH,RE<ST/e8,7/E|o^AЛO"N7 /![VP;b/_Ooe a/?mYu7_\Qry[Uڿg{K Oa |!۸Ջ2x+_\<2$Zocz-->}긫#dڛIc]r LWqV,y}uGFt.-Q$^}f4TX1߳v=t+d֝gW@xž~5Swh(à9knJ}|bj_/xQB<˟ U>1 e*vaQvxZd9h[L)iH}CJeRd?Sk$BZ2|!>9Kg9Hr+ݙtlFF6 bbYYF)PtCkso&;h0b:?/䣹jAsQW͈Ac N4I'_4#;8iE|<#H JgZRhf@u9\dE )d49%dzH$3h,Y{dc1˰Y~g |5p<XQПT'=SьSfK-0}2 ]ȹUyz()xMܜ| *{&![J>1]psY,P{^Qdna硰O hx# 'ϬdHZOB^YϧK?6C [YE~LVlsAD" |eS~f/܁ IZE\g-De܄)a/n\[/p-E <+o[ P4-!)\AYShp[A(ߧ4-6 #?,ČS~ckB3 ڏ_?;M!AbPEe/n)u"043R@jhqqx̲\ZdGހ!HK3`KR -TӖbv9 MI1t2w w`hM%7^Xꆡi1GbbS]QƇ}a󵆲AGzX&ec /8XThdˈ TC| ӛ;ܡ_'AJ՘\>D]ӀъcFǕTE^dH&j2Y2&]ʎ!)d[qHko w[K #rR}W+㖈E:qɊG}sjWML^wPi0#ޫyqBFp.v B8Kfm -eý OhZ_-4UaOl}._1eAaԮF3=43b9mo?qF)_vq[9%jL43e917:VpBLU$!Ҙm0xh ~[\ArXʳܚomxki  CӴ6L^s> )+%HKNa6 ]o6=RP\B;Ӓ}C9vŗ3~+6bعԝ @{8~藐h KnxS9ot.BRYͷ*%ۘ.1tCneȹklWHpƳE{VY.~_;yxIm߅7]3`dVKu3*R9jev`5;)kH87Wڔ=Q!VZ/ujZ>~gFjἾBFE&}v. ~sкSU_jLo00ib(ahFn$W'r\A ;C*mX gv&9h {g!bN|auaԔ#@u)5/6Z׌dNNaKoHIf>Y)۲R,y+ ]0]c' 2.WV׮GYTGu5U7> ^B9#J7+5z*(%!= t7{Q$ME 8,D3O쑠6b*k7UE 4EcR1}Xh 0% _S0~4; ؖ(Ļ311A([}0َE /"sT.nI*\ -<:^p~ˏ(^t$h{>`>m͇*Vѷ/xP>ZQb=yx* ѽw$]Z އf>["guoH/$fz儘+ZWٔe[/WMeiLIjŏ~&/# fG{)4g%\}BK`u?<ӂ%W`$fJ񰉉g7[#9r@*iٕD+|}e#ȠY} ݹY*|YtDRPދϔq9{3wI`wYG)!2"1C8]e w[b4R(/i 2ua uͶЦtC9anlA'g=H."T-~c`?߸E='EwSZOt"  |YV13($h&>t[[x8 ^xgORzԌǭ#2\35cWTJt+H|}D'?Ϝu@:-{؊!>R&{ sL1FbUAIMG$`lBME#Ya<ڏ5s^GC6vČr^ܮr)Qm]'ftH<<0s[lRT uj*\<r]%ecY]eVԌÆ?ί6 YOWtqDdL^s/okOhg?Iwd1AU9xY7yKU_l<owci˰Y&YW(`ĉuҐEPA3F^߿@%% aJW=(Ds@+q>q4xxa@]rN_r×rNб^[G9_=f=ChkXShݍk $ \|(b y$HJ&~eu%3:aިQɩ5ރFzYѼ#0rr|IJ`E&CDLO[m`ٺ3ݴN=`k:/vwWL"EMɒvm%;W䄩fD,l,SsbQĨ`hKTˤAl4H3pHJ_:!L z`$'x/ 6ո|ᢦsci~)ҥW~ftmX8q-ԑUY 3jIYSJnJNbG#z9d#bܙBkB`pD7C2<MU"u(VȾ2=w:Ǟ5T dN(@\>m*G7U.i7ʺ.[CM5m q2ş^l JZCN}UDkۄAGYQ MM(h._x1$]; 8@ ,XS=-?$p$%kYנ ՞ |_FT$\s!w(E~ڷ )!\> ȡ\輴~=Z98Oae\SƋ*\9 2ko%vwiCi:!x4=+[ fQTOojU(NKubC3Zq|ק=QN dV`/c|[H Rt$|GZwy|-8 g Rw$"3˷@n9S:aU&+;|@V|Sp4jvq -o6?>%aN;sM?W3>^˪TaO\9e Cr:^\ H} 99``!ɶrLLo7v@9UEoHΆ4_k $ty0l!̟ 'f5C֖IKQ&0i: Mtݜ;Saw $*JU4x\$ b[G89/%W#ۢނW8!w8U}%?j|I ϭ0sHcz D)P`Lg|fmzn/H3-C4c SAݮ*d3C e nj sm^\pu2bKyBISmY<튆7gTjj'(cwAozPTaHU}/>t;CsL9OrteP(8eB`) SDufZPZjW[(`D]K3FmqI[='Drvyz8!RMcÇ+CZrȢmSWMobz0 l◦+W5MB.MZ ; Ɲ:/G[>S-~J)BXԁZ 1ٱ)-5ɾqHKtS'ѝ@bw7ndW uQ6a NGpy QrsQ0#\V1  kwB!\th[Fc`IX jY+Fyˣurwt+XtR؏Nm_jE:5k`⍭F7n]5?3Џ#q C杲Di&hLQ(vJ\.+e[0v0Oc~E0YuIkiKgnGlhnK)\G XZܶ+AF Aw@JLT(i$ a)sk_J:3ܝbϙ>Bon'dtmuЈ:}r#HB؎,u ^?tNMnHسs\e1,Qqb `?B;K[)g-n@ $%bnN¼10V+&tC=5B;]Rqy)y@:'<+d4SpO:v:$yҊ{i$iCmG -$55&ΠDžI8Cg*xKCєKJgsNMuAB\Ǡ`>m0)%v6 ?-C!nZ}U^GG;+9|Bw)]eLp*J`J}njAe(i[MH?ighFir]{:frdM4~p 6b|Oy] aMJju"ޔ=JPY/e[І[;4 f,V's(ynT= Jjܻ_ ?ջLn %7fM}u$gLt&O-FrtJi鹸h~ߨ!kx!G"wTxL9\1l~mBjIaLm^Z\{Y=0~(HdF@`閽dWjɺA^$DG154j(͘EB껽eVsvd.YNZDJTtI]/2NT<'(cD`Ł`s:QOL9'rVh՜'p@0 yIk-hިB8-|9$\D!TxRO4\PR] X{Ol@K8:oOi5:d)G1h%CM/P{1Lnj<_Gl( Mpw0=Q!6-(PIwG== 8I vu}xZV!tc ^|Hi=,K jK|M!͖ > }tM0@LgѪmT]>0+ieq0`=voGc% I(lLO{}:墸lRg\O=\uՓ$94IExl“6YM 7:K 51YP#{ 3&'-xrx<4]b.;0ͮYդe>%N{;a^)R>ꢶ!tQd7uıX q }yo) X}^YL9q +)u5n48 F#|LUN\3|.\<łk`ĉZOd 2(YĎ,kCBw5]7nmxҏqC-a~O7#>" nQ 5`ty(S%ʡE}FU^jrOnyU{pu0 z ,61G6LQx~vէy^"Vmwq9]SPqK 3o%l)&/8F5B|QU5dȇXϞrIT9)#ܽb>%QE AMB+ajPܐ=\=4ضj?PgWjo =Bl/3:pV Z[)D7VupPiv4`\* A/SҘt o^ z]D ph&JcƴU6DN퍼<|᜹5]l\p|WdKm@d7qV@0sbRSJUadu?몉}/ZSmV29K]G+RHq1^{ͩP RRz#;6Giz䲌|.wDN%5iwSSn)\KSX4?gQD(ɛՈ&|ݻ%Tj&r~Uv &)dŤfLJ.=UdcWb3c:˪Eg% Έy.\y$O^Z,'Z 9iH0ØC.}/d}DNVˍ. Qy~Q^RwV;gy7HZCs3 - E2s%xa1e7]qFoCbW3.=3dD >λ)os;I׃͘+ A>78K<8ʉ8]jh{zx\o8XNWA+B j=ө Qݸ1&H[yla`Ei2RYo*(Pg|j*/?g]iI1+1 ^幺 R>(pp1$n T)`\CeHvHSl BJ Ϲy"ﻝf(r=*r'f{e5\; yj52;qQ>Br d݊11K=l.m=P˽;~PA08e))Be λżk)BTd\%cʳ=9 'Xru©t (ot-NI{.yeuweV w ]sԜv<{y۬鲽7c0i:K ݒ^X.,NuL\5g\T]|)(!qc.΃dx!R ͊Z~WmeHnT: =^mMy*cT'Gө$SCCHwp_o/|{WU& BQ +J]v<:JݫƶD eS"SOVL6e <. OS181+?V ݇^9_U+SIxuEke5:dcI}XDWz4{|DbVe/8T-*zce;mDuօ(l+56J([$CaU=؜tvB-O& |!IY4:S›L*zԮ1}u9R "yGiX~%eTԺbQ9q)SV*/gE-S`$Ŵp5 nhމҷ@ڈ\?45MYd/o~a!cra uw\"4٫}XPt5fdiԈ&$ufԽs]|[9uއkLOc8BSK=IcFj9.n530;cBMaeԏ|R%Dz&]Y/VgdHq+"4U\9F褉R6IsdWXI"F!Fux4RvsQI/{;٥YaڜD O+-k6k8lN:8K 86%9WWfj.j~OC|e[+b-%/vH"pՍQ vRv* AfGبdRrX{M~6?׶'\7˭~mvKEh R s9^=VCknFo;+%$4/[8Ze{"|h`pYTP&K<];:M)1U< Jl zL>%_s!Pr|Nh}u}~ֳq/{ۤ|2Q 60VZ/e _ 轏FIVM?]AE҉2BԢ֯Lﴚ[ PkҺq+3LK&J}P8cwl 2[U_2G4L>", &N'I}Yʼnykwbq`-Emwg'Jich 1p/2;{\'4)֞\i բk;tT}9oԮD\~jK{xg".Fp M6} 9w 0 TlG"ގ~dꑫ3Ӵqxh:N=tq|NŷML \fP{Ȼm,p*xtQӷS$ve1n_76HijF؛A>+R͟~Vtܓ.)]:}y!/_*(ib`{RO \0cAM{;( ی O@t@)NH>hM`D pVޥ ah6H X@_ {!@5z(sZor3Fp~AT&OQx@ K7*EQ$s]v&OPd&_)hrP[YFy[6%s=^^cբ5Ix\jdm@!hT5O؛CHb?K"n%UJ /Mx"!<~9|d* gA|o v//2!׫ґsdbe^nl)+B(Z5Ƒ{|s)0EsC%e9HU; aJ> -d5iBvMYfrv(:WB+4Khmr rId7]GKo\w5{HE7KWg^gOvc 㙂_7?pô<s>ީGFZ#|3dm(ޫP &"Z_(O'*ё~*J_V)yYl47X=sx.V:u[nٽj&#Lq1z g-G'̋:2?+,˖+xZn\K.p1. endstream endobj 293 0 obj << /Length1 2063 /Length2 14024 /Length3 0 /Length 15262 /Filter /FlateDecode >> stream xڍPҀ apw> Kp%\>}ޫ{j"&F I['FV&>+ Rd9"lk 1L`kuXXXXl,,Qu]&& Rlf?hi DA`c @d~h ANFɎՕ hd`&Dp;TA  ௒@kпKcB%P5ur:oV`c㛉 &#PKY_ 7_woclc0[JLNnN _@+G7{ l4zS;u @RD|9;V6KؘZ[lO2~;/?d 11 g;O6`{guގ,,,\=flWuw;Bֿj򴳵 z:]@'g?KdAd/~@mX,? 3r3+|ДQwں<9lVVv^7' (΃叭-_?){h h6 ͟Aed1~y67yr)ȁ`+kM(ؾUkuEmLL "6fVm#Q2Q;k\uEۀm=-FV#{.c˷m&ޖCJۚel\m89ohr{L6No&('Y䯣YqY%K,f?`Co>Л?E`VCo^Y/ Co[_f6Aŷ2r(dG-%S?,:;M?[@[Vԭ [6mooߙ2#~ dll|ueοMcOVj[s|+Y#ۛ{s)῿? qq֘?Т*{WƝ1i$ZFEVGwik"][47H^<kޅ4}Wi~l:ӌ07{$RK@Ĩqˋ ?KFvYL{gTl{)_#s;*\rHS>EPf91aNcf#zc^g|X)Ugs 'Oś,̏ 7%֟jEsT/ o mMGlRXecfVsL :zͤ9Y:($wZ}kOZ }f tC9i⾿sDH}h8jNgxCsR}ѷI(95vnGKwLu@)|60rOalj?uzE0GJ m+ ܄lj†;ebɢH&&_&e'N]*XK5t2L\2r͗ $(  :*H@ O׶5O۵b "z{bSwCSI,FcJLqw(vFDݱH!X-Tͽpchcېי.Vys yQԀbVSL~~} lG 6c'_Ko\6b6cEBgy\]mivC]]Q5+NL)70#7`6`y`p$.{/a~^$c8f=5`NbG?L;yVN2h g.EImpӡ\>/:wqm{q&漂!Bz()adnmnV|]һx-< O'B %*10Wiax2 gB[ScE׽B%#㑣<ڐp qɔ¯\Mcx]%: hܣp djq6sbF#k {YV0ߔeRbDpe$稦XjM$/}ӄjp1dmG2" ɹ6:/2I rS<)^ۊrOo]T3ʘv=}YN$.Kk8FB$Ką?eI9gK 0i=sz x)COх}qO<6룿.c%+V]!w˛]Z]9)΃(8m~8Z W 0mNu rCmPXW{hחcQBr ݧ*%ajű;]y}Z5{^:JqJgwq] 5^so-RϺ![.8:Akhqpjmxr(}&W51}{ J9TȪ%VBMU'ȫqOΕV5&7 E!qet`K]v#5CpP~*‚NML%g r FlF (Nz7xqGG >g NHKvuA1Ԍ{Ǎq0×إ! Sl:􅠢23S`|W}%\ԣ+eU7^: h(aE/4M-[ISwLܟ:yYf̪,qd2i1(IiQ\za6X^黪@o-Wo@StY&8l˸Jn@$+ZGfK˼[Ugg4M6U9ZPy;M«E޻ ǔђPxK_6\ ֏dAv}#uh>%+tlʮW"౧6x)@HY QHE5ʮ%jy n)_ _Yh6k9B(8.JT?әӧB~ BffrN" $ E>4?蜇n}Mz |[kh5Kq F;*͜Ox)j% h)h _nKU\ $ ڢ"v`5-8*OL>`hhY9+OٖB7|ԹMBu S:yBBh"ϖv.lM“X˞A;[Rs??VeY>:wyۤ4Le1ZێFd  O#Oop᤼(> m"Y:ST$AH&uװ}n%CZ'*6..d0 ‘̷_i$Z7q`@"g+bb $l:&xX5)[FO_0"6Ʃx1s {3uLѧVOt*tCV^L8nd'x6C[ VZU8!n9`nM7Dh(ޠ4V`2F1u&0eg-%Wr6 ä3璽:]IOǚHVn:f53'A\ybH38dVbUghM˔DIh'(m4g%Ӕ!Г|g[OH2A~btnŽzO{ $ HH' =P?i%~7-fߍP2uLkq03pg}ƣ1~H˷5Ra&U:E(6y9X'ɽ${qNIՏ̞G*b^%&C [/79' I\̿4=FC'SG,ռ~Ȇx M:mmY!V{{{ڋBNSІy(ʯY%#@;>=! -'M#ͻI՝ҹJFŹ^ZqPUDvsuExϗO6UKEqa+bNn7H,Kꝛ 3+Wz0lDƍHg[6 k 62HR`j>Vt+漦ˡ%-S ;ۣaPJޡ`6ޢ4V ްi! غL #YVDs6PO?LaC&:*S8|yc)%CC6S{w^VUFLbɇxHP2[$mR,ȍc)F[* S > p?u6BMŻ'hk_NsSG%mvbzp- `t F؃{+0 g͜ƆcDjSM4nv: ŋ`0ԥ/u"RWV?ip*^C΁t ,ҵ௧C7ʠ@N pl :g< @] k9,\hJdf .+p6Ǽy3ݪCs6m7A((GOCP|l \K_.?Wy7z^T UzNMҲDVPtU)mi^4w<\դM뎖@җAUW|6!/-\Y?f,.e!$"g.R\B+tR܊6?}OZw ǫ~$a3֢7MjS%q _5Ai!Ecݤs^qm?ϓ'Ua?3(& !.?iQ~^Pjۻz\c_¿D8y3J^^M[ ^ o E_D?<hFbD6jIxAT41 3T^~ф3пW6)qWU]W»=S$"-h/RD +e#fx.V-Qֵ ijoKD<-f#DŽХg킝b&ߝzxu8#/:8VtB&#{xZu8RނV`|p3M +b;oy.fM%[Ҧ;PBg"A ߿kp'N#߫5&j]VVȇQSbn7ܟLo H(?ȭڭ0lu~8d)U^ΨT^y),.:z  ʘH"{ ~&o(qYe[ W##.Mf-Ngbx.z \w w?zHk|dT0U|3*wqo l~tVh2qĞaOCD̆Tx<M{ꈒΙF+od嗢c; m!bnjL4 &Ni7W_CĺxCY~WȨ4=wVkg֭g-IRZs7{h:],.D܇pxcJ *4Ig{we"~S8G׋PzyY$Z )etժ>ADΦM~2a>O΢+3 )`[M&9_lV=tzR;U#ֶpx ٭,Lleu~\NE.BM:DX`U*eu&-Z.n/$MQQ`Ko#krFLX0@ҭ;X4awA=tv.7…|Or趠 W wRB]UnRrׯ[RV{l_.X„)>+Wn2) ruDZӺ>D}? vX΋a  C44PҬ'1g<.3f?hD˞/OLr+xʽoh6xaQ"AyQ}[Äa|bT\.J7Ar2uIs`l>ԗd!8x^k**uj\7r<ۅ)VL׼rQ{ Egꛀ>2]"1T˿M0Cކґv^G=Wib2{C%ZʷgḣZxY#l&Zc?] D:UsB t(fȒd;-؅AHs*~3EcG{Rh{U%6M"㥠J Q{ !~gO/V mLRH5|޽&AB"QnrL)/_-O`K!Fnzm]4Xj!(l>{꒬ s0Ę:B0 7xAB;:?SUc<JĸrxFaKTۼdsq>~6O%ss7`Ge3ãZҏX HAp;X1i$M.JxOV:I`t5Q)k0l5OP.},wT+e}Mny>Jd?Hë31(TtsîAu)&aXMTlfmlu0F́W}ui\'B5gLt,yI;tCX3J{5T}3Cd5gġ<=?o^ǝGNDMi9i`YC<]V_$LtP "Ĥo1 WΘ}I舄HZ ieg.r{] U}@s>9b N:y9k(II4&qp53']7U9&x5^Cl|o)'QG#-H,cG1:C ُ Kےt+ʝ|Ud}.ۼ樑 H KkyĨ2-?ܽTxzH ivׇO$..e, ISZgyp'?;O: 9ս x\='.9Jk#"dSMMWQX# DC|:}C#rHQud% bni}ij\ջZuB27,{ӫ#ّËJ(_R%)*GڤmMUFలAguS~X'xz g0 Є!@\\9>:&'˖In-V:ZMUi| &cg[ J=(b sڈ5^̒f^\Hu[ZNS)`D$; 9:JUĕz*DZ fg Z-9(!r8+]e,HHmr ߭ό;%'udHE>X e~X>rѣ=G:se UZ#W֮e=?L_43/`I5 :j9bP% NzQ?ףZ5Y~nMIF^~KDyY;R$zY' GŎBI#a"jבu*j3Cg_)pAP+Xt/dVUݷPKT؂9jC_&)&bh̺ܯJ_ GZeyi9}0 taA +`XсuB68$]- o*&?@R:,IJukV-eK6R/.=r׀;cIxa5  ϯLY D,,G*>aF$pc3~ iwB+eG&h)s0]EGӠ<6e a [$F^s1 `=eFl06*Uu#PRlZL5-vzH`rH!ްvByfʩ[Ւ1鮽ǮHdWxb34NZv.WÚt'Z‘ik eJ<(QItʴDvjyغY W7-[[Z鬄?1S\:zEmR2Fhw8F6+ɿM)Ĵ֞+)^ZWƴvT7"Vz~饀o9 ijϞ?KqҨc/(2?Z;x!W1=vsbQ5 yk%!?Z>J'@%b #r=$X!j-sL"3aA(]u0](qg,t OE=}rKPhן5à1udz?AނGvc;TY Erw}mc"0ljjfW~glfGA~V#e[5>CGѮ屝nN< . vw:bdzZ) ](jAd3V%8Я.g<,y~jP;b5'!Wp$#dBZӚP TB툧 9BSg%Nho(F *r]<ۚnҶ;,8`ēb'[Na8E+i5oAnfoW+GC޸hBE3]y7h5]ṊsOԘ*ot}ǨN hdhI&a?KB#^zwe4 {CL6]NĀ'L9'NFheyL{`vXn@ 4ntp¤ؿ΢f\ڧ+rPNzb1ujǓ4GR~qr}ᱳw ('"\s-LܱVV &e#o5p *%_ :;;wma𖰈ۆXY(&*ǔekJsqVg}@_"(26,q} î Koݟ u;|="]aP_&jaBի׌j c[:dI|OTxY 9}H3a(5.=s-.n\hM+ZdT3,\%<ͤ e0ѲefD[n1BS_~dFչ9psHZ:c[ͻJ7 PMCR/ы_-"5[&κ0Ē?ȆSHF`@b/_o56&|@Ƶtn\* !e sȷ|>PZiRY7`7shiCYY]>X!0meF8DUfc4Ii#9^o C%!2:`,M:aҒ]ۭ*0ypw^~|eff-Qn)0ܴ'rU{Cu%h _k^iEM}\duIGLG(}jY~(0gWZOԑstÎtQ0J6J } ܗ`j>mz/^a?#qK*&Nc,Ttc[\.Jeɬ@qUVEM9}܄NUqQCq݃tq0v9hx,}Nɽּ2&lԆ'W 8XwO1߸ʜwV?C'k}uS65FMvU[P(DGAq< [$xKς Ш3}$#᫰u/WA;&lEwä/ fJh3"^[17ms)@~Dsb`=ǹ֭'ܞ &ڎF%b:!81?R4z:D,FVenV!N;II֔k3W. ;_jDq$S}O:2CJjf0^w}nZ\MN-b9rPV/bt`t3e8z~ps!:Frk(V:vngY3ޚ-E ?%6>.pt ?x.x8豅[A:~89QE­%4<(Zkm?yNANb ưUgYYҞe}5qILOoUb82{er=-t}AȊ4R}[8Ad#6\mhlP>yP(A}x& ብt/S@|^ βqTX5x>)gND T&Lh=H߿+{67\d-睙fĭw]UWy4 &SǶ!xHfΟĐ^yEyErc-զ$3mw%2:r 54FT{msKRgZ9DX"x2|Gģ_$}_se T"׆\hÛ?Q-C,jPMhYg/^f`͍Ǽ8;Tˊ 7Pqk\+dSA5$TJGЅ98WX֬&$qAsZ5szLf@Ed~4ߞ5E /!0BP']ypTqe~k}"]hCmgC|%xsZ%DL-vy7HcಲlB]|tuCM^q^)q9,4m%|p!<6'9؜߰YtgNW+k4$Q\rakv15glD\컸*-/jzzb<Д3#kqaNGߕwtw%$*D_B J?(<8F xBj>oLёC{X4U϶#Kj$gSud,شb3q|>> Xn.N'4Rs vdxi;T˜+ ɯQdCLjlC#"(XD\ XAԳ#dzfMi7( D%xf ?G}gMV,2,Sʆ>tA99B} H^McobdЀ4ˉ ^VG3AYd,:qZ:"Hz)':̯W- "/b첍mU2xN'M-(<C`Exa'3Ż^!4]AhU*C/<##Yp.g]x sg/!$TT}f&H&+(}FxՃB|J5Zp#x$ _i~f|ՁXV +*'Z:HẔaY]v8UȊTOibٖHtGq%.IK M  x9[f7(Ŭ"N#iD*\}KyA=S5-*RfyhEC;ɜcQ%{PcyݽpnpP=cA?y?/d >_,02L]DS{ .Y"hN E}14T5eKY@ K?kCԅcbg>i}Х{(-ʴ{ endstream endobj 295 0 obj << /Length1 1464 /Length2 7257 /Length3 0 /Length 8240 /Filter /FlateDecode >> stream xڍWuTݷK.ANRKF$;AA@WwwkP:23脋ſ̦/tuo(qar :Dn Vbh:k-f#i%0b{RxbK!~)I3(YȘ+7.ss͌tLTGX) w8"3Vu,GjΓ~4MNv]rcnw8[y|F' F7' )FF+M~>|!q,_tNjAn>݀5/WYAbDt5f\pDeGY[kjAOb5(\m9'u`BЗ@`|-z`_ ݁ 1CƯ0*k/KHbg5 B_*%m:5V8Pew/JD׷%Vֽsr4=ucHEm}"ލ^^Eq,fSĆ!TBEc `#PXMygRP1XI.QG"F|]JMh|Z!Z">LҮ iv[SQo,C IylyCPV8h_|[ (`zbF:lB>\ZEY1[.j|u"¡m}jE`jasG򟇸2ff9umNMml\& ~L(pm#p$[ uf?ģCN/ wB׼gz_HZ!kh˥4*E揇' DsYؙ5X%_)ܘ eL#zR}U c1"eFDԸ0^ {(nOҭ։.tO;j+zrC9yojvLI˪&l+^߉@]JW2R?j!uN)8nxz# J}xzq6 Nď'h~ZБZnjMʔ4k _:Z.4ˇ}_2D1Di7Kҗ$MiX 0n̑53 f(ݬ7iG#MoHM*-G}'MO\]ckQt_[7c[SÆ~Z, S0ҧ%6} .q3MZ?ƟUk P3U{¼J6eɃOuX׬ػGS s⹒)r$Ǽ%Q<}#xa DT K͚Rxs栍q= ~q뗋rӜc&rЊctoO1c~ži$II^/E?jU߈g>qG,CSbU7# }Rje q H+|w@]\qe.1}8.pW ^2s[ܚN~yנ38Kv6[ǃ$֟L Hk!mTeO7G,=N B=@oHK<4ycTD=0ݰfm6Uden—ZgN)feǣQƙvG`4|c kmܫb|QOnxN&`r~Bb{GctJ||rFDM-KYZՒEG)vrI"],c | ><8ZؼVQv]]>:zec^ <ġwijViqkxl _uSъfQl.ed2j. ҷ'DoHSμ EJ|8J"HF ϽA)IűRr3l0 edž=d5ۈR0!➹hOj]ZLC( O3ht,]Sxle 6& YLG\ݠ lwc?Dq}֖sd1ѵYMtxAt}.MùTn?oJbc6pS=k͏x $qz.?;ћhkQjw)Mygu ;w>5ʫϯĵpXG(R*#iL1u'&iG9f3V .+~jR\}Y ܢVxh%G)NI+MO(Pw$5V3K{N԰=#i* f 4+׎w):rc3[{ڲCNMPys),^QEE@28ERwS=\sd>I!A6Ϡsy{43]["+--#oUe+Er/žiκ#SpN:pr3 UWRFh@Y3e/L}7Xlw)6U=1ֿPLJ,0ҔX^=Й]Az8Aߜl7,o'*In3E`۠RVʼn)9ʹ.yҲD:F˳~oc_GvtAwn /Ä5{=VCCe>0a;K8ރCIgܫ8{j7IyfN8]s"ۿ}P;n$}lYDZDQ NfJhM?_R 9-NĔt֟cumq>So9]ͅ]O`bW?-w\rS;.E(e3,&q>fYLJkDe M]~gʈ``n=#KRծTgP(0Gg)Kh>ք/O^ J|XRpNQ&5'EgYyR*\LIA#ch/r4)|uN&% }ܬ0GI}ʹh¸Dߢ+r&h[d^ 1dklh:`Jq&bGaPPM wq!;1G< ȳYNF7赗̂C9Wm|,]@hcC1U;4`7ՙt^]7JWuxENkEɓ3%F`}:ztaJ@qg O1Y%6Ebiie]E wW3jR04ISY#| BS8.+\wߢ@5k!Qv>f }u'讳nRp $_܋Xti*mmdz(˶%}VOHN!bĒ3?-[j(o0U6 >圠qЗQ(3*;:g+KDˏn0/?9 ߀LzmWJebuaJwy1Ik> ZLÛxY&ǵ17>Lu=g|pcXG9v|r&# /RzNЖD) Gܵ6._4+>_⬛,g7ݴ꯾9MU}\訾Ϲ`93oNX9;@z%6a>GfvZy6DmhzI8"3Zd wg"]Zv zٴ{}E-g2F+F7j!r]'1S;{?Elu}2+p3%qQ<*2u3!ԇ]״&If^PL|l;GVsa؈GM9 5.(ӫ-T-͙1Dz}J z$Mr{V P`dW 7HR26#yíń7:#zel|+S\:d9KU|[FR3pmB[Dm),2 ȁxgHzjLP̮+SJMV)ƺ ]vFY^s8LR>c?csJ Igwk]3-Eԡɽ}KR )z `Xy\!CDŽcCΩ.:͌*Sj+@n0EEʋݯUv 崿Knj7 OGJ,Ip8ȟJJvJU2#mtWD Y-8YEA'd1*#!mkL6k3-ufŷAp7X5+3:W=X-%Ri0_+Tk:'qL*<(y>6~ Qм!lCLFR!W8]}h1?;p"G}lI>ZKy ;dҜ7z! =ѓܑ/Vup-r>,Cs.4LYGX;+%FeY*HjrhNߒ]1H{xo+ .=*5e+&WUZ_Hϩ(ʎ _is2E q$qS&KK>b"^K罁зˇajL+&Rci͇Bů%][NiʝOgJ$M!pl l)4jMH Ȗ,$pʸʛo2oG[TX?{ǛCsYf2jVyyq endstream endobj 297 0 obj << /Length1 2527 /Length2 21780 /Length3 0 /Length 23219 /Filter /FlateDecode >> stream xڌP]иwwo%8ww. Cәof:~%ҽ߆DIA(` Waa03123"PPY-F:X9e 4vč]Av7[ Cg^@ `tAsprt#ڔE/w oj E45:Z]ՑÃ΅Baj P݁fTƈ@PrG`a V@{ P(:1ǀXYKoDVr665us4[rc{߆ƶ. cwc+[c27H(A<Sg+GWF+%2uw~V@SP۽Ͽ"܀6,fff.6 4dMT#TBq1v\݀~>+!̬L]&@ +{? 1 '@{,ߟ>ˤ. LOՉ:x|ؙ Kz_%c񗯴lAmO^r)8z̦?,W_.67ے߄$lmGmlgeoҺ_SM?VhffҮƠ boa6ZHXy͔\M-ٖO=P`@GtuV_* of+' 4d΢_K `bwp@~O$[0A\&?$>q1$  `L;I"r@A\ P.Eܠ\ Pt?]EW@ Pt?] P.&("4HK:_=P;[:)\@E oBO߂?dܜbXAX5K/GK_ _T_= E_zj:A:I?jP1 2Gh,:OAY;'?a5/~Aќ\f&[b_#`uOX 7 ApH\@oj_Ġ*]L2W@nA<<_DOr &o?3usM_/!@'ay/ĺ>ބ,Şf: ϲs l Mm ;>՝Է+/>' mI| T&OD¿gP}qlwrFQ*|:Xc,laOySr!F^PEI<.)+!-ƥ',FLi [&kg5VeQd},XR6W2]nVc3Wx%ns,&/a[eSxQQL{0~4":sgv}WU%juxY!:R٥㢘!) Qz3oBv&Nb&3SA_)-NAU,?({$^ҏ%P;9S )m-"{$ahW|QUB2Mř;>ze->O $;ԟ^ͣZ=DVe 7NGƠD`;٘ NJ+J޽-$Zo{?ǵԸկ'P?Lɛ,P"Ȍ۵XH%YFcéD) !d=lH}\OL%$! uT蛯S"(2b2`2`>\0udASS=' RILP$- zb:D^?QdYg5j8m=}naH=F? Ɏ-RRX$zˌeл9f^wKl-EC~((M6>ywiZM)T;9pFģ3i*Kj8Ի&"򈆘 i:v8 :]sYN r$~'7ye옢/NI꼬B}~k$a[XE'#XZ}wA: "hB]"fS)Sj:PwBK#gF9YM=8>u?92% g|'#:[|{B9'Ԇ78ᗼwC:U,D1qt&a|>kcAl؇|K(5VU Dayz¤Oj&4bmm-jzsVLfcHZ+Sm<+|Tl;i] l ]bQz`yzǢ_V:#FV+wEKrKYR&,YlrzI(6^(+Ɗ~sK X-Ԍ&aa3\mMlߺ< &AmƜ>g1Zvcv;Y&}令m@phaNw0"9y>wxKim\A!'M+stE! Kȸ[,gt$Dzocyp4&/Q?Ws3=⦉cR9ô]\c7F -&oV@jİW{0awͷV!8~-p"nb;$p\{^?C 2mD>'渓0?6GI$DZx:aAɓ=P*4V$;4mLR6-)=*]ԉ3SJGAV.vїF&& O,ezWAݚm$I ^br}wOv^54VxQG}wr**ƶ'_ sx9j6ŌmFm$(d]9\@n|~"N>prwAYFjuU>`dqYO%{*T4ÛSՠ>8.Wh9d4!hulsbLˊ_rzWkZm fizšrKb`U Zoyw]dꬅyV} @,o 뇲DlpUe._.Ү[Kc? )aqi$Hh7zʛEL}K?pBuJ&U~[H-Fr86, 0ʗҺqTjNl-[_AM9;u4TcG9pi9<z"3 L-LڤլQ֒d~ӘOJxg57 H JQAw$C1r$>Ѩd6!9K.ה:ɤBFjRɊT*syi9b-ME ahE7XbǦhm U2eJTbBA!/g߿Ń4.C$4*AA*┽#6_[w6*i:Yo>Ʌ0j/& ,Om rϳvݼVɝR׮SL8zI^szNQ\ixC~P%4^`?n0HFl ͒ѣS?s_#B1.E&[u)aN&a-jgQJx5>|2ŦE$ŝ*p؎j\uO}l+ IvAkQASdC2K/TWOs1cۗu,a+-x, / fZ%e8`=Oڡv.DE>nh[}[Ꜿ8vF;"㍋Y1#U)\$#0Tӿn3,mܞѝ+s{S=M AP0"MjR ݆&p.򂒦N}F+1m$!OTh׎|;qq00Y<}h !7r{|Jqؐ9dIpΔ5t9Jv" hkCnQ~/1RՕq/¿fL{bx76ޭA":SkmĈ~_<{TxaUAС>;]$1qKf:C>>ZGQv~ 6Iuwz c!)@OwYxӀ©j G!Hl@̬C%aAk~MK]7-晆̼oZ[tsΐڈ!9 DzQbtl,DZ~3˽\$!>$"_| @\o#T:U2ۈ 'YK,15o WX&ooلA}x_Za:i/zms!9t $(˩0*{o{xτ$.paUzט?Ӵ[nj.uJ7[`\Ac =@jBkJ Q1F7ww1дza#o M8"))Grz';a bTpV{Qg M}{ۮD9rqF7ܨKt]|C? 7bFWu]-\"%,~5k^Jz[MT t7k5PȁkVMt+ޘ{,/8UCdz||6BsrH:z{+?L(6WEU9x{T/ d8T;X50A]OK+SN;~m0J4>3i1# ?k8w+ԱlYv| Y5=JI'S3c.Xzqjδw5G%F1>',gFncdWn߇4"9;P4L2cV1BfduL4cݏSUZ@p ;(Q С\[uPYh6U/Lrv(b_s X*G:JעًۂY@>A%3eF |D[I46#сskh9(.,Ŧl0K?b-F 4fz =`prf'Qƹx2݅|{IT(| pri}6'E^NS)7WÚ{Ӎ""B݉RQk*~X@[,\9$&6>йM% ׅڶ" Y6[<#S9љ %U##(RqN]$+?&j#o~݃؁,BbכN͙Y|5hG3=ftw;1/0Uch |hLUR5&;(lm=Z#6XPa1-J-{䣩ù禎Rg,~B:PaދXwxx/ |xFAd8zڜ4 ]=e{OQ_~)tf)@W˲HLȭg Sr#%Ve=q XW ѕ*,B.׉'P|i5`l}ێgf0Ɉc(P\r]s_~Yyy:%QkO`=R\] 2{]ɀ]VtDK\5Zq(g4++Y)O3|j_H?{LB R؊uǭIHr2uk|];m`P bM)Pe%68VTF#C_UFh Mbj]d;[O9̩eA.q#4>|lqU9T袴j<#/̏#|lhե~?7'FmF{3f-S~TB_Ψ:C'R#}Nb~#蟄>_cyYZ! (2_Xa12Ky&f/v뜛/`b#eϘsiMlp gRK LB|Br K4~)Oi֮kƟ+V+|6_}Mǹ*"pG1BߡQBI-{`DdZsЭ# 0U"N\i˫\e9u_U|'l.r!ё "uh6!: BZ~~xvǎt9jN6xe\wN?zocd{E$=2=֮}i8GɎ͔+0.?))}dU8|$.nWM V8oH$U[p q UlE3F[VS;LIYux!"ZuOe&kDl,%C JԿ1 Nɭ+lVޡ+Vޟ]j%8efX#ӵuDd?Rך͒BJZ*yAVr7'̽ѻ[ZߏI|͍¦UƫRUhܬxjj}vM"~ùW6¢KmsKח:iEaTBMjV P똍MxyXj0 3}q JotyblPBB=j&"/4sɡnL¯8}lt~܋7Z'y}+!J:|g;RԿ s&`, q⮱ }i'7=Cn||[kS׹vIbi-$^jgf<mx.[4e,ysrob)TV#*&1zC[g[n`IʔkL/Źհ6P4:Ee9";6#9݂3TM/MD$ 0+Gj NA rOkub2'kMr=R~4+ni=?#|Y+Ca/T.wIQ= wo(9w9vHz 2~y:O\9M xR .ҹz_?닫WH)3i }4)1 }A*Q]aϯۺ__S|#yֽrߕ&sb I6VP-]VK{<% WxL~e[{{y}}8+3+_bJ-$ɤ%/F}β 3ScM^$ q>}OI'uypGPGqP865ya>]ž},Ǐ/F#l%%G!-tmV D҅I)1"Q IDl:t+&s _l"9"/e<6^`kL\z !Dܝr2GRw+/p;(ZNj?z9Т$O 1";X2+/@? amI+mAY}Q؜EBdG 9-K:1 hڝѼ(1LHRHʐCI>[PdV-Bz=3rIچO=#'{Ip Q},(ٛ;K^bqf%[`rsюG{zl{{"`x)_oJ>ʨHMt>\zRj;%`p]a!2^V(%UV{}-DxR!"A۰ ( Կ|yn> 0K9ph~b ק [\62"(pي!]$²6ڐv.KW+N65j5"*;NˣKrrnŻUǗM3s)^jQ>঵#8QBtZxrz+iMcTXT1-6-^5ɼ[gL&ް 8>bAc" U.?! 'զ8"rxll:R`~<Oh$rÍ@+O_7p:.HM/&9~M5zQ5,L㭁\eJ]ĭY [1ф8 6G~鐏,l,iscV{K3CA:Wb"ePx CXD; T+ fTpJq5|mT?9>H~ᓫ,NfU7s(NJi^akRRK9a4OǒL38{ejR ͿlVid_{͌~Q޿}6ӥ=EX/='` yoC+ɋx7!jKzc ZVE^xsG:.Lk08Xχ6!xK#ڹE`Nwj0wO(4 !(#7L9x??Jx '97Ks YXnmx drf]|4&SrqN1@TBT%l,qR Y.b0mDU-Qf2ikW뜳N2YY@龅;M`W_RΉm-o-TE^S Rv^izoP2bTcvr\xҙmUy3Ci.[;sVdȷ #:G<4~M0&^㇇h٫;1?|<䰄;SBvn!oC֛~mq-Zѫۇ*@yR&0=>uelfzv?-gԵUzT"+L!tvҠB Y^9`Tfj"R_/L @I}&;7/ituţM-|s %HiM}U<L-ޑ*D%( AClf^4x@,x[o&TC[vp61NXr~vMvLj/.m·k:EFNB]Ơ/1?}`%ta{s92ZOߝxrY ?V[ ϙIW{/Q-p62+nBDl&!lឲ/h֋:A%Cl?L UYݩ[[EӶcz ~JA }bab!tU wEB[W!wU΋Vrc2)"YRB O`3yȰH3xTUyrG.O5 ƴo~olc WK1< `7==k{Z8u f=hQ^ qC+[TTذb>beykAom0J -w(ltM@w dqfL3jɱ"١/݃FoNÆLЫXHhl%tMdvr-xY@($~QdI -=?4OHƷҶ4kJQcYM9V-ܖ8<]sUu4 2tf';NjӔW(wEՊs*gkW{epJGO \&Cn@k_G +7PFeGr5 Q)y}kRp >ic9.JR PCXJ}q$ zB@.ʚzZ&GDvkM$\ {iv7Y5I`MEq(9jo:.W=8Royt`Y[ w?m8̚qW&q ](yZ:KjKٿm|d: j̈́|S  Ӻ nvtFЉ.֙:mYt3T!u {8^WX <2>ӎ}/qx(sn(延U/9eLTd 8 5u84_M _L&I8hiDAnD7}8UꆞW^1t`-69գi\o]jr g# ,wV3s$U2uD]'B""ZBW|⵽eu֒}kMH)C߰(ebqܧ4Nl!mƌֻ{PEQᮠ%G$t?AY|^[7F\SY'}?L،@ 80\`1t֊u'c4OT+]l?J+pl0%b2DdbiQ_+*ewLmEĵ{LyJ ܸ*hN=?el2ZH'7$ylLs35;y:wV3G`l;Bd*^'B{/7ik%sĩ+S>wI1VXl[UKjw9R6N29i_YZͰ%4X'z$M;pfGǶ_vzڲsC^pjpDS2ΦQԲєGc)qm-6E8Ri1;A-%&ڇ{%uޗ9|TDe9 aӇ j:!hE1#ELxHئ`%J]ΑurM%z70_3bu/gjPGCS9p@Gd`OuНG! @ǟ-l~  X'=[s_'(_Uf9+K2%$h/o<lcr[%ՇM.it`)PBHK.h3sY "$ ]7?E@ eJ!Ul@XJ޷9ODqgLULͷ!;T cїg>`.o/Ú(emћ'G%R|e`N4B(oE'.'jdMt~ BɽmX;BOؾ=Cz!Į-P]ʫhq{# U2bVȚB|WWS{yG;i3YX46|< LƋ̍4l3KDgъxPXuُa?-!C̰dAxOe@.~WB) lQy%}=v6N^Pu wO޳k4J_C('x4jȑcЊ_M2:,fnjin|W!0]ΜuŌU\ʑ {g*ĶuVcKԆUete^a5I;a^vY]hO"/c)d-9v:`ᅭ[^[?Н"x?viƟcbޱNwZ3@F;XC͌(]$r/:C$iӯ=UodP?jWtuH5_K( 7Y4zmoV-iRv&PPDmL#NRmdyZPWKf yK3¨$=>ʳZ")˰9 P2>_ԗo_̻?dgtm Q* tY;ZڋvGǬ|9df8*nF|cQ iqN M6Н"wP#,[/xxM,s~vs > rB/)wVrv7_Op@Lq*uHXj@j+e|qiwS:wCe̘Տt &s&.N_ tَ> Bb>厵JgǙ 3Kl^aes2~\!FZ*bt?)Ic[ETUԯTPݭڻI 0}br'B2̮XӲ]8O5SnNhD{u5W0nĜ=#ذ\QĝIq| bz&of+ &Ï1cOy+ί3HGQ`ݵ+2Q0qc X >Kl"U4[>rW'qd:<!tK}ѥ+|M4#=M)DQhQ̤ԼmxՒ8#jb.zq|Q6tLt}6X0rx-wp;=,0S |Ty;2;JKVq "qzU]>= ^CfDh^2Zw&Vm#?/orb"%C5O#GPrst6vW)nZ u^bNIHlJ_6= 5<L]}47.H¯F(Veπr]Aiw(Qz8ϧg2~* :YXS4û ܏?0!I9zޔr!5"n^w<&N#u:]|uZ4UCe9v>ߍ pC;4W ;Z})W2Ǭ{yMd\.Z)L5Я `G܆UDijXDSH؁/u8Ai&8k%njO'VTDN|d%M~p 8O(iS W<1$/఑*RX v%f`-%rD";!c(ML€WMS^ZaW u~ LAP~ 0袥F໣1|)4g# A''lZky%'גVI pQU8z?Y%n9y1+;}!嵊06{d+HZqncuF^0ߵ8I7κGe,C!J C][)=Pט&q<0np'W$3|<jT]SJ ;UrI2C4eǜ2᳒.>UO8áݏIK9;7SlӶ$x;E%_cm.D lU`+_u/7]-^ 7F8W^C Y ¥M~%.1JLL@'?cC1ijmD`_Y8mm#tuXPA¨Jwyu4LaEDmo 1Y/Rvk*Sar::WB.g*,7cX>q Tn lNd~#P> w?LqO)aoy2Z0:uDEU̬$.*Je"? qyPfw@#i=>!3dwLأFCf ]P#vQF+:u6W t+ Kl# kBu"2 @G5Q4aπ'bM9{ןđ{0ݑ>v3V'إ\V:k/ڻڽr֣9z $ v̮Dnxu^Ö%JR]-xw3Y8'JNޏ4z3#\fY0Eq4]X֛'`}hIVZW#z~ϸHG78X:R2lQ)Y3l"Ng*nyc+8,#Tڍ-OKѥtNr}*|2ք٨9ԥ9xfQ.?׳4P ks6@Ў04%sR.su/RL!֪ Iς9׆ID`ЇsB9JӼ `gy߇%/b[Qe=\;x95k bG[TDp3@ G/:QrB+/Ţe\uBlaSq3JJ$ 4nYwfvB(|j\ꓔw AV)RTs c]-:~Վ"FQ"Y\ͫnד_.ϔ̩Z(6$FP@U3 ?N>a>l;vâݐEaoS%ȵoF)t7IM,EĚYT#/C) 3O{w~Iz gj(2ʑ~ mA~`&Ziӧ|(߉}U^Y˹CYgO-W(DP]~}ox4'5B47o15`kpJH]~T>ٮzg` Gə%QwόB8BHg)$ G!6W)) }ev[T % ÿ#];xvddq :K)q{`V"ĵE(F)$qxjU 'b~y[U b&,ǫ"ɅnʄWT|z~3.;3i=,G=+ r%|;dHpwargKhBQfa;?tLdv`oH)nԷlkVX$45)&y,/=|Z3Wшc%Vw˙:3ݧ{Tp5^$H?W, [t>TIDEpѺDPvo9&|9ƤƀPiLRMbFuyi'A+s5]!xƛr}hyp Ez5E*P>r_hX%{rhdYb=WXۇtvf:z52LV|ӌᒨ}H/ -L rF:孋#CA+u*3h @o0XL$W@.x|;F袭҆ &sȎ 9馹4roN(V&c{-c"9+YX9nIh١|:g}VOT%V` J,bDYkMCl;qì$Ќrf[+L!c9$,S/FnXYEwv׵uHW IUVSUp;0ӕ5*!S kBGLPL$Ph:㒤R`X[ +i=1>t\eDdAO Q4KY$"vezU2y־WHMp0뢇TvChs@_5J_AX;A+9ZR9r LWeaKvLtK[qC!ZM=mR" ˖K%[V8]=m ߡ+‰Y|$We DH Y]%,w 4<cxxY#>{f?g5R8.9a¡kzmKV/-s]$Vj{5)>@̫`LKuNa2~Eo_ݡjnƈ8@/ƓMh܋UN}VB@c C 7"`u6a`I91.c'Ƙas| 3LJs;Xc.v?)x lZZ2+c'9Eu|J*3bx._[0ED5fjFnvWNk̹O{hR^]%)Yzx쫐o&AS[O|T7_!+"j\t7A,Dݭ1},X+ڻ*݁,"ޯcWd,Y%9]RFJ]X=ŝza:K+wEaj[PDbNƱĖz;HA2\B#G۵8q;ԜKU %0+.E4c,0~N蘄DHIwW E7PoxN G%Hj!œ5^> stream xڍT-wwšSܡ{ Hq.-n-N)^yݞd5=צVdp4:]Y98R\n6.T::-W{?fT:,/t}I]yʎ`='/#D t( T:)G'Os `4gp + ؘe5Ṣ9hnr֮NBl@6G(+5@,@OƆJвqۮh{s9 l4N d ?W0 [,mAUY%6W+lwq|mfτ:d%bbc{DiOYl!O2>vOoXڀ-,aĮ qv)HCy6Y\<35f^ y_o'G' _K p|o 1wl?Aˇ@p,/')xrXx9~^gQǟP#@fOv &Rq|-GF_J+O࿳4u:{Cx֬wVda =[m\dm 5Ws]ۀAj.6_+'=oȿ\2`sGB @OTg!q9WKv6sy<_#%~B\v?ؕ A]?H |f͟_n=xxV+YAn {SX >Oh/ܜݿs<_ޞ;}~~|AN|]!? /_ͺ >7/܏O\/d%s7k?Q_օtHynLm1z/B>a"%3UgC${W6e/ŗZےܿMИnG] /<@!gyp oTuvTǻ蓃/mTA/bю6 ,3˚%~J̌{ŚRL`A=.6{;Q˥Āwd[r/Eh޻h2:Kit=׶ d̄n\_]z|C u""OɊ:Tጿj͓==v]N#As5Ar=763`j>hh|*~c E맀VuZ+!SEMf{ ǁΝQ d i4wYnWd14ə鹭܍jhoИ2=Po"lk?ITW d#6ÅoMlIo(F>2V~>?6.`f+ҔBWgt<Kf"Ћ(T,jPȓwo$`)<|N:I/8"O&Ѓ:G#$j-/6EjVv)95uj`}ʁ}\G#9)waV~A)C0drA~G3Lܚp0sDJ.;B7=8s]MT 34P㟅^ز4%)_P1- 7GtPTQ}Z7${mVӻyDhv1))1FZn3r+z)Ń)rŸQ sSZ1)"T9M7SWz 3x fR YH#6 ҙp/8 N߲X5OL*z6KMl5"] $GVDV u쬌j[ n UOkxQ3C~1[R^M2\ut#2@GKQWk9Lci}()#{EVao$MY!p-7NSR|s׾HnߡCǤ3؂p6_E+BDHx`.g"FP~ __CT8~[kR@4T҇z ~} S8? D6؟Q*)kS,dQ=߰:w#8rs8BA&Mqy9+!xTN*x>tpZj6!n5f1]S:1R,u 9n+hE-+_QFI[Г7/ݺIͽ#׺dи$qP.*=Ʊ$ >6^NLn 6T(bHH$[q*'9^{GiFnqK5Y/,Ztc8IR=/g }DX*"nlo oRH&WL혵Лʌ{x5X e*%]@?V F ˇd\T&'OؖoE{"2E;[RU#^4Έ0;Kg]`>HG$4'++[{\+YܘW'=j+gc̣vY6iLoⲍ;*1><}1ԁj~>\Ì_h?hk5lZ[h0{839^Yr]#3/M*o4 >&X=dS?m2cxxͪTH?\Ysz_Xj=(1!CԤW=g:ޓ AeJB854PKzoAI }Dz#i.b`N.fUB/im͝`Xk 5/(aM^%'_PJ[8}~=UXjS%(勪V66k;c3bcgOuO^+lMeG0qIJ o!VA%씐ݒ =j5wPG)]/my!B71wjf!a„j4EVӠJ3^ IfUxZm-;U{NvL3$#NR,#-ʱ(Kjt%:}t52mΑWP%Q_a?1S1 Sl'?ʨ׷<}U4Z̉1p}5G PH>2 B?SuV{m+\.h7yv4ߪ<ĺz7Fz6\)r:, uE艛όY>EI-cy^q_shjWu9&E?OFUcD:P5ϩ$YS1KUxu{BDDA0  UIxo$X|aYJ.6c0, -/jIxp0K^] p ^ *_.عիE@$ҩ$ƒbl*2UIh7RՈp!Khsz;lh4Q+^F.'Ps]j9u?"Jeu?}C]ЫhI8[!`TG뒷07WQ"j1(á &.* i-~aߢO5 JQeە0ӄksDߘKjnL:++:^ZzU].j "E9rAe ^ѽ!*:Hm6 p3"\՗pwz")|"jR!+oVg._jRn?q[v%\D"Y?Qk~Mx3&fKNYTfxW!*E޶l{gf$6OSd:ჿ'p,Qx#"}vmYݷz̵VSc3D#ǃFM;|P>k7+)Kekw(cFM/0k٦ LJY)pR CVVmՋՄ2(kKOaA13twFj/_/}l 8DO† }A2i+%\|ԇ%Nkyhw}}ѩaXx^'hĊn\g! xAA(XcK>MSMsm@ ʀ]BD6=yIפ%N~ Ӎj4LRBzd*45Vo"uB-pU-fP򈠯ƝJ0|(2lXyc**Ad=T>T'ovEyޱ2ٸ#eA fi/%XCȊiO#=NtrVK e-muG42我͔ õ+ Քt?"{!̹m {M.4ЀoTkÈ=zYLu  - C7CKDbV^S?ѐVR1|CetQkiz68r8QНdr>\D<99jz#?:=S8-B:T esx[lV0)ED&WwKr qgm9d$X@&#:IL,ײa؜抷 st;vCLx/[ޤbRܴ <io?kStl~&fF XG1np)Yv%n/JCTo!2k|drM.q'I%D8,$X$oP|)FZ) Oy<̶>t&:[L/S"ዢHWxI(=D?^Z)Wmiêu&hqFNOk~/FܮKMb%|!$iI EZnZ/Y>ʣrza& y bS&dkCt%@2p56[!g.Ugz#Nt]c (Z |rJ$,2aXX *0'k,|a)- CyF* }%;r{PM,?SHvFYΒ-9;4ˊVϜ ȁbgYءzS; E-Rwb)Fr38@VY3{:{R*ݷd DIRCw:QR5M]Dת8P w ު 9YD,W+p1u4hXix}W)XU[""\1bc&tOD=%C[1 &r&D$t+ȪXq=N(j 1*j40M<(Əq8: R4(PPe=i.~d8t _eMpgfuV 8t4!,C!LN![fdWH4 n|)g.Gz (l L$$u[\xb)ʝA|9Jܵ,TC4w6M̍E<=ab1gF SJ%Ȓ?VSFo8DSUXqE9ԯvS(b𣹪}+P;v@1ÀSӦ *52!ؼHeg߰ yr B%f=vg%0A~3au|J4ތHߴh5h}d xP^ݐC08޻&S[IĨƎWv2ʧ |kϼP(Zms=PTqw.*Ihhi0u_ZAk1y1Yy1G=1c?jJ$ٷۈr)gM̹/zcN:[T X&Tߗ4 8TNUo#%2=({+|*VapW1כN6u]|g6̑ >^{(b 0`Aʁ{ 6ՈW&Sכ{z:τkDs7Wa oC{{i3fDE onH^-/)^.T,v]AUKu,;ss8޳}놼[聛EI=l;벉d E˞ck-i45=oQgeqngR]@.&/HnytMJVD`X}ʋ3Q]%uZ6)J3`$#-6n'0 kM$~/5$&u}7C}x /bWuN"6 `JU&I KT%RCdЪď 3881?/Qxy%2{cR;">,k5 .vTB`ߴqjv?8 c7D]x"/ˊki .!d;zy){qf 7& +wDʱ 1%t /'FѕizwV@| yD ~ PjtIC!G5b=*ď{֒M&3LŶʈsc4L{E1 RFl6~\̔a5B¨e˹|hò,͊2`\1Wb!__ A!SpWQz3OB"`W hfmakQcBBc%x\W?(dfܡ~~‚'>Ʃ7xӊ}:*7ep/ܸ1R{,>GoLg..^R~蔕10'rvL`SYԉ}JZ(͒mDm p}R%qi8Y͸uiT4?ɽ|P^̒p3a ^].qx=^ޅw&_';2|nHz3}Z?M(t l/gbr } ~|@Zmq4Cc}$rw3: ?eȓ_M 8V[:-O$DH>m[igL،>%0#a*YiY[q᫐3dn0N.}OnY:QѱhDƻ(TgwN)LREH; J{RI D f|Qd[o(9X('g,gz8S!;g0RƁezC>j 83j]Y6ٻ$u>0B}N$2^dq*7]!#niմi]ʋ^3yV:ނD7%線6} q5|oq ^J䈛+Kg!βmHm;Gf9tOAQu 2jc:3ϤMQQx Prd=m('fg+7yH60 M!i[zݯ\[6$a/^F: xio0ϟƛr$kpz cK~UƜ8.m' ,"57#bHpEӡr+jrFRKa )Ny]QvbQ3$L.à޽b[c>91- vJyfdhzP5 ՘O36.0_hۏdjZ5,4[1ND2OKs{j`2=#懔gAv~g1+/ʯ`#cD"M%Klp:V߇-bpt x:Xt<6! Fޒ0­Ɩ)mzu'k 0PLWFU=]{n H 6xgE@UKAEO?[ ˔z'7t4+ًF+ye sW*)+鰗g4gʝ)"rd]4DQoNq%E^IʣRÍ򜾁Ciy\VI0A nL?GH;RuCL1-Fqe78_m{}U~kW(C -mM~2%$ۏӼ5{ycDW.3/.w!v }1ֵ5>(f6ӇoS${w1؊HK? w-RdxLLJ*{A_MfCcwg y%F蛞lˆQuXXN ̏lՀǨ7-!>[|nϧBy"1[zĵ_ D D)NJ Մo?Y#ݣ@(`~ 6b2RRusg:{Gj#wx{`}-𗜾P;}FłHWF9תlN.OVzWI| GQ4o^;~2v-ցs cijQ186fzVVYnaׅ0mr#߬>7P4vud^L9C\o.] "͛+Lɇ-9jAk]EHgQ~;es)Zr\ݥbCZ;WH],&į*&9ZzH+?dˊXWI':#hLDuKAU #0Ij1X(i*yǦ%> stream xڍvT[6 ҡt %!0tww0 0]J !% tHH|95k<] 3. spqdux|\ܼzPw'ȟflfG+~oǩ#'77[?WjP 7lfY+~<XlaaAi  w{~E0 C!>**f,zyyq`n\W; 6qzBl4@0̸zP?[w/+pop!p  p8@WY GϽpU_ 0s}p;- TPrv6ANn|'9 uE.md0?9+|>?N lp_$l=w;4du/  ߩp[@fw? {yߵ4Xָ7)?iRprf π{z_q?51쿽99P77F C, BA*NnOpWo~p<5aO WW6x~,T"~JAO}zu bk (^h%' M}4V&e!Z\]c=&ö#t֟>bP="bVFc+)NݡdBD>V?/4^i&/FgT *.hW[?HS_KeWgl7*n +Z#FNz. yя(k c@!?}׳"f[n`3#3Ǘ2E A-h="ܢO( 2ӂ=L+jH^Pkn5 Q#)wt1/;rsmGLC ʪ9Ma;JsjQЁC{]#qMvj˱k@ޱ&Dvg5lwr0*ʦŅs"|l'SVy%U4] :4SsJh ?ofsEey?ex^o2 mCpg?4mdتQGem典)r-~RxALyP 3}vRi٬\b8=1-oX[n+8g=G:2@^F!灆 Marl:=o.PPH5mW\Uf?e".$L;)`$i|SC0Wt )S!} G4 1dإw`x>uPهZ/Hqwߨޫ^E>-:MSيy QL' O-y 1fl-ۗ@֩F =8;8?YohBkm:E>lw3xdj=CSE1gEW[ q3;#:Bђmv_^|c čuZF>,rV n=HG2Ӥxnp]Riɘ+&9%me7)<9,*!80QC$%k p9U4ee/(sGe{M<&Pv$<(%܌&5^pn M:w ֔dBUvq\u*&$DQ`;hGG΍.Ӟ8D >b}QH_~$%j'%KOta(jǮHJ< #rX_+eBB۔ҹnAKeScA~+5K]ʜ8DUCt88*KV{2 [QAXf`pkBnzGup7ttiF(5l*ZQoi@;򭊤iէi ڽ-ЩvryID}.991=[]%aLzwxΖ`O`C1Wt?" 0?8#ӕZ p3x]c3 |sR|#ozsN4; gV q!dl?x7aZ@TD&K՗>K'} 7&/5Q1bVɿ?6]}tw]مPxUKj#5XW;quɸO4KF1d%>,k↰('|;olkmXk^+ -ȕyrAQU7Sa'nuZ2"Bn @dՎ0mAti\r3-d(ՑpӐrvXG uhv9O&cto¯p?f%x/z}"v+Y{1Ih_WHi57q,֗ 9Iw= Av$ VKٔ~ 5pyx?d9Ṯ/wɫ'MYYC|'EޝUg q[5TL'EO˴`8%KṔS,#B /3#R ?s$kPi^CC.z1o =JA`6'y=rzcN(3-Wf oxYպh{ٖ8lU(xYȪL5K -^a]&{sLS[z6_Qb>"U9'I׫ˈ? XX$0S0b6UBݓ.G!NƔW9P2gV5Y VOq*s٧byjQ eb+&mtFlukp8r!<`H,2yqꓟdQ"Wr.);7$eڻ*քÙ,p?:Y4f:-neWc#4d9Hp }ޢ ο&Sj@"e+g36 Q.i߃y'䒝 Yi .FV$ț "]9Z,;>?3N=~V*Y,S'݂ 3|l.w*9xJ=gݭGbT$Jv/}W$DZЖo=t/0҅^сPVsΩʂver#cw,t X6.#3$'t2.NG sꈀyǾף>LB039I}b1|V)S\Z No,CHJSC/UޱCbjU( %{_LQWoktB̛SVOԤ2r b heiBI97 {*$.?`)UR3I?M"|JaEm$.n w~5;06`LiY`OE1D+IvӵGǩ(#:ȟ{oq"I»x'j_ǀ_j27y}JB&Y.=<"VңUZl@L]!uW+=VY~iUAĕp'ۑ2f@"Brnj&|<ͪ4Mza6̓x1,@>E<;ܿ`4=ڣKYKFAD&+ցzq0ԱMv-NzDR Bguf[-+^c9qt)MڌRo wZ_bN>eL1oX,!9ُ8-*K慏cŧT&B\ub4-[Tk T(P %k_\:يjg͟N^J1@Fa29uJOx;j`'X"وDa^~!,A-~XNQgSV*2O&=Y \g\6esr!k>ޞn4, 4 G }l& ovcVF=1?])dz(7'v/6 ^[ E*}vjRM~^Gn f1f{5Ѵ`w|WPtlsR'TY$h=kTŅdj_p~FنHꦵjzN̈ Y8⏓0mn۹DMR1Za"/[Gʨ&9yhgorQ:K 5#VҐqe}&#DRz/B&.(Z=SჃoSn]$CVlh} N҈DPPUa`}ʁ KA 趏xRI*. ƸY@,R5{}dM|{"vOd@@_IŬ٬J .rΒ ctQRvynRM YjZuH&h{n<%aWե`ߘ}~D5w1Nemdhv\w"TD;EFJ38ZIc]a^&t}D-xDtA9Q;j "fm?4͋|݆*Z৴ň7r= ߻/J[=YvH] jF` endstream endobj 303 0 obj << /Length1 2607 /Length2 16930 /Length3 0 /Length 18438 /Filter /FlateDecode >> stream xڌuTmJw4Hww7=t3t#JHwHI -xY\;P3Y8ALl̬ Eu 6V++3++; GDtuqrFh $MA`SE'G= `ge+?@ wr!QI89{XY@kN`a;@jcnP4YMN6@P Z@,,̦nNVtO5@ tZ~ P2uS3@*u'K+ۘN@W8>@]N t¿ ix&qPV`y Mݜ6f`%o SkOn6 7f7U7ZBrC+yoq]3;PN?V`bee,2fD/%o1_g'g%%f\݁+!,lA3#vho W/>+x?3ϙ4?Ew298l<n>n'\-|ܨAϞK <@y7`b5 osvM_6;Npo"j@s63'6n6^@ bkgoTqr}XYkv ?"qX 0TJS`Q8k?AG?<ǿJ9LnfljbcCS׿\Ḿ@K_b[l=?? =xHəOm/ "pֻW_,Xa[XxE[W_X&VVM[1kogk_`__|vAps w$˟P\`.Gi8ꯔ7ӟNy up?h9#ߣ_,qk"Mtw˟$tqw-Kj3? \.sϛ[fof8?q+ tg1zfw'7zQFtws1; 3 |k#-9 օՈy2srk޾q"Cn1C(}!̻MɑT҇V @$i' %Ex\/ďr@;yK1?Ӂy.zn ɝ޾2iyA%Cȩ}*?܏NIVې4d{y=_ؕX&s}\?e@ufD_x;Ǻ;މvB~c,LwX4kA:1+*^[4'XRzd\P9t[DK"kx\/GA3glP|R ϓ{!1=au &|LpjqGql h i(HC^|Ȉ ,=cTďa6stN>o;[WB,yߴywnwx/:>S !b/mo1 '!~>\+0pm#7{5؎NPgd![!^ľzW!U 1OX,PZlR8j R1xghtl`g2MB rznIɗ+vln.X7tW!pϿ]F}$R7 o9rxiN{WoajYPzN;#BR3}0S:<$^FmYF3Ch ]>вaQ?KJPxbdz.ӌȑ:_P7«azKTxQ+?0hǰakAɇ<7_NThӸHY+F^E/t߼P-'W:-T߷}*hnG$ 0du5#hsc"4pP զbN 0[!jly:rA2o(b'd,-+KpӉDCwZV)n}q'$B準6~\"8Ϭ{N)lؖ\q]YfN1jf(FM#[&QJ%~KϔzjP5bϘ<6ޝȗhw­1+0}xKSvρZ͈kѼ(Z9(B~Y=-ETlf-t@_ll?zGQ$ +TU,*SO8!|3 ^6??lXОe Xjt2n|I _pEkbc녪NϔOv$~:Y?}XgvN0E򚬤P{׾&T07r8 #bDYKr2YמRY݄j60t0KFI,Wꡲ(sEb8fJN81S'atM3s4@]ma46 dX\_OiD#?$?win<"CSV{h5GyG~Ϩ]f֬;1 ,m-:?,{lI|Ȣ򐷲^^ޒ= )Pm2oH0 qaȻ=AeuM*sMOvQ˞&7NF?=&6|d"̸kӾosdgRF&rn>DcH5+c'2{ѷ626p1+/JmN^!=<S^uvJ5DN܅}Lp|_* : "y8x9>BZ]FYIgsF%όsX \t!R'w|Al"оs3̵*CҦ;sMɴk%Y S@X^F v!5sm 5iػUex҇ĉ|*1ft0܍kS)Z݄ƹP8R0z7rp--R&5#wߡjȗ3J˒F$l1 Y֖}0Od=(Phu=ޛ/Z\s3 P_ي{0%# )p 6Ba&CJV2~@wqbI"SI]p#T?z ZV]iO_T:LI^&}g>OcI1ʡ0s9g3rnhT<׏s ChThYx='` '$~쀪5IӨ՚AIgeq d!̛4TY{ vCr[rk}fϑry_A\qQݠ` m|iGJ6X=eBCQM0mHm ?6οtu]&)X4~n?Td};ղzFZj#7Ў쎁|1T9{wqD@_Lr:w'fF8ߜT QH{بzk[ )zEՎV$:~q>IngJۙ ]T gK>hi_؇:O]w̿Ⱦ^D0a!xJ3}ѵxSP7!,=o9Wa18Rxm!1 ?wΟLkIv&W6 0kl媹 띻~u3GZG,wiN"#ɀbbf3} {'}5&\ܫ=s7plsr^3LkJGAo_"Q`2.e_"+}'ңr(MHh6cj^MQ{`]"hlqIvL@Z.(~}}'@->EUĉUTܜĵ.ejD[} u.\"Tq=)]~d{:TQuk2 qп}+@@Z_g2+֛[ǭ3x}ax^G.>E-ǻ;+1{r8WZENP.?&h'b=\n)RbLE `ok5=>LSk32߅24)aa"O,Z{MDAz]u/<CY'aS,d0oTAA*n\$ sFڰ$+z :x5\#J_ƻu.ryXaO_ L}1 RA*=!F]t/%0Lu5Q5Q 0[5%!LMz^JJ%A~τ{j?LgnxX (6M%9ddFuJƼ߄y zad$[O~b1/#k(nΞG]f.|xM!aXontX5̱[`P1k-&'T=/({N$l!O1#HDlbFeP{V/빩&/vx^0W7[*߫Y-B$( $dn5&[MƪN_^ +|d v窽DVƸ g >Ν.3-I)IEZ.W΢g--;cN| V~|;IQ2C9DMi>>OI/LxnL6—*l PջEb."2vgh=>SKF8.)cİdoָ@ˠ_23o?#_Z"pq!" }C|B]ARFM:cWM#2J(W`Cr|BDy"brZ~IrTs2bՓwUr֘ÙiPKu{l#E~IQ &Ə:JDmeX4kzB'Tem^Mʨ}+ ^{K1cʘ8\ zRg V@ 9`y1Ʊ;)vNur)*#l\W. M)2g 6daY˘wvTF DkR^d:Ev1VF(7j+e8lkPnzyx O)5݄?ȋy6rp[ զ4/5\# `zP|^N"_˫ĸ@cIwO- ȑd5a) v< sDZ(=5xۃXk.'xB5pD¨j+2NpoθH//9)%p<2e#ghsM`'sN|x2?e #V27v⫫QE^ ~a[ɔ,[;PuIRoy?džsRHN n'a-=lCN2J^S{a*_7VfE)__u/P*Bኺ3NCH|f>AC/ ctu>'ԙwf'/k}3}pV(p::s'ڎLj&"{mJV .VvX#F:)]2Ga]Z2Tf,e]Oj՞kUz+#C?ꍔ}Baa6E:6Ju:TCԞ©"BKjCQhjU~6懾4xޢ;ѩILW*Z˶ey =mMqĈxFPt4;$ $wN`mE9߳nr~7 nΔ%gVƀ;dMHF#C6DaTϺ+Fnj]Ra!/,ؖ@بl|"{Jԓ՞VH,jY g%uV1a08GvMxOD{pC͊amM;5#xEmA)X1mD.4Cv>`0m]ĴÕ ׋;S_?wUƾsە rE'X[h}.E7&JKrfE>Y 4l⮩L\l:-؈ӪjX"}msuG&T{ Ih(iۃt(ځ,t&o>pM"[Dۢf"y[u5ܠ7n?U}/&M񱽓)X=yAm]jĚًl5C yp.U&W+JAAXDao/­2.J7\H n+_(B.3@}Sc)+Jz/C:+bYz%&HyX9α jI=ꆠ 1& {az7(>a} !E C~ڴnl:XSi;CHaJ1Xۋ H>fk+jvەmmxOڄ٨miO VaGFGHwl>1+Cs55˝lGwG)mÏ%b" x`~9uA%־4k$cb$ $Rv@^]סSgyd-ۯİq>F0F^XijF˝L(0}zlꩦ u,j 5\L5`ﰲYeDk?$8T9B=Я4yߟ]K;gh኎Da7F:\ "ң)Q3|l4K,Fhu'$|^J^ %J3a ߷7J;Oiv-/H(%>LIiCGy~eI*hbfF!_LO wE>[T-? { $z 52bosZ< 3ء-%&T]Φ [V Q+k떱K3>8ENfd}SI_"HV^ه͋}bZ;4,av]OmUO ܺ)ʴJ<@OQq+s@)OFGnjZ;x0H{TҪ(57m03/HɩC)%ό6(_"WLGaPLiҪ3B'0>txR㬹'.ھ,s95Y[<:J\TjϢa2bY:뺋тSoeݖ"YGtdgwditF oEl5_ߟzɨKʾ %,%]yw[eҲanH](D3$s4a)okNiFs|Nl~Vaԋ&^?+;"V>##,/]h {r~J@.( y?^ct#'aPp?RG5Р?$\4ׅњ_Ylwu- S~Z:C3P+pk,p]G}'6M5< \Q"WCve' R,Q՟ '=ӭH^|2,û~ՏLjiyqAwav]ؓ(X8%ϴ7}+2j,zS+lUr7bP'RgD.q^>j}XY i@ɉʧu+'DxF&~LJT^>u\TZ"5#p!q>3zveFRin?{Ľ#qdY?,Ob/sŜ91YC&ՂFM9]w8^/z THD[.ޝk"@lT7Mxf8PrEA5$&xY빒a\ǫf aIt`]~#{d瀧 4'#B4>Z~7IRufeXt"w姬[OKKX‚3fo>95b'|ֲY]ǂ_`>e&v9n"akZ33'dԅZjHDQ^`FU/l)f Ӗ6d!ƍpB*CHx#rZ*1*S<&::/%uBvg%^֧\`|4% .ƺ>r,Pډu@Qam  ob HDȘ31J 9u~~b<mBZC*̤Z6e&^0~r&ÖVnJӅLZyz!vD3[8cMyY?=lŕ AzvJt"u*_*컶 MR)e)gxV- ~>jm¾'zj:{tͷD4 H)gL6Yja[~By{tɝL=f,Z,nٷ7Ʈ=E @ȤFm 58Q\UvfV;?*`켱LAxGkq=AyxI7iQH>+eӲCRd,jx_)B#~AL[-HәɲL\ R00Sb3CnE|QڂT=+A8.QxRȯ7VjJfÒ9]*pJܯ2Efo\*$ ,^ORp<[fXFHAB}IRX#}fd@@-<۠<4]8TPND;9k4ɪeQ"A^u5pY(UHD.h4OzWR,YYQf\Jm1:Q{vL 4|ыj}WlOcukinĀbN~TEK" I?JxW%0HWTQ>'C+9A53uq͎z4{@^X٬HY*^YҾbs\v>%?ψvLJ֢%ʱ[ZMt`4 syS5Y ƖXJ> +B66d*ZNwY(YGQ%COɡ'/JHY#H'G \2sH .tvA'5݉#n1BO 9l'yȜCBf(Grb'a‘=L|U9;:i_ޡm f/Ve},}s*Tf}2wQ+d9u-CH1_Y fQnW{Ôoc5YLHt%c~c%F^pخwur:H/quQӥaa\۬g=[ z/LY{4$rCnF|H qn?,a]1h[>hsi\]M[2pg^㭭$7SFh[-X]fy\ZZ#HN[=i bږ%]zSŽct[%#~l9:al$8鳶V|{O3ču p趝(ݩ\V7Fc%"0`6Hq4&&3ҷ>8ߔu7lrO79zGdRBwz%bEɌ'DO0XVB Q.5߂ni o02-Dz'CϤЏk2 v+7G6HA-Gyznt^΂/AuC,ܬ`x .!Ō*Ӿ"]YjI".D]|^SiJjC1Dn}EfB&wm,e1iTxx7Yēm(RhW1M}/ 3t_.n׬xW}\o"K ׼֔yypMjt-]Jde3SC7:1ՂӀN@3](J A͕FPV'q*e?[| p';-WI41ޤS.ZP`q"!{u24܄*N֊x ;P+tL8fUcv5d-YZ"bh& g{j- 9%6'*J(~2YVV\p݄/Ej&{RAN6! P \ml(H,<:)RRXAMr{NG_}fl#Tc*+8)D{V3 $)ME[JʑaD/e0V(5y.'w֮s뵛z+0j=/MoQhCThIvy=Qs3#N65ogu[sqE2rs"e: O>#/4[R<ѱ^Wkny$.3AZʼn.@%]@%R(z!nr'l7BRCP=rбa쒦1mocF/O9P *-`WFK,9tq T/5NP$Qw#:L ov5"[Yzd;3K焾a[gwg`LTUX"hg0:498G\yh1'#q ަgiOaA-ݪˈ yȗ.SE* %$aQ;Tz:!Vp7@RP p MַᎸWxHپ-R+0- ?XR=0oWu1}OVW4Biyg~BBx[5IW~a-b Ǒ H!!'`;lR $X0Ⱥ)q-nH˺fH->]*{-B{dȘVGIfћ;B^Cl,e,XjWFN~_S_)^. s!ew?sHq6H<+e A߁U\$<愾 *ʴ4U{( endstream endobj 305 0 obj << /Length1 1370 /Length2 5960 /Length3 0 /Length 6892 /Filter /FlateDecode >> stream xڍwT6R HK7 tw03 ] !-4()%!ݍ}[us]ZD%m(aH.^n @ ~nItma҇!p d `Ma @$7&ȁ=67 A8Lpo7=ߏ5+**;vA0@ 8V;:pk(H'7 wd؀R AC$w񃝡NuGފ@ ~+C )W buwo|+iӿ E(@ 6P_|ˮKiNPDz\ nex@ܒ rYm~ɌOPq@\|yohMcGކp7_w*h2#ۭ~_mͿϿ xAq&ա-gUT\˟0[z"_s}U?q)'Hќ, b92 KVA,qvAhlvS&hQ[$L\ wV\"VE7g脀. +ݺmDǸhdJGfꮫ5w*Cqd۷ޞ|Jp" be(H2(2'c](1G[iuiexE}gmF_CE)"W`|d}hF/jN~0(.5IҪSPbE,f촗oC!vv5!}Yw_,a!o.oqهW؁G[U,JLقdOhBS+B>1| 3^iAK c݇'EB/=${&Q%:(wDq"F4g]L21~by*WH 4:t8|-0B ja)-9'Vuj:0 @{<=- mE ݖJ6rJeCޖ7FcsC;۫MAU-gi@1 ELCӳВe # '%EIP?I{pC2bo7j9>B ]MbeFtsWc ?mO9uJКoD^):4$Fչݣ 9x)&UTǾi1 טmJrHƑH)z!%_B 2~Xrz]Z^|.̣8*oX!YI:4DF:ɢ85鵣v]E+ %r$s۱s(e3C$vol6 Gkч AI9*4Gv;?+$GvoK-$Y-^ayr+!@Yg)ǡ%,gAt\ZM~™ԴzgvQI0l72ʎ_9 LQ`gYS7޴Fwt~n0#7W&DX%/KRTH#P71v,3V\hj$\ۺd`8 XdM:$w*@^EWk'銳#], jL|1܋3iwcݹ7^݈n/Hn>}0Xy'A `?->P*t.WtPD:xX-dL.Z{|J Dr^x@ݻ@Pg ]h9sēSIa/ Id?A9[IP >=~fMk0#(3uVHw BGfo`3ZHڼ)͝۝R*c9kG{?LFOokw-qaKP_з fVd=џoK#3df½̭ eԜC ۂ.pjRUpY˻LXkP~+h;+ӱð<wE&\ǫ8{X͍pNX]ꛃW .s Ke6@FqO 5YH aQCs;N)v x8aN˕SdCЭuop,a2jL@GR+=_v7e2t=3h18P .Q̛dݲ:#cAN([ߦVV=>EN]ZyZL.dk*ƭٗ d:ep9xBr;֋p3V? O&-& |ga0$_/cY##Loz#< a~ɠ?IUD|GֱrwE "Y[7@f|,Lz2͜ߪP dΞ^hBOhggs$t8@6\AubTWj<,Ue_޴ͻ#p_ɂjͥ־3N*C&F:9Տދ:D-XW`/q.R.+DWzJR̾i}.zv:~P/F !-rMN *,P~ ߞ jV_ Yçb4%7h|}Z^O/=+ʊ٫O9XӕnegM^Э2KYTruÛ`T;e U"o6o)cSh4&l&"7%"a wã:mL*yloIkew͚XU@fù))o,].` gmc;uM) _0v! KҜ%G Z\ݯ7GJL|pu+!y]>KR,IyCUrUMӐm3[˲cV-CRJ V>Ԋ Dy>mtU >CH:\wX}s-#5{(^c+)RE;}two$P$$Zڶ膔E0Zq? 2⦓L8uRI1mg21oL)˴R|îrC+`2?,KDIlK-9.hq,ܩ}fjs˨{sS<*{۟:#AZ؏DrZ+nt$% 0Pe+4M+?qbdJѦhi#IXԹ> &CP8vI!Cu3\CVݷ.У&%B]ϓ'>‚^ &sFt':z\͵srKO̺o(J|m=I!Jt.e6 n"V'Gq*OR{8O`̚AYrVD0EW1lL'KVT,IJDlεQNx3etr 8z ;I9kyW++mC\+iy63b6 = ]졯{xlPǽ l+Kz|,G^c ԟ2.j8$hF$\8! d)/de[ o r! mp Ű\2PfŸ4,*8F|Y_WmdL|;+fVll]Wcb$*F/jdZ%̄j,*eHFoTl֙.6ƃ<@;zB~tPV A>/zMY@i.[>wW/ҳ+QȾ: 3𨟿$r bj`Dz0Tq_~0=T$r ޳7 }?@Li eb % :{&22JG{j:&_Q:>/` 5uP]̰q>`}ì֊*Hm#PjV;?M2/&~N6fXHJctFCMʻ,n(ZRD^H3_hI(NY3sa^=nq0FphOLZIL&5Rpv]3S+7a/~Mg%S?Q]);"J^(SJȺT0V HH}<ϗ4Mg@Z/:.{,n5ܘU ?4\0Pb{2# G::6 >[dbAN;zv#&]zU>ص> '^ HDJ~F`7 Ҫ!gC?ʏ׺B7ǭFLZ Go`2*NZ[*&O4J_3֢pؖp]cF+ ajƼcuXameđMAl]5v]2I?T6WTa!+kY7lH "|~1-fv֫̀.b9(&#> stream xڍxTSۺ5E:[@zM@@{'@ $HBUz " HQtA@@}c72^7֜;LTviQXQ6@b@/9m'㵄ych@yXM (odn Po 6@싀 h Cƫ Faqu@ YYO7F X7'"!6)ݰX~~~b`OUQPLa/ 52 3/ ac0΀D@`( .ypf:#//_@J'W"w0A{zQ+@FbX4. F .8M70o#A (+ n5PP5' ŐO =@z~Wp 5KQmsaR@Yii)v󇸉*`mqcBp,>tsEsEgZ}ucBŵ u4/*$*! @ 4@FF <`ğ>#VGdjO"g.C40nBpC,%wD `O2\,NhP %]^,+ѢbM? jBb_v_zC"P0c4 ˇiu5P4$`oopq+)@J(7b(4~,(wC`H,1.'5a0lq ruޠ'axe'>f0N A>IUgh}`VO —x몇%#_8hI4i&TvvV2}?]`6mewSjlVlJQ.=.U쨍=u2]%=DjW"׀uw^ ޗ<*믗cU4>LVǧ*uV\÷L,}t3mXҧP-ݗ>Mhy#6|Lmn" yTyx9~BqSOYteՙ>DΝ@Y [J#9Vo.t,t:\>c7Dľ<DZsfN^ty>τZw=Ց:}Ec)!m{uL)J 5-."4ђU]N*1ymcxvJ,5%ǧjoI|Ѧu+ xwi^ u5D.އz=nΕ؛ԋ-UCbNtUwIߓl[4"3{gBםdJfy6͟eT0wT95EC}X뤍Ի%+9LFV;Z*hY:0> B8Pu/h)S-(1<7x+RI CϽoZ^l:u!gK}/nm$"@Vv8+1m46)s/K5/%֝$گյW@3Gv{|;߬ 9M42*sSfOe 952؇sh| ER;*ʗH|sHa$S>gSR "1GϘ W(d.kOf|dbJ ;k%sΰJk ׿|1 K~ЬBd'|JR6M׶R[kzcQnvdL1ñPߴ9!fn7J] /(_׌ K LȺUradV[KIEld`T3#^QZeM÷biޞo殩ḟ|{6X,58 ?}ḕшb^M#Hwh{ˌ=_ Qʐ!{Ew֬=35SG4Y)O1|QQS? )W}ZjZe6.Kʺq3'S:ʕVUY*E-嚯4NvqGOU9M(f cCP^'P#Kͳj( #x1_kxW "6><|11 s8R!?r 3rܨ~9UnlM58sx]}z-}\6{whwCփ<&E;=KE}: ÝyT0u4ѻ|@-5%nވ`ػr?q餱rS-Oo~%8ҝqcaC/h{+ "kzʼЦiо,3LG 5?S*=x~ob8εA2U) ~K%D/-m|歡rEOKWTtUeN;N;Xx֑wE6yrnG`N Q%]hlν{ž){IUsaP:Hu[VcQ먛_~ʦ{&?L, OҐq V@j+TCt,{+Qu[)z~& @tb q}wO^8OP8^pNvzlʲt0c/ fB63XGR}S(HԣeRRV0F1*)j+mY&' e,}X@C1v7#ohHҸ{m4 @ɠe#YoTʻ-4OM wc NQIwo _l\=~F*䵦9L%J݃mGuDե]m|6>ח-be *e Hعcrv131O(ظx '#9dCJHs`Z )/s-بCŶ5XYGǀXӂՎ|1n6? s/qƼ'XtrCZvOP5^U˓< d[A~DNTRg嚟\g{nnmTҮvy2y8FŹlyf IGlY'cc8ӫEOֻxUڥ{{Iob֨Ik/3V1̈́44E(0=s1^##7!x//֭oȧyCEkYϣHJ3j[煷^y0ΥP3Ivʫ*qH'm*:Tfd/ZX"sSه/ xTkL|OGW&pL7Y΄Pօʗ Zlzq9Wt_a|JH?X5ݡu,i[`x}EtF@{ֽqsZ9[@qzEZCYy7k*.AF:hy=E\1zM7lizzC[=uUw^H&s ]VŽ?`3ϸӣ9}ǜ Mz/jRO @9"pϣu;I݂.X_AaY=I%<&'Yk7yհRWrup0+'MZGs4Vo\ؒL|yp}$DA%rgmW D%wHN p5h۫tX7afk񟭩gwx  (p̣)lh_r)ΐ/_kICV}[LOPcܸ螰ha]uŽ_,΍;&bbkflz`2tJZSR5OYՃD9hj"OrsUHo7?~jhYj6uc!{Y*[[}ȒE$`~ue^MΈ!X-cvm:N;qaj 3Wg3/¯8*ku.W8^% yj_Ff(/}te$ Cw|WwzOB f%^$U9|m,wƗ35:zM(J3j|5]񪎜R1yY@S ti.ӥ%y`c]z:RtcYUKYWOGw#K3dX*S@9=כo8ro7}{k=vxEnRndx&XPqDl{tvZa73f{y?߅;{EaݱP)R˵&G4O=`IhD= 0:Ab^&ufA<@-'%~"_pCJKoݬwZ5 P~XpgVF(>O>;L8 ࿴b6ZHcK|\wxԮq*m(~#́.P3Nh{BȼXJit:mM8XM*:G9~)7^^G>1!|ad.+;7]X|U!¡J#Rq|Hvc)UJ}̅Tb3 ]Z#pocO>W0P[T>ymtK u_¼U$Ͻ9{o֐i<4 &ͅ& t8Jl/e6u2}9ŗ ~ MN?v.WƀqirݘJv;kjBa9UG ɯB$"}PP V|xdg CBaΝZ !aB=,[_MdAcnɏj#2^4m lE˂/qd̜=Kt&(P焰 +YExvHS@$6gG{MLE0weǓOd#v n6M h)7xCIDO`1Ez4 Dnm:f;S*qŗ,ocw]n_ L!-yI݇9 F+*X)m(~K P8bW rÐCUj$J!ru쫃,RQh RƎ=^8G3~O\K߁I]g2 =5Jio0&AmZLbk3dOu*m첳(u 24P1G'`^5x/*}0g$rX[^ˋj$Z|r4`R J#~_roę> stream xڍP[nKpXp`]C஗ݧ>罪{*c1&*ꌢ掦@)G#+ @\QC†@Ia R j q C&a0TttȹX\||,,61ttH[r@WJqG'/kK+GИXyyX8M@V@f&vuG3k =<|+察~|Lbfi U 5i_'`eep|8*&qNSf? B ߱?&{Y8Y>~H_zMc1n-Ptk *ͭVd o'\df\}v8Z|xǒ~"/cw^I3G󿖍 `bq aJs翆p|ptAb8̢M\f,7%K,fc &vSo_a0M~Yo7}D/~i7}L&^f6~ ?ڏz,Y9>CfQ?}߿;?Nd;}?jQ?#qbnBzlz 䙹|_o> 4CX^p4 m%`ܟd@ޓw?P xfϋ^qVdbk͇\ma CU|W1%yڰǍ,-.|ƑUs}9,u)dL4,;DhhFGX\,䟄ORGcl;+ ۩N7B?{f)z%ߴј gڋUlڂHNv/ӔzǐK5j9ILSP  ^j4'BtkRljX:f"XtK5XǧXE;)Pf~{ "%)j3 yV>/ٸ4uv3s;5s0F< XPS`XtDBodV  S2cq^Acm[*mq,Bj+k.X{hDKͩ{z]v{'͚q(2C30$1^M(9:[ĺ:2Zi$?UOٹJ&0jjv+vyoj'a-L1fv ڐgNJNrI1V~LdS:/TWAbKsaaPLiu9Qz[vy'(`0!UL>DFM)Lc0`UQY, CŢ<`Ui|@U8U*n:8@R$3dUErn.QE|v|yӐ] g^m~ޫsŗ)Q,xIs"+U73%b eOT8<9v)Qٱ3$[3́eYAlcwCϑ S4!Xn8T3y2Gcv6[ӫ,h-iu@ےCD#-WiQ]Ըi6X-nnJ+$rTX88e\+{qUr\kPtI>մg/9x5_ܯǟ{C]w% *Kt.芾sLCYYZyZ %.)J30vsż`h>^d3UE=n/ RZ8b\ٚnc ty%h?>Xz'!7s]w G`:d`}X|]<~mT?X7fH{\Sè0%RܞPwѴ}d wK<A>530Tk\9$j<6%!\$ pc$MsAW j#OB'm0IZuەpobkUT!kDNEVt-%>mCt«;#jtMbmim!EeR+ԍ7aK y[ڟqƥ ۠c"*դEϝӢY:= [R_8h =Q6LwkᇤN0a!*@1{4[| ې({;'a4$XEH Lu q"DLq-VBxKZץ/b!7^VO߉'k }3p["3?UvqH%uekiԩ4l*5sv/K[|yI_3}g]°8cE$@gȩZ\~\$֒@ otdFz6q.Z~9-9/&G]0j`Wj0c ѷ C)zcD:,HGM R$z$x`@l|*v\-*h”wyQI% E`|i=Ԕ_#7U'A/"%B`:+DR+a&wK=Z.T)ki*Q| od& ftR8\Qm ;5ٯFsZzmM.Zel"Y_ʌTp|MxN F#mkU8 NNvѱcS ¯cЮF6W'x49?͢#LaGF_YbL % y_qƤ!*XrBe"꽬Յǽ9 uWwdL9-T!G 'I.N{[*TxYG]󵻈MBȗR>fP?5s~Ѩ3瓔&f}'̹gDleh=PdX Q_=b 1sMj 2""/?$%ڝS1C;-#_C;Ik--<"AzϠ[*ilrW ?u@.d%MnM6H1Qsa-uyyA4lB5n~awZ=oݟKXJupdd0ptJbfֽcZL00@" _N+kMvyxEr/u󘕳W3`"[ul ]E1<[Xae?v YY)M[hsa+5r7\s)wFG#ϱ:՘QϱLڏݛoj%.&MRiC/pbXZ7’Bh-XA,[ x-%6ZS^?[Z8r\a`Nž*|Iā+cgµ|oS)!(D^DŽ . AH/8\Aj\ pe`!cjȯ%ͼNNFK<j9g* ,&JJp%~$n-f7Ӷ'!1/]DǞ ovkX┹D54FjfMg-+;@^^ K][ٟS藥*^=t/NbSH}-KdR]5sHoIdXUSypQ"YAl2pxtgu,H5oby7<2kdC?HnB*Ac\p\Z[J{g/|q߸IOR#ĸ mZV< ʖ+q8"m(_սĸJ@F}!mycH:cxR-=my> b!!#*NަW/ E#w2~p5,^+Iވ`VXL $V}k>$ cQ+kac5$lف!ݗǀJ<]l\kN)@oih$2jθ^e;lfʗ(qDжkZyI޵suI0Gi$|~ s@7I]#LXNB .m ~O28&i-x:d' i?گW19̺ l&ΫG>U":d[jXѷq |rJZXkrsq{Tv\^w)g9  pDc5 >;)}m{zNZD* vM}7hNe56Ag֟N3\>Eųz etQ&jO!oڛ\“%H6$5qv z̋y[w$#?0gS$&RiqDdKA֧oK @A_O[9T烦RUbEI4 Lg$ζQl3kr4jqTM J%~4K?"Xt©S|>CN*#ʰcmQ&a S).3qAO}K /ģoRXՀ詼LuBL)ndLh *q҅gv4#`Um*ScMWX+" #tU@nH* >KٌXZRAd];0ڻ^Nv_6!%.;h(鳸LŦ&ZcPb(<Ԇ$#zJ0HWe #Lh~oKThUOB@8pj!GWAeBZRY!L}H}}jW3z8h"wDR> 6x7Y34 |LE޹CY+IƠW iA(LhO^M*5og09۟ zqNU06n^3, )YbUr?&X^2g>*K~I0 \\IR `pi3TjE6ѹv([Ƿ'tg#twSZtm -ɨXCZ^Q)O -KG[ eKj8 _B"4OhPvw-FžoDE A] ~\ag9ӱJ%#~Ys9 Zd1vump?SVuc&Q!."7dq|P_4US`Pi49rƯ(s&݀foy'T7ބ{,ߛ*ܽ383`d?ވJĦ3AtN]}^icu1x\7n9mΥOD)wi}fx-LÍ+a"OZ=\c>8rt\ݒ9^483؂JYӜ0[_&lP/éQ4qpWG8徾Ḭ˛I#uup< ˺n8aX^)v7'Tz-D![3JBCL9>'[!` P"m'5˸>˙=-ok` ]Uzƍdϗ ү"뽝kI1OC+>Uʡ1/`B^,[O;\P/[^")%f@*fsBkHIdxhO;UJ<XU\@BL>{iXm^:h9yXa!,_WE̻'|i$#.EY_״"V?~UE3uLI e]L#v3G}ݐ^[4%||;0?L#_oy*>BۀwWC=M q0JBHH"Z'? tim'Mtۼ^Teh•0|m0%sHZis{P{&&vB.![SL˥[2cv1g+Gg7p2Zsʆq\ R92 l7 WJLM8 tP!8kWeؘ8G6 ?IAvx6#R j ux:wFX6+MdQfiUux y{`YK 4@$?4:>Oa5$BFb0u/+L1N6ɋo>m_sͤ34&6=lxDK8\bw#\gOזxJPwDs XUb"h\>)#--YԠ޹dXtӌ3 VJǗS]Q`5J%+9w2֥Ck0 DIIڎW-E7?MfAL~Rݩh `47pJpzDž6 쵴-=Qa&UUwyBǀq (v(ö^on &͎jǻ׵ĢHB(r~G~v! 4&ɧR%U|ys#9dO@9>YPkݑITGU=^+gv[7VbK.$˲re7>2| fn:}l?BrøA s njzjsuI/^EwRX:W N7&/Y%- /'D'pD cQ&_ܖpu<8{)̵%5rXU숔l5;ׄ Jt7&3弁K]QZNܣE=ͳO}$hP V\ab#]Üj Kba)> Mp9R>s[y٧IY=OMNlߥitPT,ѥLnv׹0v}0":u[¾UjN`ΠQGטLr~JIܯp>4>,j7zrTRFcb!/y/*ȕob'Z!s溝U4太9a( S~¶B>/k[|QL&^PTw eؘ-}7jůsVڹ0E'iCOh@(}Gr]5[fAsIח]ӦnQs|{,O7cZM#tDaW1 xgfSk/W H`r-Ȝ9Qv[C r M(݆Hm) c5729pEG^aVPzPm}!+Ҟ+"~:]rg m%>ҭ8g{\`; s' ~ B29\8ݑі1ဖZdHi[?zX>D`pn[:+yY^ .a{L몣.whNW{g۸9)bKoQJe/P f?ODk|oWmhDZ?("MfVqOGW=Ő¼\=٧K6{Z"j3Y/bSL6 BpIBy _18ṫ}UQs ")kB$9<9ߩ`! ;:/2ŸbM2ijo h`"R.̌$~AX%Mj] 6Z=.O<*游fDJ :Q>OlțOoC4.LA%ք""N"z 9f8|ɤ?D5"6z75$Z=yJSMpcnExzuv'eQ 5^z!La(iX.X=IE.(o/~ͶiD:ީDcblֳJH· &hOs}Ww;!A$*~86Q"qhb\9qj+,ob@|#cg8:8hZ=.|`%>m4ʩ}َ=S3H&{ -G,0^(2,@fKrkazI_N'XC;h-^V֢eS2OŻ­5B(? ǹQ_BߑZТ6r5wsL*}78=d¾`'Q#H7G`爤ϴ+Q"ݲ ;3"v4i[x'r=B{_ٓ;:iA$.5)qD踗Ci#';w#Պ\nlV ONqh')>5֪J`|"2Qr1#G0IJBJEug4p)3V|Y|U12_O?3= 먗4# *W =\b6&~ȞaIɭljhy4XN&(]Oӑ\=y .QMKeŋ ߃9_u{:Gif,_/Sئca./FH'̾ڑbT.`llX:$+H//dhk9ށ:ēgT1u6T Aw]ɯ!5?ɣ~ ѭK^J;u xUx3-_FXAsVSa 3Z|JI #L%%R1 7v2~~@d;of {+6?"l4W`00GkT:sTmEV^ G2x?UķNVV'64>K~yxkxM.=Qp_ҒZCvP $sFeߨ6!g e4^X-x露\&%Z8 ory&Heh,xșER4)I1E%܆9 P[Ow[͈(d &oOp7zߝ| vC U}Z(U_2KH2E/g<+a&`{J[RمLµؘVO?f}B tA:yQLJTX{ ؏CipmKA2e,2uƓpثМ5$FX13qݙ=UUC@A)/]Κu`|dzk#Țs '|%5n.ږ?F`H3-t[ 4 %Jj^v吟yD_5պOV/ yJTAɓjWT!ՄN2.cH7Ȅ툦)uZrVAޖehB~L߬>j3TƍV:zd6\ RmF3_\33@NN~6bߝQ~(]򬧥qg7dսX:e*WLs?Pڿu6w&~͊,j =9igt@Iu0j${R#r6q@_Cȹ OM?lZ&ݼOeJ G+ޭ#=\D|Ǒ<@DY^,)H g0-PQ`==wom΁Ps0˴M${=֭X5ᰃ;Sک趋fșp,a\|W*55U{55ĀI%kf 毰U;&Ѵ1Ia!K~t((|^JtGH̉7 P#d\gd D4qo#V"FjDKʖDK IGsx.%%iar$S5Ǔ*pTnr64pp\Ï.v .IYz_9HcKHYȂkܞ^d|U=V >S {:|PS-R|³ksWԀtu[ }xbγHAB>oM 5:uĖLJ<v+*4;JF40KCﳒݞ4|H[zrA<{px>PUk_,m$@^0H~-O 굳 s>5f''@2~s8 㠂ntJu޹B>pFVp"6N\%91O6 XfXC!SKi)ft!V VR{-:\1=E,*Rhv;8㼘LaNgg1YRt>,\ )JTjv*~Тs޽36:SPUaiߦ~X:[ф|Umw[mܻϥr,hh~*u'u0_k$(Bg iv dOEӢr^{LTxMi6˾#(I-d做DE_ bU"V.8ӣ`'uѯn ]2 \޸OtR*cp/5Y;vr]]`W.K~FL~mo`t 2nUn38J>Eo{Tm1P;"50&J)w.$ʺ}9/"{EGl{$M%%bngH Qm;T|ԷS^IQ7ެ&lL`@ ܜt[Xxt\ˍ" )AW0WgP;_uM@IHƋ||y骴aMœ_cjz&N~ʡǸItkУ̅g}j849:?9Q/9sI TI ä*0UIvqx&E],_fONzmj@4 #Ҫ-$<) Tq㨪s-u Y1Hm3\NV3R3cH?PO}XyPy ^W"h ̝ &&F|[bUakp-t<*CA(T4d܋[QX~MǃkNbr._c@G{Cݧ%WvnԪykO=` K-B.*TB+CBg}ffi!7Rm%%1Ά endstream endobj 311 0 obj << /Length1 1464 /Length2 7166 /Length3 0 /Length 8164 /Filter /FlateDecode >> stream xڍT6Liґ"{ ]iRBJIҋt&RHDAI^7Qzoz a qA#! uP]r=P @]~(6!T0M?U8 w?@z!;_cy #0̵ @R@; u}7/ ~nH70= 1(m P GS ˾>h'k!.>?#'ńBB :>@H_^-{]Ky\?_Bt_UDߎ=\\~ǹW5s=0*E^kP_Յ=\7 @ՠpf4H/?1PǿX\ u\g@A_Swv"~MXLA >DkF UiMfN^G~Pے ǒ Bc ]P_u?(/S (m] u߼oEyàDHLSUXqr(in>ɒٓ=qᕘ9EWuΦ goci'jm Pʝ~Gxh75ۈ hvz+,dkr+?*sܺU)@s㶤 NXRfc`vbߏd&na>l|:~|Z8@f8ıԢv w~CX0-HaNr݇ %7Mɻ;IS<bnB_O(|{Z{ґ =z/Ws]xư#AYYO;JW3[u7S6$RCsiU`=Pّlxu mZNjFjLt Ǹ).|9˷疢)Tnˌ*8_<; ^s[[s_qSi!W;782++kƄ)߲RVJ =eW.U;! p'CIƒC(lCQ.@wo8a,U.0\u+!ږ C*əny}Ļ+* oު@8E8~" 3GX(&*-ť۷ n+ZW%HRfLOu' &N2JxzSæ`"]T #Y?ݬ\cguvNSczeL@~PEra/E``9@3/3 aK,Zdy.(\LFn(h{AQb>#̲,F&"SZm ؕʏbb؂1,f(}ʆy$`p.+|tL˽8n2#jSi⯶CcV(F-[\`B^֐y7k9V.sXki겐r,F|s*4j/Ў Ri^FK34O/}q"L"]%dFD c&rN!׬>(׀Kr4,4iktN\շ)MW վm^je+q`BBȊ_׷ I[:9;b Gc8d䤣&*-Oʌ$r/F<ĉqdh' mtEƦNJ&W/p*](xJh4"~? *-vfUOF}~3<94DUR9#xL%J=Z=s yizmR^~\$s՚y_HENwja6 Cbv!.u<;A8Ȫ~Ͼlp&iDu:MMKA-F,;cO䠼sOD M3lzl9mެ![! eqS%GB^|NX.`uIbZ$<ւgqSc0 ֟rtdv}ur;MQ qS)2PkaW1sǍg5:׽$A+̛SJeKlOKzL(D7)cD+o:E;op]+Z /pXtUs>Hgp 3;Ӿ5 Y: ]pSQkC<$R.c(sV=0>*̠uƐHc!ȢzpۆHTL S=+gy2WZ$Z<#BF:I[ҿlu7AyApG{7{F~*SuOTWݶ=ޓ[j!Sė_ t; ymeP%\}?d|Rej\F ,8$T.ͰTs^S|Nyd^A?Iǫ9g8kt!D t[pgBK}h[Kɓ}Zʒo=44ᬩgj2\Om6wէ22(7Ȫ/ՍF+#:m4NU'E~@;R8@};&elV8gwCC?u&mn ufɁ+JE#w"$:Wz:0kw{/nnHYsY.fF;m"#'qk75YTcOY8Ȣ0tV<~7 'Ǎ#S!22IK8e ~ E>(vtW ڔ=évlNF_dæNR?Z$V+T'}z+heZkJ W~CK 5/q/cn `.z̉VT'_lj=ncgϢx+j=89-%Ģ7gYn425G+gbŎpI3p~9L\8{`-LlV&(`γb /;I ]~kƮ\iW*m̄Ur{s q/7<.P_/F+5^%d;/kTф>31Yt#dM{++עOMDoi.kd|[q0Oco='OH`v7¼~f$< /%`y!~E -&hži_g[/}"9+4ß}םukTsTfݑ' hOEōSWez~cT?* 8d4tSMFj|L2_֎Qk0e&ؤ {ɚ^{{gJ )Rsn5X)}B>`PM>yĴ2,ĸKb/.McI*#4>GPk-JH_S=]f?>N=:NI|w4"$؟fO5٬:.{ߕ]nq7|M^u ,s?tmghV=0)Z -țE_->4tsMsξU8 >pN1_ {2>,NdhVyN$Q!D4uM i>51ewo>qnj6VFc8UZd2,EѾڡq@VŘ/QF'Gu?8O(UР쌌[HMmx肋R[waڣ[l6Qm\o<^D*mkW1f!E}ÿ;×"_(+SP>.|Rno_D.UVHҧ̲ؠCٷv|5~ B4.YS\ǸCB*9<x{ʽ;4ٹ!#ƞD1L™vz[b?xhX&`o7&w=9:tEMA|ֱqQ )4U\omӿvO_&N,=c(d53҆8Vdfo5K;YqzY il8Tx'zV^(tũ=R j;ēYϲ-7c!dM9IQZ7#.Kyٮh[{w@~p1=dScuvД |taZ{Mq\{ tj:3H0Uua-ICQp{clmYǕ̑γBtDuk89ܧrXMʌ{Ec ~8T'1JyQ"!%ciRr#oZp-%ת`2|7(³Ț]*i7e^Gq7Ox.56=hJCf+C)/nvCE 5A$Xn[M/ƥԞl6Ml=i<(.x!B =W\D9v~L U1l͖7+4x)H P{1S+8ŅiO~uY??d թ״EuO+2[&&kw/{Q8R/|5L p}q?~F̷JdSgvm$* ܖ&D.=Hk: 'pl^w'U$Hم/s~i~ yZ1S׈0trzw"(99<9w~ ##xʦnWn:7nu |,WMozsQ*fUjƧ[Ur3v:ZYpeDV uKnNbG(ػ+.y"!LwC H.|D- &pθRZki輙wՂ7GLmRCb4:ͪ")\pq6ʋi%#ݪ:s(}WW_#Hks˙0MSyES׵jHB#1,lNxNeu Xu@\NB]k M=YFxᤣp-݇xO3W䈱ZrhE3ޛ8lT(ey:fO4CSJ NƲuYE1mQ#k_&ڂItwsD`sRT[ ν9xq6u7et͐!g I]^L:oI HdDy2.1eekՎHu8R5kYnlN?q c+qx;n;sXES]زGj{| [n0'Bb]or7bcG)ET#;OQx,j^$.S<ݤŇvK%|x*!v_T줓ъ.lG>xY'¿}H)yaBI5 dWM]mU_+8{V 6uQ0$KhKl %v{C$;kFϏ:$<|^leǟjύf a_! |W,N(w }9C[E:pگWк7 pueyxmI޸CvBw)> 4l?'(O kOB1ՔA#SK/F}2Eu1K;0}LԹNg%Xʞ0@> stream xڌP cAww .IpA;w nNggbi{{uXA^(nm@URbf01201+9X+#WٛY[B; eRfV33'v<Q}'3#,@ hG.bmjgfbO!5/w%P ` 1[ ̀ٙAҞ΄l` 9K[.djfXA.VF@;()ZX?t`f`'޿Yohhmiojfe06e\VF -ANf R }*>{C;3{{352j%w~fv@CP]>\s+kg+"c3+#e90*[:%ELv&&&.Vt14eMjK[ ` *if };v@O+13 @3+?Ab0\Lc0K4aFV:bF!ϲh. @ `gpps<7YSdA]oNOA Xr֠;!aߌ:7#qG T1[Ym[GZ6@#3Gt킐?47s)95+3 3+@t24K2YZ^2v+hX̠m45F+k T'r?(qE . `q2 f`% b@]@ SJ]A v?ħA-m@;b =h8L P ́9?l? P vP0Ck [bi'')VPFvedRz[GfqfӿbV[;db'"Ho) (w?Nh/ _AЙ 'cP,~o= dCeھAMԠlAl@U9U6濥{l2@cfu_ق^3m"U@# 3H~2'otW;A'?-MڟDAw7_ _˄ݿ:AP3@PPA nrEr' @v2z]pKֆ_k>8!UMw_@]nK(aeu[Vp&%sdzn엩VIc>X|z%=[_sf)l[G.$\{^ ھѠ{%Z9s8$i.\go͠eMIyDklD=̹)wj@ޢNQ $Ja/D\df[# >єDf[%leUAMUԳ9FZ5vvȕo* 5usu\U?>>t4ނ\m^} ĐƿVLVufeEGؙ#L^B 2 z^j'Pm5:l%wV +iΨF23~ED$Z#WS 208άj{RA V1.ݒ'#TRښ㏍aPMIara4jr>kSdGJwB؁ND117n ,O'~0"]\:ç8]dKs'nI+yTcKcue2&zWBT3v,~e5P=vgJv+xd'5H~˦9=$QUD]cw&N ^#vSI/BP?g!--~R` 6^X⭝d%u|XKb RZ}W@ cZ4LkVY>ckڭ؂c)QwVaɶ͜0Esq6lLRf.5O$ffi:LN '.K0I k2H|Y" 3 #("e_SG{(OGcC?9ܣ%uf,KKgnАwL- k }|*dDl6OyuR.9Aloq-Wq*{D[tf}6Ƃ&-*N})fCFkp1?uʖ#qCV_Y~Ab83#D>ߏdIju6D ]<w<]RQGՌ 9*)(D.G>lղLwyމYDĞ@w%idA/Q&Ĕ!˻O'o>O𲯥)X f, fEǪGWUuT}KZHN% 7&1B*U=:`W>\zoJ9A sXf>PtE(b$iKzS:z`uT 3Ж)Ҕ[3Yb rt<a,_96/t")}VnűlM{ݩȺz/)r]P++/J宆Ő26p&JBmV+4Q%6qn{;+0ՁFPR, iE1*'[moϯ5eM:GRu_SeE'(NpxDNƤs;\>ѣ;m=ϲFg/nz{4cF1P0  !4Lٶg>->rʰ}DȽ> vL6O\0 2}?ڢWOj0?x, JH16|l ]KYۈ@ {ǹ^)cNm|x?$E:YʓQnʽKY[$k~Vb1{C׶}/oe+1Z1#ݘhLSqPX@Y)0Ś;_X\3šG)Ei$A}dB :HWO8O z`J(!0'Sƙ#R T-"'9Br-#qJ SMu#"vw o!Yw{j 6j?l$~0'LQ J$$PWBo_{ᵋXd|z8.L(F1Q%c0_84C1`^' l|U[gkɱBV 9(9~>nYxIP/V7RD8<2Q +ܟzXzGAPe3ׄ烄IHxA8_Y \+?εx[>!YoBWCcd3 l`n6'`9l` 7O%)v5Rc`uSGȯzO8nb a`\ V(:$OѢquQ"2+)U'; ԣ.Yyu֑ M{R\B"A8 TKxy/v7eG,Fhan<Bu%)l8p/>`so6bY5ON?W"?txTx>w-rMZɀ>򁨾+$9[k/HN5|*%[ܼU׎x{u0@Ib9p Kuˑ4Nh#R)f,tٺAt4]HƆ3Yvu?*L,"ϓb+'>*^j °}{]?bj 5W_s3 ?CdH|Е\oM`tсrs7qA4cSj}j>`aN1x2 ;nA}{~_E81ے3JM_Jf NN|)ĒjSeaOؒ{7 ֛i광7y4^-:V[~*L %%2n4)+ƠdP:않[w ]!I4V^d&buTRjEfC'!qN#{AWؙ<0N4$|Єɮ [_$%Gέ!kŅ=bvU[Qra=WcQ2c҇)q5E3e8Hr |I>w7g|WKy[4No]>ȅ?H(֢TĝЌJNE~+7=x1t~_ cKkNi2gR R8OuIiM;/ D0 f|wJqy1z!!S-F)h1) /fe#cO};}WF\ /˯fƶ7'`^TɎQ qԒ5BhQ ;tE-5uv2'uWRW;Gb\wUup#v~pwbC6`*@ !lwA ^X]% zK)GI?yB2{u4ʅmBZrB 7u:SCِ#60E0iZAi2^{G] ڵ9OkOR ޺ZJCh魟]ꁢ#q$MdƍI˩램FvBY)h^Y.桇 pfMPq r:ُH8jx4ta#Ԝ"4 [^O܁=lzźoRYT4,YyHA^hف*/sNj9xy@Iꮇ\$Y$DQ؁Í'Kg!jg<#(L3VLj/hJi^ǏY:9q <}Ref9Q(!x&f/@uc^rDHJ%zJc P~[}:1sSKxG*-Vctc_ AەNy<"Z>PQk`ч}R)ՄT8^&ـh:$,17t U+ד 4-J-CnQï#ϡ.éJR>4զuQ`G1k`ܣv?q}I R\'KsTNڪ8ӳ׶ 3Qo5h g)3Gcw@*T'{rZJnvgl K^݁xgƱMm;n*"wWCDH,ӍKV$LթUv)9n.LE?TCqi/&E\BKS$Njq^G*䑤7?MO@*#<Ԕv]ξSҳQČ&!j۔)a>wb" {F,\z9}%mCK'u%EYm8Z_>A ױc $ !moƀ-8*l_~<_,Ӓ!J) :0X]0z?ivJBqbtR[xtd{1!FքUIkUW`YyvyiO LyG >vMM0 <^z~mn4Si'=6|@ODo5M'fX܊,P)('#GOepM.@ .~VyR,F3uArnFrY\ipuIF?š3/9m0Oۯ0= ɀ(nҾ4KL)P7 gMKsדpi5SJ)0 Efbb!9VVBFCBg}T; 8|Z e]Gc4.LQ@H$S.j9G>Lhe\w+uǖ]>d¹ΪuMjQ2b1Y<2|޷yc3 _cs%'F}YcF2P:VQ͊l3鱄*:PA c9kp'2!~MrT?s$]$ηwbĝZX|Y~QjWJKGJ0'd4H`/YJQL NqmOο-C76B[KmƜ0#,-LL^E?yjy-`hAaO)d3T(Rg%/^T9JN+_/DU|Jū[ 1Us{?v˟զ}CyY;na>7{Ά tg7*)1zєf u\2&kFSdΧJMv5|M33rpxQɹS> A~vx,ុCk2U ILWge<W๿D%Kȧ2;šE|OL#`y{ZegG?Pt O}@uRQy8GrX@g,=fa>[bT1ZXIc%X9"sֽu4w0ǖjcJCЯ춓c+ s0Q :ks*@H!Xr[X8jU#w v;]_;U/OՏ֯ hѯKu&_GePY +;s-o$eׁީc\~v*'HX)\DRVg3,񾊐nYK`y6$ᤂQ66مVCW~'=AOӐu# YXc> :V/>8wuppXGJb-re)Rw<*]J}W&VӓkF_1x 7zzoF/%|E1Ɉy9%?{&$GU!3l"M Wᝌȕ/+.4̉Vp~eOeLjx2pC8>V9L*qs%[?ֹec]g~=re+0)b&GKxq!mbYl()4V a^w:<~l:ng@a>KKJ-Js2bLOrBc[+Kb9"RgOD ?=;zUbEOʂnf(R*u?}5x<r߻ro5LJ0إIT\b?yT6hPj&(0>vyQw!dgQ ⫻޴ؑZ0\c$5QJmTO]Kb(La~W+P%_RqC.T|Ю ;pGA?aWV@,U"l\q{fvF¬q4*o cI{d .2sM?MY S*)_۰v?F7ZZɆLޚBlI)B"A~ӱ5Y?b.ĄW=1i]י^{vM2~ʄqTr (>c19:2^1,}%y Q33'f{_2,*%qw`g.< J\ál(۳:IET`5lEķI.|{^^m=Ҟ;a O?!܉0"ZNX\-oi{Rq)6sI΄W@]>@H=)D*Ǻ1?? *DG2ܠ G _^~W*Q0,Z%ܙ8KVO-6`[墳)#K{n枽/DJ"rlQ௦^*cƬ|/5fYHC!ʬ+Q``I֞)/މ mBFVzi5&OuˉGYCCh{s %4?oe<퐭6X"ĠyI.кQq5D䓆35\ X >b|_9&.ᾓ\t_{s;{BL #D jEYV*+B$?INc+|vQwR8jcчM6 *;<]*onyZCws ݙZE,)66Ssѽn'Z/Sdy,C!=^ l{`텳Q=w n!7B8>8&Q>UIþnKZk&~ Fqr,ul(5۟ʶbf!)T"Iә,[!g,mXNzCN%8;pcA7%=m?2MړrG/3*{T +6"3= L\UD%޼" IDrUcwEQwꐮle8hߖ'}ᗉIBJiK=NY9q]!  [% =iK$t,k; |V38VH:QwȂ-STHF=J''eB٫.N-!Y Tby'yiY189%H8:4Vtn2w$/JBQf/do3kqtc@tgΔ)I0&po [=ߢ0_9ᛤ'ѧ4l,yPT%t]o;B^n+)ND>P uK:Q^;սFDy;eK [)pB*cqmöFw%\( ?1Ӑsl10Vʶ!tzPȠʥ$ݹ@Ā`$\N"Աa]Anۡ羰YgfȵE|8$ƓtFgϕ}4Yu&J-CN}%<-QzA*Lg[];/ާs$OzXaŝNncL qk$<Z]r_cHaz0:b:hX%z:<^#P+*1^AnfrGz؉CX 4Se0bRre DXPSܶ|uPf^*ҕtGp9Ԥ~C*¿5 m3֜g=8LN^?d;90;fDB]څÜ30k; :%aDQ^99^ rN?  ,RŴ`Ê[8}o * Cicg|$=5DP5 F6]2o룒?S7ᕾiZ ?d\6C_\^jNǀ\Q249_uف* 5C\5˒. u7lB1]4g/m=flMX8?P#O( ;$gD8\d5`1W #uP,Ҙ6Ї1\\\xnQ&Щf ;iv =] +]yp\/,>5g~4t]s-"8-R |%z/t1JH{_p"F1M'3x#_7lȭw#Gpj ){aX7lѕg_*+x]o\yj m] svvHӟ0)_CW qoەЍ7JMsH:1>dܧ<^dfi|3CMȊ+ Whn\*t*Nڕg/B|6Y%1N\Uhc(vcsK[Z.F{Q;S*4ʦ$q򕬘שQʷM}\3<&@~JSTcO#ן >*/3˧0 .%nM_˥;ְ.Gm#2ĀE|t*t_r|\ސ0 Q84xRBQA?>:<) {,M[h9sf${DEfH)0JH7{R|,:P/oY7w4 wbo"w50饏jXioJUጎ{2<ݫɮ 1,26f6 ;`/7I&XNtRg,UYC9'QȏRR9$#'+t \u/,C?>&߼G:°! -u:fyi7JНev.xn0ƀ̪ng2#Y H dUg,d7snpS~nSg ʩ@M|ZYTn_qj 9X -I z)smR>!yWk5;A:#xzwD}0T6{d#[b",8ls&y-hD>D oॣ,񽙥+mLG).Qδp1y1_<-=Sz<^$յf,Z}xe^zqԿ]=ݍޮ?V!lؔ2U{~Yg}\SD^?+uG樠)LJ2=׌ &LGGgwmIEQ"!0afRL?{QZO=*贩PN?peӕ֭Rw۞ I?1NQ4!7f]Zq}!/‚()ߢSq.eܼ~fQo->@ P尥r)yqEe`"BR`}6* "q nk u5g\2o$!ëmGS۟[//`.&-5ivѲp5h(K( bEw14\*5ph/[Qr|9JѽLY[r測r5NuX9&dd xςrnPbeF8oN ;`4)Ex"REP'0W]%v\1kN!},[`(K:O2֩u(zftJ&1g"W&o2c0+D#>$i{{}1xNƲ mO>oO=v-Va?yxulZr/<7u龞bp;<2c5cgQR,SSh=cXAgWԘ<+=\0k"5%|c51t^G xU:岏PG-f,JŒUB*<|B&/i85 yf&k.\װDM4W_!W|c0}V?u!h|JKLqw XE]''gԘsU i M=[]efU Xr Tt_7Pvmvs;&!P^&4=!%8x[Sd?ӓ %  Iv9kDFHps83azå &OƖ*֑幏dbңQU-""NN ;'P(ţz~FKĿлJkևa?B"ӝUM@%eGrpWO&̡.c/iL&7F!gD$)`wƖXpAH vkl#AzZ# P@lݦxqM1G1Ufʭ\Ct6yUfYۈ4Z~+)q36 ;o'5s<%OsR(ү$OF#\]rW,&b:~u]EL$.[H'Bk\cYuVprg?ZpKS6D6g.,0lsKܡ|&vpOKH䓥/rWx)#S:R@p/.&KN~|%oذǹZ[z55vŦĢÂ|(2ߤZF*&qG5ndvu^ Gd^Y-= 4شA'zTNb֧$\Gqkq-|[M5jŐ8Z yeձ]cWWtk>]%S9d95l3ȾэA.\S0 @T \$x'X&hW'J WH# SSaCW͜ #$v{< G葓" zxAό u3Eh T">&4ܳUOW}G‡H iTy.2yB]yvfyl͓AȞZd| O_S@|]H,(R2WԺذ: /}萎j!aZ8f݋wUy@7~MS BW7?)n_!=lEma[įW?3Sy~%W0ΒC}z t,P:D ܬ\a&:djZ^_U(gܳ* ]roO?A%r{ǃ+_EFR(A!wL#(`-$'Bur#hGSTnwCF(g-8jWɂجQ6(zH e! pA C(o,(CJvе9w h22[sM}1; K51x򬂠9fZR;ꗋ/WfվR+ڱ qBkPmP¹`þa>=;q'IDHU zas*#bCVYt GewxLch ZǕ?z'j~+.ߞ5)~ǨŵMdjLJEA902\ %>NnrIv?ڹol ]*.HM$l8W.oiRJ>1P_0DT0Y]t',0eXzS.ޟV їl9.GÉH+ Dck#iEۻdW֊ eS;ajA׃@S#^,&rXdD&>s?\WHbghhc~@(+ܰ<A#M}U 9[8/$3; okin)IaR `<̏^#gH:d4+uxMJ.u_jvuP !pD6WbFIe[ܜ2ݾo`k6քsww߮W ô_q6᷄yRO67e*h13!zgŠ_6s 2 q~;u>[X-&)2\z6?*sL0XYb\z'f%L4~}}h#*6s83_$ WM=AnL 85)=s!sa$o͝#"˦@R.6a2Y,a XRf)1k:LCrj賨@Q8O!c endstream endobj 333 0 obj << /Author()/Title()/Subject()/Creator(LaTeX with hyperref package)/Producer(pdfTeX-1.40.16)/Keywords() /CreationDate (D:20171030183838-04'00') /ModDate (D:20171030183838-04'00') /Trapped /False /PTEX.Fullbanner (This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) kpathsea version 6.2.1) >> endobj 281 0 obj << /Type /ObjStm /N 84 /First 763 /Length 3762 /Filter /FlateDecode >> stream x[YsF~S[0LU*UZ˲-)>%7pc @6Te zzYI/3-%U>&̈́/p2 i RȡȔ a$T(O6)>ΪLEf$R @Xa3+-R|f5hWȬvlp(:s"L(XNy2 ]ÄD'l̼ J k`3o <ld+\.2 ,! Rr'` LiX( ڏ$@gk0@A:?\-!NCI!HY gj@bف442B!&A0 t|jpD )@ `'!*Il%Vpa w mUVIt( è10>Nc7jHQV"^-rxᑢ} 3N}L/Y!zZ Wra,*\}Yq^/+܉V|Gm2zeLB!AYKceD\ 5%ƕ57shkCWt8`p7QPkm %;ؑX#\ p  Nz%@Rjqj ]#L;k0@*Ǥxǝ[4U!U+i,k\YW?WPad GǠǎj )OxHĀC"AREE IPNJሦ>1cFq"5-]TG(fN ɓXUmx|.3АDzJQY:?t$h >:aA,+fr):Jӹļ>Æ~r Wk4`輵Xt" 㧅 HQ: ~V?K1mHXF2NGVLe##´CQ &q@Yv|%イx{1y2^및o!^l [`C{-X 340(`DT8 #FA9ŶJ*%B=F=)GF[Ep;'.#L EA;44\4/  n~&D$Ox%/ EkI @72ZYhT Uk fإr:(RпBVJګyVJ%kȥu p{<G3< j5UA@sy̱<+nng X{;IW,a9ѼZTt ڢO{5ֆXsŇ_Ш5EMX`i1-Nz +cRs»'Xn 4JHqu 4Dt ٴ(kԔ"e{)%J:P|'e:`LqNFkSģB|W=]q~(l\^-&bDϧ;yu;Qt|4s<~<}36HA T8WlxY9GoG9=ǫtru0?_wo(q2nNjr=$M>/j~w7ί2||o|'-g<>/uȗ\U]ec?G0 觟 ߼=~ɀ&2XV-=ȞhG1wChi/t>~O[NllvxqKv'Nz< I4m[މP SUE9V#B>h8ufBFmMN]SN_󓋋J[Z;!Zc흔SŽ[ʸm<Ϯ77Ȑ"\J`4-UM1Ol嗫s,1hܬrll}w Nnfog j(TUtd|L4"&y¾%ʀÇUۓOГێހ\%UЙ(o;T`JeۿӋOΞí2UK+T?뿺e j%-j:M٪:TN?f4[ :swp_cZs6JÚ`|=NokeE^_ęVndG(2?,+Md\n?Q}u^oz)nw/R]l/[w7q[%}ۭy[;~ej?Nվy'ȻF{x@mv]6rOIv^#ߞy> ]{j̼J?<@W2H7]ж{vozSMtƓy <ϸH2 1J]x|Ea,=>Dxʤ`Yx|F|Ma˰2ys"h#z5bÞcO+'R=~ZdձU]]Ƣ(ӈ-QmGѷXTszsm"Ko Jk\mS m+ MK} ocRC}-+En_|^cAvz2jkh[v[i<ж5nS1=ymk"&@okc8؞8:fKL۳L 釘VTuV/t kLڶ)n^mRu_-rRu_u*MOިbՆ8 T?Wn"2-]a0TIqB֘Jee T֐ ?,Fg!W ;2s_ZZ_ٮASM-|)ZWrqҜ$0=_흯Ƌ.NţUy/LBH A!% bL&ʊc,[XΐDFm4Y/J"$UZ *9kfpV6 DI.٣{h8SBX'iޚm6 om*)W2%ǡZkҥbt f@Vj~0Cm6MƜsEBs)GFeyghk(L6ז~אk&xY&OM?/ѝp8^ICjnсߘ: a ] /Length 817 /Filter /FlateDecode >> stream x;lOaj^u*VRTb E`5$ Fb#1*#߳<9w53%PmsdAuR utE2 Y* d!vTȶ|p  1pE,"&ی-Z` Y%&l (%%kr:إ`Y=l$k [dȚ5lԓ'k6d[bWddU`50#jX9gs*rbLf70Ohô7W}r_bP*@(K۔ݲfT'n z@/ 0w+ }nUwdw`?+Qc`[e|L5e[K#Ԕ[,_:683n#/d9Qq48BA#?e|D۹eT>N-p^#Eעt, Ub"t*8 Pvn1S(wWWUjWekx%[e ۉ,] uMekRhtMѡV'Omv*Ň6gdFA-̒`':ݞiuu{{Av :t(^u`e{xslLIpmҵP#|(8hNLIȖIM|O:HxrO~*)YE*un>=}`@_^i@;93]s1)_ endstream endobj startxref 229803 %%EOF IRanges/inst/extdata/0000755000175400017540000000000013175713360015545 5ustar00biocbuildbiocbuildIRanges/inst/extdata/ce2chrM.bed0000644000175400017540000000027113175713360017504 0ustar00biocbuildbiocbuildchrM 13357 13651 trf 2 162.5 2 67 19 173 47 0 0 52 1.00 TA chrM 13436 13585 trf 7 23.9 7 66 28 61 46 0 0 53 1.00 ATTATAA chrM 13406 13658 trf 9 28.7 9 66 15 63 45 0 0 54 0.99 TATTATATT IRanges/inst/extdata/ce2chrM.fa.out0000644000175400017540000000714313175713360020153 0ustar00biocbuildbiocbuild SW perc perc perc query position in query matching repeat position in repeat score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID 210 19.5 0.0 7.2 chrM 433 515 (13279) + T-rich Low_complexity 2 78 (0) 1 28 6.1 0.0 0.0 chrM 543 591 (13203) + AT_rich Low_complexity 1 49 (0) 2 34 2.4 0.0 0.0 chrM 1345 1385 (12409) + AT_rich Low_complexity 1 41 (0) 3 26 8.2 0.0 0.0 chrM 2464 2524 (11270) + AT_rich Low_complexity 1 61 (0) 4 239 21.5 0.8 9.7 chrM 2566 2699 (11095) + T-rich Low_complexity 3 124 (0) 5 27 7.3 0.0 0.0 chrM 3922 3976 (9818) + AT_rich Low_complexity 1 55 (0) 6 37 5.7 0.0 3.3 chrM 4206 4296 (9498) + AT_rich Low_complexity 1 88 (0) 7 22 5.6 0.0 0.0 chrM 4759 4794 (9000) + AT_rich Low_complexity 1 36 (0) 8 28 2.9 0.0 0.0 chrM 5437 5471 (8323) + AT_rich Low_complexity 1 35 (0) 9 228 18.5 1.9 0.0 chrM 5538 5591 (8203) + T-rich Low_complexity 5 59 (0) 10 22 0.0 0.0 0.0 chrM 6584 6605 (7189) + AT_rich Low_complexity 1 22 (0) 11 36 9.7 0.0 0.0 chrM 6699 6811 (6983) + AT_rich Low_complexity 1 113 (0) 12 21 8.9 0.0 0.0 chrM 6978 7033 (6761) + AT_rich Low_complexity 1 56 (0) 13 26 8.8 0.0 0.0 chrM 7521 7588 (6206) + AT_rich Low_complexity 1 68 (0) 14 30 8.3 0.0 0.0 chrM 7794 7865 (5929) + AT_rich Low_complexity 1 72 (0) 15 21 0.0 0.0 0.0 chrM 8054 8074 (5720) + AT_rich Low_complexity 1 21 (0) 16 39 5.0 0.0 0.0 chrM 10561 10620 (3174) + AT_rich Low_complexity 1 60 (0) 17 207 16.8 6.6 4.7 chrM 10656 10761 (3033) + (TTTAA)n Simple_repeat 1 108 (0) 18 32 2.6 0.0 0.0 chrM 10947 10985 (2809) + AT_rich Low_complexity 1 39 (0) 19 25 0.0 0.0 0.0 chrM 11805 11829 (1965) + AT_rich Low_complexity 1 25 (0) 20 21 3.6 0.0 0.0 chrM 12027 12054 (1740) + AT_rich Low_complexity 1 28 (0) 21 24 3.2 0.0 0.0 chrM 12320 12350 (1444) + AT_rich Low_complexity 1 31 (0) 22 27 8.1 0.0 0.0 chrM 12727 12788 (1006) + AT_rich Low_complexity 1 62 (0) 23 22 9.9 0.0 0.0 chrM 12883 12953 (841) + AT_rich Low_complexity 1 71 (0) 24 60 7.2 0.0 1.9 chrM 13006 13161 (633) + AT_rich Low_complexity 1 153 (0) 25 28 9.1 0.0 0.0 chrM 13193 13269 (525) + AT_rich Low_complexity 1 77 (0) 26 468 6.9 9.4 0.0 chrM 13358 13516 (278) + (TA)n Simple_repeat 1 174 (0) 27 344 8.5 8.3 2.4 chrM 13491 13658 (136) + (TTATA)n Simple_repeat 2 179 (0) 28 IRanges/inst/extdata/hg18liftAll.lft0000644000175400017540000004065713175713360020347 0ustar00biocbuildbiocbuild0 1/NT_077402 167280 chr1 247249719 217280 1/NT_077911 40302 chr1 247249719 307582 1/NT_077912 153649 chr1 247249719 511231 1/NT_004350 2112849 chr1 247249719 2674080 1/NT_004321 1161048 chr1 247249719 3895128 1/NT_004547 1440092 chr1 247249719 5385220 1/NT_021937 7590365 chr1 247249719 13025585 1/NT_113791 116914 chr1 247249719 13192499 1/NT_077382 237250 chr1 247249719 13479749 1/NT_004873 3518496 chr1 247249719 17048245 1/NT_004610 12702424 chr1 247249719 29800669 1/NT_032977 73835825 chr1 247249719 103686494 1/NT_113792 157344 chr1 247249719 103893838 1/NT_019273 16604841 chr1 247249719 120548679 1/NT_086586 189539 chr1 247249719 120788218 1/NT_077389 398739 chr1 247249719 141476957 1/NT_113793 195588 chr1 247249719 141722545 1/NT_113794 186739 chr1 247249719 141959284 1/NT_113795 175055 chr1 247249719 142184339 1/NT_113796 201709 chr1 247249719 142436048 1/NT_113797 126477 chr1 247249719 142612525 1/NT_079485 194615 chr1 247249719 142857140 1/NT_079497 78698 chr1 247249719 142985838 1/NT_077932 127263 chr1 247249719 143163101 1/NT_077933 170669 chr1 247249719 143383770 1/NT_113798 38311 chr1 247249719 143522081 1/NT_004434 1022394 chr1 247249719 144594475 1/NT_034398 281532 chr1 247249719 144926007 1/NT_034400 1566655 chr1 247249719 146542662 1/NT_113799 185320 chr1 247249719 146777982 1/NT_079483 172789 chr1 247249719 147000771 1/NT_034401 220313 chr1 247249719 147271084 1/NT_034403 455185 chr1 247249719 147776269 1/NT_004487 56413061 chr1 247249719 204239330 1/NT_086602 259514 chr1 247249719 204548844 1/NT_021877 17265625 chr1 247249719 221864469 1/NT_004559 11394365 chr1 247249719 233308834 1/NT_004836 13665999 chr1 247249719 247024833 1/NT_032968 174886 chr1 247249719 50000 10/NT_077567 5577110 chr10 135374737 5677110 10/NT_077569 12337571 chr10 135374737 18064681 10/NT_008705 20794160 chr10 135374737 38908841 10/NT_024133 286100 chr10 135374737 41674941 10/NT_079540 191752 chr10 135374737 41916693 10/NT_033985 3830277 chr10 135374737 45896970 10/NT_031847 952205 chr10 135374737 46999175 10/NT_077570 263307 chr10 135374737 47412482 10/NT_077571 163231 chr10 135374737 47725713 10/NT_030772 989829 chr10 135374737 48865542 10/NT_017696 1941874 chr10 135374737 50857416 10/NT_035036 211435 chr10 135374737 51118851 10/NT_008583 30112613 chr10 135374737 81241464 10/NT_030059 44617998 chr10 135374737 125909462 10/NT_035040 2696597 chr10 135374737 128656059 10/NT_008818 4615335 chr10 135374737 133281394 10/NT_025835 246123 chr10 135374737 133577517 10/NT_017795 1797220 chr10 135374737 50000 11/NT_035113 1102759 chr11 134452384 1169335 11/NT_009237 49571094 chr11 134452384 50947429 11/NT_035158 503352 chr11 134452384 54450781 11/NT_033903 14395596 chr11 134452384 68848982 11/NT_078088 588129 chr11 134452384 69454899 11/NT_033927 17911127 chr11 134452384 87378026 11/NT_008984 8549206 chr11 134452384 95942794 11/NT_033899 38509590 chr11 134452384 16000 12/NT_009759 7043293 chr12 132349534 7132293 12/NT_009714 27615668 chr12 132349534 36142961 12/NT_029419 38648979 chr12 132349534 75041940 12/NT_019546 32815934 chr12 132349534 107914874 12/NT_009775 13091146 chr12 132349534 121156020 12/NT_009755 10116045 chr12 132349534 131317065 12/NT_024477 972469 chr12 132349534 17918000 13/NT_024524 67740325 chr13 114142980 85708325 13/NT_009952 25443670 chr13 114142980 111551995 13/NT_027140 1821999 chr13 114142980 113473994 13/NT_077627 184056 chr13 114142980 113758050 13/NT_024498 369930 chr13 114142980 18070000 14/NT_026437 88290585 chr14 106368585 18260000 15/NT_037852 1403478 chr15 100338915 19763478 15/NT_077631 334079 chr15 100338915 20197557 15/NT_078094 868737 chr15 100338915 21116294 15/NT_026446 3571299 chr15 100338915 24731593 15/NT_078095 438013 chr15 100338915 25270606 15/NT_010280 1108140 chr15 100338915 26478746 15/NT_078096 417989 chr15 100338915 26996735 15/NT_010194 53619965 chr15 100338915 80676700 15/NT_077661 2098777 chr15 100338915 82835477 15/NT_010274 13510195 chr15 100338915 96367672 15/NT_035325 3971243 chr15 100338915 0 16/NT_037887 8576922 chr16 88827254 8594422 16/NT_010393 25336229 chr16 88827254 34030651 16/NT_024773 1112651 chr16 88827254 44943302 16/NT_010498 42003582 chr16 88827254 86966884 16/NT_010542 1855370 chr16 88827254 0 17/NT_024972 296854 chr17 78774742 343376 17/NT_010718 21163833 chr17 78774742 21607209 17/NT_024862 579924 chr17 78774742 22287133 17/NT_010799 9412828 chr17 78774742 31799961 17/NT_078100 1629269 chr17 78774742 33529230 17/NT_010755 5072491 chr17 78774742 38701721 17/NT_010783 24793602 chr17 78774742 63585323 17/NT_010641 11472733 chr17 78774742 75211056 17/NT_024871 2103126 chr17 78774742 77379182 17/NT_010663 1275560 chr17 78774742 0 18/NT_010859 15400898 chr18 76117153 16764896 18/NT_010966 33548238 chr18 76117153 50360134 18/NT_025028 20074199 chr18 76117153 70462333 18/NT_025004 3388475 chr18 76117153 73872808 18/NT_010879 2244345 chr18 76117153 11000 19/NT_011255 7286004 chr19 63811651 7302004 19/NT_077812 1291194 chr19 63811651 8598198 19/NT_011295 15825424 chr19 63811651 32423622 19/NT_011109 31383029 chr19 63811651 0 2/NT_022327 1254071 chr2 242951149 1255071 2/NT_022221 2252116 chr2 242951149 3557187 2/NT_022139 1426129 chr2 242951149 5083316 2/NT_005334 11088087 chr2 242951149 16221403 2/NT_015926 4791168 chr2 242951149 21037571 2/NT_022184 68373980 chr2 242951149 89561551 2/NT_032994 397279 chr2 242951149 90958830 2/NT_034508 731068 chr2 242951149 94689898 2/NT_026970 2594449 chr2 242951149 97296847 2/NT_022171 12173457 chr2 242951149 109612304 2/NT_034485 736346 chr2 242951149 110498650 2/NT_077407 359898 chr2 242951149 111008548 2/NT_022135 38390280 chr2 242951149 149498828 2/NT_005403 84213157 chr2 242951149 233731985 2/NT_005120 5688986 chr2 242951149 239453971 2/NT_113800 12944 chr2 242951149 239496915 2/NT_022173 952154 chr2 242951149 240474069 2/NT_005416 2277080 chr2 242951149 8000 20/NT_011387 26259569 chr20 62435964 28033230 20/NT_025215 234339 chr20 62435964 29267569 20/NT_028392 5092930 chr20 62435964 34380499 20/NT_011362 26144333 chr20 62435964 60551882 20/NT_035608 71932 chr20 62435964 60733814 20/NT_011333 1702150 chr20 62435964 9719767 21/NT_029490 490233 chr21 46944323 13260000 21/NT_011512 28617429 chr21 46944323 41878628 21/NT_030188 1627105 chr21 46944323 43507092 21/NT_011515 3437231 chr21 46944323 14430000 22/NT_028395 647850 chr22 49691432 15227850 22/NT_011519 3661581 chr22 49691432 18939431 22/NT_011520 23276302 chr22 49691432 42227733 22/NT_011521 830225 chr22 49691432 43107958 22/NT_011523 4248192 chr22 49691432 47366250 22/NT_011525 1384186 chr22 49691432 48767136 22/NT_019197 320440 chr22 49691432 49089176 22/NT_113818 17927 chr22 49691432 49126803 22/NT_011526 464629 chr22 49691432 0 22_h2_hap1/NT_113959 63661 chr22_h2_hap1 63661 35000 3/NT_022517 66080833 chr3 199501827 66375833 3/NT_022459 24211711 chr3 199501827 94987544 3/NT_005612 100530253 chr3 199501827 195537797 3/NT_005535 1299866 chr3 199501827 196864663 3/NT_029928 2582164 chr3 199501827 0 4/NT_037622 1413146 chr4 191273063 1464146 4/NT_006081 2419310 chr4 191273063 3963456 4/NT_006051 4820284 chr4 191273063 8933740 4/NT_006316 22487426 chr4 191273063 31492166 4/NT_022794 976586 chr4 191273063 32527752 4/NT_016297 7445039 chr4 191273063 39992791 4/NT_006238 9040907 chr4 191273063 49183698 4/NT_037645 171176 chr4 191273063 52354874 4/NT_022853 7074452 chr4 191273063 59479326 4/NT_022778 9796115 chr4 191273063 69375441 4/NT_077444 2161413 chr4 191273063 71711854 4/NT_006216 3929449 chr4 191273063 75671303 4/NT_016354 92123751 chr4 191273063 167825054 4/NT_022792 23438009 chr4 191273063 63000 5/NT_006576 46378398 chr5 180857866 49441398 5/NT_006713 42230486 chr5 180857866 91711884 5/NT_023148 5878002 chr5 180857866 97612886 5/NT_034772 41199371 chr5 180857866 138817257 5/NT_029289 16301663 chr5 180857866 155123020 5/NT_023133 25714846 chr5 180857866 0 5_h2_hap1/NT_113801 1146088 chr5_h2_hap1 1794870 1186088 5_h2_hap1/NT_113802 608782 chr5_h2_hap1 1794870 5000 6/NT_034880 9194728 chr6 170899992 9249728 6/NT_007592 48945890 chr6 170899992 58245618 6/NT_033172 642507 chr6 170899992 61938125 6/NT_033948 248423 chr6 170899992 62236548 6/NT_007299 33500716 chr6 170899992 95937264 6/NT_025741 61645385 chr6 170899992 157632649 6/NT_007422 10134273 chr6 170899992 167784922 6/NT_007302 2236975 chr6 170899992 170171897 6/NT_007583 725095 chr6 170899992 0 6_cox_hap1/NT_113891 4731698 chr6_cox_hap1 4731698 0 6_qbl_hap2/NT_113892 475847 chr6_qbl_hap2 4565931 635157 6_qbl_hap2/NT_113893 1999704 chr6_qbl_hap2 4565931 2686011 6_qbl_hap2/NT_113894 291112 chr6_qbl_hap2 4565931 3003554 6_qbl_hap2/NT_113895 266639 chr6_qbl_hap2 4565931 3322986 6_qbl_hap2/NT_113896 623992 chr6_qbl_hap2 4565931 3973953 6_qbl_hap2/NT_113897 591978 chr6_qbl_hap2 4565931 34000 7/NT_029998 293567 chr7 158821424 477567 7/NT_007819 47690382 chr7 158821424 48207949 7/NT_030008 2130176 chr7 158821424 50378125 7/NT_033968 6577293 chr7 158821424 57005418 7/NT_023629 1052855 chr7 158821424 61058273 7/NT_023603 256182 chr7 158821424 61364455 7/NT_077528 190137 chr7 158821424 61604592 7/NT_007758 12749068 chr7 158821424 74603660 7/NT_007933 64426257 chr7 158821424 139054917 7/NT_007914 14846650 chr7 158821424 154001567 7/NT_034885 736332 chr7 158821424 154817899 7/NT_007741 4003525 chr7 158821424 0 8/NT_023736 7462059 chr8 146274826 7562059 8/NT_077531 4537293 chr8 146274826 12199352 8/NT_030737 9464880 chr8 146274826 21681632 8/NT_023666 8051036 chr8 146274826 29798768 8/NT_007995 14159284 chr8 146274826 46958052 8/NT_023678 1291149 chr8 146274826 48309201 8/NT_008183 38454502 chr8 146274826 86851003 8/NT_008046 57155273 chr8 146274826 144106276 8/NT_023684 1290020 chr8 146274826 145403396 8/NT_037704 871430 chr8 146274826 0 9/NT_008413 39653686 chr9 140273252 39703686 9/NT_086745 261110 chr9 140273252 40014796 9/NT_078049 208233 chr9 140273252 40273029 9/NT_113811 142805 chr9 140273252 40465834 9/NT_078041 464507 chr9 140273252 40980341 9/NT_078042 152873 chr9 140273252 41183214 9/NT_113812 172579 chr9 140273252 41405793 9/NT_078043 1198158 chr9 140273252 42653951 9/NT_078055 549743 chr9 140273252 43253694 9/NT_078045 632871 chr9 140273252 43936565 9/NT_079529 680077 chr9 140273252 44666642 9/NT_078077 181647 chr9 140273252 44898289 9/NT_078051 291910 chr9 140273252 45240199 9/NT_078053 465318 chr9 140273252 45755517 9/NT_113813 350909 chr9 140273252 46156426 9/NT_086759 194609 chr9 140273252 46401035 9/NT_086755 498918 chr9 140273252 46949953 9/NT_078078 157546 chr9 140273252 65207499 9/NT_078052 450681 chr9 140273252 65708180 9/NT_113814 223855 chr9 140273252 65982035 9/NT_113815 162441 chr9 140273252 66194476 9/NT_078058 159539 chr9 140273252 66404015 9/NT_078059 199148 chr9 140273252 66653163 9/NT_078065 194491 chr9 140273252 66897654 9/NT_079533 158462 chr9 140273252 67106116 9/NT_078066 471702 chr9 140273252 67627818 9/NT_078067 376183 chr9 140273252 68054001 9/NT_078068 174765 chr9 140273252 68278766 9/NT_078069 289439 chr9 140273252 68618205 9/NT_078070 682157 chr9 140273252 69350362 9/NT_078071 158187 chr9 140273252 69558549 9/NT_113816 187806 chr9 140273252 69796355 9/NT_113817 178933 chr9 140273252 70025288 9/NT_023935 21507948 chr9 140273252 91583236 9/NT_079535 85380 chr9 140273252 91718616 9/NT_008470 40394265 chr9 140273252 132212881 9/NT_035014 3818133 chr9 140273252 136231014 9/NT_019501 2075804 chr9 140273252 138336818 9/NT_024000 1936434 chr9 140273252 0 M/NC_001807 16571 chrM 16571 0 X/NT_086925 34821 chrX 154913754 84821 X/NT_078115 86563 chrX 154913754 201384 X/NT_028413 766173 chrX 154913754 1017557 X/NT_086929 36556 chrX 154913754 1104113 X/NT_086931 80121 chrX 154913754 1274234 X/NT_033330 754004 chrX 154913754 2128238 X/NT_011757 34879939 chrX 154913754 37033177 X/NT_079573 12096764 chrX 154913754 49179941 X/NT_086939 680972 chrX 154913754 50040913 X/NT_011638 2371726 chrX 154913754 52462639 X/NT_011630 6136098 chrX 154913754 61598737 X/NT_011669 14971611 chrX 154913754 76590348 X/NT_011651 36813576 chrX 154913754 113473924 X/NT_028405 2122394 chrX 154913754 115616318 X/NT_011786 27718692 chrX 154913754 143365010 X/NT_011681 5427710 chrX 154913754 148832720 X/NT_011726 6081034 chrX 154913754 0 Y/NT_113967 34821 chrY 57772954 84821 Y/NT_113968 86563 chrY 57772954 201384 Y/NT_113969 766173 chrY 57772954 1017557 Y/NT_113970 36556 chrY 57772954 1104113 Y/NT_113971 80121 chrY 57772954 1274234 Y/NT_113972 754004 chrY 57772954 2128238 Y/NT_113973 581282 chrY 57772954 2709520 Y/NT_011896 6265435 chrY 57772954 9024955 Y/NT_086998 276367 chrY 57772954 9901322 Y/NT_011878 813231 chrY 57772954 11214553 Y/NT_087001 39401 chrY 57772954 11653954 Y/NT_113819 554624 chrY 57772954 12308578 Y/NT_011875 10002238 chrY 57772954 22360816 Y/NT_011903 4867933 chrY 57772954 57228749 Y/NT_025975 98295 chrY 57772954 57377044 Y/NT_091573 66393 chrY 57772954 57443437 Y/NT_113974 329517 chrY 57772954 0 1/NT_113870 145186 chr1_random 1663265 195186 1/NT_113871 197748 chr1_random 1663265 442934 1/NT_113872 183763 chr1_random 1663265 676697 1/NT_113873 51825 chr1_random 1663265 778522 1/NT_113874 136815 chr1_random 1663265 965337 1/NT_113875 114056 chr1_random 1663265 1129393 1/NT_113876 25994 chr1_random 1663265 1205387 1/NT_113877 208942 chr1_random 1663265 1464329 1/NT_113878 106433 chr1_random 1663265 1620762 1/NT_113879 42503 chr1_random 1663265 0 10/NT_113918 113275 chr10_random 113275 0 11/NT_113919 40524 chr11_random 215294 90524 11/NT_113920 35155 chr11_random 215294 175679 11/NT_113921 39615 chr11_random 215294 0 13/NT_113923 186858 chr13_random 186858 0 15/NT_113924 139260 chr15_random 784346 189260 15/NT_113925 168820 chr15_random 784346 408080 15/NT_113926 119514 chr15_random 784346 577594 15/NT_113927 111864 chr15_random 784346 739458 15/NT_113928 44888 chr15_random 784346 0 16/NT_113929 105485 chr16_random 105485 0 17/NT_113930 174588 chr17_random 2617613 224588 17/NT_113931 186078 chr17_random 2617613 460666 17/NT_113932 104495 chr17_random 2617613 615161 17/NT_113933 142595 chr17_random 2617613 807756 17/NT_113934 120350 chr17_random 2617613 978106 17/NT_113935 185449 chr17_random 2617613 1213555 17/NT_113936 163628 chr17_random 2617613 1427183 17/NT_113937 37443 chr17_random 2617613 1514626 17/NT_113938 45226 chr17_random 2617613 1609852 17/NT_113939 147354 chr17_random 2617613 1807206 17/NT_113940 19187 chr17_random 2617613 1876393 17/NT_113941 37498 chr17_random 2617613 1963891 17/NT_113942 117663 chr17_random 2617613 2131554 17/NT_113943 81310 chr17_random 2617613 2262864 17/NT_113944 182567 chr17_random 2617613 2495431 17/NT_113945 41001 chr17_random 2617613 2586432 17/NT_113946 31181 chr17_random 2617613 0 18/NT_113947 4262 chr18_random 4262 0 19/NT_113948 92689 chr19_random 301858 142689 19/NT_113949 159169 chr19_random 301858 0 2/NT_113880 185571 chr2_random 185571 0 21/NT_113950 28709 chr21_random 1679693 78709 21/NT_113951 152296 chr21_random 1679693 281005 21/NT_113952 184355 chr21_random 1679693 515360 21/NT_113953 131056 chr21_random 1679693 696416 21/NT_113954 129889 chr21_random 1679693 876305 21/NT_113955 178865 chr21_random 1679693 1105170 21/NT_113956 150002 chr21_random 1679693 1305172 21/NT_113957 166452 chr21_random 1679693 1521624 21/NT_113958 158069 chr21_random 1679693 0 22/NT_113960 40752 chr22_random 257318 90752 22/NT_113961 166566 chr22_random 257318 0 3/NT_113881 146010 chr3_random 749256 196010 3/NT_113882 172475 chr3_random 749256 418485 3/NT_113883 137703 chr3_random 749256 606188 3/NT_113884 143068 chr3_random 749256 0 4/NT_113885 189789 chr4_random 842648 239789 4/NT_113886 96249 chr4_random 842648 386038 4/NT_113887 3994 chr4_random 842648 440032 4/NT_113888 191469 chr4_random 842648 681501 4/NT_113889 161147 chr4_random 842648 0 5/NT_113890 143687 chr5_random 143687 0 6/NT_113898 1305230 chr6_random 1875562 1355230 6/NT_113899 520332 chr6_random 1875562 0 7/NT_113900 112804 chr7_random 549659 162804 7/NT_113901 182896 chr7_random 549659 395700 7/NT_113902 153959 chr7_random 549659 0 8/NT_113903 12854 chr8_random 943810 62854 8/NT_113904 50950 chr8_random 943810 163804 8/NT_113905 183161 chr8_random 943810 396965 8/NT_113906 46082 chr8_random 943810 493047 8/NT_113907 37175 chr8_random 943810 580222 8/NT_113908 13036 chr8_random 943810 643258 8/NT_113909 38914 chr8_random 943810 732172 8/NT_113910 211638 chr8_random 943810 0 9/NT_113911 36148 chr9_random 1146434 86148 9/NT_113912 185143 chr9_random 1146434 321291 9/NT_113913 154740 chr9_random 1146434 526031 9/NT_113914 90085 chr9_random 1146434 666116 9/NT_113915 187035 chr9_random 1146434 903151 9/NT_113916 173443 chr9_random 1146434 1126594 9/NT_113917 19840 chr9_random 1146434 0 X/NT_113962 217385 chrX_random 1719168 267385 X/NT_113963 24360 chrX_random 1719168 341745 X/NT_113964 204131 chrX_random 1719168 595876 X/NT_113965 1005289 chrX_random 1719168 1651165 X/NT_113966 68003 chrX_random 1719168 IRanges/inst/extdata/hs_b36v3_chrY.agp0000644000175400017540000003070513175713360020565 0ustar00biocbuildbiocbuild# # Homo sapiens chromosome Y, reference assembly, complete sequence # # This file provides assembly instructions for sequence NC_000024 # included in reference assembly of NCBI build 36 (HGSC Finished Genome v4.0). # #chrom chr_start chr_stop part_no part_type comp_id/gap_len comp_type/gap_type comp_end/linkage orientation/empty chrY 1 34821 1 F BX640545.2 1 34821 + chrY 34822 84821 2 N 50000 contig no chrY 84822 122592 3 F AL954722.18 1 37771 + chrY 122593 157464 4 F BX537334.4 1 34872 - chrY 157465 171384 5 F BX000483.7 1999 15918 + chrY 171385 201384 6 N 30000 contig no chrY 201385 232395 7 F AL954664.17 8190 39200 - chrY 232396 265528 8 F BX000476.5 208 33340 + chrY 265529 482251 9 F AL732314.18 2001 218723 + chrY 482252 550112 10 F BX004827.18 51695 119555 + chrY 550113 723877 11 F AL683871.15 2001 175765 + chrY 723878 837875 12 F AL672311.26 2001 115998 + chrY 837876 967557 13 F AL672277.20 2001 131682 + chrY 967558 1017557 14 N 50000 contig no chrY 1017558 1054113 15 F BX908402.3 1 36556 + chrY 1054114 1104113 16 N 50000 contig no chrY 1104114 1147822 17 F BX649635.3 1 43709 + chrY 1147823 1184234 18 F BX901949.9 1685 38096 + chrY 1184235 1274234 19 N 90000 contig no chrY 1274235 1307891 20 F BX908382.8 1 33657 + chrY 1307892 1390773 21 F BX649553.6 2001 84882 + chrY 1390774 1425289 22 F BX901885.7 1 34516 + chrY 1425290 1458955 23 F BX119906.16 2001 35666 + chrY 1458956 1618348 24 F AL683870.15 1001 160393 + chrY 1618349 1661433 25 F AL691415.17 1 43085 + chrY 1661434 1849259 26 F AL683807.22 1 187826 + chrY 1849260 1966556 27 F AL672040.10 1 117297 + chrY 1966557 2028238 28 F CR381640.8 2001 63682 + chrY 2028239 2128238 29 N 100000 contig no chrY 2128239 2165561 30 F CR856018.10 1 37323 + chrY 2165562 2202890 31 F CR381696.5 2000 39328 + chrY 2202891 2280454 32 F BX649443.16 2001 79564 + chrY 2280455 2333896 33 F BX119919.5 1 53442 - chrY 2333897 2514591 34 F AC079176.15 1 180695 - chrY 2514592 2593088 35 F AC097314.27 1 78497 - chrY 2593089 2709520 36 F AC006209.25 23323 139754 - chrY 2709521 2838553 37 F AC006040.3 57272 186304 + chrY 2838554 2845472 38 F AC074181.1 1 6919 + chrY 2845473 2999955 39 F AC006157.2 1 154483 + chrY 2999956 3170037 40 F AC006032.2 1 170082 + chrY 3170038 3247254 41 F AC006152.4 1 77217 + chrY 3247255 3316539 42 F AC011305.2 1 69285 + chrY 3316540 3471638 43 F AC009479.4 1 155099 + chrY 3471639 3567129 44 F AC019058.4 1 95491 + chrY 3567130 3650550 45 F AC024038.6 1 83421 + chrY 3650551 3827222 46 F AC012078.3 1 176672 + chrY 3827223 3945458 47 F AC010094.5 1 118236 + chrY 3945459 4015462 48 F AC010737.4 1 70004 + chrY 4015463 4164401 49 F AC010084.3 1 148939 + chrY 4164402 4316800 50 F AC010905.3 1 152399 + chrY 4316801 4335619 51 F AC010106.2 1 18819 + chrY 4335620 4367320 52 F AC024703.5 1 31701 + chrY 4367321 4576407 53 F AC012077.4 1 209087 + chrY 4576408 4712669 54 F AC010142.4 1 136262 + chrY 4712670 4829264 55 F AC019060.5 1 116595 + chrY 4829265 4917080 56 F AC023423.5 1 87816 + chrY 4917081 5077603 57 F AC010722.2 1 160523 + chrY 5077604 5237705 58 F AC010685.3 1 160102 + chrY 5237706 5281650 59 F AC010129.3 1 43945 + chrY 5281651 5445957 60 F AC012067.2 1 164307 + chrY 5445958 5520761 61 F AC012667.2 1 74804 + chrY 5520762 5697342 62 F AC010081.4 1 176581 + chrY 5697343 5811187 63 F AC010874.3 1 113845 + chrY 5811188 5947272 64 F AC010977.4 1 136085 + chrY 5947273 6015798 65 F AC016681.2 1 68526 + chrY 6015799 6163940 66 F AC010140.3 1 148142 + chrY 6163941 6386252 67 F AC006335.2 1 222312 + chrY 6386253 6441403 68 F AC010154.3 1 55151 + chrY 6441404 6607550 69 F AC010144.4 1 166147 + chrY 6607551 6623520 70 F AC010728.4 1 15970 + chrY 6623521 6823534 71 F AC013412.3 1 200014 + chrY 6823535 6877729 72 F AC011297.3 1 54195 + chrY 6877730 7044045 73 F AC012068.5 1 166316 + chrY 7044046 7154095 74 F AC010104.3 1 110050 + chrY 7154096 7233943 75 F AC010143.3 1 79848 + chrY 7233944 7405192 76 F AC007284.4 1 171249 + chrY 7405193 7521780 77 F AC007247.5 1 116588 + chrY 7521781 7638889 78 F AC007274.3 1 117109 + chrY 7638890 7808593 79 F AC007275.4 1 169704 + chrY 7808594 7883829 80 F AC010678.4 1 75236 + chrY 7883830 7997798 81 F AC010902.4 1 113969 + chrY 7997799 8184715 82 F AC016749.4 1 186917 + chrY 8184716 8278929 83 F AC051663.9 1 94214 + chrY 8278930 8370383 84 F AC025731.12 1 91454 + chrY 8370384 8544585 85 F AC016991.5 1 174202 + chrY 8544586 8634069 86 F AC064829.6 1 89484 + chrY 8634070 8774803 87 F AC009491.3 1 140734 + chrY 8774804 8962150 88 F AC007967.3 1 187347 + chrY 8962151 8972934 89 F AC068719.3 1 10784 + chrY 8972935 8974955 90 F AC079126.3 1 2021 + chrY 8974956 9024955 91 N 50000 clone no chrY 9024956 9030327 92 F AC079125.4 1 5372 + chrY 9030328 9178185 93 F AC009952.4 1 147858 + chrY 9178186 9190963 94 F AC025732.9 1 12778 + chrY 9190964 9301322 95 F AC006158.6 1 110359 + chrY 9301323 9901322 96 N 600000 clone no chrY 9901323 10013703 97 F AC006156.5 1 112381 + chrY 10013704 10088698 98 F AC025819.7 1 74995 + chrY 10088699 10250877 99 F AC017019.3 1 162179 + chrY 10250878 10283186 100 F AC010891.2 1 32309 + chrY 10283187 10456943 101 F AC006986.3 1 173757 + chrY 10456944 10622784 102 F AC006987.2 1 165841 + chrY 10622785 10714553 103 F AC010970.3 1 91769 + chrY 10714554 11214553 104 N 500000 clone no chrY 11214554 11253954 105 F AC069323.5 1 39401 + chrY 11253955 11653954 106 N 400000 centromere no chrY 11653955 11738549 107 F AC140113.3 71061 155655 - chrY 11738550 11861114 108 F AC134878.3 33249 155813 - chrY 11861115 12003063 109 F AC134882.2 10706 152654 - chrY 12003064 12208578 110 F AC134879.3 1 205515 - chrY 12208579 12308578 111 N 100000 centromere no chrY 12308579 12468100 112 F AC011293.5 1 159522 + chrY 12468101 12581699 113 F AC012502.3 1 113599 + chrY 12581700 12759636 114 F AC011302.3 1 177937 + chrY 12759637 12838587 115 F AC013735.5 1 78951 + chrY 12838588 12911566 116 F AC004772.2 40021 112999 + chrY 12911567 12936024 117 F AC005942.2 1 24458 - chrY 12936025 13059669 118 F AC002992.1 2001 125645 + chrY 13059670 13234892 119 F AC004617.2 1 175223 + chrY 13234893 13319390 120 F AC004810.1 1 84498 - chrY 13319391 13515290 121 F AC002531.1 2001 197900 + chrY 13515291 13619376 122 F AC004474.1 44195 148280 + chrY 13619377 13664255 123 F AC006565.4 1 44879 - chrY 13664256 13879980 124 F AC005820.1 1 215725 - chrY 13879981 13952171 125 F AC010877.3 64839 137029 + chrY 13952172 14123978 126 F AC006376.2 1 171807 + chrY 14123979 14159738 127 F AC007004.3 1 35760 + chrY 14159739 14264748 128 F AC006383.2 1 105010 + chrY 14264749 14466902 129 F AC006371.2 1 202154 + chrY 14466903 14639848 130 F AC006370.2 1 172946 + chrY 14639849 14742756 131 F AC018677.3 1 102908 + chrY 14742757 14779500 132 F AC010720.4 1 36744 + chrY 14779501 14953720 133 F AC010723.3 1 174220 + chrY 14953721 14981864 134 F AC019191.4 1 28144 + chrY 14981865 15158188 135 F AC010726.4 1 176324 + chrY 15158189 15267382 136 F AC010979.3 1 109194 + chrY 15267383 15447103 137 F AC010879.2 1 179721 + chrY 15447104 15512578 138 F AC011903.4 1 65475 + chrY 15512579 15557501 139 F AC017032.3 1 44923 + chrY 15557502 15714967 140 F AC006989.3 1 157466 + chrY 15714968 15795304 141 F AC011289.4 1 80337 + chrY 15795305 15909741 142 F AC010972.3 1 114437 + chrY 15909742 15967360 143 F AC007007.3 1 57619 + chrY 15967361 16111420 144 F AC006998.3 1 144060 + chrY 16111421 16274030 145 F AC006382.3 1 162610 + chrY 16274031 16413824 146 F AC006462.3 1 139794 + chrY 16413825 16509580 147 F AC006336.4 1 95756 + chrY 16509581 16528817 148 F AC016671.3 1 19237 + chrY 16528818 16695020 149 F AC017020.4 1 166203 + chrY 16695021 16899873 150 F AC011749.2 1 204853 + chrY 16899874 16926987 151 F AC053516.10 1 27114 + chrY 16926988 17103738 152 F AC010135.3 1 176751 + chrY 17103739 17166324 153 F AC010128.3 1 62586 + chrY 17166325 17345839 154 F AC011751.2 1 179515 + chrY 17345840 17488621 155 F AC016678.4 1 142782 + chrY 17488622 17510633 156 F AC015979.4 1 22012 + chrY 17510634 17659631 157 F AC007034.4 1 148998 + chrY 17659632 17766833 158 F AC007043.3 1 107202 + chrY 17766834 17879511 159 F AC006999.2 1 112678 + chrY 17879512 17886218 160 F AC007042.3 1 6707 + chrY 17886219 17923463 161 F AC091329.3 1 37245 + chrY 17923464 18051552 162 F AC007972.4 1 128089 + chrY 18051553 18230132 163 F AC015978.4 1 178580 + chrY 18230133 18244934 164 F AC068704.4 1 14802 + chrY 18244935 18443908 165 F AC007742.4 1 198974 + chrY 18443909 18474219 166 F AC095381.1 1 30311 + chrY 18474220 18653273 167 F AC009976.4 1 179054 + chrY 18653274 18698502 168 F AC095380.1 1 45229 + chrY 18698503 18861991 169 F AC024183.4 1 163489 + chrY 18861992 19023659 170 F AC007241.3 1 161668 + chrY 19023660 19090333 171 F AC069130.6 1 66674 + chrY 19090334 19096847 172 F AC073962.5 1 6514 + chrY 19096848 19134213 173 F AC068541.7 1 37366 + chrY 19134214 19290370 174 F AC022486.4 1 156157 + chrY 19290371 19464252 175 F AC007379.2 1 173882 + chrY 19464253 19623962 176 F AC009235.4 1 159710 + chrY 19623963 19808739 177 F AC007244.2 1 184777 + chrY 19808740 19871998 178 F AC021210.4 1 63259 + chrY 19871999 20023872 179 F AC010133.4 1 151874 + chrY 20023873 20066067 180 F AC012062.4 1 42195 + chrY 20066068 20189568 181 F AC010137.3 1 123501 + chrY 20189569 20221039 182 F AC009977.4 1 31471 + chrY 20221040 20400963 183 F AC010889.3 1 179924 + chrY 20400964 20472584 184 F AC010151.3 1 71621 + chrY 20472585 20667823 185 F AC009233.3 1 195239 + chrY 20667824 20778857 186 F AC079157.3 1 111034 + chrY 20778858 20829067 187 F AC079261.2 1 50210 + chrY 20829068 20846452 188 F AC079156.4 1 17385 + chrY 20846453 20889975 189 F AC024250.6 1 43523 + chrY 20889976 20991065 190 F AC009240.6 1 101090 + chrY 20991066 21140390 191 F AC011745.4 1 149325 + chrY 21140391 21249363 192 F AC007678.3 1 108973 + chrY 21249364 21437477 193 F AC009494.2 1 188114 + chrY 21437478 21456819 194 F AC026061.8 1 19342 + chrY 21456820 21557572 195 F AC009489.3 1 100753 + chrY 21557573 21722251 196 F AC007876.2 1 164679 + chrY 21722252 21858422 197 F AC009239.3 1 136171 + chrY 21858423 22029157 198 F AC010086.4 1 170735 + chrY 22029158 22150215 199 F AC010141.2 1 121058 + chrY 22150216 22310816 200 F AC021107.3 1 160601 + chrY 22310817 22360816 201 N 50000 clone no chrY 22360817 22384194 202 F AC078938.3 1 23378 + chrY 22384195 22403616 203 F AC024236.5 1 19422 + chrY 22403617 22587631 204 F AC007322.4 1 184015 + chrY 22587632 22688058 205 F AC007359.3 1 100427 + chrY 22688059 22757212 206 F AC023342.3 1 69154 + chrY 22757213 22817822 207 F AC025227.6 1 60610 + chrY 22817823 23005516 208 F AC007320.3 1 187694 + chrY 23005517 23210553 209 F AC008175.2 1 205037 + chrY 23210554 23230057 210 F AC016694.2 1 19504 + chrY 23230058 23386813 211 F AC010080.2 1 156756 + chrY 23386814 23417437 212 F AC016911.6 1 30624 + chrY 23417438 23583695 213 F AC006366.4 201 166458 - chrY 23583696 23691182 214 F AC010088.4 1 107487 + chrY 23691183 23794414 215 F AC053490.2 1 103232 + chrY 23794415 23811321 216 F AC007039.6 1 16907 + chrY 23811322 23991466 217 F AC006983.4 1 180145 + chrY 23991467 24140496 218 F AC009947.2 1 149030 + chrY 24140497 24157833 219 F AC016707.2 1 17337 + chrY 24157834 24324069 220 F AC016752.2 1 166236 + chrY 24324070 24353833 221 F AC025246.6 1 29764 + chrY 24353834 24474584 222 F AC073649.3 1 120751 + chrY 24474585 24538834 223 F AC073893.4 1 64250 + chrY 24538835 24614390 224 F AC068601.8 1 75556 + chrY 24614391 24751872 225 F AC023274.2 1 137482 + chrY 24751873 24941229 226 F AC012005.4 1 189357 + chrY 24941230 24948693 227 F AC013465.4 1 7464 + chrY 24948694 25102779 228 F AC016698.3 1 154086 + chrY 25102780 25206726 229 F AC010153.3 1 103947 + chrY 25206727 25291998 230 F AC025735.4 1 85272 + chrY 25291999 25395154 231 F AC010089.4 1 103156 + chrY 25395155 25397101 232 F AC006982.3 1 1947 + chrY 25397102 25572891 233 F AC006338.6 1 175790 + chrY 25572892 25724743 234 F AC016728.4 1 151852 + chrY 25724744 25888772 235 F AC006386.4 1 164029 + chrY 25888773 26066341 236 F AC006328.5 1 177569 + chrY 26066342 26203218 237 F AC007562.4 1 136877 + chrY 26203219 26348595 238 F AC010682.3 1 145377 + chrY 26348596 26443316 239 F AC017005.7 1 94721 + chrY 26443317 26625199 240 F AC007965.3 1 181883 + chrY 26625200 26798241 241 F AC006991.3 1 173042 + chrY 26798242 26906387 242 F AC024067.4 1 108146 + chrY 26906388 27086206 243 F AC013734.4 1 179819 + chrY 27086207 27194539 244 F AC019099.6 1 108333 + chrY 27194540 27228749 245 F AC073880.5 1 34210 + chrY 27228750 57228749 246 N 30000000 heterochromatin no chrY 57228750 57327044 247 F AC068123.5 1 98295 + chrY 57327045 57377044 248 N 50000 clone no chrY 57377045 57443437 249 F AC025226.4 101674 168066 - chrY 57443438 57614293 250 F AJ271735.1 69145 240000 + chrY 57614294 57772954 251 F AJ271736.1 1 158661 + IRanges/inst/include/0000755000175400017540000000000013175713360015536 5ustar00biocbuildbiocbuildIRanges/inst/include/IRanges_defines.h0000644000175400017540000000312213175713360020732 0ustar00biocbuildbiocbuild/***************************************************************************** IRanges C interface: typedefs and defines ----------------------------------------- The IRanges C interface is split in 2 files: 1. IRanges_defines.h (this file): contains the typedefs and defines of the interface. 2. IRanges_interface.h (in this directory): contains the prototypes of the IRanges C routines that are part of the interface. Please consult IRanges_interface.h for how to use this interface in your package. *****************************************************************************/ #ifndef IRANGES_DEFINES_H #define IRANGES_DEFINES_H #include "S4Vectors_defines.h" #include #include /* * *_holder structs. */ typedef struct compressed_chars_list_holder { int length; const char *unlisted; const int *breakpoints; } CompressedCharsList_holder; typedef struct compressed_ints_list_holder { int length; const int *unlisted; const int *breakpoints; } CompressedIntsList_holder; typedef struct compressed_doubles_list_holder { int length; const double *unlisted; const int *breakpoints; } CompressedDoublesList_holder; typedef struct iranges_holder { const char *classname; int is_constant_width; int length; const int *width; const int *start; const int *end; int SEXP_offset; /* offset in 'names' member below */ SEXP names; } IRanges_holder; typedef struct compressed_iranges_list_holder { const char *classname; int length; const int *end; IRanges_holder unlistData_holder; } CompressedIRangesList_holder; #endif IRanges/inst/include/IRanges_interface.h0000644000175400017540000000631713175713360021266 0ustar00biocbuildbiocbuild/***************************************************************************** IRanges C interface: prototypes ------------------------------- The IRanges C interface is split in 2 files: 1. IRanges_defines.h (in this directory): contains the typedefs and defines of the interface. 2. IRanges_interface.h (this file): contains the prototypes of the IRanges C routines that are part of the interface. *****************************************************************************/ #include "IRanges_defines.h" /* * Comparing ranges. * (see Ranges_comparison.c) */ int overlap_code( int x_start, int x_width, int y_start, int y_width ); int invert_overlap_code(int code); /* * Low-level manipulation of IRanges objects. * (see IRanges_class.c) */ SEXP get_IRanges_start(SEXP x); SEXP get_IRanges_width(SEXP x); SEXP get_IRanges_names(SEXP x); int get_IRanges_length(SEXP x); IRanges_holder hold_IRanges(SEXP x); int get_length_from_IRanges_holder(const IRanges_holder *x_holder); int get_width_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); int get_start_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); int get_end_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); SEXP get_names_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); IRanges_holder get_linear_subset_from_IRanges_holder(const IRanges_holder *x_holder, int offset, int length); void set_IRanges_names(SEXP x, SEXP names); void copy_IRanges_slots(SEXP x, SEXP x0); SEXP new_IRanges(const char *classname, SEXP start, SEXP width, SEXP names); SEXP new_IRanges_from_IntPairAE(const char *classname, const IntPairAE *intpair_ae); SEXP new_list_of_IRanges_from_IntPairAEAE(const char *element_type, const IntPairAEAE *intpair_aeae); SEXP alloc_IRanges(const char *classname, int length); /* * Low-level manipulation of Grouping objects. * (see Grouping_class.c) */ SEXP get_H2LGrouping_high2low(SEXP x); SEXP get_H2LGrouping_low2high(SEXP x); SEXP get_Partitioning_names(SEXP x); SEXP get_PartitioningByEnd_end(SEXP x); SEXP new_PartitioningByEnd(const char *classname, SEXP end, SEXP names); /* * Low-level manipulation of CompressedList objects. * (see CompressedList_class.c) */ SEXP get_CompressedList_unlistData(SEXP x); SEXP get_CompressedList_partitioning(SEXP x); int get_CompressedList_length(SEXP x); SEXP get_CompressedList_names(SEXP x); SEXP new_CompressedList(const char *classname, SEXP unlistData, SEXP partitioning); CompressedIntsList_holder hold_CompressedIntegerList(SEXP x); int get_length_from_CompressedIntsList_holder(const CompressedIntsList_holder *x_holder); Ints_holder get_elt_from_CompressedIntsList_holder(const CompressedIntsList_holder *x_holder, int i); /* * Low-level manipulation of CompressedIRangesList objects. * (see CompressedIRangesList_class.c) */ CompressedIRangesList_holder hold_CompressedIRangesList(SEXP x); int get_length_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder); IRanges_holder get_elt_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder, int i); int get_eltNROWS_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder, int i); IRanges/inst/include/_IRanges_stubs.c0000644000175400017540000001271213175713360020614 0ustar00biocbuildbiocbuild#include "IRanges_interface.h" #define DEFINE_CCALLABLE_STUB(retT, stubname, Targs, args) \ typedef retT(*__ ## stubname ## _funtype__)Targs; \ retT stubname Targs \ { \ static __ ## stubname ## _funtype__ fun = NULL; \ if (fun == NULL) \ fun = (__ ## stubname ## _funtype__) R_GetCCallable("IRanges", "_" #stubname); \ return fun args; \ } /* * Using the above macro when retT (the returned type) is void will make Sun * Studio 12 C compiler unhappy. So we need to use the following macro to * handle that case. */ #define DEFINE_NOVALUE_CCALLABLE_STUB(stubname, Targs, args) \ typedef void(*__ ## stubname ## _funtype__)Targs; \ void stubname Targs \ { \ static __ ## stubname ## _funtype__ fun = NULL; \ if (fun == NULL) \ fun = (__ ## stubname ## _funtype__) R_GetCCallable("IRanges", "_" #stubname); \ fun args; \ return; \ } /* * Stubs for callables defined in Ranges_comparison.c */ DEFINE_CCALLABLE_STUB(int, overlap_code, (int x_start, int x_width, int y_start, int y_width), ( x_start, x_width, y_start, y_width) ) DEFINE_CCALLABLE_STUB(int, invert_overlap_code, (int code), ( code) ) /* * Stubs for callables defined in IRanges_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_start, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_width, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_IRanges_length, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(IRanges_holder, hold_IRanges, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_IRanges_holder, (const IRanges_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(int, get_width_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_start_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_end_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(SEXP, get_names_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(IRanges_holder, get_linear_subset_from_IRanges_holder, (const IRanges_holder *x_holder, int offset, int length), ( x_holder, offset, length) ) DEFINE_NOVALUE_CCALLABLE_STUB(set_IRanges_names, (SEXP x, SEXP names), ( x, names) ) DEFINE_NOVALUE_CCALLABLE_STUB(copy_IRanges_slots, (SEXP x, SEXP x0), ( x, x0) ) DEFINE_CCALLABLE_STUB(SEXP, new_IRanges, (const char *classname, SEXP start, SEXP width, SEXP names), ( classname, start, width, names) ) DEFINE_CCALLABLE_STUB(SEXP, new_IRanges_from_IntPairAE, (const char *classname, const IntPairAE *intpair_ae), ( classname, intpair_ae) ) DEFINE_CCALLABLE_STUB(SEXP, new_list_of_IRanges_from_IntPairAEAE, (const char *element_type, const IntPairAEAE *intpair_aeae), ( element_type, intpair_aeae) ) DEFINE_CCALLABLE_STUB(SEXP, alloc_IRanges, (const char *classname, int length), ( classname, length) ) /* * Stubs for callables defined in Grouping_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_H2LGrouping_high2low, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_H2LGrouping_low2high, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_Partitioning_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_PartitioningByEnd_end, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, new_PartitioningByEnd, (const char *classname, SEXP end, SEXP names), ( classname, end, names) ) /* * Stubs for callables defined in CompressedList_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_unlistData, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_partitioning, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_CompressedList_length, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, new_CompressedList, (const char *classname, SEXP unlistData, SEXP partitioning), ( classname, unlistData, partitioning) ) DEFINE_CCALLABLE_STUB(CompressedIntsList_holder, hold_CompressedIntegerList, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_CompressedIntsList_holder, (const CompressedIntsList_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(Ints_holder, get_elt_from_CompressedIntsList_holder, (const CompressedIntsList_holder *x_holder, int i), ( x_holder, i) ) /* * Stubs for callables defined in CompressedIRangesList_class.c */ DEFINE_CCALLABLE_STUB(CompressedIRangesList_holder, hold_CompressedIRangesList, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(IRanges_holder, get_elt_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_eltNROWS_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder, int i), ( x_holder, i) ) IRanges/inst/unitTests/0000755000175400017540000000000013175713360016115 5ustar00biocbuildbiocbuildIRanges/inst/unitTests/test_AtomicList-class.R0000644000175400017540000000712013175713360022452 0ustar00biocbuildbiocbuildtest_AtomicList_constructors <- function() { subclasses <- c(logical="LogicalList", integer="IntegerList", #double="NumericList", numeric="NumericList", complex="ComplexList", character="CharacterList", raw="RawList", Rle="RleList") for (elt_type in names(subclasses)) { subclass <- subclasses[[elt_type]] constructor <- get(subclass) vec1 <- get(elt_type)(6) vec2 <- get(elt_type)(8) target <- list(A=vec1, B=vec2) for (compress in c(TRUE, FALSE)) { current <- constructor(A=vec1, B=vec2, compress=compress) checkTrue(is(current, subclass)) checkIdentical(compress, is(current, "CompressedList")) checkIdentical(elt_type, elementType(current)) checkIdentical(target, as.list(current)) checkIdentical(unname(target), as.list(current, use.names=FALSE)) } } } test_AtomicList_general <- function() { vec1 <- c(1L,2L,NA,3L,NA,5L,2L,8L) vec2 <- c(NA,15L,45L,20L,NA,1L,15L,100L,80L,5L,NA) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(list1 %in% c(1L, 5L)), lapply(list1, "%in%", c(1L, 5L))) checkIdentical(lapply(list1 %in% IntegerList(one = vec1, vec2, compress = compress), as.vector), mapply("%in%", lapply(list1, as.vector), list(one = vec1, vec2))) checkIdentical(as.list(is.na(list1)), lapply(list1, is.na)) checkIdentical(as.list(match(list1, c(1L, 5L))), lapply(list1, match, c(1L, 5L))) checkIdentical(lapply(match(list1, IntegerList(one = vec1, vec2, compress = compress)), as.vector), mapply(match, lapply(list1, as.vector), list(one = vec1, vec2))) checkIdentical(as.list(sort(list1)), lapply(list1, sort)) checkIdentical(as.list(unique(list1)), lapply(list1, unique)) } } } test_RleList_methods <- function() { x1 <- RleList(11:15, 15L, integer(0), 15:16, compress=FALSE) x2 <- RleList(11:15, 15L, integer(0), 15:16, compress=TRUE) checkIdentical(as(runValue(x1), "CompressedIntegerList"), runValue(x2)) checkIdentical(as(runLength(x1), "CompressedIntegerList"), runLength(x2)) checkIdentical(as(ranges(x1), "CompressedIRangesList"), ranges(x2)) a1 <- Rle(1, 999722111) a2 <- 20 * a1 a <- RleList(a1, a2, compress=TRUE) b1 <- c(a1, a1) b2 <- 20 * b1 b <- RleList(b1, b2, compress=FALSE) ## FIXME: 'a1 <= 19:21' is taking forever and eats up all the memory in ## BioC <= 2.12! Seems like 'a1' is expanded to integer vector first, which ## is not good :-/ #for (y in list(8L, 8, 19:21)) { for (y in list(8L, 8)) { ## With a CompressedRleList target <- RleList(a1 <= y, a2 <= y, compress=TRUE) current <- a <= y checkIdentical(target, current) ## With a SimpleRleList target <- RleList(b1 <= y, b2 <= y, compress=FALSE) current <- b <= y checkIdentical(target, current) } } IRanges/inst/unitTests/test_AtomicList-utils.R0000644000175400017540000001706713175713360022520 0ustar00biocbuildbiocbuildtest_AtomicList_GroupGenerics <- function() { vec1 <- c(1L,2L,3L,5L,2L,8L) vec2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(list1 + list1), Map("+", list1, list1)) checkIdentical(as.list(log(list1)), lapply(list1, log)) checkIdentical(as.list(round(sqrt(list1))), lapply(list1, function(x) round(sqrt(x)))) checkIdentical(sum(list1), sapply(list1, sum)) } } } test_AtomicList_logical <- function() { vec1 <- c(TRUE,NA,FALSE, NA) vec2 <- c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE) for (compress in c(TRUE, FALSE)) { for (type in c("LogicalList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(!list1), lapply(list1, "!")) checkIdentical(as.list(which(list1)), lapply(list1, which)) } } } test_AtomicList_numerical <- function() { vec1 <- c(1L,2L,NA,3L,NA,5L,2L,8L) vec2 <- c(NA,15L,45L,20L,NA,1L,15L,100L,80L,5L,NA) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) list2 <- endoapply(list1, rev) checkIdentical(as.list(diff(list1)), lapply(list1, diff)) checkIdentical(as.list(pmax(list1, list2)), mapply(pmax, list1, list2)) checkIdentical(as.list(pmin(list1, list2)), mapply(pmin, list1, list2)) checkIdentical(as.list(pmax.int(list1, list2)), mapply(pmax.int, list1, list2)) checkIdentical(as.list(pmin.int(list1, list2)), mapply(pmin.int, list1, list2)) checkIdentical(mean(list1, na.rm=TRUE), sapply(list1, mean, na.rm=TRUE)) checkIdentical(var(list1, na.rm=TRUE), sapply(list1, var, na.rm=TRUE)) checkIdentical(cov(list1, list2, use="complete.obs"), mapply(cov, list1, list2, MoreArgs = list(use="complete.obs"))) checkIdentical(cor(list1, list2, use="complete.obs"), mapply(cor, list1, list2, MoreArgs = list(use="complete.obs"))) checkIdentical(sd(list1, na.rm=TRUE), sapply(list1, sd, na.rm=TRUE)) checkIdentical(median(list1, na.rm=TRUE), sapply(list1, median, na.rm=TRUE)) checkIdentical(quantile(list1, na.rm=TRUE), sapply(list1, quantile, na.rm=TRUE)) checkIdentical(mad(list1, na.rm=TRUE), sapply(list1, mad, na.rm=TRUE)) checkIdentical(IQR(list1, na.rm=TRUE), sapply(list1, IQR, na.rm=TRUE)) vec3 <- (-20:20)^2 vec3[c(1,10,21,41)] <- c(100L, 30L, 400L, 470L) list3 <- do.call(type, list(one = vec3, rev(vec3), compress = compress)) checkIdentical(as.list(smoothEnds(list3)), lapply(list3, smoothEnds)) checkIdentical(as.list(runmed(list3, 7)), lapply(list3, function(x) { y <- runmed(x, 7) if (type != "RleList") y <- as.vector(y) y })) } } } test_AtomicList_character <- function() { txt <- c("The", "licenses", "for", "most", "software", "are", "designed", "to", "take", "away", "your", "freedom", "to", "share", "and", "change", "it.", "", "By", "contrast,", "the", "GNU", "General", "Public", "License", "is", "intended", "to", "guarantee", "your", "freedom", "to", "share", "and", "change", "free", "software", "--", "to", "make", "sure", "the", "software", "is", "free", "for", "all", "its", "users") for (compress in c(TRUE, FALSE)) { for (type in c("CharacterList", "RleList")) { list1 <- do.call(type, list(one = txt, rev(txt), compress = compress)) checkIdentical(as.list(nchar(list1)), lapply(list1, nchar)) checkIdentical(as.list(chartr("@!*", "alo", list1)), lapply(list1, chartr, old="@!*", new="alo")) checkIdentical(as.list(tolower(list1)), lapply(list1, tolower)) checkIdentical(as.list(toupper(list1)), lapply(list1, toupper)) checkIdentical(as.list(sub("[b-e]",".", list1)), lapply(list1, sub, pattern="[b-e]", replacement=".")) checkIdentical(as.list(gsub("[b-e]",".", list1)), lapply(list1, gsub, pattern="[b-e]", replacement=".")) } } } test_RleList_methods <- function() { ## na.rm x <- RleList(c(NA,1,1), c(1L,NA_integer_,1L), c(1,Inf,1,-Inf),compress=TRUE) target <- RleList(c(1,2), c(1L,1L), c(Inf,Inf,-Inf)) current <- runsum(x,2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,2), c(NA_integer_,NA_integer_), c(Inf,Inf,-Inf)) current <- runsum(x,2, na.rm = FALSE) checkIdentical(target, current) target <- RleList(c(2,4), c(2,2), c(Inf, Inf, -Inf)) current <- runwtsum(x,2, c(2,2), na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,4), c(NA_real_,NA_real_), c(Inf,Inf,-Inf)) current <- runwtsum(x,2, c(2,2), na.rm = FALSE) checkIdentical(target, current) target <- RleList(c(1,1), c(1,1), c(Inf,Inf,-Inf)) current <- runmean(x, 2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,1), c(NA_real_, NA_real_), c(Inf, Inf, -Inf)) current <- runmean(x, 2, na.rm = FALSE) checkIdentical(target, current) x <- RleList(c(NA,1,2), c(2L,NA_integer_,1L), c(1,Inf,1,-Inf),compress=TRUE) target <- RleList(c(1,2), c(2L,1L), c(Inf,Inf,1)) current <- runq(x, 2, 2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,2), c(NA_integer_, NA_integer_), c(Inf, Inf, 1)) current <- runq(x, 2, 2, na.rm = FALSE) checkIdentical(target, current) ## Binary operations between an RleList and an atomic vector: a1 <- Rle(1, 999722111) a2 <- 20 * a1 a <- RleList(a1, a2, compress=TRUE) b1 <- c(a1, a1) b2 <- 20 * b1 b <- RleList(b1, b2, compress=FALSE) for (y in list(8L, 8)) { ## With a CompressedRleList target <- RleList(a1 + y, a2 + y, compress=TRUE) current <- a + y checkIdentical(target, current) target <- RleList(a1 * y, a2 * y, compress=TRUE) current <- a * y checkIdentical(target, current) target <- RleList(a1 / y, a2 / y, compress=TRUE) current <- a / y checkIdentical(target, current) ## With a SimpleRleList target <- RleList(b1 + y, b2 + y, compress=FALSE) current <- b + y checkIdentical(target, current) target <- RleList(b1 * y, b2 * y, compress=FALSE) current <- b * y checkIdentical(target, current) target <- RleList(b1 / y, b2 / y, compress=FALSE) current <- b / y checkIdentical(target, current) } } IRanges/inst/unitTests/test_DataFrame-utils.R0000644000175400017540000000070313175713360022261 0ustar00biocbuildbiocbuild ## splitting test_DataFrame_splitting <- function() { data(swiss) rn <- rownames(swiss) sw <- DataFrame(swiss, row.names=rn) swisssplit <- split(swiss, swiss$Education) ## split swsplit <- split(sw, sw[["Education"]]) checkTrue(validObject(swsplit)) checkIdentical(as.list(lapply(swsplit, as.data.frame)), swisssplit) checkTrue(validObject(split(DataFrame(IRanges(1:26, 1:26), LETTERS), letters))) } IRanges/inst/unitTests/test_DataFrameList.R0000644000175400017540000001776713175713360022001 0ustar00biocbuildbiocbuildtest_DataFrameList_construction <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) checkDFL2dfl(DataFrameList(swiss, airquality), list(swiss, airquality)) } test_SplitDataFrameList_construction <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } striprownames <- function(x) { lapply(x, function(y) { rownames(y) <- NULL y }) } data(airquality) data(swiss) aq <- DataFrame(airquality) sw <- DataFrame(swiss, row.names=rownames(swiss)) aqsplit1 <- split(aq, aq[["Month"]]) aqsplit2 <- SplitDataFrameList(lapply(split(airquality, airquality[["Month"]]), as, "DataFrame")) checkIdentical(aqsplit1, aqsplit2) swsplit1 <- split(sw, sw[["Education"]]) swsplit2 <- SplitDataFrameList(lapply(split(swiss, swiss[["Education"]]), as, "DataFrame")) checkIdentical(swsplit1, swsplit2) for (compress in c(TRUE, FALSE)) { airqualitysplit <- striprownames(split(airquality, airquality[["Month"]])) aqsplit <- SplitDataFrameList(as.list(split(aq, aq[["Month"]])), compress = compress) checkDFL2dfl(aqsplit, airqualitysplit) swisssplit <- split(swiss, swiss[["Education"]]) swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) checkDFL2dfl(swsplit, swisssplit) } } test_DataFrameList_subset <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) checkDFL2dfl(DFL1[], dfl1[]) checkDFL2dfl(DFL1[1], dfl1[1]) checkDFL2dfl(DFL1[2:1], dfl1[2:1]) checkIdentical(as.data.frame(DFL1[[2]]), airquality) checkException(DFL1[[3]], silent = TRUE) DFL2 <- DataFrameList(s = swiss, a = airquality) dfl2 <- list(s = swiss, a = airquality) checkDFL2dfl(DFL2[], dfl2[]) checkDFL2dfl(DFL2[1], dfl2[1]) checkDFL2dfl(DFL2["a"], dfl2["a"]) checkDFL2dfl(DFL2[c("a", "s")], dfl2[c("a", "s")]) checkIdentical(as.data.frame(DFL2[["a"]]), airquality) checkIdentical(DFL2[["z"]], NULL) } test_SplitDataFrameList_subset <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(swiss) sw <- DataFrame(swiss, row.names = rownames(swiss)) for (compress in c(TRUE, FALSE)) { swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) swisssplit <- split(swiss, swiss[["Education"]]) checkDFL2dfl(swsplit[], swisssplit[]) checkDFL2dfl(swsplit[1], swisssplit[1]) checkDFL2dfl(swsplit[2:1], swisssplit[2:1]) checkIdentical(as.data.frame(swsplit[[2]]), swisssplit[[2]]) checkIdentical(swsplit[["A"]], NULL) checkException(swsplit[[30]], silent = TRUE) checkIdentical(as.list(swsplit[,1]), split(swiss[[1]], swiss[["Education"]])) checkIdentical(as.list(swsplit[,"Examination"]), split(swiss[["Examination"]], swiss[["Education"]])) } } test_SplitDataFrameList_as.data.frame <- function() { checkDFL2dfl <- function(DFL, dfl, compress) { target <- data.frame(group = togroup(PartitioningByWidth(dfl)), group_name = names(dfl)[togroup(PartitioningByWidth(dfl))], do.call(rbind, dfl), stringsAsFactors=FALSE, row.names=NULL) rownames(target) <- unlist(lapply(dfl, row.names), use.names = FALSE) checkIdentical(target, as.data.frame(DFL)) } data(swiss) sw <- DataFrame(swiss, row.names = rownames(swiss)) for (compress in c(TRUE, FALSE)) { swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) swisssplit <- split(swiss, swiss[["Education"]]) checkDFL2dfl(swsplit, swisssplit, compress) } } test_DataFrameList_replace <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[] <- DFL1[1] dfl1[] <- dfl1[1] checkDFL2dfl(DFL1, dfl1) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[2] <- DFL1[1] dfl1[2] <- dfl1[1] checkDFL2dfl(DFL1, dfl1) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[[1]][[1]] <- DFL1[[1]][[1]] + 1L dfl1[[1]][[1]] <- dfl1[[1]][[1]] + 1L checkDFL2dfl(DFL1, dfl1) } test_SplitDataFrameList_replace <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } striprownames <- function(x) { lapply(x, function(y) { rownames(y) <- NULL y }) } data(airquality) data(swiss) swiss2 <- swiss rownames(swiss2) <- NULL sw2 <- DataFrame(swiss2) for (compress in c(TRUE, FALSE)) { swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) swiss2split[] <- swiss2split[1] sw2split[] <- sw2split[1] checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) swiss2split[c(2, 4, 5)] <- swiss2split[1] sw2split[c(2, 4, 5)] <- sw2split[1] checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) swiss2split <- lapply(swiss2split, function(x) {x[["Examination"]] <- x[["Examination"]] + 1L; x}) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) sw2split[,"Examination"] <- sw2split[,"Examination"] + 1L checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) swiss2split <- lapply(swiss2split, function(x) { x[["Examination"]][x[["Examination"]] > 22] <- x[["Examination"]][x[["Examination"]] > 22] + 1L x }) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) sw2split[sw2split[, "Examination"] > 22, "Examination"] <- sw2split[sw2split[, "Examination"] > 22,"Examination"] + 1L checkDFL2dfl(sw2split, swiss2split) } } test_DataFrameList_transform <- function() { DF <- DataFrame(state.division, state.region, state.area) DFL <- split(DF, DF$state.division) # NICER: split(DF, ~ state.devision) DFL <- transform(DFL, total.area=sum(state.area[state.region!="South"]), fraction=ifelse2(total.area == 0, 0, state.area/total.area)) ANS <- DataFrame(lapply(unlist(DFL, use.names=FALSE), unname)) df <- as.data.frame(DF) df$total.area <- with(subset(df, state.region != "South"), sapply(split(state.area, state.division), sum))[df$state.division] df$fraction <- with(df, ifelse(total.area == 0, 0, state.area/total.area)) df <- df[order(df$state.division),] rownames(df) <- NULL checkIdentical(ANS, DataFrame(df)) } IRanges/inst/unitTests/test_Grouping-class.R0000644000175400017540000001316413175713360022201 0ustar00biocbuildbiocbuild### test_PartitioningByEnd <- function() { ## on a numeric vector, NG not supplied current0 <- PartitioningByEnd() checkTrue(validObject(current0)) target <- new("PartitioningByEnd") checkIdentical(target, current0) breakpoints <- c(0, 5, 5, 8) current1 <- PartitioningByEnd(breakpoints) checkTrue(validObject(current1)) checkIdentical(4L, length(current1)) checkIdentical(as.integer(breakpoints), end(current1)) checkIdentical(end(current1), cumsum(width(current1))) checkIdentical(NULL, names(current1)) checkException(PartitioningByEnd(breakpoints, names=letters), silent=TRUE) current2 <- PartitioningByEnd(breakpoints, names=letters[1:4]) checkTrue(validObject(current2)) checkIdentical(letters[1:4], names(current2)) names(breakpoints) <- names(current2) current3 <- PartitioningByEnd(breakpoints) checkIdentical(current2, current3) current4 <- PartitioningByEnd(breakpoints, names=LETTERS[4:1]) checkIdentical(LETTERS[4:1], names(current4)) breakpoints <- rep.int(0, 1000) current5 <- PartitioningByEnd(breakpoints) checkTrue(validObject(current5)) checkIdentical(as.integer(breakpoints), end(current5)) checkIdentical(end(current5), cumsum(width(current5))) ## on a PartitioningByEnd object checkIdentical(current1, PartitioningByEnd(current1)) # no-op checkIdentical(current2, PartitioningByEnd(current2)) # no-op checkException(PartitioningByEnd(current2, names=LETTERS), silent=TRUE) current6 <- PartitioningByEnd(current2, names=names(current4)) checkTrue(validObject(current6)) checkIdentical(names(current4), names(current6)) ## on CompressedList, SimpleList, IRanges, and list objects do_checks <- function(x) { checkIdentical(current1, PartitioningByEnd(x)) checkException(PartitioningByEnd(x, names=letters), silent=TRUE) checkIdentical(current2, PartitioningByEnd(x, names=names(current2))) names(x) <- names(current2) checkIdentical(current2, PartitioningByEnd(x)) checkIdentical(current4, PartitioningByEnd(x, names=names(current4))) } x <- RleList(Rle(), Rle(-3, 5), Rle(), Rle(1:0, c(2,1)), compress=TRUE) do_checks(x) do_checks(as(x, "SimpleList")) do_checks(as.list(x)) x <- IRanges(seq(148, by=-50, length.out=4), width=width(current1)) do_checks(x) do_checks(as.list(x)) do_checks(list(NULL, integer(5), complex(0), raw(3))) } test_PartitioningByWidth <- function() { ## on a numeric vector, NG not supplied current0 <- PartitioningByWidth() checkTrue(validObject(current0)) target <- new("PartitioningByWidth") checkIdentical(target, current0) widths <- c(0, 5, 0, 3) current1 <- PartitioningByWidth(widths) checkTrue(validObject(current1)) checkIdentical(4L, length(current1)) checkIdentical(as.integer(widths), width(current1)) checkIdentical(end(current1), cumsum(width(current1))) checkIdentical(NULL, names(current1)) checkException(PartitioningByWidth(widths, names=letters), silent=TRUE) current2 <- PartitioningByWidth(widths, names=letters[1:4]) checkTrue(validObject(current2)) checkIdentical(letters[1:4], names(current2)) names(widths) <- names(current2) current3 <- PartitioningByWidth(widths) checkIdentical(current2, current3) current4 <- PartitioningByWidth(widths, names=LETTERS[4:1]) checkIdentical(LETTERS[4:1], names(current4)) widths <- rep.int(0, 1000) current5 <- PartitioningByWidth(widths) checkTrue(validObject(current5)) checkIdentical(as.integer(widths), width(current5)) checkIdentical(end(current5), cumsum(width(current5))) ## on a PartitioningByWidth object checkIdentical(current1, PartitioningByWidth(current1)) # no-op checkIdentical(current2, PartitioningByWidth(current2)) # no-op checkException(PartitioningByWidth(current2, names=LETTERS), silent=TRUE) current6 <- PartitioningByWidth(current2, names=names(current4)) checkTrue(validObject(current6)) checkIdentical(names(current4), names(current6)) ## on CompressedList, SimpleList, IRanges, and list objects do_checks <- function(x) { checkIdentical(current1, PartitioningByWidth(x)) checkException(PartitioningByWidth(x, names=letters), silent=TRUE) checkIdentical(current2, PartitioningByWidth(x, names=names(current2))) names(x) <- names(current2) checkIdentical(current2, PartitioningByWidth(x)) checkIdentical(current4, PartitioningByWidth(x, names=names(current4))) } x <- RleList(Rle(), Rle(-3, 5), Rle(), Rle(1:0, c(2,1)), compress=TRUE) do_checks(x) do_checks(as(x, "SimpleList")) do_checks(as.list(x)) x <- IRanges(seq(148, by=-50, length.out=4), width=width(current1)) do_checks(x) do_checks(as.list(x)) do_checks(list(NULL, integer(5), complex(0), raw(3))) } test_PartitioningByEndOrWidth_NG_supplied <- function() { for (class in c("PartitioningByEnd", "PartitioningByWidth")) { CONSTRUCTOR <- get(class) x <- c(3, 3, 4, 6) NG <- 8 current1 <- CONSTRUCTOR(x, NG) checkTrue(is(current1, class)) checkTrue(validObject(current1)) checkIdentical(8L, length(current1)) checkIdentical(tabulate(x, nbins=NG), width(current1)) checkException(CONSTRUCTOR(x, NG, names=letters[1:4]), silent=TRUE) current2 <- CONSTRUCTOR(x, NG, names=letters[1:8]) checkTrue(validObject(current2)) checkIdentical(letters[1:8], names(current2)) names(x) <- letters[1:4] current3 <- CONSTRUCTOR(x, NG) checkIdentical(current1, current3) } } IRanges/inst/unitTests/test_HitsList.R0000644000175400017540000000037013175713360021042 0ustar00biocbuildbiocbuildtest_HitsList_as_matrix <- function() { x <- IRangesList(chr1=IRanges(1, 5), chr2=IRanges(6, 10)) y <- IRangesList(chr2=IRanges(8, 10)) checkIdentical(as.matrix(findOverlaps(x, y)), cbind(queryHits = 2L, subjectHits = 1L)) } IRanges/inst/unitTests/test_IRanges-class.R0000644000175400017540000000400713175713360021733 0ustar00biocbuildbiocbuildtest_IRanges_names <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) checkIdentical(names(range1), NULL) nms <- c("a", NA, "b") names(range1) <- nms checkIdentical(names(range1), nms) checkTrue(validObject(nms)) names(range1) <- NULL checkTrue(validObject(nms)) checkIdentical(names(range1), NULL) names(range1) <- "a" checkTrue(validObject(range1)) checkIdentical(names(range1), c("a", NA, NA)) checkException(names(range1) <- c("a", "b", "c", "d"), silent = TRUE) } test_Ranges_isDisjoint <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) ir2 <- IRanges(c(2,9,5), c(3,9,6)) ir3 <- IRanges(1, 5) checkIdentical(isDisjoint(ir1), FALSE) checkIdentical(isDisjoint(ir2), TRUE) checkIdentical(isDisjoint(ir3), TRUE) ## Handling of zero-width ranges current <- sapply(11:17, function(i) isDisjoint(IRanges(c(12, i), width=c(4, 0)))) target <- rep(c(TRUE, FALSE, TRUE), c(2, 3, 2)) checkIdentical(target, current) } test_IRanges_combine <- function() { range <- IRanges(start=c(1,2,3,1), end=c(5,2,8,3)) srange <- split(range, start(range) == 1) checkIdentical(srange, as(RangesList(`FALSE` = range[2:3], `TRUE` = range[c(1,4)]), "CompressedIRangesList")) checkIdentical(do.call(c, unname(as.list(srange))), IRanges(c(2,3,1,1), c(2,8,5,3))) ir1 <- IRanges(1, 10) ir2 <- IRanges(c(1, 15), width=5) mcols(ir2) <- DataFrame(score=1:2) checkIdentical(mcols(c(ir1, ir2)), DataFrame(score = c(NA, 1L, 2L))) ## Combining multiple IRanges object with varying mcols mcols(ir1) <- DataFrame(gc=0.78) checkException(c(ir1, ir2), silent=TRUE) checkIdentical(mcols(c(ir1, ir2, ignore.mcols=TRUE)), NULL) } test_IRanges_annotation <- function() { range <- IRanges(c(1, 4), c(5, 7)) mcols(range) <- DataFrame(a = 1:2) checkIdentical(mcols(range)[,1], 1:2) checkIdentical(mcols(range[2:1])[,1], 2:1) checkIdentical(mcols(c(range,range))[,1], rep(1:2,2)) } IRanges/inst/unitTests/test_NCList-class.R0000644000175400017540000005116113175713360021542 0ustar00biocbuildbiocbuild### findOverlaps_NCList <- IRanges:::findOverlaps_NCList findOverlaps_NCLists <- IRanges:::findOverlaps_NCLists .transpose_hits <- function(hits) { if (is.list(hits)) return(lapply(hits, .transpose_hits)) t(hits) } ### Used in the unit tests for GNCList located in GenomicRanges. .compare_hits <- function(target, current) { if (is.list(target) || is(target, "List") && is.list(current) || is(current, "List")) return(all(mapply(.compare_hits, target, current))) identical(.transpose_hits(target), .transpose_hits(current)) } ### Used in the unit tests for GNCList located in GenomicRanges. .make_Hits_from_q2s <- function(q2s, s_len) { q_hits <- rep.int(seq_along(q2s), elementNROWS(q2s)) s_hits <- as.integer(unlist(q2s, use.names=FALSE)) Hits(q_hits, s_hits, length(q2s), s_len, sort.by.query=TRUE) } .make_Hits_from_s2q <- function(s2q, q_len) .transpose_hits(.make_Hits_from_q2s(s2q, q_len)) .select_hits <- function(x, select) { if (is.list(x)) return(lapply(x, .select_hits, select)) selectHits(x, select) } ### Vectorized. Return -1 if the query and subject overlap (i.e. if ### end(query) < start(subject) and end(subject) < start(query) are both ### false). Otherwise (i.e. if they are disjoint), return the width of the ### gap between them. Note that a gap width of 0 means that they are adjacent. ### TODO: Rename this pgapWidth(), make it a generic with various methods ### (at least one for Ranges and one for GenomicRanges objects), and export it. .gapwidth <- function(query, subject) { ifelse(end(query) < start(subject), start(subject) - end(query), ifelse(end(subject) < start(query), start(query) - end(subject), 0L)) - 1L } ### Vectorized. ### TODO: Rename this poverlapWidth(), make it a generic with various methods ### (at least one for Ranges and one for GenomicRanges objects), and export it. .overlapwidth <- function(query, subject) { score <- pmin.int(end(query), end(subject)) - pmax.int(start(query), start(subject)) + 1L pmax.int(score, 0L) } ### Used in the unit tests for GNCList located in GenomicRanges. .get_query_overlaps <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal")) { type <- match.arg(type) if (type == "any" && maxgap != -1L && minoverlap != 0L) stop("when 'type' is \"any\", at least one of 'maxgap' ", "and 'minoverlap' must be set to its default value") overlapwidth <- .overlapwidth(query, subject) ok <- overlapwidth >= minoverlap if (type == "any") { gapwidth <- .gapwidth(query, subject) ok <- ok & gapwidth <= maxgap return(ok) } if (maxgap == -1L) maxgap <- 0L if (type != "end") d1 <- abs(start(subject) - start(query)) if (type != "start") d2 <- abs(end(subject) - end(query)) if (type == "start") return(ok & d1 <= maxgap) if (type == "end") return(ok & d2 <= maxgap) if (type == "equal") return(ok & d1 <= maxgap & d2 <= maxgap) if (type == "within") { ok2 <- start(query) >= start(subject) & end(query) <= end(subject) } else { # type == "extend" ok2 <- start(query) <= start(subject) & end(query) >= end(subject) } ok <- ok & ok2 if (maxgap > 0L) ok <- ok & (d1 + d2) <= maxgap ok } .findOverlaps_naive <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count")) { type <- match.arg(type) select <- match.arg(select) hits_per_query <- lapply(seq_along(query), function(i) which(.get_query_overlaps(query[i], subject, maxgap=maxgap, minoverlap=minoverlap, type=type))) hits <- .make_Hits_from_q2s(hits_per_query, length(subject)) selectHits(hits, select=select) } test_NCList <- function() { x <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5), names=LETTERS[1:21]) mcols(x) <- DataFrame(score=seq(0.7, by=0.045, length.out=21)) nclist <- NCList(x) checkTrue(is(nclist, "NCList")) checkTrue(validObject(nclist, complete=TRUE)) checkIdentical(x, ranges(nclist, use.mcols=TRUE)) checkIdentical(length(x), length(nclist)) checkIdentical(names(x), names(nclist)) checkIdentical(start(x), start(nclist)) checkIdentical(end(x), end(nclist)) checkIdentical(width(x), width(nclist)) checkIdentical(x, as(nclist, "IRanges")) checkIdentical(x[-6], as(nclist[-6], "IRanges")) } ### Test findOverlaps_NCList() *default* behavior, that is, with all optional ### arguments (i.e. 'maxgap', 'minoverlap', 'type', 'select', and ### 'circle.length') set to their default value. test_findOverlaps_NCList <- function() { query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) target0 <- .findOverlaps_naive(query, subject) current <- findOverlaps_NCList(query, NCList(subject)) checkTrue(.compare_hits(target0, current)) current <- findOverlaps_NCList(NCList(query), subject) checkTrue(.compare_hits(target0, current)) current <- findOverlaps_NCList(query, subject) checkTrue(.compare_hits(target0, current)) ## Shuffle query and/or subject elements. permute_input <- function(q_perm, s_perm) { q_revperm <- integer(length(q_perm)) q_revperm[q_perm] <- seq_along(q_perm) s_revperm <- integer(length(s_perm)) s_revperm[s_perm] <- seq_along(s_perm) target <- remapHits(target0, Lnodes.remapping=q_revperm, new.nLnode=length(q_perm), Rnodes.remapping=s_revperm, new.nRnode=length(s_perm)) current <- findOverlaps_NCList(query[q_perm], NCList(subject[s_perm])) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(NCList(query[q_perm]), subject[s_perm]) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(query[q_perm], subject[s_perm]) checkTrue(.compare_hits(target, current)) } q_perm <- rev(seq_along(query)) s_perm <- rev(seq_along(subject)) permute_input(q_perm, seq_along(subject)) # reverse query permute_input(seq_along(query), s_perm) # reverse subject permute_input(q_perm, s_perm) # reverse both set.seed(97) for (i in 1:33) { ## random permutations q_perm <- sample(length(query)) s_perm <- sample(length(subject)) permute_input(q_perm, seq_along(subject)) permute_input(seq_along(query), s_perm) permute_input(q_perm, s_perm) } } test_findOverlaps_NCList_with_filtering <- function() { query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) pp_query <- NCList(query) pp_subject <- NCList(subject) for (type in c("any", "start", "end", "within", "extend", "equal")) { for (maxgap in -1:3) { if (type != "any" || maxgap == -1L) max_minoverlap <- 4L else max_minoverlap <- 0L for (minoverlap in 0:max_minoverlap) { for (select in c("all", "first", "last", "count")) { ## query - subject target <- .findOverlaps_naive(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(query, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(pp_query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) ## subject - query target <- .findOverlaps_naive(subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(pp_subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, pp_query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) ## subject - subject target <- .findOverlaps_naive(subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(pp_subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) } } } } } ### Only test "start" and "end" types at the moment. test_findOverlaps_NCList_special_types <- function() { x <- IRanges(10, 10) x1 <- IRanges(10, 9) y1 <- IRanges(start=c(7, 7, 13, 13), width=c(2, 0, 2, 0)) stopifnot(all(abs(start(x) - start(y1)) == 3L)) stopifnot(all(abs(start(x1) - start(y1)) == 3L)) x2 <- IRanges(11, 10) y2 <- IRanges(end=c(7, 7, 13, 13), width=c(2, 0, 2, 0)) stopifnot(all(abs(end(x) - end(y2)) == 3L)) stopifnot(all(abs(end(x2) - end(y2)) == 3L)) test_maxgap_and_type <- function(maxgap, minoverlap, nhit) { hits <- findOverlaps(x, y1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(y1, x, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(x1, y1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(y1, x1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(x, y2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(y2, x, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(x2, y2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(y2, x2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) } ## no hits for (maxgap in -1:2) { test_maxgap_and_type(maxgap, minoverlap=1L, 0L) test_maxgap_and_type(maxgap, minoverlap=0L, 0L) } for (maxgap in 3:5) { ## no hits test_maxgap_and_type(maxgap, minoverlap=1L, 0L) ## 4 hits test_maxgap_and_type(maxgap, minoverlap=0L, 4L) } } .test_arbitrary_selection <- function(query, subject) { pp_query <- NCList(query) pp_subject <- NCList(subject) for (type in c("any", "start", "end", "within", "extend", "equal")) { for (maxgap in -1:3) { if (type != "any" || maxgap == -1L) max_minoverlap <- 4L else max_minoverlap <- 0L for (minoverlap in 0:max_minoverlap) { target <- as(.findOverlaps_naive(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="all"), "CompressedIntegerList") target_idx0 <- elementNROWS(target) == 0L check_arbitrary_hits <- function(current) { current_idx0 <- is.na(current) checkIdentical(target_idx0, current_idx0) current <- as(current, "CompressedIntegerList") checkTrue(all(current_idx0 | as.logical(current %in% target))) } current <- findOverlaps_NCList(query, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) current <- findOverlaps_NCList(pp_query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) current <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) } } } } test_findOverlaps_NCList_arbitrary <- function() { query <- IRanges(4:3, 6) subject <- IRanges(2:4, 10) .test_arbitrary_selection(query, subject) query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) .test_arbitrary_selection(query, subject) } .test_circularity <- function(query0, subject0, circle_length, target0, pp, findOverlaps_pp, type) { for (i in -2:2) { query <- shift(query0, shift=i*circle_length) pp_query <- pp(query, circle.length=circle_length) for (j in -2:2) { subject <- shift(subject0, shift=j*circle_length) pp_subject <- pp(subject, circle.length=circle_length) for (select in c("all", "first", "last", "count")) { target <- .select_hits(target0, select=select) current <- findOverlaps_pp(query, pp_subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(pp_query, subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(query, subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) target <- .select_hits(.transpose_hits(target0), select=select) current <- findOverlaps_pp(pp_subject, query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(subject, pp_query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(subject, query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) } } } } test_findOverlaps_NCList_with_circular_space <- function() { query <- IRanges(-2:17, width=3) subject <- IRanges(c(4, -1, 599), c(7, 0, 999)) circle_length <- 10L ## type "any" s2q <- list(c(5:10, 15:20L), c(1:3, 10:13, 20L), 1:20) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "any") ## type "start" s2q <- lapply(start(subject), function(s) which((start(query) - s) %% circle_length == 0L)) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "start") ## type "end" s2q <- lapply(end(subject), function(e) which((end(query) - e) %% circle_length == 0L)) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "end") } test_NCLists <- function() { x1 <- IRanges(-3:7, width=3) x2 <- IRanges() x3 <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) x <- IRangesList(x1=x1, x2=x2, x3=x3) mcols(x) <- DataFrame(label=c("first", "second", "third")) nclists <- NCLists(x) checkTrue(is(nclists, "NCLists")) checkTrue(validObject(nclists, complete=TRUE)) checkIdentical(x, ranges(nclists, use.mcols=TRUE)) checkIdentical(length(x), length(nclists)) checkIdentical(names(x), names(nclists)) checkIdentical(start(x), start(nclists)) checkIdentical(end(x), end(nclists)) checkIdentical(width(x), width(nclists)) checkIdentical(x, as(nclists, "IRangesList")) checkIdentical(x[-1], as(nclists[-1], "IRangesList")) checkIdentical(elementNROWS(x), elementNROWS(nclists)) nclist <- nclists[[3]] checkTrue(is(nclist, "NCList")) checkTrue(validObject(nclist, complete=TRUE)) checkIdentical(x3, as(nclist, "IRanges")) } test_findOverlaps_NCLists <- function() { ir1 <- IRanges(-3:7, width=3) ir2 <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) target0 <- mapply(findOverlaps_NCList, list(ir1, ir2), list(ir2, ir1)) for (compress in c(TRUE, FALSE)) { query <- IRangesList(ir1, ir2, IRanges(2, 7), compress=compress) pp_query <- NCLists(query) subject <- IRangesList(ir2, ir1, compress=compress) pp_subject <- NCLists(subject) for (select in c("all", "first", "last", "count")) { target <- .select_hits(target0, select=select) current <- findOverlaps_NCLists(query, pp_subject, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCLists(pp_query, subject, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCLists(query, subject, select=select) checkTrue(.compare_hits(target, current)) } } } test_findOverlaps_NCLists_with_circular_space <- function() { query1 <- IRanges(-2:17, width=3) subject1 <- IRanges(c(4, -1, 599), c(7, 0, 999)) query <- IRangesList(query1, IRanges(), subject1) subject <- IRangesList(subject1, IRanges(), query1) circle_length <- c(10L, NA_integer_, 10L) s2q <- list(c(5:10, 15:20L), c(1:3, 10:13, 20L), 1:20) target1 <- .make_Hits_from_s2q(s2q, length(query1)) target2 <- .make_Hits_from_s2q(list(), 0) target3 <- .transpose_hits(target1) target <- list(target1, target2, target3) .test_circularity(query, subject, circle_length, target, NCLists, findOverlaps_NCLists, "any") } IRanges/inst/unitTests/test_Ranges-comparison.R0000644000175400017540000000366113175713360022674 0ustar00biocbuildbiocbuildtest_Ranges_pcompare <- function() { x1 <- IRanges(6:16, width=4) y <- IRanges(11, 14) target <- c(-6:-4, -4L, -4L, 0L, 4L, 4L, 4:6) checkIdentical(target, pcompare(x1, y)) checkIdentical(-target, pcompare(y, x1)) x2 <- IRanges(4:16, width=6) target <- c(-6:-4, -4L, -4L, -3L, -2L, 1L, 4L, 4L, 4:6) checkIdentical(target, pcompare(x2, y)) checkIdentical(-target, pcompare(y, x2)) x3 <- IRanges(8:16, width=2) target <- c(-6:-4, -1L, 2L, 3L, 4:6) checkIdentical(target, pcompare(x3, y)) checkIdentical(-target, pcompare(y, x3)) ## Moving a 0-width range over a non 0-width range. ## Note that when the end of the 0-width range is equal to the start of ## the non 0-width range minus 1, returning code -5 (which describes ## a situation of adjacent ranges) seems appropriate. ## However, one could argue that returning code -1 (which describes a ## situation where one range is inside the other) would also be ## appropriate, because, in that case, the two ranges have the same start. ## So the question really is whether the 0-width range should be considered ## *outside* or *inside* the non 0-width range. ## It's an arbitrary choice and we chose the former. x0 <- IRanges(10:16, width=0) target <- c(-6:-5, 2L, 2L, 2L, 5:6) checkIdentical(target, pcompare(x0, y)) checkIdentical(-target, pcompare(y, x0)) ## Moving a 0-width range over a 0-width range. y0 <- IRanges(13, 12) target <- c(-6L, -6L, -6L, 0L, 6L, 6L, 6L) checkIdentical(target, pcompare(x0, y0)) checkIdentical(-target, pcompare(y0, x0)) } test_Ranges_order <- function() { ir1 <- IRanges(c(2,5,1,5), c(3,7,3,6)) ir1.sort <- IRanges(c(1,2,5,5), c(3,3,6,7)) ir1.rev <- IRanges(c(5,5,2,1), c(7,6,3,3)) checkIdentical(sort(ir1), ir1.sort) checkIdentical(sort(ir1, decreasing=TRUE), ir1.rev) checkException(sort(ir1, decreasing=NA), silent = TRUE) } IRanges/inst/unitTests/test_RangesList-class.R0000644000175400017540000001114013175713360022452 0ustar00biocbuildbiocbuildtest_RangesList_construction <- function() { empty <- RangesList() checkTrue(validObject(empty)) checkIdentical(length(empty), 0L) range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) named <- RangesList(one = range1, two = range2) checkTrue(validObject(named)) checkIdentical(length(named), 2L) checkIdentical(start(named), IntegerList(one = start(range1), two = start(range2), compress=FALSE)) checkIdentical(end(named), IntegerList(one = end(range1), two = end(range2), compress=FALSE)) checkIdentical(width(named), IntegerList(one = width(range1), two = width(range2), compress=FALSE)) checkIdentical(names(named), c("one", "two")) checkIdentical(range1, named[[1]]) unnamed <- RangesList(range1, range2) checkTrue(validObject(unnamed)) checkIdentical(length(unnamed), 2L) checkIdentical(range2, unnamed[[2]]) checkIdentical(names(unnamed), NULL) } test_RangesList_subset <- function() { ## by RangesList range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(1,15,20,45), end=c(5,15,100,80)) collection <- RangesList(one = range1, range2) checkIdentical(subsetByOverlaps(collection, RangesList()), RangesList(one=IRanges(), IRanges())) checkIdentical(subsetByOverlaps(collection, RangesList(IRanges(4, 6), IRanges(50, 70))), RangesList(one=IRanges(c(1,3),c(5,8)), IRanges(c(20,45),c(100,80)))) checkIdentical(subsetByOverlaps(collection, RangesList(IRanges(50, 70), one=IRanges(4, 6))), RangesList(one=IRanges(c(1,3),c(5,8)), IRanges())) } test_RangesList_as_list <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) checkIdentical(list(range1, range2), as.list(RangesList(range1, range2))) checkIdentical(list(a=range1, b=range2), as.list(RangesList(a=range1, b=range2))) } test_RangesList_as_data_frame <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) rl <- RangesList(range1, range2) df <- data.frame(group=togroup(PartitioningByWidth(rl)), group_name=NA_character_, as.data.frame(c(range1,range2)), stringsAsFactors=FALSE) checkIdentical(df, as.data.frame(rl)) names(rl) <- c("a", "b") df$group_name <- c("a", "b")[togroup(PartitioningByWidth(rl))] checkIdentical(df, as.data.frame(rl)) } test_IRangesList_construction <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { named <- IRangesList(one = range1, two = range2, compress = compress) checkIdentical(length(named), 2L) checkIdentical(names(named), c("one", "two")) checkIdentical(range1, named[[1]]) unnamed <- IRangesList(range1, range2) checkTrue(validObject(unnamed)) checkIdentical(length(unnamed), 2L) checkIdentical(range2, unnamed[[2]]) checkIdentical(names(unnamed), NULL) } } test_IRangesList_annotation <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { rl <- IRangesList(range1, range2, compress = compress) mcols(rl) <- DataFrame(a = 1:2) checkIdentical(mcols(rl)[,1], 1:2) checkIdentical(mcols(rl[2:1])[,1], 2:1) checkIdentical(mcols(c(rl,rl))[,1], rep(1:2,2)) checkIdentical(mcols(append(rl,rl))[,1], rep(1:2,2)) } } ## test_RangesList_overlap <- function() { ## rl1 <- RangesList(a = IRanges(c(1,2),c(4,3)), b = IRanges(c(4,6),c(10,7))) ## rl2 <- RangesList(b = IRanges(c(0,2),c(4,5)), a = IRanges(c(4,5),c(6,7))) ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## names(rl2)[1] <- "c" ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## names(rl2) <- NULL ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## overlap(rl1, rl2[1]) ## overlap(rl1, rl2[1], select = "first") ## overlap(rl1, rl2[1], select = "first", drop = TRUE) ## overlap(rl1[1], rl2) ## overlap(rl1[1], rl2, select = "first") ## overlap(rl1[1], rl2, select = "first", drop = TRUE) ## } IRanges/inst/unitTests/test_RleViews.R0000644000175400017540000001615613175713360021050 0ustar00biocbuildbiocbuildtest_RleViews <- function() { empty <- Views(Rle(), IRanges()) checkIdentical(empty, new("RleViews")) checkIdentical(list(), viewApply(empty, min)) checkIdentical(integer(0), viewMins(empty)) checkIdentical(integer(0), viewMaxs(empty)) checkIdentical(integer(0), viewSums(empty)) checkIdentical(numeric(0), viewMeans(empty)) checkIdentical(integer(0), viewWhichMins(empty)) checkIdentical(integer(0), viewWhichMaxs(empty)) checkIdentical(IRanges(), viewRangeMins(empty)) checkIdentical(IRanges(), viewRangeMaxs(empty)) x <- rep(c(1L, 3L, NA, 7L, 9L), 1:5) xRle <- Rle(x) xRleViewsUntrimmed <- Views(xRle, IRanges(start = c(1,1), width = c(0,20))) checkIdentical(c(Inf, 1), suppressWarnings(viewApply(xRleViewsUntrimmed, min, na.rm = TRUE))) checkIdentical(c(2147483647L, 1L), viewMins(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMins(xRleViewsUntrimmed, na.rm = TRUE), min(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(-2147483647L, 9L), viewMaxs(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMaxs(xRleViewsUntrimmed, na.rm = TRUE), max(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(0L, 80L), viewSums(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewSums(xRleViewsUntrimmed, na.rm = TRUE), sum(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NaN, 20/3), viewMeans(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMeans(xRleViewsUntrimmed, na.rm = TRUE), mean(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 1L), viewWhichMins(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewWhichMins(xRleViewsUntrimmed, na.rm = TRUE), which.min(xRleViewsUntrimmed)) checkIdentical(c(NA_integer_, 11L), viewWhichMaxs(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewWhichMaxs(xRleViewsUntrimmed, na.rm = TRUE), which.max(xRleViewsUntrimmed)) checkException(max(xRleViewsUntrimmed, xRleViewsUntrimmed, na.rm = TRUE), silent = TRUE) xRleViews <- Views(xRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9), names = letters[1:5]) xList <- lapply(structure(seq_len(length(xRleViews)), names = letters[1:5]), function(i) window(x, start = start(xRleViews)[i], end = end(xRleViews)[i])) checkIdentical(letters[1:5], names(viewApply(xRleViews, min))) checkIdentical(letters[1:5], names(viewMins(xRleViews))) checkIdentical(letters[1:5], names(viewMaxs(xRleViews))) checkIdentical(letters[1:5], names(viewSums(xRleViews))) checkIdentical(letters[1:5], names(viewMeans(xRleViews))) checkIdentical(letters[1:5], names(viewWhichMins(xRleViews))) checkIdentical(letters[1:5], names(viewWhichMaxs(xRleViews))) checkIdentical(letters[1:5], names(viewRangeMins(xRleViews, na.rm = TRUE))) checkIdentical(letters[1:5], names(viewRangeMaxs(xRleViews, na.rm = TRUE))) checkEqualsNumeric(sapply(xList, min), viewMins(xRleViews)) checkEqualsNumeric(sapply(xList, min), viewApply(xRleViews, min)) checkEqualsNumeric(sapply(xList, min, na.rm = TRUE), viewMins(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, min, na.rm = TRUE), viewApply(xRleViews, min, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, max), viewMaxs(xRleViews)) checkEqualsNumeric(sapply(xList, max), viewApply(xRleViews, max)) checkEqualsNumeric(sapply(xList, max, na.rm = TRUE), viewMaxs(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, max, na.rm = TRUE), viewApply(xRleViews, max, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, sum), viewSums(xRleViews)) checkEqualsNumeric(sapply(xList, mean), viewMeans(xRleViews)) checkEqualsNumeric(sapply(xList, sum), viewApply(xRleViews, sum)) checkEqualsNumeric(sapply(xList, sum, na.rm = TRUE), viewSums(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, mean, na.rm = TRUE), viewMeans(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, sum, na.rm = TRUE), viewApply(xRleViews, sum, na.rm = TRUE)) y <- rep(c(1.2, 3.4, NA, 7.8, 9.0), 1:5) yRle <- Rle(y) yRleViewsUntrimmed <- Views(yRle, IRanges(start = c(1,1), width = c(0,20))) checkIdentical(c(Inf, 1.2), suppressWarnings(viewApply(yRleViewsUntrimmed, min, na.rm = TRUE))) checkIdentical(c(Inf, 1.2), viewMins(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(-Inf, 9), viewMaxs(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(0, 84.2), viewSums(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NaN, 84.2/12), viewMeans(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 1L), viewWhichMins(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 11L), viewWhichMaxs(yRleViewsUntrimmed, na.rm = TRUE)) yRleViews <- Views(yRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) yList <- lapply(seq_len(length(yRleViews)), function(i) window(y, start = start(yRleViews)[i], end = end(yRleViews)[i])) checkEqualsNumeric(sapply(yList, min), viewMins(yRleViews)) checkEqualsNumeric(sapply(yList, min), viewApply(yRleViews, min)) checkEqualsNumeric(sapply(yList, min, na.rm = TRUE), viewMins(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, min, na.rm = TRUE), viewApply(yRleViews, min, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, max), viewMaxs(yRleViews)) checkEqualsNumeric(sapply(yList, max), viewApply(yRleViews, max)) checkEqualsNumeric(sapply(yList, max, na.rm = TRUE), viewMaxs(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, max, na.rm = TRUE), viewApply(yRleViews, max, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, sum), viewSums(yRleViews)) checkEqualsNumeric(sapply(yList, mean), viewMeans(yRleViews)) checkEqualsNumeric(sapply(yList, sum), viewApply(yRleViews, sum)) checkEqualsNumeric(sapply(yList, sum, na.rm = TRUE), viewSums(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, mean, na.rm = TRUE), viewMeans(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, sum, na.rm = TRUE), viewApply(yRleViews, sum, na.rm = TRUE)) z <- rep(c(1+1i, 3.4-1i, NA, 7.8+3i, 9.0-2i), 1:5) zRle <- Rle(z) zRleViews <- Views(zRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) zList <- lapply(seq_len(length(zRleViews)), function(i) window(z, start = start(zRleViews)[i], end = end(zRleViews)[i])) checkEqualsNumeric(sapply(zList, sum), viewSums(zRleViews)) checkEqualsNumeric(sapply(zList, mean), viewMeans(zRleViews)) checkEqualsNumeric(sapply(zList, sum), viewApply(zRleViews, sum)) checkEqualsNumeric(sapply(zList, sum, na.rm = TRUE), viewSums(zRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(zList, mean, na.rm = TRUE), viewMeans(zRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(zList, sum, na.rm = TRUE), viewApply(zRleViews, sum, na.rm = TRUE)) } IRanges/inst/unitTests/test_RleViewsList.R0000644000175400017540000001302713175713360021676 0ustar00biocbuildbiocbuildtest_RleViewsList <- function() { x1 <- rep(c(1L, 3L, NA, 7L, 9L), 1:5) x1Rle <- Rle(x1) x1Ranges <- IRanges(start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) x2 <- rev(x1) x2Rle <- Rle(x2) x2Ranges <- IRanges(start = c(2, 4, 6, 8, 10), end = c(3, 9, 11, 13, 15)) checkIdentical(RleViewsList(Views(x1Rle, x1Ranges), Views(x2Rle, x2Ranges)), RleViewsList(rleList = RleList(x1Rle, x2Rle), rangesList = IRangesList(x1Ranges, x2Ranges))) xRleViewsList <- RleViewsList(a = Views(x1Rle, x1Ranges), b = Views(x2Rle, x2Ranges)) xList <- list(a = lapply(seq_len(length(xRleViewsList[[1]])), function(i) window(x1, start = start(x1Ranges)[i], end = end(x1Ranges)[i])), b = lapply(seq_len(length(xRleViewsList[[2]])), function(i) window(x2, start = start(x2Ranges)[i], end = end(x2Ranges)[i]))) checkIdentical(c("a", "b"), names(viewApply(xRleViewsList, min))) checkIdentical(c("a", "b"), names(viewMins(xRleViewsList))) checkIdentical(c("a", "b"), names(viewMaxs(xRleViewsList))) checkIdentical(c("a", "b"), names(viewSums(xRleViewsList))) checkIdentical(c("a", "b"), names(viewMeans(xRleViewsList))) checkIdentical(c("a", "b"), names(viewWhichMins(xRleViewsList))) checkIdentical(c("a", "b"), names(viewWhichMaxs(xRleViewsList))) checkIdentical(c("a", "b"), names(viewRangeMins(xRleViewsList, na.rm = TRUE))) checkIdentical(c("a", "b"), names(viewRangeMaxs(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, min)), unlist(viewMins(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, min)), unlist(viewApply(xRleViewsList, min))) checkEqualsNumeric(unlist(lapply(xList, lapply, min, na.rm = TRUE)), unlist(viewMins(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, min, na.rm = TRUE)), unlist(viewApply(xRleViewsList, min, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, max)), unlist(viewMaxs(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, max)), unlist(viewApply(xRleViewsList, max))) checkEqualsNumeric(unlist(lapply(xList, lapply, max, na.rm = TRUE)), unlist(viewMaxs(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, max, na.rm = TRUE)), unlist(viewApply(xRleViewsList, max, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum)), unlist(viewSums(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, mean)), unlist(viewMeans(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum)), unlist(viewApply(xRleViewsList, sum))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum, na.rm = TRUE)), unlist(viewSums(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, mean, na.rm = TRUE)), unlist(viewMeans(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum, na.rm = TRUE)), unlist(viewApply(xRleViewsList, sum, na.rm = TRUE))) y1 <- rep(c(1.2, 3.4, NA, 7.8, 9.0), 1:5) y1Ranges <- IRanges(start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) y1Rle <- Rle(y1) y2 <- rev(y1) y2Rle <- Rle(y2) y2Ranges <- IRanges(start = c(2, 4, 6, 8, 10), end = c(3, 9, 11, 13, 15)) checkIdentical(RleViewsList(Views(y1Rle, y1Ranges), Views(y2Rle, y2Ranges)), RleViewsList(rleList = RleList(y1Rle, y2Rle), rangesList = IRangesList(y1Ranges, y2Ranges))) yRleViewsList <- RleViewsList(Views(y1Rle, y1Ranges), Views(y2Rle, y2Ranges)) yList <- list(lapply(seq_len(length(yRleViewsList[[1]])), function(i) window(y1, start = start(y1Ranges)[i], end = end(y1Ranges)[i])), lapply(seq_len(length(yRleViewsList[[2]])), function(i) window(y2, start = start(y2Ranges)[i], end = end(y2Ranges)[i]))) checkEqualsNumeric(unlist(lapply(yList, lapply, min)), unlist(viewMins(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, min)), unlist(viewApply(yRleViewsList, min))) checkEqualsNumeric(unlist(lapply(yList, lapply, min, na.rm = TRUE)), unlist(viewMins(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, min, na.rm = TRUE)), unlist(viewApply(yRleViewsList, min, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, max)), unlist(viewMaxs(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, max)), unlist(viewApply(yRleViewsList, max))) checkEqualsNumeric(unlist(lapply(yList, lapply, max, na.rm = TRUE)), unlist(viewMaxs(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, max, na.rm = TRUE)), unlist(viewApply(yRleViewsList, max, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum)), unlist(viewSums(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, mean)), unlist(viewMeans(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum)), unlist(viewApply(yRleViewsList, sum))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum, na.rm = TRUE)), unlist(viewSums(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, mean, na.rm = TRUE)), unlist(viewMeans(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum, na.rm = TRUE)), unlist(viewApply(yRleViewsList, sum, na.rm = TRUE))) } IRanges/inst/unitTests/test_coverage-methods.R0000644000175400017540000000155713175713360022543 0ustar00biocbuildbiocbuildtest_IRanges_coverage <- function() { ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width = c(12, 6, 6, 15, 6, 2, 7)) checkIdentical(as.vector(coverage(ir)), rep(c(1L, 2L, 1L, 2L, 3L, 2L, 1L, 0L, 1L, 0L, 1L), c(7, 5, 2, 4, 1, 5, 5, 4, 2, 4, 7))) ir <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) checkIdentical(as.vector(coverage(ir)), rep(c(3L, 1L, 0L, 1L, 2L, 1L), c(2, 2, 4, 1, 3, 2))) checkIdentical(as.vector(coverage(ir, shift=7)), rep(c(1L, 0L, 1L, 2L, 3L, 1L, 0L, 1L, 2L, 1L), c(3, 1, 2, 1, 2, 2, 4, 1, 3, 2))) checkIdentical(as.vector(coverage(ir, shift=7, width=27)), rep(c(1L, 0L, 1L, 2L, 3L, 1L, 0L, 1L, 2L, 1L, 0L), c(3, 1, 2, 1, 2, 2, 4, 1, 3, 2, 6))) } IRanges/inst/unitTests/test_extractList.R0000644000175400017540000001014313175713360021604 0ustar00biocbuildbiocbuild### test_relistToClass <- function() { ## TODO } test_relist <- function() { ## TODO } test_splitAsList <- function() { ## TODO } test_extractList <- function() { ## TODO } test_regroupBySupergroup <- function() { regroupBySupergroup <- IRanges:::regroupBySupergroup .do_checks <- function(x, breakpoints, target) { supergroups <- PartitioningByEnd(breakpoints) current <- regroupBySupergroup(x, supergroups) checkIdentical(target, current) checkIdentical(target, regroupBySupergroup(x, breakpoints)) x_partitioning <- PartitioningByEnd(x) current2 <- regroupBySupergroup(x_partitioning, supergroups) checkIdentical(PartitioningByEnd(target), current2) } x <- CharacterList( x1=NULL, x2=LETTERS[1:3], x3=LETTERS[4:5], x4=letters[1:5], x5=NULL, x6=letters[6:7] ) breakpoints <- c(SG1=3, SG2=6) target <- CharacterList(SG1=LETTERS[1:5], SG2=letters[1:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=2, SG2=5, SG3=6) target <- CharacterList(SG1=LETTERS[1:3], SG2=c(LETTERS[4:5], letters[1:5]), SG3=letters[6:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=2, 2, SG2=5, SG3=6) target <- CharacterList(SG1=LETTERS[1:3], NULL, SG2=c(LETTERS[4:5], letters[1:5]), SG3=letters[6:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- 6 target <- CharacterList(unlist(x, use.names=FALSE), compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=6, SG2=6, SG3=6) target <- CharacterList(SG1=unlist(x, use.names=FALSE), SG2=NULL, SG3=NULL, compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(0, 0, 0, 6, 6) target <- CharacterList(NULL, NULL, NULL, unlist(x, use.names=FALSE), NULL, compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- seq_along(x) target <- unname(as(x, "CompressedList")) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) names(breakpoints) <- names(x) target <- as(x, "CompressedList") .do_checks(as(x, "CompressedList"), breakpoints, target) # no-op .do_checks(as(x, "SimpleList"), breakpoints, target) x0 <- CharacterList() breakpoints <- setNames(integer(0), character(0)) target <- setNames(CharacterList(compress=TRUE), character(0)) .do_checks(as(x0, "CompressedList"), breakpoints, target) # Fails at the moment because unlist() is doesn't work properly on # SimpleCharacterList #.do_checks(as(x0, "SimpleList"), breakpoints, target) x2 <- RleList(Rle(44:45, 2:1), Rle(45), Rle(-2, 3)) breakpoints <- c(SG1=2, SG2=3) target <- RleList(SG1=Rle(44:45, c(2,2)), SG2=Rle(-2, 3), compress=TRUE) .do_checks(as(x2, "CompressedList"), breakpoints, target) .do_checks(as(x2, "SimpleList"), breakpoints, target) x3 <- Views(unlist(x2, use.names=FALSE), start=c(3, 1, 1), end=c(6, 1, 3)) breakpoints <- c(SG1=2, SG2=3) target <- RleList(SG1=Rle(c(45,-2,44), c(2,2,1)), SG2=Rle(44:45, 2:1), compress=TRUE) .do_checks(x3, breakpoints, target) } IRanges/inst/unitTests/test_findOverlaps-methods.R0000644000175400017540000001303113175713360023372 0ustar00biocbuildbiocbuild### test_findOverlaps_Ranges <- function() { ## ..... ## .... ## .. ## x ## xx ## xxx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) result <- findOverlaps(query, subject, select = "first") checkIdentical(result, c(1L, NA, 3L)) result <- findOverlaps(query, subject, select = "last") checkIdentical(result, c(2L, NA, 3L)) result <- findOverlaps(query, subject, select = "arbitrary") checkIdentical(result, c(2L, NA, 3L)) checkOverlap <- function(a, q, s, r, c) { target <- Hits(q, s, r, c, sort.by.query=TRUE) checkIdentical(t(a), t(target)) } result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 3), c(1, 2, 3), 3, 3) ## with 'maxgap' result <- findOverlaps(query, subject, maxgap = 0L) checkOverlap(result, c(1, 1, 2, 3), c(2, 1, 2, 3), 3, 3) ## with 'minoverlap' result <- findOverlaps(query, subject, minoverlap = 3L) checkOverlap(result, integer(0), integer(0), 3, 3) result <- findOverlaps(query, subject, minoverlap = 2L) checkOverlap(result, 1, 2, 3, 3) result <- findOverlaps(query, subject, minoverlap = 2L, select = "first") checkIdentical(result, c(2L, NA, NA)) result <- findOverlaps(query, subject, minoverlap = 2L, select = "last") checkIdentical(result, c(2L, NA, NA)) result <- findOverlaps(query, subject, minoverlap = 2L, select = "arbitrary") checkIdentical(result, c(2L, NA, NA)) ## zero-width ranges query <- IRanges(9:14, 8:13) result <- findOverlaps(query, subject, minoverlap = 1L) checkOverlap(result, integer(0), integer(0), 6, 3) result <- findOverlaps(query, subject) checkOverlap(result, c(3, 4), c(3, 3), 6, 3) result <- findOverlaps(query, subject, maxgap = 0L) checkOverlap(result, 2:5, c(3, 3, 3, 3), 6, 3) result <- findOverlaps(query, subject, maxgap = 1L) checkOverlap(result, 1:6, c(3, 3, 3, 3, 3, 3), 6, 3) result <- findOverlaps(subject, query, minoverlap = 1L) checkOverlap(result, integer(0), integer(0), 3, 6) result <- findOverlaps(subject, query) checkOverlap(result, c(3, 3), c(3, 4), 3, 6) result <- findOverlaps(subject, query, maxgap = 0L) checkOverlap(result, c(3, 3, 3, 3), 2:5, 3, 6) result <- findOverlaps(subject, query, maxgap = 1L) checkOverlap(result, c(3, 3, 3, 3, 3, 3), 1:6, 3, 6) ## ..... ## .... ## .. ## xxxx ## xxx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2), c(5, 4)) result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 2, 2), c(1, 2, 1, 2), 3, 2) result <- findOverlaps(subject, query) checkOverlap(result, c(1, 1, 2, 2), c(1, 2, 1, 2), 2, 3) query <- IRanges(c(1, 4, 9, 11), c(5, 7, 10, 11)) result <- findOverlaps(query) checkOverlap(result, c(1, 1, 2, 2, 3, 4), c(1, 2, 1, 2, 3, 4), 4, 4) ## check case of identical subjects ## ..... ## ..... ## .. ## xxxx ## xxxx ## xx ## xxx ## xx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 6, 6, 6), c(5, 5, 7, 8, 7)) result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 2, 2, 2, 2, 2), c(1, 2, 1, 2, 3, 4, 5), 3, 5) subject <- IRanges(c(1, 6, 13), c(4, 9, 14)) # single points checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "first"), c(1L, 2L, NA)) checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "last"), c(1L, 2L, NA)) checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "arbitrary"), c(1L, 2L, NA)) checkIdentical(findOverlaps(IRanges(c(2,1),c(3,4)), subject), Hits(1:2, c(1, 1), 2, 3, sort.by.query=TRUE)) ## check other types of matching ## .. ## .. ## .... ## ...... ## xxxx ## xxxx ## xxxxx ## xxxx query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) ## 'start' result <- findOverlaps(query, subject, type = "start") checkOverlap(result, c(1, 2, 3), c(1, 3, 2), 4, 4) ## minoverlap > 1L result <- findOverlaps(query, subject, type = "start", minoverlap = 3L) checkOverlap(result, 3, 2, 4, 4) ## 'end' result <- findOverlaps(query, subject, type = "end") checkOverlap(result, c(2, 3, 4, 4), c(2, 2, 3, 4), 4, 4) result <- findOverlaps(subject, query, type = "end") checkOverlap(result, c(2, 2, 3, 4), c(2, 3, 4, 4), 4, 4) ## select = "first" result <- findOverlaps(query, subject, type = "end", select = "first") checkIdentical(result, c(NA, 2L, 2L, 3L)) ## 'within' result <- findOverlaps(query, subject, type = "within") checkOverlap(result, c(1, 2, 2, 3), c(1, 2, 3, 2), 4, 4) ## 'equal' result <- findOverlaps(query, subject, type = "equal") checkOverlap(result, 3, 2, 4, 4) checkException(findOverlaps(query, NULL), silent = TRUE) checkException(findOverlaps(NULL, query), silent = TRUE) } test_subsetByOverlaps_Ranges <- function() { x <- IRanges(9:12, 15) ranges <- IRanges(1, 10) checkIdentical(x[1:2], subsetByOverlaps(x, ranges)) checkIdentical(x[3:4], subsetByOverlaps(x, ranges, invert=TRUE)) checkIdentical(x[1:3], subsetByOverlaps(x, ranges, maxgap=0)) checkIdentical(x[4], subsetByOverlaps(x, ranges, maxgap=0, invert=TRUE)) x <- IRanges(c(1, 4, 9), c(5, 7, 10)) ranges <- IRanges(c(6, 8, 10), c(7, 12, 14)) checkIdentical(x[2:3], subsetByOverlaps(x, ranges)) checkIdentical(x[1], subsetByOverlaps(x, ranges, invert=TRUE)) checkIdentical(x, subsetByOverlaps(x, ranges, maxgap=0)) checkIdentical(x[0], subsetByOverlaps(x, ranges, maxgap=0, invert=TRUE)) } IRanges/inst/unitTests/test_inter-range-methods.R0000644000175400017540000003111413175713360023153 0ustar00biocbuildbiocbuildtest_range_Ranges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) ir2 <- IRanges(c(5,2,0), c(6,3,1)) checkIdentical(range(ir1), IRanges(1, 7)) checkIdentical(range(ir1, ir2), IRanges(0, 7)) checkIdentical(range(IRanges()), IRanges()) checkException(range(ir1, c(2,3)), silent = TRUE) # check with.revmap rng1 <- range(ir1, with.revmap=TRUE) rng2 <- range(ir2, with.revmap=TRUE) rng3 <- range(ir1,ir2, with.revmap=TRUE) checkIdentical(mcols(rng1)$revmap, IntegerList(seq(3))) checkIdentical(mcols(rng2)$revmap, IntegerList(seq(3))) checkIdentical(mcols(rng3)$revmap, IntegerList(seq(6))) ir3 <- IRanges() checkIdentical(mcols(range(ir3, with.revmap=TRUE))$revmap, IntegerList()) } test_range_RangesList <- function() { for (compress in c(TRUE, FALSE)) { rl1 <- IRangesList(a = IRanges(c(1,2),c(4,3)), b = IRanges(c(4,6),c(10,7)), compress = compress) rl2 <- IRangesList(c = IRanges(c(0,2),c(4,5)), a = IRanges(c(4,5),c(6,7)), compress = compress) ans <- IRangesList(a = IRanges(1,7), b = IRanges(4,10), c = IRanges(0,5), compress = compress) checkIdentical(range(rl1, rl2), ans) names(rl2) <- NULL ans <- IRangesList(IRanges(0,5), IRanges(4,10), compress = compress) checkIdentical(range(rl1, rl2), ans) ## must be same length checkException(range(rl2, rep.int(rl2, 2L)), silent=TRUE) } # check with.revmap revmap1 <- mcols(range(rl1,rl2, with.revmap=TRUE)[[1]])$revmap revmap2 <- mcols(range(rl1,rl2, with.revmap=TRUE)[[2]])$revmap ans <- IntegerList(seq(4)) checkIdentical(revmap1, ans) checkIdentical(revmap2, ans) range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges() range4 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) collection <- IRangesList(range1, range2, range3, range4) rng <- range(collection, with.revmap=TRUE) checkIdentical(mcols(rng[[1]])$revmap, IntegerList(1:3)) checkIdentical(mcols(rng[[2]])$revmap, IntegerList(1:4)) checkIdentical(mcols(rng[[3]])$revmap, IntegerList()) checkIdentical(mcols(rng[[4]])$revmap, IntegerList(1:3)) rng <- range(IRangesList(IRanges(), IRanges()), with.revmap=TRUE) checkIdentical(mcols(rng[[1]])$revmap, IntegerList()) checkIdentical(mcols(rng[[2]])$revmap, IntegerList()) } test_reduce_Ranges <- function() { x <- IRanges() current <- reduce(x) checkIdentical(x, current) x <- IRanges(1:3, width=0) current <- reduce(x, with.revmap=TRUE) target <- x mcols(target) <- DataFrame(revmap=as(seq_along(target), "IntegerList")) checkIdentical(target, current) current <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) target <- IRanges() mcols(target) <- DataFrame(revmap=IntegerList(seq_along(target))) checkIdentical(target, current) x <- IRanges(c(1:4, 10:11, 11), width=c(0,1,1,0,0,0,1)) current <- reduce(x, with.revmap=TRUE) target <- IRanges(c(1:2, 10:11), width=c(0,2,0,1)) mcols(target) <- DataFrame(revmap=IntegerList(1,2:4,5,6:7)) checkIdentical(target, current) current <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) target <- IRanges(c(2, 11), width=c(2,1)) mcols(target) <- DataFrame(revmap=IntegerList(2:3,7)) checkIdentical(target, current) x <- IRanges(start=c(1,2,3), end=c(5,2,8)) y <- reduce(x, with.revmap=TRUE) target <- IRanges(start=1, end=8) mcols(target) <- DataFrame(revmap=IntegerList(1:3)) checkIdentical(target, y) mcols(target)$revmap <- as(seq_along(target), "IntegerList") checkIdentical(target, reduce(y, with.revmap=TRUE)) x <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) y <- reduce(x, with.revmap=TRUE) target <- IRanges(start=c(1,15,20), end=c(5,15,100)) mcols(target) <- DataFrame(revmap=IntegerList(4, 1, 3:2)) checkIdentical(target, y) mcols(target)$revmap <- as(seq_along(target), "IntegerList") checkIdentical(target, reduce(y, with.revmap=TRUE)) x <- IRanges(start=c(7,3,-2,6,7,-10,-2,3), width=c(3,1,0,0,0,0,8,0)) ## Before reduction: ## start end width ==-10===-5====0===+5==+10=== ## [1] 7 9 3 ....:....:....:....:.xxx:... ## [2] 3 3 1 ....:....:....:..x.:....:... ## [3] -2 -3 0 ....:....:..[.:....:....:... ## [4] 6 5 0 ....:....:....:....:[...:... ## [5] 7 6 0 ....:....:....:....:.[..:... ## [6] -10 -11 0 ....[....:....:....:....:... ## [7] -2 5 8 ....:....:..xxxxxxxx....:... ## [8] 3 2 0 ....:....:....:..[.:....:... ## ---------------------==-10===-5====0===+5==+10=== ## After reduction: ## y1: ....[....:..xxxxxxxx.xxx:... ## y3: ....:....:..xxxxxxxx....:... y1 <- reduce(x) checkIdentical(y1, IRanges(start=c(-10,-2,7), end=c(-11,5,9))) checkIdentical(reduce(y1), y1) y2 <- reduce(x, with.inframe.attrib=TRUE) checkIdentical(start(attr(y2, "inframe")), c(9L,6L,1L,9L,9L,1L,1L,6L)) checkIdentical(width(attr(y2, "inframe")), width(x)) y3 <- reduce(x, drop.empty.ranges=TRUE) checkIdentical(y3, y1[width(y1) != 0L]) checkIdentical(reduce(y3), y3) y4 <- reduce(x, drop.empty.ranges=TRUE, with.inframe.attrib=TRUE) checkIdentical(attr(y4, "inframe"), attr(y2, "inframe")) y5 <- reduce(x, min.gapwidth=0) checkIdentical(y5, IRanges(start=c(-10,-2,-2,6,7,7), end=c(-11,-3,5,5,6,9))) y6 <- reduce(x, drop.empty.ranges=TRUE, min.gapwidth=0) checkIdentical(y6, y5[width(y5) != 0L]) y7 <- reduce(x, min.gapwidth=2) checkIdentical(y7, IRanges(start=c(-10,-2), end=c(-11,9))) y8 <- reduce(x, min.gapwidth=8) checkIdentical(y8, y7) y9 <- reduce(x, min.gapwidth=9) checkIdentical(y9, IRanges(start=-10, end=9)) } test_reduce_RangesList <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) range3 <- IRanges(start=c(3,-2,6,7,-10,-2,3), width=c(1,0,0,0,0,8,0)) range4 <- IRanges() for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one=range1, range2, range3, range4, compress=compress) for (with.revmap in c(FALSE, TRUE)) { for (drop.empty.ranges in c(FALSE, TRUE)) { current <- reduce(collection, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap) target <- IRangesList(one=reduce(range1, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range2, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range3, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range4, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), compress=compress) checkIdentical(target, current) } } } } test_gaps_Ranges <- function() { checkIdentical(gaps(IRanges()), IRanges()) checkIdentical(gaps(IRanges(), start=1, end=4), IRanges(start=1, end=4)) x <- IRanges(start=2, end=3) checkIdentical(gaps(x), IRanges()) checkIdentical(gaps(x, start=2), IRanges()) checkIdentical(gaps(x, start=4), IRanges()) checkIdentical(gaps(x, start=0), IRanges(start=0, end=1)) checkIdentical(gaps(x, end=3), IRanges()) checkIdentical(gaps(x, end=1), IRanges()) checkIdentical(gaps(x, end=5), IRanges(start=4, end=5)) checkIdentical(gaps(x, start=0, end=5), IRanges(start=c(0,4), end=c(1,5))) } test_gaps_RangesList <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one = range1, range2, compress = compress) checkIdentical(gaps(collection), IRangesList(one = gaps(range1), gaps(range2), compress = compress)) } } test_disjoin_Ranges <- function() { checkIdentical(disjoin(IRanges()), IRanges()) ir <- IRanges(c(1, 21, 10, 1, 15, 5, 20, 20), c(6, 20, 9, 3, 14, 11, 20, 19)) current <- disjoin(ir) checkTrue(validObject(current, complete=TRUE)) ## The result of disjoin(x) must verify the following properties: check_disjoin_general_properties <- function(y, x) { checkTrue(isDisjoint(y)) checkTrue(isStrictlySorted(y)) checkIdentical(reduce(x, drop.empty.ranges=TRUE), reduce(y)) checkTrue(all(start(y) %in% c(start(x), end(x) + 1L))) checkTrue(all(end(y) %in% c(end(x), start(x) - 1L))) } check_disjoin_general_properties(current, ir) target <- IRanges(c(1, 4, 5, 7, 10, 20), c(3, 4, 6, 9, 11, 20)) checkIdentical(target, current) ## Check 'revmap'. mcols(ir)$label <- LETTERS[seq_along(ir)] current <- disjoin(ir, with.revmap=TRUE) revmap <- IntegerList(c(1, 4), 1, c(1, 6), 6, 6, 7) mcols(target)$revmap <- revmap checkIdentical(target, current) ## With many randomly generated ranges. set.seed(2009L) ir <- IRanges(start=sample(580L, 500L, replace=TRUE), width=sample(10L, 500L, replace=TRUE) - 1L) check_disjoin_general_properties(disjoin(ir), ir) ir <- IRanges(start=sample(4900L, 500L, replace=TRUE), width=sample(35L, 500L, replace=TRUE) - 1L) check_disjoin_general_properties(disjoin(ir), ir) } test_disjoin_RangesList <- function() { ir0 <- IRanges(10, 20) checkTrue(validObject(disjoin(IRangesList()))) ## unnamed; incl. 0-length irl <- IRangesList(IRanges()) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(ir0, IRanges(), ir0) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(ir0, IRanges(), IRanges(), ir0) checkIdentical(irl, disjoin(irl)) ## named; incl. 0-length irl <- IRangesList(a=IRanges()) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(a=ir0, b=IRanges(), c=ir0) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(a=ir0, b=IRanges(), c=IRanges(), d=ir0) checkIdentical(irl, disjoin(irl)) ## no interference between separate elements ir0 <- IRanges(10, c(15, 20)) dr0 <- disjoin(ir0) irl <- IRangesList(ir0, ir0) checkIdentical(IRangesList(dr0, dr0), disjoin(irl)) irl <- IRangesList(ir0, IRanges(), ir0) checkIdentical(IRangesList(dr0, IRanges(), dr0), disjoin(irl)) ## 0-width ## 1-width ir0 <- IRanges(c(1, 10), 10) irl <- IRangesList(ir0, IRanges()) checkIdentical(disjoin(ir0), disjoin(irl)[[1]]) irl <- IRangesList(IRanges(), ir0) checkIdentical(disjoin(ir0), disjoin(irl)[[2]]) ## check don't collapse levels irl <- IRangesList(IRanges(1, 5), IRanges(3, 7)) names(irl) <- character(2) checkIdentical(irl, disjoin(irl)) ## check 'revmap' on many randomly generated ranges set.seed(2009L) ir1 <- IRanges(start=sample(580L, 500L, replace=TRUE), width=sample(10L, 500L, replace=TRUE) - 1L) ir2 <- IRanges(start=sample(4900L, 500L, replace=TRUE), width=sample(35L, 500L, replace=TRUE) - 1L) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one=ir1, IRanges(), ir0, ir0, ir2, IRanges(), compress=compress) for (with.revmap in c(FALSE, TRUE)) { current <- disjoin(collection, with.revmap=with.revmap) target <- IRangesList(one=disjoin(ir1, with.revmap=with.revmap), disjoin(IRanges(), with.revmap=with.revmap), disjoin(ir0, with.revmap=with.revmap), disjoin(ir0, with.revmap=with.revmap), disjoin(ir2, with.revmap=with.revmap), disjoin(IRanges(), with.revmap=with.revmap), compress=compress) checkIdentical(target, current) } } } test_disjointBins_Ranges <- function() { checkIdentical(disjointBins(IRanges()), integer()) checkIdentical(disjointBins(IRanges(1, 5)), 1L) checkIdentical(disjointBins(IRanges(c(1, 3), c(5, 12))), c(1L, 2L)) checkIdentical(disjointBins(IRanges(c(1, 3, 10), c(5, 12, 13))), c(1L, 2L, 1L)) checkIdentical(disjointBins(IRanges(c(3, 1, 10), c(5, 12, 13))), c(2L, 1L, 2L)) } IRanges/inst/unitTests/test_intra-range-methods.R0000644000175400017540000002222013175713360023145 0ustar00biocbuildbiocbuildtest_shift_Ranges <- function() { ir0 <- IRanges(0, 0) ir1 <- shift(ir0, .Machine$integer.max) checkTrue(validObject(ir1)) checkIdentical(.Machine$integer.max, start(ir1)) checkIdentical(.Machine$integer.max, end(ir1)) checkIdentical(1L, width(ir1)) checkIdentical(ir1, shift(ir1)) checkIdentical(ir1, shift(shift(ir1, -10), 10)) ir2 <- shift(ir0, -.Machine$integer.max) checkTrue(validObject(ir2)) checkIdentical(-.Machine$integer.max, start(ir2)) checkIdentical(-.Machine$integer.max, end(ir2)) checkIdentical(1L, width(ir2)) checkIdentical(ir2, shift(ir2)) checkIdentical(ir2, shift(shift(ir2, 10), -10)) ## shift() would produce an object with ranges that are not within the ## [-.Machine$integer.max, .Machine$integer.max] range. checkException(suppressWarnings(shift(ir1, 1)), silent=TRUE) checkException(suppressWarnings(shift(ir2, -1)), silent=TRUE) ir3 <- IRanges(1999222000, width=1000) checkException(suppressWarnings(shift(ir3, 188222000)), silent=TRUE) ir4 <- IRanges(1:20, width=222000000) checkException(suppressWarnings(shift(ir4, 1:20 * 99000000L)), silent=TRUE) } test_narrow_Ranges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(narrow(ir1, start=1, end=2), IRanges(c(2, 5, 1), c(3, 6, 2))) checkException(narrow(ir1, start=10, end=20), silent = TRUE) } test_narrow_RangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(narrow(collection, start=1, end=2), IRangesList(IRanges(c(2, 5), c(3, 6)), IRanges(1, 2), compress = compress)) checkException(narrow(collection, start=10, end=20), silent = TRUE) } } test_resize_Ranges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(resize(ir1, width=10), IRanges(c(2, 5, 1), width=10)) checkIdentical(resize(ir1, width=10, fix="end"), IRanges(c(-6, -2, -6), width=10)) checkIdentical(resize(ir1, width=10, fix="center"), IRanges(c(-2, 1, -3), width=10)) checkIdentical(resize(ir1, width=10, fix=c("start", "end", "center")), IRanges(c(2, -2, -3), width=10)) checkException(resize(ir1, -1), silent = TRUE) } test_resize_RangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(resize(collection, width=10), IRangesList(IRanges(c(2, 5), width=10), IRanges(1, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix="end"), IRangesList(IRanges(c(-6, -2), width=10), IRanges(-6, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix="center"), IRangesList(IRanges(c(-2, 1), width=10), IRanges(-3, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix=CharacterList(c("start", "end"), "center")), IRangesList(IRanges(c(2, -2), width=10), IRanges(-3, width=10), compress = compress)) checkException(resize(collection, -1), silent = TRUE) } } test_flank_Ranges <- function() { checkIdentical(flank(IRanges(), 2), IRanges()) ir1 <- IRanges(c(2, 5, 1), c(3, 7, 3)) checkIdentical(flank(ir1, 2), IRanges(c(0, 3, -1), c(1, 4, 0))) checkIdentical(flank(ir1, 2, FALSE), IRanges(c(4, 8, 4), c(5, 9, 5))) checkIdentical(flank(ir1, 2, c(FALSE, TRUE, FALSE)), IRanges(c(4, 3, 4), c(5, 4, 5))) checkIdentical(flank(ir1, c(2, -2, 2)), IRanges(c(0, 5, -1), c(1, 6, 0))) checkIdentical(flank(ir1, 2, both = TRUE), IRanges(c(0, 3, -1), c(3, 6, 2))) checkIdentical(flank(ir1, 2, FALSE, TRUE), IRanges(c(2, 6, 2), c(5, 9, 5))) checkIdentical(flank(ir1, -2, FALSE, TRUE), IRanges(c(2, 6, 2), c(5, 9, 5))) checkException(flank(ir1, 2, both = c(TRUE, FALSE, TRUE)), silent = TRUE) # not vectorized checkException(flank(ir1, 2, c(FALSE, TRUE, NA)), silent = TRUE) checkException(flank(ir1, NA), silent = TRUE) } test_flank_RangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(flank(collection, 2), IRangesList(IRanges(c(0, 3), c(1, 4)), IRanges(-1, 0), compress = compress)) checkIdentical(flank(collection, 2, FALSE), IRangesList(IRanges(c(4, 8), c(5, 9)), IRanges(4, 5), compress = compress)) checkIdentical(flank(collection, 2, LogicalList(c(FALSE, TRUE), FALSE)), IRangesList(IRanges(c(4, 3), c(5, 4)), IRanges(4, 5), compress = compress)) checkIdentical(flank(collection, IntegerList(c(2, -2), 2)), IRangesList(IRanges(c(0, 5), c(1, 6)), IRanges(-1, 0), compress = compress)) checkIdentical(flank(collection, 2, both = TRUE), IRangesList(IRanges(c(0, 3), c(3, 6)), IRanges(-1, 2), compress = compress)) checkIdentical(flank(collection, 2, FALSE, TRUE), IRangesList(IRanges(c(2, 6), c(5, 9)), IRanges(2, 5), compress = compress)) checkIdentical(flank(collection, -2, FALSE, TRUE), IRangesList(IRanges(c(2, 6), c(5, 9)), IRanges(2, 5), compress = compress)) checkException(flank(collection, 2, both = c(TRUE, FALSE, TRUE)), silent = TRUE) # not vectorized checkException(flank(collection, 2, LogicalList(c(FALSE, TRUE), NA)), silent = TRUE) checkException(flank(collection, NA), silent = TRUE) } } test_promoters <- function() { ir <- IRanges(c(10, 10), width=c(0, 1)) checkIdentical(width(promoters(ir, 0, 0)), c(0L, 0L)) checkIdentical(width(promoters(ir, 1, 0)), c(1L, 1L)) checkIdentical(start(promoters(ir, 1, 0)), c(9L, 9L)) checkIdentical(width(promoters(ir, 0, 1)), c(1L, 1L)) checkIdentical(start(promoters(ir, 0, 1)), c(10L, 10L)) ir <- IRanges(c(5, 2, 20), width=1) checkIdentical(start(promoters(ir, 5, 2)), c(0L, -3L, 15L)) rl <- RangesList("A"=IRanges(5:7, width=1), "B"=IRanges(10:12, width=5)) current <- promoters(rl, 0, 0) checkIdentical(names(current), names(rl)) checkIdentical(start(current), start(rl)) current <- promoters(rl, 2, 0) checkIdentical(unique(unlist(width(current))), 2L) library(XVector) subject <- XInteger(10, 3:-6) view <- Views(subject, start=4:2, end=4:6) current <- promoters(view, 0, 0) checkIdentical(start(current), start(view)) current <- promoters(view, 2, 0) checkIdentical(unique(width(current)), 2L) cmp <- IRangesList("A"=IRanges(5:7, width=1), "B"=IRanges(10:12, width=5)) current <- promoters(rl, 0, 0) checkIdentical(names(current), names(rl)) checkIdentical(start(current), start(rl)) current <- promoters(rl, 2, 0) checkIdentical(unique(unlist(width(current))), 2L) } test_reflect_Ranges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) bounds <- IRanges(c(0, 5, 3), c(10, 6, 9)) checkIdentical(reflect(ir1, bounds), IRanges(c(7, 4, 9), c(8, 6, 11))) checkException(reflect(ir1, IRanges()), silent = TRUE) } test_restrict_Ranges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(restrict(ir1, start=2, end=5), IRanges(c(2, 5, 2), c(3, 5, 3))) checkIdentical(restrict(ir1, start=1, end=2), IRanges(c(2, 1), c(2, 2))) checkIdentical(restrict(ir1, start=1, end=2, keep.all.ranges=TRUE), IRanges(c(2, 3, 1), c(2, 2, 2))) } test_restrict_RangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(restrict(collection, start=2, end=5), IRangesList(IRanges(c(2, 5), c(3, 5)), IRanges(2, 3), compress = compress)) checkIdentical(restrict(collection, start=1, end=2), IRangesList(IRanges(2, 2), IRanges(1, 2), compress = compress)) checkIdentical(restrict(collection, start=1, end=2, keep.all.ranges=TRUE), IRangesList(IRanges(c(2, 3), c(2, 2)), IRanges(1, 2), compress = compress)) } } test_zoom_Ranges <- function() { ir <- IRanges(c(1,5), c(3,10)) checkIdentical(ir*1, ir) checkIdentical(ir*c(1,2), IRanges(c(1,6), c(3, 8))) checkIdentical(ir*-2, IRanges(c(-1,2), c(4, 13))) checkException(ir*NA_integer_, silent = TRUE) checkException(ir*numeric(), silent = TRUE) checkException(ir*c(1,2,1), silent = TRUE) checkException(ir[rep(1,3)]*c(1,2), silent = TRUE) } IRanges/inst/unitTests/test_nearest-methods.R0000644000175400017540000001161013175713360022400 0ustar00biocbuildbiocbuildcheckMatching <- function(a, q, s, r, c) { mat <- cbind(queryHits = as.integer(q), subjectHits = as.integer(s)) checkIdentical(as.matrix(a), mat) checkIdentical(c(queryLength(a), subjectLength(a)), as.integer(c(r, c))) } test_Ranges_adjacency <- function() { query <- IRanges(c(1, 3, 9), c(3, 7, 10)) subject <- IRanges(c(3, 10, 2), c(3, 12, 5)) checkIdentical(precede(query, subject), c(2L, 2L, NA)) checkIdentical(precede(IRanges(), subject), integer()) checkIdentical(precede(query, IRanges()), rep(NA_integer_, 3)) checkIdentical(precede(query), c(3L, 3L, NA)) checkIdentical(follow(query, subject), c(NA, NA, 3L)) checkIdentical(follow(IRanges(), subject), integer()) checkIdentical(follow(query, IRanges()), rep(NA_integer_, 3)) checkIdentical(follow(query), c(NA, NA, 2L)) checkMatching(precede(query, subject, select="all"), c(1, 2), c(2, 2), 3, 3) ## xxxx ## xxx ## xx ## xx ## xxx ## .. ## .. ## .. ## .. ## .. subject <- IRanges(c(1, 2, 9, 15, 15), width=c(4, 3, 2, 2, 3)) query <- IRanges(c(6, 11, 1, 13, 18), width=c(2, 2, 2, 2, 2)) checkMatching(precede(query, subject, select="all"), c(1, 2, 2, 3, 4, 4), c(3, 4, 5, 3, 4, 5), 5, 5) checkMatching(precede(subject, query, select="all"), c(1, 2, 3, 4, 5), c(1, 1, 2, 5, 5), 5, 5) checkMatching(follow(query, subject, select="all"), c(1, 1, 2, 4, 5), c(1, 2, 3, 3, 5), 5, 5) checkMatching(follow(subject, query, select="all"), c(3, 4, 5), c(1, 4, 4), 5, 5) checkMatching(precede(query, select="all"), c(1, 2, 3, 4), c(2, 4, 1, 5), 5, 5) checkMatching(precede(subject, select="all"), c(1, 2, 3, 3), c(3, 3, 4, 5), 5, 5) checkMatching(follow(query, select="all"), c(1, 2, 4, 5), c(3, 1, 2, 4), 5, 5) checkMatching(follow(subject, select="all"), c(3, 3, 4, 5), c(1, 2, 3, 3), 5, 5) } test_Ranges_nearest <- function() { query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) ## 2 possible results current <- nearest(query, subject) target1 <- c(1L, 1L, 3L) target2 <- c(1L, 2L, 3L) checkTrue(identical(target1, current) || identical(target2, current)) checkIdentical(nearest(query), c(2L, 1L, 2L)) checkIdentical(nearest(query, subject[c(2,3,1)]), c(3L, 3L, 2L)) ## xxxx ## xxx ## xx ## xx ## xxx ## .. ## .. ## .. ## .. ## .. subject <- IRanges(c(1, 2, 9, 15, 15), width=c(4, 3, 2, 2, 3)) query <- IRanges(c(6, 11, 1, 13, 18), width=c(2, 2, 2, 2, 2)) checkMatching(nearest(query, subject, select = "all"), c(1, 1, 1, 2, 3, 3, 4, 4, 5), c(1, 2, 3, 3, 1, 2, 4, 5, 5), 5, 5) checkMatching(nearest(subject, query, select = "all"), c(1, 2, 3, 4, 5, 5), c(3, 3, 2, 4, 4, 5), 5, 5) checkMatching(nearest(subject, select="all"), c(1, 2, 3, 3, 3, 3, 4, 5), c(2, 1, 1, 2, 4, 5, 5, 4), 5, 5) checkMatching(nearest(query, select="all"), c(1, 1, 2, 3, 4, 5), c(2, 3, 4, 1, 2, 4), 5, 5) } quiet <- suppressWarnings test_Ranges_distance <- function() { checkIdentical(quiet(distance(IRanges(), IRanges())), integer()) ## adjacent, overlap, separated by 1 query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) checkIdentical(quiet(distance(query, subject)), c(0L, 0L, 1L)) ## recycling checkIdentical(quiet(distance(query[1:2], subject)), c(0L, 0L, 9L)) ## zero-width target <- abs(-3:3) current <- sapply(-3:3, function(i) quiet(distance(shift(IRanges(4,3), i), IRanges(4,3)))) checkIdentical(target, current) checkIdentical(quiet(distance(IRanges(4,3), IRanges(3,4))), 0L) } test_Ranges_distanceToNearest <- function() { target <- Hits(sort.by.query=TRUE) current <- distanceToNearest(IRanges(), IRanges()) checkIdentical(queryHits(current), queryHits(target)) checkIdentical(subjectHits(current), subjectHits(target)) checkIdentical(queryLength(current), queryLength(target)) x <- IRanges(5, 10) subject <- IRanges(c(1, 1, 1), c(4, 5, 6)) current <- distanceToNearest(x, subject, select="all") checkIdentical(subjectHits(current), 1:3) current <- distanceToNearest(x, rev(subject), select="all") checkIdentical(subjectHits(current), 1:3) current <- distanceToNearest(x, IRanges()) checkIdentical(length(current), 0L) checkIdentical(queryLength(current), 1L) checkIdentical(subjectLength(current), 0L) x <- IRanges(c(2, 4, 12, 15), c(2, 3, 13, 14)) subject <- IRanges(1, 10) current <- distanceToNearest(x, subject) checkIdentical(queryHits(current), 1:4) checkIdentical(mcols(current)$distance, c(0L, 0L, 1L, 4L)) } IRanges/inst/unitTests/test_seqapply.R0000644000175400017540000000062613175713360021141 0ustar00biocbuildbiocbuildtest_unsplit <- function() { ir <- IRanges(1:5, 11:15) f <- factor(c("a", "b", "a", "b", "b"), c("b", "a", "c")) rl <- split(ir, f) checkIdentical(unsplit(rl, f), ir) rl <- split(ir, f, drop=TRUE) checkIdentical(unsplit(rl, Rle(f), drop=TRUE), ir) checkException(unsplit(rl, f, drop=FALSE), silent=TRUE) v <- 1:5 l <- splitAsList(v, f) checkIdentical(unsplit(l, Rle(f)), v) } IRanges/inst/unitTests/test_setops-methods.R0000644000175400017540000000706613175713360022266 0ustar00biocbuildbiocbuildtest_IRanges_union <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- union(x, y) ans0 <- IRanges(c(1, 9), c(7, 12)) checkIdentical(ans, ans0) } test_IRanges_intersect <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- intersect(x, y) ans0 <- IRanges(c(2,10),c(3,10)) checkIdentical(ans, ans0) } test_IRanges_setdiff <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- setdiff(x, y) ans0 <- IRanges(c(1,4,9), c(1,7,9)) checkIdentical(ans, ans0) ans <- setdiff(y, x) ans0 <- IRanges(c(11), c(12)) checkIdentical(ans, ans0) } test_IRanges_punion <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans0 <- IRanges(start=c(1,8,18,31,41,48,61,71), end=c(5,15,25,36,45,55,65,79)) checkIdentical(punion(x, y, fill.gap=TRUE), ans0) checkIdentical(punion(y, x, fill.gap=TRUE), ans0) } test_IRanges_pintersect <- function() { x <- IRanges(start=c(22,22,22,22,22,22), end=c(28,28,28,28,21,21)) y <- IRanges(start=c(25,30,29,25,22,22), end=c(30,40,40,24,21,29)) ansMaxStart <- IRanges(start=c(25,30,29,25,22,22), end=c(28,29,28,24,21,21)) ansStartX <- IRanges(start=c(25,22,29,25,22,22), end=c(28,21,28,24,21,21)) ansStartY <- IRanges(start=c(25,30,29,25,22,22), end=c(28,29,28,24,21,21)) checkException(pintersect(x, y), silent = TRUE) checkException(pintersect(y, x), silent = TRUE) for (resolve.empty in c("none", "max.start", "start.x")) { checkIdentical(x, pintersect(x, x, resolve.empty = resolve.empty)) checkIdentical(y, pintersect(y, y, resolve.empty = resolve.empty)) } checkIdentical(pintersect(x[-c(2,3)], y[-c(2,3)]), ansMaxStart[-c(2,3)]) checkIdentical(pintersect(y[-c(2,3)], x[-c(2,3)]), ansMaxStart[-c(2,3)]) checkIdentical(pintersect(x, y, resolve.empty = "max.start"), ansMaxStart) checkIdentical(pintersect(y, x, resolve.empty = "max.start"), ansMaxStart) checkIdentical(pintersect(x, y, resolve.empty = "start.x"), ansStartX) checkIdentical(pintersect(y, x, resolve.empty = "start.x"), ansStartY) } test_IRanges_psetdiff <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans <- psetdiff(x[-7], y[-7]) ans0 <- IRanges(start=c(5,11,23,31,41,51,71), end=c(5,10,25,34,40,55,75)) checkIdentical(ans, ans0) ans <- psetdiff(y[-2], x[-2]) ans0 <- IRanges(start=c(1,18,36,43,48,63,78), end=c(0,20,36,45,50,62,79)) checkIdentical(ans, ans0) } test_IRanges_pgap <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans <- pgap(x, y) checkIdentical(width(ans), c(0L, 0L, 0L, 0L, 2L, 0L, 0L, 2L)) checkIdentical(start(ans)[width(ans) != 0L], c(41L, 76L)) } test_RangesList_setops <- function() { rl1 <- RangesList(IRanges(c(1,2),c(4,3)), IRanges(c(4,6),c(10,7))) rl2 <- RangesList(IRanges(c(0,2),c(4,5)), IRanges(c(4,5),c(6,7))) checkIdentical(union(rl1, rl2), RangesList(union(rl1[[1]], rl2[[1]]), union(rl1[[2]], rl2[[2]]))) checkIdentical(intersect(rl1, rl2), RangesList(intersect(rl1[[1]], rl2[[1]]), intersect(rl1[[2]], rl2[[2]]))) checkIdentical(setdiff(rl1, rl2), RangesList(setdiff(rl1[[1]], rl2[[1]]), setdiff(rl1[[2]], rl2[[2]]))) } IRanges/inst/unitTests/test_split.R0000644000175400017540000000024013175713360020426 0ustar00biocbuildbiocbuildtest_splitAsList <- function() { ir <- IRanges(sample(100),sample(100)+100) ir2 <- unlist(split(ir, ceiling(1:100 / 10))) checkTrue(all(ir==ir2)) } IRanges/inst/unitTests/test_tile-methods.R0000644000175400017540000000313413175713360021676 0ustar00biocbuildbiocbuildtest_tile <- function() { ir <- IRanges() checkIdentical(tile(ir, n=3), IRangesList()) checkIdentical(tile(ir, width=2), IRangesList()) checkIdentical(tile(ir, n=0), IRangesList()) ir <- IRanges(1, 4) checkIdentical(tile(ir, n=2), IRangesList(IRanges(c(1, 3), c(2, 4)))) checkIdentical(tile(ir, n=2), tile(ir, width=2)) ir <- IRanges(1, 5) checkIdentical(tile(ir, n=3), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)))) checkIdentical(tile(ir, n=3), tile(ir, width=2)) ir <- IRanges(1, 4) checkIdentical(tile(ir, n=3), IRangesList(IRanges(1:3, c(1, 2, 4)))) ir <- IRanges(1:3, width=5:3) checkIdentical(tile(ir, n=3), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)), IRanges(c(2, 3, 4), c(2, 3, 5)), IRanges(c(3, 4, 5), c(3, 4, 5)))) checkIdentical(tile(ir, width=2), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)), IRanges(c(2, 4), c(3, 5)), IRanges(c(3, 4), c(3, 5)))) checkIdentical(elementNROWS(tile(ir, width=4)), c(2L, 1L, 1L)) checkException(tile(ir, n=4), silent=TRUE) checkException(tile(ir, width=-1), silent=TRUE) checkException(tile(ir, n=-1), silent=TRUE) } test_slidingWindows <- function() { ir <- IRanges() checkIdentical(slidingWindows(ir, width=3), IRangesList()) ir <- IRanges(1:3, width=5:3) checkIdentical(slidingWindows(ir, width=3, step=2), IRangesList(IRanges(c(1, 3), c(3, 5)), IRanges(c(2, 4), c(4, 5)), IRanges(3, 5))) } IRanges/man/0000755000175400017540000000000013175713360013711 5ustar00biocbuildbiocbuildIRanges/man/AtomicList-class.Rd0000644000175400017540000002644513175713360017366 0ustar00biocbuildbiocbuild\name{AtomicList} \docType{class} % AtomicList classes \alias{class:AtomicList} \alias{AtomicList-class} \alias{AtomicList} \alias{class:CompressedAtomicList} \alias{CompressedAtomicList-class} \alias{CompressedAtomicList} \alias{class:SimpleAtomicList} \alias{SimpleAtomicList-class} \alias{SimpleAtomicList} \alias{class:LogicalList} \alias{LogicalList-class} \alias{LogicalList} \alias{class:CompressedLogicalList} \alias{CompressedLogicalList-class} \alias{CompressedLogicalList} \alias{class:SimpleLogicalList} \alias{SimpleLogicalList-class} \alias{SimpleLogicalList} \alias{class:IntegerList} \alias{IntegerList-class} \alias{IntegerList} \alias{class:CompressedIntegerList} \alias{CompressedIntegerList-class} \alias{CompressedIntegerList} \alias{class:SimpleIntegerList} \alias{SimpleIntegerList-class} \alias{SimpleIntegerList} \alias{class:NumericList} \alias{NumericList-class} \alias{NumericList} \alias{class:CompressedNumericList} \alias{CompressedNumericList-class} \alias{CompressedNumericList} \alias{class:SimpleNumericList} \alias{SimpleNumericList-class} \alias{SimpleNumericList} \alias{class:ComplexList} \alias{ComplexList-class} \alias{ComplexList} \alias{class:CompressedComplexList} \alias{CompressedComplexList-class} \alias{CompressedComplexList} \alias{class:SimpleComplexList} \alias{SimpleComplexList-class} \alias{SimpleComplexList} \alias{class:CharacterList} \alias{CharacterList-class} \alias{CharacterList} \alias{class:CompressedCharacterList} \alias{CompressedCharacterList-class} \alias{CompressedCharacterList} \alias{class:SimpleCharacterList} \alias{SimpleCharacterList-class} \alias{SimpleCharacterList} \alias{class:RawList} \alias{RawList-class} \alias{RawList} \alias{class:CompressedRawList} \alias{CompressedRawList-class} \alias{CompressedRawList} \alias{class:SimpleRawList} \alias{SimpleRawList-class} \alias{SimpleRawList} \alias{class:RleList} \alias{RleList-class} \alias{RleList} \alias{class:CompressedRleList} \alias{CompressedRleList-class} \alias{CompressedRleList} \alias{class:SimpleRleList} \alias{SimpleRleList-class} \alias{SimpleRleList} \alias{class:FactorList} \alias{FactorList-class} \alias{FactorList} \alias{class:CompressedFactorList} \alias{CompressedFactorList-class} \alias{CompressedFactorList} \alias{class:SimpleFactorList} \alias{SimpleFactorList-class} \alias{SimpleFactorList} % coercion \alias{as.list,CompressedAtomicList-method} \alias{coerce,CompressedAtomicList,list-method} \alias{as.vector,AtomicList-method} \alias{coerce,vector,AtomicList-method} \alias{as.matrix,AtomicList-method} \alias{lapply,CompressedAtomicList-method} \alias{coerce,vector,CompressedLogicalList-method} \alias{coerce,vector,SimpleLogicalList-method} \alias{coerce,vector,CompressedIntegerList-method} \alias{coerce,vector,SimpleIntegerList-method} \alias{coerce,vector,CompressedNumericList-method} \alias{coerce,vector,SimpleNumericList-method} \alias{coerce,vector,CompressedComplexList-method} \alias{coerce,vector,SimpleComplexList-method} \alias{coerce,vector,CompressedCharacterList-method} \alias{coerce,vector,SimpleCharacterList-method} \alias{coerce,vector,CompressedRawList-method} \alias{coerce,vector,SimpleRawList-method} \alias{coerce,vector,CompressedRleList-method} \alias{coerce,vector,SimpleRleList-method} \alias{coerce,AtomicList,LogicalList-method} \alias{coerce,AtomicList,IntegerList-method} \alias{coerce,AtomicList,NumericList-method} \alias{coerce,AtomicList,ComplexList-method} \alias{coerce,AtomicList,CharacterList-method} \alias{coerce,AtomicList,RawList-method} \alias{coerce,AtomicList,RleList-method} \alias{RleList,AtomicList,RleList-method} \alias{coerce,AtomicList,RleViews} \alias{unlist,SimpleFactorList-method} \alias{unlist,SimpleRleList-method} \alias{unique,RleList-method} \alias{unique,CompressedList-method} \alias{table,AtomicList-method} \alias{table,SimpleAtomicList-method} \alias{drop,AtomicList-method} \alias{duplicated,CompressedList-method} \alias{duplicated,CompressedAtomicList-method} \alias{sort,List-method} \alias{order,List-method} \alias{rank,List-method} \alias{runLength,RleList-method} \alias{runValue,RleList-method} \alias{runLength,CompressedRleList-method} \alias{runValue,CompressedRleList-method} \alias{runValue<-,CompressedRleList-method} \alias{runValue<-,SimpleRleList-method} \alias{ranges,RleList-method} \alias{ranges,CompressedRleList-method} \alias{show,AtomicList-method} \alias{show,RleList-method} \title{Lists of Atomic Vectors in Natural and Rle Form} \description{An extension of \code{\linkS4class{List}} that holds only atomic vectors in either a natural or run-length encoded form.} \details{ The lists of atomic vectors are \code{LogicalList}, \code{IntegerList}, \code{NumericList}, \code{ComplexList}, \code{CharacterList}, and \code{RawList}. There is also an \code{RleList} class for run-length encoded versions of these atomic vector types. Each of the above mentioned classes is virtual with Compressed* and Simple* non-virtual representations. } \section{Constructors}{ \describe{ \item{}{\code{LogicalList(..., compress = TRUE)}: Concatenates the \code{logical} vectors in \code{...} into a new \code{LogicalList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{IntegerList(..., compress = TRUE)}: Concatenates the \code{integer} vectors in \code{...} into a new \code{IntegerList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{NumericList(..., compress = TRUE)}: Concatenates the \code{numeric} vectors in \code{...} into a new \code{NumericList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{ComplexList(..., compress = TRUE)}: Concatenates the \code{complex} vectors in \code{...} into a new \code{ComplexList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{CharacterList(..., compress = TRUE)}: Concatenates the \code{character} vectors in \code{...} into a new \code{CharacterList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{RawList(..., compress = TRUE)}: Concatenates the \code{raw} vectors in \code{...} into a new \code{RawList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{RleList(..., compress = TRUE)}: Concatenates the run-length encoded atomic vectors in \code{...} into a new \code{RleList}. If \code{compress}, the internal storage of the data is compressed.} \item{}{\code{FactorList(..., compress = TRUE)}: Concatenates the \code{factor} objects in \code{...} into a new \code{FactorList}. If \code{compress}, the internal storage of the data is compressed.} } } \section{Coercion}{ \describe{ \item{}{ \code{as(from, "CompressedSplitDataFrameList")}, \code{as(from, "SimpleSplitDataFrameList")}: Creates a \linkS4class{CompressedSplitDataFrameList}/\linkS4class{SimpleSplitDataFrameList} instance from an AtomicList instance. } \item{}{ \code{as(from, "IRangesList")}, \code{as(from, "CompressedIRangesList")}, \code{as(from, "SimpleIRangesList")}: Creates a \linkS4class{CompressedIRangesList}/\linkS4class{SimpleIRangesList} instance from a LogicalList or logical RleList instance. Note that the elements of this instance are guaranteed to be normal. } \item{}{ \code{as(from, "NormalIRangesList")}, \code{as(from, "CompressedNormalIRangesList")}, \code{as(from, "SimpleNormalIRangesList")}: Creates a \linkS4class{CompressedNormalIRangesList}/\linkS4class{SimpleNormalIRangesList} instance from a LogicalList or logical RleList instance. } \item{}{\code{as(from, "CharacterList")}, \code{as(from, "ComplexList")}, \code{as(from, "IntegerList")}, \code{as(from, "LogicalList")}, \code{as(from, "NumericList")}, \code{as(from, "RawList")}, \code{as(from, "RleList")}: Coerces an \code{AtomicList} \code{from} to another derivative of \code{AtomicList}. } \item{}{\code{as(from, "AtomicList")}: If \code{from} is a vector, converts it to an \code{AtomicList} of the appropriate type. } \item{}{\code{drop(x)}: Checks if every element of \code{x} is of length one, and, if so, unlists \code{x}. Otherwise, an error is thrown. } \item{}{\code{as(from, "RleViews")}: Creates an RleViews where each view corresponds to an element of \code{from}. The subject is \code{unlist(from)}. } \item{}{\code{as.matrix(x, col.names=NULL)}: Maps the elements of the list to rows of a matrix. The column mapping depends on whether there are inner names (either on the object or provided via \code{col.names} as a List object). If there are no inner names, each row is padded with NAs to reach the length of the longest element. If there are inner names, there is a column for each unique name and the mapping is by name. To provide inner names, the \code{col.names} argument should be a List, usually a CharacterList or FactorList (which is particularly efficient). If \code{col.names} is a character vector, it names the columns of the result, but does not imply inner names. } } } \section{Compare, Order, Tabulate}{ The following methods are provided for element-wise comparison of 2 AtomicList objects, and ordering or tabulating of each list element of an AtomicList object: \code{is.na}, \code{duplicated}, \code{unique}, \code{match}, \code{\%in\%}, \code{table}, \code{order}, \code{sort}. } \section{RleList Methods}{ RleList has a number of methods that are not shared by other AtomicList derivatives. \describe{ \item{}{\code{runLength(x)}: Gets the run lengths of each element of the list, as an IntegerList. } \item{}{\code{runValue(x)}, \code{runValue(x) <- value}: Gets or sets the run values of each element of the list, as an AtomicList. } \item{}{\code{ranges(x)}: Gets the run ranges as a \code{RangesList}. } } } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{AtomicList-utils} for common operations on AtomicList objects. \item \link[S4Vectors]{List} objects in the \pkg{S4Vectors} package for the parent class. } } \examples{ int1 <- c(1L,2L,3L,5L,2L,8L) int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) collection <- IntegerList(int1, int2) ## names names(collection) <- c("one", "two") names(collection) names(collection) <- NULL # clear names names(collection) names(collection) <- "one" names(collection) # c("one", NA) ## extraction collection[[1]] # range1 collection[["1"]] # NULL, does not exist collection[["one"]] # range1 collection[[NA_integer_]] # NULL ## subsetting collection[numeric()] # empty collection[NULL] # empty collection[] # identity collection[c(TRUE, FALSE)] # first element collection[2] # second element collection[c(2,1)] # reversed collection[-1] # drop first collection$one ## replacement collection$one <- int2 collection[[2]] <- int1 ## combining col1 <- IntegerList(one = int1, int2) col2 <- IntegerList(two = int2, one = int1) col3 <- IntegerList(int2) append(col1, col2) append(col1, col2, 0) col123 <- c(col1, col2, col3) col123 ## revElements revElements(col123) revElements(col123, 4:5) } \keyword{methods} \keyword{classes} IRanges/man/AtomicList-utils.Rd0000644000175400017540000002140413175713360017407 0ustar00biocbuildbiocbuild\name{AtomicList-utils} \alias{AtomicList-utils} \alias{Ops,CompressedAtomicList,CompressedAtomicList-method} \alias{Ops,SimpleAtomicList,SimpleAtomicList-method} \alias{Ops,SimpleAtomicList,CompressedAtomicList-method} \alias{Ops,CompressedAtomicList,SimpleAtomicList-method} \alias{Ops,AtomicList,atomic-method} \alias{Ops,atomic,AtomicList-method} \alias{Ops,CompressedAtomicList,atomic-method} \alias{Ops,atomic,CompressedAtomicList-method} \alias{Ops,SimpleAtomicList,atomic-method} \alias{Ops,atomic,SimpleAtomicList-method} \alias{Math,CompressedAtomicList-method} \alias{Math,SimpleAtomicList-method} \alias{Math2,CompressedAtomicList-method} \alias{Math2,SimpleAtomicList-method} \alias{Summary,AtomicList-method} \alias{Summary,CompressedRleList-method} \alias{Complex,CompressedAtomicList-method} \alias{Complex,SimpleAtomicList-method} \alias{sum,CompressedIntegerList-method} \alias{sum,CompressedLogicalList-method} \alias{sum,CompressedNumericList-method} \alias{which,CompressedLogicalList-method} \alias{which,SimpleLogicalList-method} \alias{which,CompressedRleList-method} \alias{which,SimpleRleList-method} \alias{which.max,CompressedRleList-method} \alias{which.max,RleList-method} \alias{which.max,IntegerList-method} \alias{which.max,NumericList-method} \alias{which.min,CompressedRleList-method} \alias{which.min,RleList-method} \alias{which.min,IntegerList-method} \alias{which.min,NumericList-method} \alias{all,CompressedRleList-method} \alias{all,CompressedAtomicList-method} \alias{any,CompressedAtomicList-method} \alias{anyNA,CompressedAtomicList-method} \alias{diff.AtomicList} \alias{diff,CompressedAtomicList-method} \alias{pmax,IntegerList-method} \alias{pmax,NumericList-method} \alias{pmax,RleList-method} \alias{pmin,IntegerList-method} \alias{pmin,NumericList-method} \alias{pmin,RleList-method} \alias{pmax.int,IntegerList-method} \alias{pmax.int,NumericList-method} \alias{pmax.int,RleList-method} \alias{pmin.int,IntegerList-method} \alias{pmin.int,NumericList-method} \alias{pmin.int,RleList-method} \alias{mean,AtomicList-method} \alias{mean,CompressedLogicalList-method} \alias{mean,CompressedIntegerList-method} \alias{mean,CompressedNumericList-method} \alias{mean,CompressedRleList-method} \alias{var,AtomicList,missing-method} \alias{var,AtomicList,AtomicList-method} \alias{cov,AtomicList,AtomicList-method} \alias{cor,AtomicList,AtomicList-method} \alias{sd,AtomicList-method} \alias{median,AtomicList-method} \alias{median,CompressedAtomicList-method} \alias{quantile,AtomicList-method} \alias{mad,AtomicList-method} \alias{IQR,AtomicList-method} \alias{cumsum,CompressedAtomicList-method} \alias{cumprod,CompressedAtomicList-method} \alias{cummin,CompressedAtomicList-method} \alias{cummax,CompressedAtomicList-method} \alias{range,CompressedIntegerList-method} \alias{range,CompressedNumericList-method} \alias{range,CompressedLogicalList-method} \alias{smoothEnds,CompressedIntegerList-method} \alias{smoothEnds,SimpleIntegerList-method} \alias{smoothEnds,NumericList-method} \alias{smoothEnds,RleList-method} \alias{runmed,CompressedIntegerList-method} \alias{runmed,SimpleIntegerList-method} \alias{runmed,NumericList-method} \alias{runmed,RleList-method} \alias{runmean,RleList-method} \alias{runsum,RleList-method} \alias{runwtsum,RleList-method} \alias{runq,RleList-method} \alias{nchar,CompressedCharacterList-method} \alias{nchar,SimpleCharacterList-method} \alias{nchar,CompressedRleList-method} \alias{nchar,SimpleRleList-method} \alias{chartr,ANY,ANY,CompressedCharacterList-method} \alias{chartr,ANY,ANY,SimpleCharacterList-method} \alias{chartr,ANY,ANY,CompressedRleList-method} \alias{chartr,ANY,ANY,SimpleRleList-method} \alias{tolower,CompressedCharacterList-method} \alias{tolower,SimpleCharacterList-method} \alias{tolower,CompressedRleList-method} \alias{tolower,SimpleRleList-method} \alias{toupper,CompressedCharacterList-method} \alias{toupper,SimpleCharacterList-method} \alias{toupper,CompressedRleList-method} \alias{toupper,SimpleRleList-method} \alias{sub,ANY,ANY,CompressedCharacterList-method} \alias{sub,ANY,ANY,SimpleCharacterList-method} \alias{sub,ANY,ANY,CompressedRleList-method} \alias{sub,ANY,ANY,SimpleRleList-method} \alias{gsub,ANY,ANY,CompressedCharacterList-method} \alias{gsub,ANY,ANY,SimpleCharacterList-method} \alias{gsub,ANY,ANY,CompressedRleList-method} \alias{gsub,ANY,ANY,SimpleRleList-method} \alias{unstrsplit,CharacterList-method} \alias{unstrsplit,RleList-method} \alias{paste,CompressedAtomicList-method} \alias{rank,CompressedAtomicList-method} \alias{order,CompressedAtomicList-method} \alias{selfmatch,CompressedAtomicList-method} \alias{intersect,CompressedAtomicList,CompressedAtomicList-method} \alias{ifelse2} \alias{ifelse2,ANY,ANY,List-method} \alias{ifelse2,ANY,List,ANY-method} \alias{ifelse2,List,ANY,ANY-method} \alias{ifelse2,CompressedLogicalList,ANY,ANY-method} \alias{ifelse2,CompressedLogicalList,List,ANY-method} \alias{ifelse2,CompressedLogicalList,ANY,List-method} \alias{ifelse2,CompressedLogicalList,List,ANY-method} \alias{ifelse2,CompressedLogicalList,List,List-method} \alias{ifelse2,SimpleLogicalList,ANY,ANY-method} \alias{ifelse2,SimpleLogicalList,List,ANY-method} \alias{ifelse2,SimpleLogicalList,ANY,List-method} \alias{ifelse2,SimpleLogicalList,List,ANY-method} \alias{ifelse2,SimpleLogicalList,List,List-method} \title{Common operations on AtomicList objects} \description{ Common operations on \link{AtomicList} objects. } \section{Group Generics}{ AtomicList objects have support for S4 group generic functionality to operate within elements across objects: \describe{ \item{\code{Arith}}{\code{"+"}, \code{"-"}, \code{"*"}, \code{"^"}, \code{"\%\%"}, \code{"\%/\%"}, \code{"/"}} \item{\code{Compare}}{\code{"=="}, \code{">"}, \code{"<"}, \code{"!="}, \code{"<="}, \code{">="}} \item{\code{Logic}}{\code{"&"}, \code{"|"}} \item{\code{Ops}}{\code{"Arith"}, \code{"Compare"}, \code{"Logic"}} \item{\code{Math}}{\code{"abs"}, \code{"sign"}, \code{"sqrt"}, \code{"ceiling"}, \code{"floor"}, \code{"trunc"}, \code{"cummax"}, \code{"cummin"}, \code{"cumprod"}, \code{"cumsum"}, \code{"log"}, \code{"log10"}, \code{"log2"}, \code{"log1p"}, \code{"acos"}, \code{"acosh"}, \code{"asin"}, \code{"asinh"}, \code{"atan"}, \code{"atanh"}, \code{"exp"}, \code{"expm1"}, \code{"cos"}, \code{"cosh"}, \code{"sin"}, \code{"sinh"}, \code{"tan"}, \code{"tanh"}, \code{"gamma"}, \code{"lgamma"}, \code{"digamma"}, \code{"trigamma"}} \item{\code{Math2}}{\code{"round"}, \code{"signif"}} \item{\code{Summary}}{\code{"max"}, \code{"min"}, \code{"range"}, \code{"prod"}, \code{"sum"}, \code{"any"}, \code{"all"}} \item{\code{Complex}}{\code{"Arg"}, \code{"Conj"}, \code{"Im"}, \code{"Mod"}, \code{"Re"}} } See \link[methods]{S4groupGeneric} for more details. } \section{Other Methods}{ The AtomicList objects also support a large number of basic methods. Like the group generics above, these methods perform the corresponding operation on each element of the list separately. The methods are: \describe{ \item{Logical}{\code{!}, \code{which}, \code{which.max}, \code{which.min}} \item{Numeric}{\code{diff}, \code{pmax}, \code{pmax.int}, \code{pmin}, \code{pmin.int}, \code{mean}, \code{var}, \code{cov}, \code{cor}, \code{sd}, \code{median}, \code{quantile}, \code{mad}, \code{IQR}} \item{Running Window}{\code{smoothEnds}, \code{runmed}. \code{runmean}, \code{runsum}, \code{runwtsum}, \code{runq}} \item{Character}{\code{nchar}, \code{chartr}, \code{tolower}, \code{toupper}, \code{sub}, \code{gsub}} } The \code{which.min} and \code{which.max} functions have an extra argument, \code{global=FALSE}, which controls whether the returned subscripts are global (compatible with the unlisted form of the input) or local (compatible with the corresponding list element). The \code{rank} method only supports tie methods \dQuote{average}, \dQuote{first}, \dQuote{min} and \dQuote{max}. Since \code{\link{ifelse}} relies on non-standard evaluation for arguments that need to be in the generic signature, we provide \code{ifelse2}, which has eager but otherwise equivalent semantics. } \section{Specialized Methods}{ \describe{ \item{}{\code{unstrsplit(x, sep="")}}: A fast \code{sapply(x, paste0, collapse=sep)}. See \code{?\link{unstrsplit}} for the details. } } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{AtomicList} objects. } } \examples{ ## group generics int1 <- c(1L,2L,3L,5L,2L,8L) int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) col1 <- IntegerList(one = int1, int2) 2 * col1 col1 + col1 col1 > 2 sum(col1) # equivalent to (but faster than) 'sapply(col1, sum)' mean(col1) # equivalent to 'sapply(col1, mean)' } \keyword{methods} IRanges/man/CompressedHitsList-class.Rd0000644000175400017540000000146013175713360021074 0ustar00biocbuildbiocbuild\name{CompressedHitsList-class} \docType{class} \alias{class:CompressedHitsList} \alias{CompressedHitsList-class} \alias{CompressedHitsList} % coercion \alias{as.matrix,CompressedHitsList-method} % accessors \alias{space,CompressedHitsList-method} \alias{from,CompressedHitsList-method} \alias{to,CompressedHitsList-method} \alias{nLnode,CompressedHitsList-method} \alias{nRnode,CompressedHitsList-method} \title{CompressedHitsList objects} \description{ An efficient representation of \link{HitsList} objects. See \code{?\link{HitsList}} for more information about \link{HitsList} objects. } \note{ This class is highly experimental. It has not been well tested and may disappear at any time. } \author{Michael Lawrence} \seealso{ \link{HitsList} objects. } \keyword{methods} \keyword{classes} IRanges/man/CompressedList-class.Rd0000644000175400017540000001247613175713360020255 0ustar00biocbuildbiocbuild\name{CompressedList-class} \docType{class} \alias{class:CompressedList} \alias{CompressedList} \alias{CompressedList-class} % accessors \alias{length,CompressedList-method} \alias{names,CompressedList-method} \alias{names<-,CompressedList-method} \alias{elementNROWS,CompressedList-method} \alias{$<-,CompressedList-method} \alias{[[<-,CompressedList-method} \alias{is.na,CompressedList-method} % coercion \alias{unlist,CompressedList-method} \alias{coerce,ANY,CompressedList-method} % combining \alias{c,CompressedList-method} % looping \alias{lapply,CompressedList-method} \alias{endoapply,CompressedList-method} \alias{mendoapply,CompressedList-method} \alias{revElements,CompressedList-method} % displaying \alias{classNameForDisplay,CompressedList-method} % ops \alias{!,CompressedList-method} % setops \alias{match,CompressedList,vector-method} \title{CompressedList objects} \description{ Like the \link[S4Vectors]{SimpleList} class defined in the \pkg{S4Vectors} package, the CompressedList class extends the \link[S4Vectors]{List} virtual class. } \details{ Unlike the \link[S4Vectors]{SimpleList} class, CompressedList is virtual, that is, it cannot be instantiated. Many concrete (i.e. non-virtual) CompressedList subclasses are defined and documented in this package (e.g. \link{CompressedIntegerList}, \link{CompressedCharacterList}, \link{CompressedRleList}, etc...), as well as in other packages (e.g. \link[GenomicRanges]{GRangesList} in the \pkg{GenomicRanges} package, \link[GenomicAlignments]{GAlignmentsList} in the \pkg{GenomicAlignments} package, etc...). It's easy for developers to extend CompressedList to create a new CompressedList subclass and there is generally very little work involved to make this new subclass fully operational. In a CompressedList object the list elements are concatenated together in a single vector-like object. The \emph{partitioning} of this single vector-like object (i.e. the information about where each original list element starts and ends) is also kept in the CompressedList object. This internal representation is generally more memory efficient than \link[S4Vectors]{SimpleList}, especially if the object has many list elements (e.g. thousands or millions). Also it makes it possible to implement many basic list operations very efficiently. Many objects like \link{LogicalList}, \link{IntegerList}, \link{CharacterList}, \link{RleList}, etc... exist in 2 flavors: CompressedList and \link[S4Vectors]{SimpleList}. Each flavor is incarnated by a concrete subclass: \link{CompressedLogicalList} and \link{SimpleLogicalList} for virtual class \link{LogicalList}, \link{CompressedIntegerList} and \link{SimpleIntegerList} for virtual class \link{IntegerList}, etc... It's easy to switch from one representation to the other with \code{as(x, "CompressedList")} and \code{as(x, "SimpleList")}. Also the constructor function for those virtual classes have a switch that lets the user choose the representation at construction time e.g. \code{CharacterList(..., compress=TRUE)} or \code{CharacterList(..., compress=FALSE)}. See below for more information. } \section{Constructor}{ See the \link[S4Vectors]{List} man page in the \pkg{S4Vectors} package for a quick overview of how to construct \link{List} objects in general. Unlike for \link[S4Vectors]{SimpleList} objects, there is no \code{CompressedList} constructor function. However, many constructor functions for \link[S4Vectors]{List} objects have a switch that lets the user choose between the CompressedList and \link[S4Vectors]{SimpleList} representation at construction time. For example, a \link{CompressedCharacterList} object can be constructed with \code{CharacterList(..., compress=TRUE)}. } \section{Accessors}{ Same as for \link[S4Vectors]{List} objects. See the \link[S4Vectors]{List} man page in the \pkg{S4Vectors} package for more information. } \section{Coercion}{ All the coercions documented in the \link[S4Vectors]{List} man page apply to CompressedList objects. } \section{Subsetting}{ Same as for \link[S4Vectors]{List} objects. See the \link[S4Vectors]{List} man page for more information. } \section{Looping and functional programming}{ Same as for \link[S4Vectors]{List} objects. See \code{?`\link[S4Vectors]{List-utils}`} in the \pkg{S4Vectors} package for more information. } \section{Displaying}{ When a CompressedList object is displayed, the "Compressed" prefix is removed from the real class name of the object. See \code{\link[S4Vectors]{classNameForDisplay}} in the \pkg{S4Vectors} package for more information about this. } \seealso{ \itemize{ \item The \link[S4Vectors]{List} class defined and documented in the \pkg{S4Vectors} package for the parent class. \item The \link[S4Vectors]{SimpleList} class defined and documented in the \pkg{S4Vectors} package for an alternative to CompressedList. \item The \link{CompressedIntegerList} class for a CompressedList subclass example. } } \examples{ ## Displaying a CompressedList object: x <- IntegerList(11:12, integer(0), 3:-2, compress=TRUE) class(x) ## The "Simple" prefix is removed from the real class name of the ## object: x ## This is controlled by internal helper classNameForDisplay(): classNameForDisplay(x) } \keyword{methods} \keyword{classes} IRanges/man/DataFrame-utils.Rd0000644000175400017540000000221313175713360017160 0ustar00biocbuildbiocbuild\name{DataFrame-utils} \alias{DataFrame-utils} \alias{relistToClass,DataFrame-method} \alias{relistToClass,data.frame-method} % splitting \alias{mstack,DataFrame-method} \title{Common operations on DataFrame objects} \description{ Common operations on \link{DataFrame} objects. } \section{Splitting}{ In the following code snippets, \code{x} is a \code{DataFrame}. \describe{ \item{}{\code{split(x, f, drop = FALSE)}: Splits \code{x} into a \code{\linkS4class{CompressedSplitDataFrameList}}, according to \code{f}, dropping elements corresponding to unrepresented levels if \code{drop} is \code{TRUE}. } \item{}{ \code{mstack(..., .index.var = "name")}: Stacks the data frames passed as through \dots, using \code{.index.var} as the index column name. See \code{\link{stack}}. } } } \author{ Michael Lawrence } \seealso{ \code{\linkS4class{DataTable}} and \code{\linkS4class{Vector}} } \examples{ ## split sw <- DataFrame(swiss) swsplit <- split(sw, sw[["Education"]]) ## rbind do.call(rbind, as.list(swsplit)) ## cbind cbind(DataFrame(score), DataFrame(counts)) } \keyword{methods} IRanges/man/DataFrameList-class.Rd0000644000175400017540000001664613175713360020000 0ustar00biocbuildbiocbuild\name{DataFrameList-class} \docType{class} \alias{DataFrameList-class} \alias{SimpleDataFrameList-class} \alias{SplitDataFrameList-class} \alias{CompressedSplitDataFrameList-class} \alias{SimpleSplitDataFrameList-class} % accessors \alias{nrow,DataFrameList-method} \alias{ncol,DataFrameList-method} \alias{ncol,CompressedSplitDataFrameList-method} \alias{ncol,SimpleSplitDataFrameList-method} \alias{dim,DataFrameList-method} \alias{rownames,DataFrameList-method} \alias{colnames,DataFrameList-method} \alias{colnames,CompressedSplitDataFrameList-method} \alias{colnames,SimpleSplitDataFrameList-method} \alias{dimnames,DataFrameList-method} \alias{rownames<-,CompressedSplitDataFrameList-method} \alias{rownames<-,SimpleDataFrameList-method} \alias{colnames<-,CompressedSplitDataFrameList-method} \alias{colnames<-,SimpleDataFrameList-method} \alias{dimnames<-,DataFrameList-method} \alias{columnMetadata} \alias{columnMetadata<-} \alias{columnMetadata,SimpleSplitDataFrameList-method} \alias{columnMetadata<-,SimpleSplitDataFrameList-method} \alias{columnMetadata,CompressedSplitDataFrameList-method} \alias{columnMetadata<-,CompressedSplitDataFrameList-method} % constructor \alias{DataFrameList} \alias{SplitDataFrameList} % subsetting \alias{[,SimpleSplitDataFrameList-method} \alias{[,CompressedSplitDataFrameList-method} \alias{[<-,SplitDataFrameList-method} % transformation \alias{transform,SplitDataFrameList-method} % coercion \alias{coerce,DataFrameList,DataFrame-method} \alias{coerce,SplitDataFrameList,DataFrame-method} \alias{coerce,ANY,CompressedSplitDataFrameList-method} \alias{coerce,ANY,SimpleSplitDataFrameList-method} \alias{coerce,List,CompressedSplitDataFrameList-method} \alias{coerce,list,SplitDataFrameList-method} \alias{coerce,List,SimpleSplitDataFrameList-method} \alias{coerce,ANY,SplitDataFrameList-method} \alias{coerce,SimpleList,SplitDataFrameList-method} \alias{coerce,list,SplitDataFrameList-method} \alias{coerce,DataFrame,SplitDataFrameList-method} \alias{stack,DataFrameList-method} % splitting and combining \alias{cbind,DataFrameList-method} \alias{rbind,DataFrameList-method} % show \alias{show,SplitDataFrameList-method} % SDFLWrapperForTransform (internal) \alias{[[,SDFLWrapperForTransform-method} \alias{[[<-,SDFLWrapperForTransform-method} \alias{as.env,SDFLWrapperForTransform-method} \title{List of DataFrames} \description{Represents a list of \code{\linkS4class{DataFrame}} objects. The \code{SplitDataFrameList} class contains the additional restriction that all the columns be of the same name and type. Internally it is stored as a list of \code{DataFrame} objects and extends \code{\linkS4class{List}}.} \section{Accessors}{ In the following code snippets, \code{x} is a \code{DataFrameList}. \describe{ \item{}{\code{dim(x)}: Get the two element integer vector indicating the number of rows and columns over the entire dataset.} \item{}{\code{dimnames(x)}: Get the list of two character vectors, the first holding the rownames (possibly \code{NULL}) and the second the column names. } \item{}{\code{columnMetadata(x)}: Get the \code{DataFrame} of metadata along the columns, i.e., where each column in \code{x} is represented by a row in the metadata. The metadata is common across all elements of \code{x}. Note that calling \code{mcols(x)} returns the metadata on the \code{DataFrame} elements of \code{x}. } \item{}{\code{columnMetadata(x) <- value}: Set the \code{DataFrame} of metadata for the columns. } } } \section{Subsetting}{ In the following code snippets, \code{x} is a \code{SplitDataFrameList}. In general \code{x} follows the conventions of \code{SimpleList}/\code{CompressedList} with the following addition: \describe{ \item{}{ \code{x[i,j,drop]}: If matrix subsetting is used, \code{i} selects either the list elements or the rows within the list elements as determined by the \code{[} method for \code{SimpleList}/\code{CompressedList}, \code{j} selects the columns, and \code{drop} is used when one column is selected and output can be coerced into an \code{AtomicList} or \code{RangesList} subclass. } \item{}{\code{x[i,j] <- value}: If matrix subsetting is used, \code{i} selects either the list elements or the rows within the list elements as determined by the \code{[<-} method for \code{SimpleList}/\code{CompressedList}, \code{j} selects the columns and \code{value} is the replacement value for the selected region. } } } \section{Constructor}{ \describe{ \item{}{\code{DataFrameList(...)}: Concatenates the \code{DataFrame} objects in \code{...} into a new \code{DataFrameList}.} \item{}{\code{SplitDataFrameList(..., compress = TRUE, cbindArgs = FALSE)}: If \code{cbindArgs} is \code{FALSE}, the \code{...} arguments are coerced to \code{DataFrame} objects and concatenated to form the result. The arguments must have the same number and names of columns. If \code{cbindArgs} is \code{TRUE}, the arguments are combined as columns. The arguments must then be the same length, with each element of an argument mapping to an element in the result. If \code{compress = TRUE}, returns a \code{CompressedSplitDataFrameList}; else returns a \code{SimpleSplitDataFrameList}.} } } \section{Combining}{ In the following code snippets, objects in \code{...} are of class \code{DataFrameList}. \describe{ \item{}{ \code{rbind(...)}: Creates a new \code{DataFrameList} containing the element-by-element row concatenation of the objects in \code{...}. } \item{}{ \code{cbind(...)}: Creates a new \code{DataFrameList} containing the element-by-element column concatenation of the objects in \code{...}. } } } \section{Transformation}{ \describe{ \item{}{\code{transform(`_data`, ...)}: Transforms a \code{SplitDataFrame} in a manner analogous to the base \code{\link{transform}}, where the columns are \code{List} objects adhering to the structure of \code{_data}. } } } \section{Coercion}{ In the following code snippets, \code{x} is a \code{DataFrameList}. \describe{ \item{}{\code{as(from, "DataFrame")}: Coerces a \code{SplitDataFrameList} to a \code{DataFrame}, which has a column for every column in \code{from}, except each column is a \code{List} with the same structure as \code{from}. } \item{}{\code{as(from, "SplitDataFrameList")}: By default, simply calls the \code{SplitDataFrameList} constructor on \code{from}. If \code{from} is a \code{List}, each element of \code{from} is passed as an argument to \code{SplitDataFrameList}, like calling \code{as.list} on a vector. If \code{from} is a \code{DataFrame}, each row becomes an element in the list.} \item{}{\code{stack(x, index.var = "name")}: Unlists \code{x} and adds a column named \code{index.var} to the result, indicating the element of \code{x} from which each row was obtained. } \item{}{ \code{as.data.frame(x, row.names = NULL, optional = FALSE, ..., value.name = "value", use.outer.mcols = FALSE, group_name.as.factor = FALSE)}: Coerces \code{x} to a \code{data.frame}. See as.data.frame on the \code{List} man page for details (?\code{List}). } } } \author{ Michael Lawrence } \seealso{ \code{\linkS4class{DataFrame}} } \keyword{methods} \keyword{classes} IRanges/man/Grouping-class.Rd0000644000175400017540000004740113175713360017103 0ustar00biocbuildbiocbuild\name{Grouping-class} \docType{class} % Grouping objects \alias{class:Grouping} \alias{Grouping-class} \alias{Grouping} \alias{nobj} \alias{grouplengths} \alias{grouplengths,Grouping-method} \alias{grouplengths,CompressedGrouping-method} \alias{show,Grouping-method} % ManyToOneGrouping objects \alias{class:ManyToOneGrouping} \alias{ManyToOneGrouping-class} \alias{ManyToOneGrouping} \alias{nobj,ManyToOneGrouping-method} \alias{nobj,CompressedManyToOneGrouping-method} \alias{members} \alias{members,ManyToOneGrouping-method} \alias{vmembers} \alias{vmembers,ManyToOneGrouping-method} \alias{togroup} \alias{togroup,ManyToOneGrouping-method} \alias{togrouplength} \alias{togrouplength,ManyToOneGrouping-method} \alias{coerce,grouping,Grouping-method} \alias{coerce,grouping,ManyToOneGrouping-method} \alias{coerce,vector,Grouping-method} \alias{coerce,vector,ManyToOneGrouping-method} \alias{coerce,ManyToOneGrouping,factor-method} \alias{coerce,DataFrame,Grouping-method} \alias{coerce,FactorList,Grouping-method} \alias{coerce,Hits,Grouping-method} % ManyToManyGrouping objects \alias{nobj,BaseManyToManyGrouping-method} \alias{coerce,vector,ManyToManyGrouping-method} % H2LGrouping and Dups objects \alias{class:H2LGrouping} \alias{H2LGrouping-class} \alias{H2LGrouping} \alias{high2low} \alias{high2low,H2LGrouping-method} \alias{high2low,ANY-method} \alias{low2high} \alias{low2high,H2LGrouping-method} \alias{length,H2LGrouping-method} \alias{nobj,H2LGrouping-method} \alias{grouplengths,H2LGrouping-method} \alias{members,H2LGrouping-method} \alias{vmembers,H2LGrouping-method} \alias{togroup,H2LGrouping-method} \alias{grouprank} \alias{grouprank,H2LGrouping-method} \alias{togrouprank} \alias{togrouprank,H2LGrouping-method} \alias{length<-,H2LGrouping-method} \alias{class:Dups} \alias{Dups-class} \alias{Dups} \alias{duplicated,Dups-method} \alias{show,Dups-method} % ManyToManyGrouping objects \alias{class:ManyToManyGrouping} \alias{ManyToManyGrouping-class} \alias{ManyToManyGrouping} \alias{nobj,ManyToManyGrouping-method} % GroupingRanges objects \alias{class:GroupingRanges} \alias{GroupingRanges-class} \alias{GroupingRanges} \alias{grouplengths,GroupingRanges-method} \alias{class:GroupingIRanges} \alias{GroupingIRanges-class} \alias{GroupingIRanges} % Partitioning objects \alias{class:Partitioning} \alias{Partitioning-class} \alias{Partitioning} \alias{togroup,Partitioning-method} \alias{names,Partitioning-method} \alias{names<-,Partitioning-method} \alias{class:PartitioningByEnd} \alias{PartitioningByEnd-class} \alias{PartitioningByEnd} \alias{end,PartitioningByEnd-method} \alias{length,PartitioningByEnd-method} \alias{nobj,PartitioningByEnd-method} \alias{start,PartitioningByEnd-method} \alias{width,PartitioningByEnd-method} \alias{coerce,Ranges,PartitioningByEnd-method} \alias{class:PartitioningByWidth} \alias{PartitioningByWidth-class} \alias{PartitioningByWidth} \alias{width,PartitioningByWidth-method} \alias{end,PartitioningByWidth-method} \alias{start,PartitioningByWidth-method} \alias{coerce,Ranges,PartitioningByWidth-method} % PartitioningMap objects \alias{class:PartitioningMap} \alias{PartitioningMap-class} \alias{PartitioningMap} \alias{mapOrder} \alias{mapOrder,PartitioningMap-method} \alias{show,PartitioningMap-method} % SimpleGrouping & CompressedGrouping objects \alias{class:SimpleGrouping} \alias{SimpleGrouping-class} \alias{class:CompressedGrouping} \alias{CompressedGrouping-class} \alias{class:SimpleManyToOneGrouping} \alias{SimpleManyToOneGrouping-class} \alias{class:CompressedManyToOneGrouping} \alias{CompressedManyToOneGrouping-class} \alias{class:SimpleManyToManyGrouping} \alias{SimpleManyToManyGrouping-class} \alias{class:CompressedManyToManyGrouping} \alias{CompressedManyToManyGrouping-class} % old stuff (deprecated & defunct) \alias{togroup,ANY-method} \title{Grouping objects} \description{ We call \emph{grouping} an arbitrary mapping from a collection of NO objects to a collection of NG groups, or, more formally, a bipartite graph between integer sets [1, NO] and [1, NG]. Objects mapped to a given group are said to belong to, or to be assigned to, or to be in that group. Additionally, the objects in each group are ordered. So for example the 2 following groupings are considered different: \preformatted{ Grouping 1: NG = 3, NO = 5 group objects 1 : 4, 2 2 : 3 : 4 Grouping 2: NG = 3, NO = 5 group objects 1 : 2, 4 2 : 3 : 4 } There are no restriction on the mapping e.g. any object can be mapped to 0, 1, or more groups, and can be mapped twice to the same group. Also some or all the groups can be empty. The Grouping class is a virtual class that formalizes the most general kind of grouping. More specific groupings (e.g. \emph{many-to-one groupings} or \emph{block-groupings}) are formalized via specific Grouping subclasses. This man page documents the core Grouping API, and 3 important Grouping subclasses: ManyToOneGrouping, GroupingRanges, and Partitioning (the last one deriving from the 2 first). } \section{The core Grouping API}{ Let's give a formal description of the core Grouping API: Groups G_i are indexed from 1 to NG (1 <= i <= NG). Objects O_j are indexed from 1 to NO (1 <= j <= NO). Given that empty groups are allowed, NG can be greater than NO. If \code{x} is a Grouping object: \describe{ \item{}{ \code{length(x)}: Returns the number of groups (NG). } \item{}{ \code{names(x)}: Returns the names of the groups. } \item{}{ \code{nobj(x)}: Returns the number of objects (NO). } } Going from groups to objects: \describe{ \item{}{ \code{x[[i]]}: Returns the indices of the objects (the j's) that belong to G_i. This provides the mapping from groups to objects. } \item{}{ \code{grouplengths(x, i=NULL)}: Returns the number of objects in G_i. Works in a vectorized fashion (unlike \code{x[[i]]}). \code{grouplengths(x)} is equivalent to \code{grouplengths(x, seq_len(length(x)))}. If \code{i} is not NULL, \code{grouplengths(x, i)} is equivalent to \code{sapply(i, function(ii) length(x[[ii]]))}. } } Note to developers: Given that \code{length}, \code{names} and \code{[[} are expected to work on any Grouping object, those objects can be seen as \link{List} objects. More precisely, the Grouping class actually extends the \link{IntegerList} class. In particular, many other "list" operations like \code{as.list}, \code{elementNROWS}, and \code{unlist}, etc... should work out-of-the-box on any Grouping object. } \section{ManyToOneGrouping objects}{ The ManyToOneGrouping class is a virtual subclass of Grouping for representing \emph{many-to-one groupings}, that is, groupings where each object in the original collection of objects belongs to exactly one group. The grouping of an empty collection of objects in an arbitrary number of (necessarily empty) groups is a valid ManyToOneGrouping object. Note that, for a ManyToOneGrouping object, if NG is 0 then NO must also be 0. The ManyToOneGrouping API extends the core Grouping API by adding a couple more operations for going from groups to objects: \describe{ \item{}{ \code{members(x, i)}: Equivalent to \code{x[[i]]} if \code{i} is a single integer. Otherwise, if \code{i} is an integer vector of arbitrary length, it's equivalent to \code{sort(unlist(sapply(i, function(ii) x[[ii]])))}. } \item{}{ \code{vmembers(x, L)}: A version of \code{members} that works in a vectorized fashion with respect to the \code{L} argument (\code{L} must be a list of integer vectors). Returns \code{lapply(L, function(i) members(x, i))}. } } And also by adding operations for going from objects to groups: \describe{ \item{}{ \code{togroup(x, j=NULL)}: Returns the index i of the group that O_j belongs to. This provides the mapping from objects to groups (many-to-one mapping). Works in a vectorized fashion. \code{togroup(x)} is equivalent to \code{togroup(x, seq_len(nobj(x)))}: both return the entire mapping in an integer vector of length NO. If \code{j} is not NULL, \code{togroup(x, j)} is equivalent to \code{y <- togroup(x); y[j]}. } \item{}{ \code{togrouplength(x, j=NULL)}: Returns the number of objects that belong to the same group as O_j (including O_j itself). Equivalent to \code{grouplengths(x, togroup(x, j))}. } } One important property of any ManyToOneGrouping object \code{x} is that \code{unlist(as.list(x))} is always a permutation of \code{seq_len(nobj(x))}. This is a direct consequence of the fact that every object in the grouping belongs to one group and only one. } \section{2 ManyToOneGrouping concrete subclasses: H2LGrouping, Dups and SimpleManyToOneGrouping}{ [DOCUMENT ME] Constructors: \describe{ \item{}{ \code{H2LGrouping(high2low=integer())}: [DOCUMENT ME] } \item{}{ \code{Dups(high2low=integer())}: [DOCUMENT ME] } \item{}{ \code{ManyToOneGrouping(..., compress=TRUE)}: Collect \code{\dots} into a \code{ManyToOneGrouping}. The arguments will be coerced to integer vectors and combined into a list, unless there is a single list argument, which is taken to be an integer list. The resulting integer list should have a structure analogous to that of \code{Grouping} itself: each element represents a group in terms of the subscripts of the members. If \code{compress} is \code{TRUE}, the representation uses a \code{CompressedList}, otherwise a \code{SimpleList}. } } } \section{ManyToManyGrouping objects}{ The ManyToManyGrouping class is a virtual subclass of Grouping for representing \emph{many-to-many groupings}, that is, groupings where each object in the original collection of objects belongs to any number of groups. Constructors: \describe{ \item{}{ \code{ManyToManyGrouping(x, compress=TRUE)}: Collect \code{\dots} into a \code{ManyToManyGrouping}. The arguments will be coerced to integer vectors and combined into a list, unless there is a single list argument, which is taken to be an integer list. The resulting integer list should have a structure analogous to that of \code{Grouping} itself: each element represents a group in terms of the subscripts of the members. If \code{compress} is \code{TRUE}, the representation uses a \code{CompressedList}, otherwise a \code{SimpleList}. } } } \section{GroupingRanges objects}{ The GroupingRanges class is a virtual subclass of Grouping for representing \emph{block-groupings}, that is, groupings where each group is a block of adjacent elements in the original collection of objects. GroupingRanges objects support the Ranges API (e.g. \code{\link{start}}, \code{\link{end}}, \code{\link{width}}, etc...) in addition to the Grouping API. See \code{?\link{Ranges}} for a description of the \link{Ranges} API. } \section{Partitioning objects}{ The Partitioning class is a virtual subclass of GroupingRanges for representing \emph{block-groupings} where the blocks fully cover the original collection of objects and don't overlap. Since this makes them \emph{many-to-one groupings}, the Partitioning class is also a subclass of ManyToOneGrouping. An additional constraint of Partitioning objects is that the blocks must be ordered by ascending position with respect to the original collection of objects. The Partitioning virtual class itself has 3 concrete subclasses: PartitioningByEnd (only stores the end of the groups, allowing fast mapping from groups to objects), and PartitioningByWidth (only stores the width of the groups), and PartitioningMap which contains PartitioningByEnd and two additional slots to re-order and re-list the object to a related mapping. Constructors: \describe{ \item{}{ \code{PartitioningByEnd(x=integer(), NG=NULL, names=NULL)}: \code{x} must be either a list-like object or a sorted integer vector. \code{NG} must be either \code{NULL} or a single integer. \code{names} must be either \code{NULL} or a character vector of length \code{NG} (if supplied) or \code{length(x)} (if \code{NG} is not supplied). Returns the following PartitioningByEnd object \code{y}: \itemize{ \item If \code{x} is a list-like object, then the returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{elementNROWS(x)}. \item If \code{x} is an integer vector and \code{NG} is not supplied, then \code{x} must be sorted (checked) and contain non-NA non-negative values (NOT checked). The returned object \code{y} has the same length as \code{x} and is such that \code{end(y)} is identical to \code{x}. \item If \code{x} is an integer vector and \code{NG} is supplied, then \code{x} must be sorted (checked) and contain values >= 1 and <= \code{NG} (checked). The returned object \code{y} is of length \code{NG} and is such that \code{togroup(y)} is identical to \code{x}. } If the \code{names} argument is supplied, it is used to name the partitions. } \item{}{ \code{PartitioningByWidth(x=integer(), NG=NULL, names=NULL)}: \code{x} must be either a list-like object or an integer vector. \code{NG} must be either \code{NULL} or a single integer. \code{names} must be either \code{NULL} or a character vector of length \code{NG} (if supplied) or \code{length(x)} (if \code{NG} is not supplied). Returns the following PartitioningByWidth object \code{y}: \itemize{ \item If \code{x} is a list-like object, then the returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{elementNROWS(x)}. \item If \code{x} is an integer vector and \code{NG} is not supplied, then \code{x} must contain non-NA non-negative values (NOT checked). The returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{x}. \item If \code{x} is an integer vector and \code{NG} is supplied, then \code{x} must be sorted (checked) and contain values >= 1 and <= \code{NG} (checked). The returned object \code{y} is of length \code{NG} and is such that \code{togroup(y)} is identical to \code{x}. } If the \code{names} argument is supplied, it is used to name the partitions. } \item{}{ \code{PartitioningMap(x=integer(), mapOrder=integer())}: \code{x} is a list-like object or a sorted integer vector used to construct a PartitioningByEnd object. \code{mapOrder} numeric vector of the mapped order. Returns a PartitioningMap object. } } Note that these constructors don't recycle their \code{names} argument (to remain consistent with what \code{`names<-`} does on standard vectors). } \section{Coercions to Grouping objects}{ These types can be coerced to different derivatives of Grouping objects: \describe{ \item{factor}{ Analogous to calling \code{split} with the factor. Returns a ManyToOneGrouping if there are no NAs, otherwise a ManyToManyGrouping. If a factor is explicitly converted to a ManytoOneGrouping, then any NAs are placed in the last group. } \item{vector}{ A vector is effectively treated as a factor, but more efficiently. The order of the groups is not defined. } \item{FactorList}{ Same as the factor coercion, except using the interaction of every factor in the list. The interaction has an NA wherever any of the elements has one. Every element must have the same length. } \item{DataFrame}{ Effectively converted via a FactorList by coercing each column to a factor. } \item{grouping}{ Equivalent Grouping representation of the base R \code{\link{grouping}} object. } \item{Hits}{ Returns roughly the same object as \code{as(x, "List")}, except it is a ManyToManyGrouping, i.e., it knows the number of right nodes. } } } \author{Hervé Pagès, Michael Lawrence} \seealso{ \link{IntegerList-class}, \link{Ranges-class}, \link{IRanges-class}, \link{successiveIRanges}, \link[base]{cumsum}, \link[base]{diff} } \examples{ showClass("Grouping") # shows (some of) the known subclasses ## --------------------------------------------------------------------- ## A. H2LGrouping OBJECTS ## --------------------------------------------------------------------- high2low <- c(NA, NA, 2, 2, NA, NA, NA, 6, NA, 1, 2, NA, 6, NA, NA, 2) h2l <- H2LGrouping(high2low) h2l ## The core Grouping API: length(h2l) nobj(h2l) # same as 'length(h2l)' for H2LGrouping objects h2l[[1]] h2l[[2]] h2l[[3]] h2l[[4]] h2l[[5]] grouplengths(h2l) # same as 'unname(sapply(h2l, length))' grouplengths(h2l, 5:2) members(h2l, 5:2) # all the members are put together and sorted togroup(h2l) togroup(h2l, 5:2) togrouplength(h2l) # same as 'grouplengths(h2l, togroup(h2l))' togrouplength(h2l, 5:2) ## The List API: as.list(h2l) sapply(h2l, length) ## --------------------------------------------------------------------- ## B. Dups OBJECTS ## --------------------------------------------------------------------- dups1 <- as(h2l, "Dups") dups1 duplicated(dups1) # same as 'duplicated(togroup(dups1))' ### The purpose of a Dups object is to describe the groups of duplicated ### elements in a vector-like object: x <- c(2, 77, 4, 4, 7, 2, 8, 8, 4, 99) x_high2low <- high2low(x) x_high2low # same length as 'x' dups2 <- Dups(x_high2low) dups2 togroup(dups2) duplicated(dups2) togrouplength(dups2) # frequency for each element table(x) ## --------------------------------------------------------------------- ## C. Partitioning OBJECTS ## --------------------------------------------------------------------- pbe1 <- PartitioningByEnd(c(4, 7, 7, 8, 15), names=LETTERS[1:5]) pbe1 # the 3rd partition is empty ## The core Grouping API: length(pbe1) nobj(pbe1) pbe1[[1]] pbe1[[2]] pbe1[[3]] grouplengths(pbe1) # same as 'unname(sapply(pbe1, length))' # and 'width(pbe1)' togroup(pbe1) togrouplength(pbe1) # same as 'grouplengths(pbe1, togroup(pbe1))' names(pbe1) ## The Ranges core API: start(pbe1) end(pbe1) width(pbe1) ## The List API: as.list(pbe1) sapply(pbe1, length) ## Replacing the names: names(pbe1)[3] <- "empty partition" pbe1 ## Coercion to an IRanges object: as(pbe1, "IRanges") ## Other examples: PartitioningByEnd(c(0, 0, 19), names=LETTERS[1:3]) PartitioningByEnd() # no partition PartitioningByEnd(integer(9)) # all partitions are empty x <- c(1L, 5L, 5L, 6L, 8L) pbe2 <- PartitioningByEnd(x, NG=10L) stopifnot(identical(togroup(pbe2), x)) pbw2 <- PartitioningByWidth(x, NG=10L) stopifnot(identical(togroup(pbw2), x)) ## --------------------------------------------------------------------- ## D. RELATIONSHIP BETWEEN Partitioning OBJECTS AND successiveIRanges() ## --------------------------------------------------------------------- mywidths <- c(4, 3, 0, 1, 7) ## The 3 following calls produce the same ranges: ir <- successiveIRanges(mywidths) # IRanges instance. pbe <- PartitioningByEnd(cumsum(mywidths)) # PartitioningByEnd instance. pbw <- PartitioningByWidth(mywidths) # PartitioningByWidth instance. stopifnot(identical(as(ir, "PartitioningByEnd"), pbe)) stopifnot(identical(as(ir, "PartitioningByWidth"), pbw)) } \keyword{methods} \keyword{classes} IRanges/man/Hits-class-leftovers.Rd0000644000175400017540000000635613175713360020233 0ustar00biocbuildbiocbuild\name{Hits-class-leftovers} \docType{class} \alias{Hits-examples} % coercion \alias{coerce,Hits,DataFrame-method} \alias{as.data.frame,Hits-method} \alias{coerce,SortedByQueryHits,PartitioningByEnd-method} \alias{coerce,SortedByQueryHits,Partitioning-method} \alias{coerce,SortedByQueryHits,Ranges-method} \alias{coerce,SortedByQueryHits,IRanges-method} \alias{coerce,SortedByQueryHits,CompressedIntegerList-method} \alias{coerce,SortedByQueryHits,IntegerList-method} \alias{coerce,SortedByQueryHits,List-method} \alias{as.list,SortedByQueryHits-method} \alias{coerce,Hits,CompressedIntegerList-method} \alias{coerce,Hits,IntegerList-method} \alias{coerce,Hits,List-method} \alias{as.list,Hits-method} \alias{coerce,Hits,Grouping} \title{Examples of basic manipulation of Hits objects} \description{ IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Hits} objects located in the \pkg{S4Vectors} package. } \details{ The \code{as.data.frame} method coerces a \code{Hits} object to a two column \code{data.frame} with one row for each hit, where the value in the first column is the index of an element in the query and the value in the second column is the index of an element in the subject. } \section{Coercion}{ In the code snippets below, \code{x} is a \code{Hits} object. \describe{ \item{}{ \code{as(x, "DataFrame")}: Creates a \code{DataFrame} by combining the result of \code{as.matrix(x)} with \code{mcols(from)}. } \item{}{\code{as.data.frame(x)}: Attempts to coerce the result of \code{as(from, "DataFrame")} to a \code{data.frame}. } \item{}{\code{as.list(x)}: Coerces \code{x} to a list of integers, grouping the the right node hits for each left node. } \item{}{\code{as(x, "List")}: Analogous to \code{as.list(x)}. } \item{}{\code{as(x, "Grouping")}: Returns roughly the same object as \code{as(x, "List")}, except it is a ManyToManyGrouping, i.e., it knows the number of right nodes. } } } \seealso{ The \link[S4Vectors]{Hits} class defined and documented in the \pkg{S4Vectors} package. } \examples{ query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) hits <- findOverlaps(query, subject) as.matrix(hits) as.data.frame(hits) as.table(hits) # hits per query as.table(t(hits)) # hits per subject ## Turn a Hits object into an IntegerList object with one list element ## per element in the original query. as(hits, "IntegerList") as(hits, "List") # same as as(hits, "IntegerList") ## Turn a Hits object into a PartitioningByEnd object that describes ## the grouping of hits by query. as(hits, "PartitioningByEnd") as(hits, "Partitioning") # same as as(hits, "PartitioningByEnd") ## --------------------------------------------------------------------- ## remapHits() ## --------------------------------------------------------------------- hits2 <- remapHits(hits, Rnodes.remapping=factor(c("e", "e", "d"), letters[1:5])) hits2 hits3 <- remapHits(hits, Rnodes.remapping=c(5, 5, 4), new.nRnode=5) hits3 stopifnot(identical(hits2, hits3)) } \keyword{methods} \keyword{classes} IRanges/man/IPos-class.Rd0000644000175400017540000001644513175713360016167 0ustar00biocbuildbiocbuild\name{IPos-class} \docType{class} \alias{class:IPos} \alias{IPos-class} \alias{IPos} \alias{length,IPos-method} \alias{names,IPos-method} \alias{names<-,IPos-method} \alias{pos} \alias{pos,IPos-method} \alias{start,IPos-method} \alias{end,IPos-method} \alias{width,IPos-method} \alias{coerce,Ranges,IPos-method} \alias{coerce,ANY,IPos-method} \alias{as.data.frame,IPos-method} \alias{extractROWS,IPos-method} \alias{show,IPos-method} \alias{c,IPos-method} \title{Memory-efficient representation of integer positions} \description{ The IPos class is a container for storing a set of \emph{integer positions} where most of the positions are typically (but not necessarily) adjacent. Because integer positions can be seen as integer ranges of width 1, the IPos class extends the \link{Ranges} virtual class. Note that even though an \link{IRanges} object can be used for storing integer positions, using an IPos object will be much more memory-efficient, especially when the object contains long runs of adjacent positions in \emph{ascending order}. } \usage{ IPos(pos_runs) # constructor function } \arguments{ \item{pos_runs}{ An \link{IRanges} object (or any other \link{Ranges} derivative) where each range is interpreted as a run of adjacent ascending positions. If \code{pos_runs} is not a \link{Ranges} derivative, \code{IPos()} first tries to coerce it to one with \code{as(pos_runs, "Ranges", strict=FALSE)}. } } \value{ An IPos object. } \section{Accessors}{ \subsection{Getters}{ IPos objects support the same set of getters as other \link{Ranges} derivatives (i.e. \code{start()}, \code{end()}, \code{mcols()}, etc...), plus the \code{pos()} getter which is equivalent to \code{start()} or \code{end()}. See \code{?\link{Ranges}} for the list of getters supported by \link{Ranges} derivatives. IMPORTANT NOTE: An IPos object cannot hold names i.e. \code{names()} always returns \code{NULL} on it. } \subsection{Setters}{ IPos objects support the \code{mcols()} and \code{metadata()} setters only. } } \section{Coercion}{ From \link{Ranges} to IPos: A \link{Ranges} derivative \code{x} in which all the ranges have a width of 1 can be coerced to an IPos object with \code{as(x, "IPos")}. The names on \code{x} are not propagated (a warning is issued if \code{x} has names on it). From IPos to \link{IRanges}: An IPos object \code{x} can be coerced to an \link{IRanges} object with \code{as(x, "IRanges")}. However be aware that the resulting object can use thousands times (or more) memory than \code{x}! See "MEMORY USAGE" in the Examples section below. From IPos to ordinary R objects: Like with any other \link{Ranges} derivative, \code{as.character()}, \code{as.factor()}, and \code{as.data.frame()} work on an IPos object \code{x}. Note however that \code{as.data.frame(x)} returns a data frame with a \code{pos} column (containing \code{pos(x)}) instead of the \code{start}, \code{end}, and \code{width} columns that one gets with other \link{Ranges} derivatives. } \section{Subsetting}{ An IPos object can be subsetted exactly like an \link{IRanges} object. } \section{Combining}{ IPos objects can be combined (a.k.a. appended) with \code{c()} or \code{append()}. } \section{Splitting and Relisting}{ Like with an \link{IRanges} object, \code{split()} and \code{relist()} work on an IPos object. } \note{ Like for any \link[S4Vectors]{Vector} derivative, the length of an IPos object cannot exceed \code{.Machine$integer.max} (i.e. 2^31 on most platforms). \code{IPos()} will return an error if \code{pos_runs} contains too many integer positions. } \author{ Hervé Pagès; based on ideas borrowed from Georg Stricker \email{georg.stricker@in.tum.de} and Julien Gagneur \email{gagneur@in.tum.de} } \seealso{ \itemize{ \item The \link[GenomicRanges]{GPos} class in the \pkg{GenomicRanges} package for a memory-efficient representation of \emph{genomic positions} (i.e. genomic ranges of width 1). \item \link{Ranges} and \link{IRanges} objects. \item \link{Ranges-comparison} for comparing and ordering integer ranges and/or positions. \item \link{findOverlaps-methods} for finding overlapping integer ranges and/or positions. \item \link{nearest-methods} for finding the nearest integer range and/or position. } } \examples{ ## --------------------------------------------------------------------- ## BASIC EXAMPLES ## --------------------------------------------------------------------- ## Example 1: ipos1 <- IPos(c("44-53", "5-10", "2-5")) ipos1 length(ipos1) pos(ipos1) # same as 'start(ipos1)' and 'end(ipos1)' as.character(ipos1) as.data.frame(ipos1) as(ipos1, "IRanges") as.data.frame(as(ipos1, "IRanges")) ipos1[9:17] ## Example 2: pos_runs <- IRanges(c(1, 6, 12, 17), c(5, 10, 16, 20)) ipos2 <- IPos(pos_runs) ipos2 ## Example 3: ipos3A <- ipos3B <- IPos(c("1-15000", "15400-88700")) npos <- length(ipos3A) mcols(ipos3A)$sample <- Rle("sA") sA_counts <- sample(10, npos, replace=TRUE) mcols(ipos3A)$counts <- sA_counts mcols(ipos3B)$sample <- Rle("sB") sB_counts <- sample(10, npos, replace=TRUE) mcols(ipos3B)$counts <- sB_counts ipos3 <- c(ipos3A, ipos3B) ipos3 ## --------------------------------------------------------------------- ## MEMORY USAGE ## --------------------------------------------------------------------- ## Coercion to IRanges works... ipos4 <- IPos(c("1-125000", "135000-575000")) ir4 <- as(ipos4, "IRanges") ir4 ## ... but is generally not a good idea: object.size(ipos4) object.size(ir4) # 1739 times bigger than the IPos object! ## Shuffling the order of the positions impacts memory usage: ipos4s <- sample(ipos4) object.size(ipos4s) ## AN IMPORTANT NOTE: In the worst situations, IPos still performs as ## good as an IRanges object. object.size(as(ipos4s, "IRanges")) # same size as 'ipos4s' ## Best case scenario is when the object is strictly sorted (i.e. ## positions are in strict ascending order). ## This can be checked with: is.unsorted(ipos4, strict=TRUE) # 'ipos4' is strictly sorted ## --------------------------------------------------------------------- ## USING MEMORY-EFFICIENT METADATA COLUMNS ## --------------------------------------------------------------------- ## In order to keep memory usage as low as possible, it is recommended ## to use a memory-efficient representation of the metadata columns that ## we want to set on the object. Rle's are particularly well suited for ## this, especially if the metadata columns contain long runs of ## identical values. This is the case for example if we want to use an ## IPos object to represent the coverage of sequencing reads along a ## chromosome. ## Example 5: library(pasillaBamSubset) library(Rsamtools) # for the BamFile() constructor function bamfile1 <- BamFile(untreated1_chr4()) bamfile2 <- BamFile(untreated3_chr4()) ipos5 <- IPos(IRanges(1, seqlengths(bamfile1)[["chr4"]])) library(GenomicAlignments) # for "coverage" method for BamFile objects cov1 <- coverage(bamfile1)$chr4 cov2 <- coverage(bamfile2)$chr4 mcols(ipos5) <- DataFrame(cov1, cov2) ipos5 object.size(ipos5) # lightweight ## Keep only the positions where coverage is at least 10 in one of the ## 2 samples: ipos5[mcols(ipos5)$cov1 >= 10 | mcols(ipos5)$cov2 >= 10] } \keyword{methods} \keyword{classes} IRanges/man/IRanges-class.Rd0000644000175400017540000001366113175713360016642 0ustar00biocbuildbiocbuild\name{IRanges-class} \docType{class} % IRanges objects: \alias{class:IRanges} \alias{IRanges-class} % Accessors \alias{start,IRanges-method} \alias{width,IRanges-method} \alias{names,IRanges-method} \alias{start<-,IRanges-method} \alias{width<-,IRanges-method} \alias{end<-,IRanges-method} \alias{names<-,IRanges-method} \alias{ranges,Ranges-method} \alias{isNormal,IRanges-method} \alias{update,IRanges-method} \alias{c,IRanges-method} % NormalIRanges objects: \alias{class:NormalIRanges} \alias{NormalIRanges-class} \alias{NormalIRanges} \alias{isEmpty,NormalIRanges-method} \alias{isNormal,NormalIRanges-method} \alias{max,NormalIRanges-method} \alias{min,NormalIRanges-method} % Coercion: \alias{coerce,Ranges,IRanges-method} \alias{coerce,logical,IRanges-method} \alias{coerce,logical,NormalIRanges-method} \alias{coerce,integer,IRanges-method} \alias{coerce,integer,NormalIRanges-method} \alias{coerce,numeric,IRanges-method} \alias{coerce,numeric,NormalIRanges-method} \alias{coerce,character,IRanges-method} \alias{coerce,factor,IRanges-method} \alias{coerce,ANY,Ranges-method} \title{IRanges and NormalIRanges objects} \description{ The IRanges class is a simple implementation of the \link{Ranges} container where 2 integer vectors of the same length are used to store the start and width values. See the \link{Ranges} virtual class for a formal definition of \link{Ranges} objects and for their methods (all of them should work for IRanges objects). Some subclasses of the IRanges class are: NormalIRanges, \link{Views}, etc... A NormalIRanges object is just an IRanges object that is guaranteed to be "normal". See the Normality section in the man page for \link{Ranges} objects for the definition and properties of "normal" \link{Ranges} objects. } \section{Constructor}{ See \code{?`\link{IRanges-constructor}`}. } \section{Coercion}{ \describe{ \item{}{ \code{ranges(x, use.names=FALSE, use.mcols=FALSE)}: Squeeze the ranges out of \link{Ranges} object \code{x} and return them in an IRanges object \emph{parallel} to \code{x} (i.e. same length as \code{x}). } \item{}{ \code{as(from, "IRanges")}: Creates an IRanges instance from a Ranges object, logical vector, or integer vector. When \code{from} is a logical vector, the resulting IRanges object contains the indices for the runs of \code{TRUE} values. When \code{from} is an integer vector, the elements are either singletons or "increase by 1" sequences. } \item{}{ \code{as(from, "NormalIRanges")}: Creates a NormalIRanges instance from a logical or integer vector. When \code{from} is an integer vector, the elements must be strictly increasing. } } } \section{Combining}{ \describe{ \item{}{ \code{c(x, ..., ignore.mcols=FALSE)} Combining \code{IRanges} objects is straightforward when they do not have any metadata columns. If only one of the \code{IRanges} object has metadata columns, then the corresponding metadata columns are attached to the other \code{IRanges} object and set to \code{NA}. When multiple \code{IRanges} object have their own \code{metadata columns}, the user must ensure that each such \code{linkS4class{DataFrame}} have identical layouts to each other (same columns defined), in order for the combination to be successful, otherwise an error will be thrown. The user can call \code{c(x, ..., ignore.mcols=TRUE)} in order to combine \code{IRanges} objects with differing sets of metadata columns, which will result in the combined object having NO metadata columns. } } } \section{Methods for NormalIRanges objects}{ \describe{ \item{}{ \code{max(x)}: The maximum value in the finite set of integers represented by \code{x}. } \item{}{ \code{min(x)}: The minimum value in the finite set of integers represented by \code{x}. } } } \author{Hervé Pagès} \seealso{ \link{Ranges-class}, \link{IRanges-constructor}, \link{IRanges-utils}, \link{intra-range-methods} for intra range transformations, \link{inter-range-methods} for inter range transformations, \link{setops-methods} } \examples{ showClass("IRanges") # shows (some of) the known subclasses ## --------------------------------------------------------------------- ## A. MANIPULATING IRanges OBJECTS ## --------------------------------------------------------------------- ## All the methods defined for Ranges objects work on IRanges objects. ## See ?Ranges for some examples. ## Also see ?`IRanges-utils` and ?`setops-methods` for additional ## operations on IRanges objects. ## Combining IRanges objects ir1 <- IRanges(c(1, 10, 20), width=5) mcols(ir1) <- DataFrame(score=runif(3)) ir2 <- IRanges(c(101, 110, 120), width=10) mcols(ir2) <- DataFrame(score=runif(3)) ir3 <- IRanges(c(1001, 1010, 1020), width=20) mcols(ir3) <- DataFrame(value=runif(3)) some.iranges <- c(ir1, ir2) ## all.iranges <- c(ir1, ir2, ir3) ## This will raise an error all.iranges <- c(ir1, ir2, ir3, ignore.mcols=TRUE) stopifnot(is.null(mcols(all.iranges))) ## --------------------------------------------------------------------- ## B. A NOTE ABOUT PERFORMANCE ## --------------------------------------------------------------------- ## Using an IRanges object for storing a big set of ranges is more ## efficient than using a standard R data frame: N <- 2000000L # nb of ranges W <- 180L # width of each range start <- 1L end <- 50000000L set.seed(777) range_starts <- sort(sample(end-W+1L, N)) range_widths <- rep.int(W, N) ## Instantiation is faster system.time(x <- IRanges(start=range_starts, width=range_widths)) system.time(y <- data.frame(start=range_starts, width=range_widths)) ## Subsetting is faster system.time(x16 <- x[c(TRUE, rep.int(FALSE, 15))]) system.time(y16 <- y[c(TRUE, rep.int(FALSE, 15)), ]) ## Internal representation is more compact object.size(x16) object.size(y16) } \keyword{methods} \keyword{classes} IRanges/man/IRanges-constructor.Rd0000644000175400017540000001555213175713360020123 0ustar00biocbuildbiocbuild\name{IRanges-constructor} \alias{IRanges-constructor} \alias{IRanges} \alias{solveUserSEW0} \alias{solveUserSEW} \title{The IRanges constructor and supporting functions} \description{ The \code{IRanges} function is a constructor that can be used to create IRanges instances. \code{solveUserSEW0} and \code{solveUserSEW} are utility functions that solve a set of user-supplied start/end/width values. } \usage{ ## IRanges constructor: IRanges(start=NULL, end=NULL, width=NULL, names=NULL) ## Supporting functions (not for the end user): solveUserSEW0(start=NULL, end=NULL, width=NULL) solveUserSEW(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE) } \arguments{ \item{start, end, width}{ For \code{IRanges} and \code{solveUserSEW0}: \code{NULL}, or vector of integers (eventually with NAs). For \code{solveUserSEW}: vector of integers (eventually with NAs). } \item{names}{ A character vector or \code{NULL}. } \item{refwidths}{ Vector of non-NA non-negative integers containing the reference widths. } \item{rep.refwidths}{ \code{TRUE} or \code{FALSE}. Use of \code{rep.refwidths=TRUE} is supported only when \code{refwidths} is of length 1. } \item{translate.negative.coord, allow.nonnarrowing}{ \code{TRUE} or \code{FALSE}. } } \section{IRanges constructor}{ Return the IRanges object containing the ranges specified by \code{start}, \code{end} and \code{width}. Input falls into one of two categories: \describe{ \item{Category 1}{ \code{start}, \code{end} and \code{width} are numeric vectors (or NULLs). If necessary they are recycled to the length of the longest (NULL arguments are filled with NAs). After this recycling, each row in the 3-column matrix obtained by binding those 3 vectors together is "solved" i.e. NAs are treated as unknown in the equation \code{end = start + width - 1}. Finally, the solved matrix is returned as an \link{IRanges} instance. } \item{Category 2}{ The \code{start} argument is a logical vector or logical Rle object and \code{IRanges(start)} produces the same result as \code{as(start, "IRanges")}. Note that, in that case, the returned IRanges instance is guaranteed to be normal. } } Note that the \code{names} argument is never recycled (to remain consistent with what \code{`names<-`} does on standard vectors). } \section{Supporting functions}{ \describe{ \item{}{ \code{solveUserSEW0(start=NULL, end=NULL, width=NULL)}: } \item{}{ \code{solveUserSEW(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE)}: Use of \code{rep.refwidths=TRUE} is supported only when \code{refwidths} is of length 1. If \code{rep.refwidths=FALSE} (the default) then \code{start}, \code{end} and \code{width} are recycled to the length of \code{refwidths} (it's an error if one of them is longer than \code{refwidths}, or is of zero length while \code{refwidths} is not). If \code{rep.refwidths=TRUE} then \code{refwidths} is first replicated L times where L is the length of the longest of \code{start}, \code{end} and \code{width}. After this replication, \code{start}, \code{end} and \code{width} are recycled to the new length of \code{refwidths} (L) (it's an error if one of them is of zero length while L is != 0). From now, \code{refwidths}, \code{start}, \code{end} and \code{width} are integer vectors of equal lengths. Each row in the 3-column matrix obtained by binding those 3 vectors together must contain at least one NA (otherwise an error is returned). Then each row is "solved" i.e. the 2 following transformations are performed (\code{i} is the indice of the row): (1) if \code{translate.negative.coord} is TRUE then a negative value of \code{start[i]} or \code{end[i]} is considered to be a \code{-refwidths[i]}-based coordinate so \code{refwidths[i]+1} is added to it to make it 1-based; (2) the NAs in the row are treated as unknowns which values are deduced from the known values in the row and from \code{refwidths[i]}. The exact rules for (2) are the following. Rule (2a): if the row contains at least 2 NAs, then \code{width[i]} must be one of them (otherwise an error is returned), and if \code{start[i]} is one of them it is replaced by 1, and if \code{end[i]} is one of them it is replaced by \code{refwidths[i]}, and finally \code{width[i]} is replaced by \code{end[i] - start[i] + 1}. Rule (2b): if the row contains only 1 NA, then it is replaced by the solution of the \code{width[i] == end[i] - start[i] + 1} equation. Finally, the set of solved rows is returned as an \link{IRanges} object of the same length as \code{refwidths} (after replication if \code{rep.refwidths=TRUE}). Note that an error is raised if either (1) the set of user-supplied start/end/width values is invalid or (2) \code{allow.nonnarrowing} is FALSE and the ranges represented by the solved start/end/width values are not narrowing the ranges represented by the user-supplied start/end/width values. } } } \author{Hervé Pagès} \seealso{ \link{IRanges-class}, \code{\link{narrow}} } \examples{ ## --------------------------------------------------------------------- ## A. USING THE IRanges() CONSTRUCTOR ## --------------------------------------------------------------------- IRanges(start=11, end=rep.int(20, 5)) IRanges(start=11, width=rep.int(20, 5)) IRanges(-2, 20) # only one range IRanges(start=c(2, 0, NA), end=c(NA, NA, 14), width=11:0) IRanges() # IRanges instance of length zero IRanges(names=character()) ## With logical input: x <- IRanges(c(FALSE, TRUE, TRUE, FALSE, TRUE)) # logical vector input isNormal(x) # TRUE x <- IRanges(Rle(1:30) \%\% 5 <= 2) # logical Rle input isNormal(x) # TRUE ## --------------------------------------------------------------------- ## B. USING solveUserSEW() ## --------------------------------------------------------------------- refwidths <- c(5:3, 6:7) refwidths solveUserSEW(refwidths) solveUserSEW(refwidths, start=4) solveUserSEW(refwidths, end=3, width=2) solveUserSEW(refwidths, start=-3) solveUserSEW(refwidths, start=-3, width=2) solveUserSEW(refwidths, end=-4) ## The start/end/width arguments are recycled: solveUserSEW(refwidths, start=c(3, -4, NA), end=c(-2, NA)) ## Using 'rep.refwidths=TRUE': solveUserSEW(10, start=-(1:6), rep.refwidths=TRUE) solveUserSEW(10, end=-(1:6), width=3, rep.refwidths=TRUE) } \keyword{utilities} IRanges/man/IRanges-internals.Rd0000644000175400017540000000042113175713360017522 0ustar00biocbuildbiocbuild\name{IRanges internals} \alias{coerce,ANY,vector-method} \title{IRanges internals} \description{ Objects, classes and methods defined in the \pkg{IRanges} package that are not intended to be used directly. } \keyword{internal} \keyword{classes} \keyword{methods} IRanges/man/IRanges-utils.Rd0000644000175400017540000000756113175713360016677 0ustar00biocbuildbiocbuild\name{IRanges-utils} \alias{IRanges-utils} \alias{successiveIRanges} \alias{breakInChunks} \alias{whichAsIRanges} % Coercion: \alias{asNormalIRanges} \alias{coerce,IRanges,NormalIRanges-method} \title{IRanges utility functions} \description{ Utility functions for creating or modifying \link{IRanges} objects. } \usage{ ## Create an IRanges instance: successiveIRanges(width, gapwidth=0, from=1) breakInChunks(totalsize, chunksize, nchunk) ## Turn a logical vector into a set of ranges: whichAsIRanges(x) ## Coercion: asNormalIRanges(x, force=TRUE) } \arguments{ \item{width}{ A vector of non-negative integers (with no NAs) specifying the widths of the ranges to create. } \item{gapwidth}{ A single integer or an integer vector with one less element than the \code{width} vector specifying the widths of the gaps separating one range from the next one. } \item{from}{ A single integer specifying the starting position of the first range. } \item{totalsize}{ A single non-negative integer. The total size of the object to break. } \item{chunksize}{ A single positive integer. The size of the chunks (last chunk might be smaller). } \item{nchunk}{ A single positive integer. The number of chunks. } \item{x}{ A logical vector for \code{whichAsIRanges}. An \link{IRanges} object for \code{asNormalIRanges}. } \item{force}{ \code{TRUE} or \code{FALSE}. Should \code{x} be turned into a \link{NormalIRanges} object even if \code{isNormal(x)} is \code{FALSE}? } } \details{ \code{successiveIRanges} returns an IRanges instance containing the ranges that have the widths specified in the \code{width} vector and are separated by the gaps specified in \code{gapwidth}. The first range starts at position \code{from}. When \code{gapwidth=0} and \code{from=1} (the defaults), the returned IRanges can be seen as a partitioning of the 1:sum(width) interval. See \code{?Partitioning} for more details on this. \code{whichAsIRanges} returns an \link{IRanges} instance containing all of the ranges where \code{x} is \code{TRUE}. If \code{force=TRUE} (the default), then \code{asNormalIRanges} will turn \code{x} into a \link{NormalIRanges} instance by reordering and reducing the set of ranges if necessary (i.e. only if \code{isNormal(x)} is \code{FALSE}, otherwise the set of ranges will be untouched). If \code{force=FALSE}, then \code{asNormalIRanges} will turn \code{x} into a \link{NormalIRanges} instance only if \code{isNormal(x)} is \code{TRUE}, otherwise it will raise an error. Note that when \code{force=FALSE}, the returned object is guaranteed to contain exactly the same set of ranges than \code{x}. \code{as(x, "NormalIRanges")} is equivalent to \code{asNormalIRanges(x, force=TRUE)}. } \author{Hervé Pagès} \seealso{ \link{Ranges-class}, \link{IRanges-class}, \link{intra-range-methods} for intra range transformations, \link{inter-range-methods} for inter range transformations, \link{setops-methods}, \code{\link{solveUserSEW}}, \code{\link{successiveViews}} } \examples{ vec <- as.integer(c(19, 5, 0, 8, 5)) successiveIRanges(vec) breakInChunks(600999, 50000) # 13 chunks of size 50000 (last chunk is # smaller). whichAsIRanges(vec >= 5) x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) asNormalIRanges(x) # 3 non-empty ranges ordered from left to right and # separated by gaps of width >= 1. ## More on normality: example(`IRanges-class`) isNormal(x16) # FALSE if (interactive()) x16 <- asNormalIRanges(x16) # Error! whichFirstNotNormal(x16) # 57 isNormal(x16[1:56]) # TRUE xx <- asNormalIRanges(x16[1:56]) class(xx) max(xx) min(xx) } \keyword{utilities} IRanges/man/IRangesList-class.Rd0000644000175400017540000001420413175713360017470 0ustar00biocbuildbiocbuild\name{IRangesList-class} \docType{class} % IRangesList objects: \alias{class:IRangesList} \alias{class:CompressedIRangesList} \alias{class:SimpleIRangesList} \alias{IRangesList-class} \alias{CompressedIRangesList-class} \alias{SimpleIRangesList-class} \alias{IRangesList} \alias{CompressedIRangesList} \alias{SimpleIRangesList} % accessors \alias{end,CompressedIRangesList-method} \alias{width,CompressedIRangesList-method} \alias{start,CompressedIRangesList-method} % coercion \alias{coerce,list,CompressedIRangesList-method} \alias{coerce,list,SimpleIRangesList-method} \alias{coerce,list,IRangesList-method} \alias{coerce,List,CompressedIRangesList-method} \alias{coerce,Ranges,CompressedIRangesList-method} \alias{coerce,List,SimpleIRangesList-method} \alias{coerce,RangesList,SimpleIRangesList-method} \alias{coerce,SimpleRangesList,SimpleIRangesList-method} \alias{coerce,List,IRangesList-method} \alias{coerce,CompressedRleList,CompressedIRangesList-method} % NormalIRangesList objects: \alias{class:NormalIRangesList} \alias{class:CompressedNormalIRangesList} \alias{class:SimpleNormalIRangesList} \alias{NormalIRangesList-class} \alias{CompressedNormalIRangesList-class} \alias{SimpleNormalIRangesList-class} \alias{NormalIRangesList} \alias{CompressedNormalIRangesList} \alias{SimpleNormalIRangesList} \alias{isNormal,SimpleIRangesList-method} \alias{isNormal,CompressedIRangesList-method} % general \alias{max,CompressedNormalIRangesList-method} \alias{max,SimpleNormalIRangesList-method} \alias{min,CompressedNormalIRangesList-method} \alias{min,SimpleNormalIRangesList-method} \alias{summary,CompressedIRangesList-method} % more coercions \alias{as.list,CompressedNormalIRangesList-method} \alias{unlist,SimpleNormalIRangesList-method} \alias{coerce,RangesList,SimpleNormalIRangesList-method} \alias{coerce,SimpleIRangesList,SimpleNormalIRangesList-method} \alias{coerce,NormalIRangesList,CompressedNormalIRangesList-method} \alias{coerce,CompressedIRangesList,CompressedNormalIRangesList-method} \alias{coerce,RangesList,CompressedNormalIRangesList-method} \alias{coerce,RangesList,NormalIRangesList-method} \alias{coerce,LogicalList,NormalIRangesList-method} \alias{coerce,LogicalList,CompressedNormalIRangesList-method} \alias{coerce,LogicalList,SimpleNormalIRangesList-method} \alias{coerce,RleList,NormalIRangesList-method} \alias{coerce,RleList,CompressedNormalIRangesList-method} \alias{coerce,RleList,SimpleNormalIRangesList-method} \title{List of IRanges and NormalIRanges} \description{\code{\linkS4class{IRangesList}} and \code{\linkS4class{NormalIRangesList}} objects for storing \code{\linkS4class{IRanges}} and \code{\linkS4class{NormalIRanges}} objects respectively.} \section{Constructor}{ \describe{ \item{}{\code{IRangesList(..., compress=TRUE)}: The \code{...} argument accepts either a comma-separated list of \code{IRanges} objects, or a single \code{LogicalList} / logical \code{RleList} object, or 2 elements named \code{start} and \code{end} each of them being either a list of integer vectors or an IntegerList object. When \code{IRanges} objects are supplied, each of them becomes an element in the new \code{IRangesList}, in the same order, which is analogous to the \code{\link{list}} constructor. If \code{compress}, the internal storage of the data is compressed. } } } \section{Coercion}{ In the code snippets below, \code{from} is a \emph{list-like} object. \describe{ \item{}{\code{as(from, "SimpleIRangesList")}: Coerces \code{from}, to a \code{\linkS4class{SimpleIRangesList}}, requiring that all \code{Ranges} elements are coerced to internal \code{IRanges} elements. This is a convenient way to ensure that all \code{Ranges} have been imported into R (and that there is no unwanted overhead when accessing them). } \item{}{\code{as(from, "CompressedIRangesList")}: Coerces \code{from}, to a \code{\linkS4class{CompressedIRangesList}}, requiring that all \code{Ranges} elements are coerced to internal \code{IRanges} elements. This is a convenient way to ensure that all \code{Ranges} have been imported into R (and that there is no unwanted overhead when accessing them). } \item{}{\code{as(from, "SimpleNormalIRangesList")}: Coerces \code{from}, to a \code{\linkS4class{SimpleNormalIRangesList}}, requiring that all \code{Ranges} elements are coerced to internal \code{NormalIRanges} elements. } \item{}{\code{as(from, "CompressedNormalIRangesList")}: Coerces \code{from}, to a \code{\linkS4class{CompressedNormalIRangesList}}, requiring that all \code{Ranges} elements are coerced to internal \code{NormalIRanges} elements. } } In the code snippet below, \code{x} is an \code{IRangesList} object. \describe{ \item{}{\code{unlist(x)}: Unlists \code{x}, an \code{IRangesList}, by concatenating all of the ranges into a single \code{IRanges} instance. If the length of \code{x} is zero, an empty \code{IRanges} is returned. } } } \section{Methods for NormalIRangesList objects}{ \describe{ \item{}{ \code{max(x)}: An integer vector containing the maximum values of each of the elements of \code{x}. } \item{}{ \code{min(x)}: An integer vector containing the minimum values of each of the elements of \code{x}. } } } \author{ Michael Lawrence } \seealso{ \code{\linkS4class{RangesList}}, the parent of this class, for more functionality. \link{intra-range-methods} and \link{inter-range-methods} for intra and inter range transformations of IRangesList objects. \link{setops-methods} for set operations on IRangesList objects. } \examples{ range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) named <- IRangesList(one = range1, two = range2) length(named) # 2 names(named) # "one" and "two" named[[1]] # range1 unnamed <- IRangesList(range1, range2) names(unnamed) # NULL x <- IRangesList(start=list(c(1,2,3), c(15,45,20,1)), end=list(c(5,2,8), c(15,100,80,5))) as.list(x) } \keyword{classes} \keyword{methods} IRanges/man/List-class-leftovers.Rd0000644000175400017540000000254113175713360020227 0ustar00biocbuildbiocbuild\name{List-class-leftovers} \docType{class} \alias{stack,List-method} \title{List objects (old man page)} \description{ IMPORTANT NOTE - 9/4/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{List} objects located in the \pkg{S4Vectors} package. } \details{ The only thing left here is the documentation of the \code{stack} method for \link{List} objects. In the code snippets below, \code{x} is a List object. \describe{ \item{}{ \code{stack(x, index.var = "name", value.var = "value")}: As with \code{\link[utils:stack]{stack}} on a \code{list}, constructs a \code{DataFrame} with two columns: one for the unlisted values, the other indicating the name of the element from which each value was obtained. \code{index.var} specifies the column name for the index (source name) column and \code{value.var} specifies the column name for the values. } } } \seealso{ \itemize{ \item The \link[S4Vectors]{List} class defined and documented in the \pkg{S4Vectors} package. } } \examples{ starts <- IntegerList(c(1, 5), c(2, 8)) ends <- IntegerList(c(3, 8), c(5, 9)) rgl <- IRangesList(start=starts, end=ends) rangeDataFrame <- stack(rgl, "space", "ranges") } \keyword{methods} \keyword{classes} IRanges/man/MaskCollection-class.Rd0000644000175400017540000001616013175713360020216 0ustar00biocbuildbiocbuild\name{MaskCollection-class} \docType{class} % Classes: \alias{class:MaskCollection} \alias{MaskCollection-class} \alias{MaskCollection} % Basic accessor methods: \alias{nir_list} \alias{nir_list,MaskCollection-method} \alias{length,MaskCollection-method} \alias{width,MaskCollection-method} \alias{active} \alias{active,MaskCollection-method} \alias{active<-} \alias{active<-,MaskCollection-method} \alias{names,MaskCollection-method} \alias{names<-,MaskCollection-method} \alias{desc} \alias{desc,MaskCollection-method} \alias{desc<-} \alias{desc<-,MaskCollection-method} % Constructor: \alias{Mask} % Other methods: \alias{max,MaskCollection-method} \alias{min,MaskCollection-method} \alias{maskedwidth} \alias{maskedwidth,MaskCollection-method} \alias{maskedratio} \alias{maskedratio,MaskCollection-method} % Subsetting and appending: \alias{append,MaskCollection,MaskCollection-method} % Endomorphisms: \alias{collapse} \alias{collapse,MaskCollection-method} % Coercion: \alias{coerce,MaskCollection,NormalIRanges-method} % "show" method: \alias{MaskCollection.show_frame} \alias{show,MaskCollection-method} \title{MaskCollection objects} \description{ The MaskCollection class is a container for storing a collection of masks that can be used to mask regions in a sequence. } \details{ In the context of the Biostrings package, a mask is a set of regions in a sequence that need to be excluded from some computation. For example, when calling \code{\link[Biostrings:letterFrequency]{alphabetFrequency}} or \code{\link[Biostrings]{matchPattern}} on a chromosome sequence, you might want to exclude some regions like the centromere or the repeat regions. This can be achieved by putting one or several masks on the sequence before calling \code{\link[Biostrings:letterFrequency]{alphabetFrequency}} on it. A MaskCollection object is a vector-like object that represents such set of masks. Like standard R vectors, it has a "length" which is the number of masks contained in it. But unlike standard R vectors, it also has a "width" which determines the length of the sequences it can be "put on". For example, a MaskCollection object of width 20000 can only be put on an \link[Biostrings:XString-class]{XString} object of 20000 letters. Each mask in a MaskCollection object \code{x} is just a finite set of integers that are >= 1 and <= \code{width(x)}. When "put on" a sequence, these integers indicate the positions of the letters to mask. Internally, each mask is represented by a \link{NormalIRanges} object. } \section{Basic accessor methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{}{ \code{length(x)}: The number of masks in \code{x}. } \item{}{ \code{width(x)}: The common with of all the masks in \code{x}. This determines the length of the sequences that \code{x} can be "put on". } \item{}{ \code{active(x)}: A logical vector of the same length as \code{x} where each element indicates whether the corresponding mask is active or not. } \item{}{ \code{names(x)}: \code{NULL} or a character vector of the same length as \code{x}. } \item{}{ \code{desc(x)}: \code{NULL} or a character vector of the same length as \code{x}. } \item{}{ \code{nir_list(x)}: A list of the same length as \code{x}, where each element is a \link{NormalIRanges} object representing a mask in \code{x}. } } } \section{Constructor}{ \describe{ \item{}{ \code{Mask(mask.width, start=NULL, end=NULL, width=NULL)}: Return a single mask (i.e. a MaskCollection object of length 1) of width \code{mask.width} (a single integer >= 1) and masking the ranges of positions specified by \code{start}, \code{end} and \code{width}. See the \code{\link{IRanges}} constructor (\code{?\link{IRanges}}) for how \code{start}, \code{end} and \code{width} can be specified. Note that the returned mask is active and unnamed. } } } \section{Other methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{}{ \code{isEmpty(x)}: Return a logical vector of the same length as \code{x}, indicating, for each mask in \code{x}, whether it's empty or not. } \item{}{ \code{max(x)}: The greatest (or last, or rightmost) masked position for each mask. This is a numeric vector of the same length as \code{x}. } \item{}{ \code{min(x)}: The smallest (or first, or leftmost) masked position for each mask. This is a numeric vector of the same length as \code{x}. } \item{}{ \code{maskedwidth(x)}: The number of masked position for each mask. This is an integer vector of the same length as \code{x} where all values are >= 0 and <= \code{width(x)}. } \item{}{ \code{maskedratio(x)}: \code{maskedwidth(x) / width(x)} } } } \section{Subsetting and appending}{ In the code snippets below, \code{x} and \code{values} are MaskCollection objects. \describe{ \item{}{ \code{x[i]}: Return a new MaskCollection object made of the selected masks. Subscript \code{i} can be a numeric, logical or character vector. } \item{}{ \code{x[[i, exact=TRUE]]}: Extract the mask selected by \code{i} as a \link{NormalIRanges} object. Subscript \code{i} can be a single integer or a character string. } \item{}{ \code{append(x, values, after=length(x))}: Add masks in \code{values} to \code{x}. } } } \section{Other methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{}{ \code{collapse(x)}: Return a MaskCollection object of length 1 obtained by collapsing all the active masks in \code{x}. } } } \author{Hervé Pagès} \seealso{ \link{NormalIRanges-class}, \link{read.Mask}, \link[Biostrings]{MaskedXString-class}, \code{\link{reverse}}, \code{\link[Biostrings]{alphabetFrequency}}, \code{\link[Biostrings]{matchPattern}} } \examples{ ## Making a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks length(mymasks) width(mymasks) collapse(mymasks) ## Names and descriptions: names(mymasks) <- c("A", "B", "C") # names should be short and unique... mymasks mymasks[c("C", "A")] # ...to make subsetting by names easier desc(mymasks) <- c("you can be", "more verbose", "here") mymasks[-2] ## Activate/deactivate masks: active(mymasks)["B"] <- FALSE mymasks collapse(mymasks) active(mymasks) <- FALSE # deactivate all masks mymasks active(mymasks)[-1] <- TRUE # reactivate all masks except mask 1 active(mymasks) <- !active(mymasks) # toggle all masks ## Other advanced operations: mymasks[[2]] length(mymasks[[2]]) mymasks[[2]][-3] append(mymasks[-2], gaps(mymasks[2])) } \keyword{methods} \keyword{classes} IRanges/man/NCList-class.Rd0000644000175400017540000001630413175713360016443 0ustar00biocbuildbiocbuild\name{NCList-class} \docType{class} % NCList objects: \alias{class:NCList} \alias{NCList-class} \alias{NCList} \alias{length,NCList-method} \alias{names,NCList-method} \alias{start,NCList-method} \alias{end,NCList-method} \alias{width,NCList-method} \alias{coerce,Ranges,NCList-method} % NCLists objects: \alias{class:NCLists} \alias{NCLists-class} \alias{NCLists} \alias{ranges,NCLists-method} \alias{length,NCLists-method} \alias{names,NCLists-method} \alias{start,NCLists-method} \alias{end,NCLists-method} \alias{width,NCLists-method} \alias{elementNROWS,NCLists-method} \alias{coerce,NCLists,CompressedIRangesList-method} \alias{coerce,NCLists,IRangesList-method} \alias{coerce,RangesList,NCLists-method} \title{Nested Containment List objects} \description{ The NCList class is a container for storing the Nested Containment List representation of a \link{Ranges} object. Preprocessing a \link{Ranges} object as a Nested Containment List allows efficient overlap-based operations like \code{\link{findOverlaps}}. The NCLists class is a container for storing a collection of NCList objects. An NCLists object is typically the result of preprocessing each list element of a \link{RangesList} object as a Nested Containment List. Like with NCList, the NCLists object can then be used for efficient overlap-based operations. To preprocess a \link{Ranges} or \link{RangesList} object, simply call the \code{NCList} or \code{NCLists} constructor function on it. } \usage{ NCList(x, circle.length=NA_integer_) NCLists(x, circle.length=NA_integer_) } \arguments{ \item{x}{ The \link{Ranges} or \link{RangesList} object to preprocess. } \item{circle.length}{ Use only if the space (or spaces if \code{x} is a \link{RangesList} object) on top of which the ranges in \code{x} are defined needs (need) to be considered circular. If that's the case, then use \code{circle.length} to specify the length(s) of the circular space(s). For \code{NCList}, \code{circle.length} must be a single positive integer (or NA if the space is linear). For \code{NCLists}, it must be an integer vector parallel to \code{x} (i.e. same length) and with positive or NA values (NAs indicate linear spaces). } } \details{ The \pkg{GenomicRanges} package also defines the \code{\link[GenomicRanges]{GNCList}} constructor and class for preprocessing and representing a vector of genomic ranges as a data structure based on Nested Containment Lists. Some important differences between the new findOverlaps/countOverlaps implementation based on Nested Containment Lists (BioC >= 3.1) and the old implementation based on Interval Trees (BioC < 3.1): \itemize{ \item With the new implementation, the hits returned by \code{\link{findOverlaps}} are not \emph{fully} ordered (i.e. ordered by queryHits and subject Hits) anymore, but only \emph{partially} ordered (i.e. ordered by queryHits only). Other than that, and except for the 2 particular situations mentioned below, the 2 implementations produce the same output. However, the new implementation is faster and more memory efficient. \item With the new implementation, either the query or the subject can be preprocessed with \code{NCList} for a \link{Ranges} object (replacement for \code{IntervalTree}), \code{NCLists} for a \link{RangesList} object (replacement for \code{IntervalForest}), and \code{\link[GenomicRanges]{GNCList}} for a \link[GenomicRanges]{GenomicRanges} object (replacement for \code{GIntervalTree}). However, for a one-time use, it is NOT advised to explicitely preprocess the input. This is because \code{\link{findOverlaps}} or \code{\link{countOverlaps}} will take care of it and do a better job at it (by preprocessing only what's needed when it's needed, and releasing memory as they go). \item With the new implementation, \code{\link{countOverlaps}} on \link{Ranges} or \link[GenomicRanges]{GenomicRanges} objects doesn't call \code{\link{findOverlaps}} in order to collect all the hits in a growing \link{Hits} object and count them only at the end. Instead, the counting happens at the C level and the hits are not kept. This reduces memory usage considerably when there is a lot of hits. \item When \code{minoverlap=0}, zero-width ranges are now interpreted as insertion points and considered to overlap with ranges that contain them. With the old alogrithm, zero-width ranges were always ignored. This is the 1st situation where the new and old implementations produce different outputs. \item When using \code{select="arbitrary"}, the new implementation will generally not select the same hits as the old implementation. This is the 2nd situation where the new and old implementations produce different outputs. \item The new implementation supports preprocessing of a \link[GenomicRanges]{GenomicRanges} object with ranges defined on circular sequences (e.g. on the mitochnodrial chromosome). See \link[GenomicRanges]{GNCList} in the \pkg{GenomicRanges} package for some examples. \item Objects preprocessed with \code{NCList}, \code{NCLists}, and \code{\link[GenomicRanges]{GNCList}} are serializable (with \code{save}) for later use. Not a typical thing to do though, because preprocessing is very cheap (i.e. very fast and memory efficient). } } \value{ An NCList object for the \code{NCList} constructor and an NCLists object for the \code{NCLists} constructor. } \author{Hervé Pagès} \references{ Alexander V. Alekseyenko and Christopher J. Lee -- Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases. Bioinformatics (2007) 23 (11): 1386-1393. doi: 10.1093/bioinformatics/btl647 } \seealso{ \itemize{ \item The \code{\link[GenomicRanges]{GNCList}} constructor and class defined in the \pkg{GenomicRanges} package. \item \code{\link{findOverlaps}} for finding/counting interval overlaps between two \emph{range-based} objects. \item \link{Ranges} and \link{RangesList} objects. } } \examples{ ## The example below is for illustration purpose only and does NOT ## reflect typical usage. This is because, for a one-time use, it is ## NOT advised to explicitely preprocess the input for findOverlaps() ## or countOverlaps(). These functions will take care of it and do a ## better job at it (by preprocessing only what's needed when it's ## needed, and release memory as they go). query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) ## Either the query or the subject of findOverlaps() can be preprocessed: ppsubject <- NCList(subject) hits1 <- findOverlaps(query, ppsubject) hits1 ppquery <- NCList(query) hits2 <- findOverlaps(ppquery, subject) hits2 ## Note that 'hits1' and 'hits2' contain the same hits but not in the ## same order. stopifnot(identical(sort(hits1), sort(hits2))) } \keyword{classes} \keyword{methods} IRanges/man/RangedData-class.Rd0000644000175400017540000004601513175713360017303 0ustar00biocbuildbiocbuild\name{RangedData-class} \docType{class} \alias{class:RangedData} \alias{RangedData-class} % Accessors: \alias{nrow,RangedData-method} \alias{ncol,RangedData-method} \alias{rownames,RangedData-method} \alias{colnames,RangedData-method} \alias{rownames<-,RangedData-method} \alias{colnames<-,RangedData-method} \alias{elementNROWS,RangedData-method} \alias{end,RangedData-method} \alias{end<-,RangedData-method} \alias{length,RangedData-method} \alias{names,RangedData-method} \alias{names<-,RangedData-method} \alias{ranges,RangedData-method} \alias{ranges<-,RangedData-method} \alias{start,RangedData-method} \alias{start<-,RangedData-method} \alias{values,RangedData-method} \alias{values<-,RangedData-method} \alias{width,RangedData-method} \alias{width<-,RangedData-method} \alias{space,RangedData-method} \alias{universe,RangedData-method} \alias{universe<-,RangedData-method} \alias{score,RangedData-method} \alias{score<-,RangedData-method} \alias{columnMetadata,RangedData-method} \alias{columnMetadata<-,RangedData-method} % Constructor: \alias{RangedData} % Coercion: \alias{as.data.frame,RangedData-method} \alias{coerce,RangedData,DataFrame-method} \alias{coerce,Rle,RangedData-method} \alias{coerce,RleList,RangedData-method} \alias{coerce,RleViewsList,RangedData-method} \alias{coerce,Ranges,RangedData-method} \alias{coerce,RangesList,RangedData-method} \alias{coerce,RangedData,CompressedIRangesList-method} \alias{coerce,RangedData,IRangesList-method} \alias{coerce,RangedData,RangesList-method} \alias{as.env,RangedData-method} \alias{coerce,data.frame,RangedData-method} \alias{coerce,DataTable,RangedData-method} % Combining and splitting \alias{c,RangedData-method} \alias{rbind,RangedData-method} % Subsetting: \alias{[,RangedData-method} \alias{[[,RangedData-method} \alias{[[<-,RangedData-method} \alias{$<-,RangedData-method} % Utilities: \alias{within,RangedData-method} % Applying: \alias{endoapply,RangedData-method} \alias{lapply,RangedData-method} % Show: \alias{show,RangedData-method} \title{Data on ranges} \description{ IMPORTANT NOTE: \code{RangedData} objects will be deprecated in BioC 3.7! The use of \code{RangedData} objects has been discouraged in favor of \link[GenomicRanges]{GRanges} or \link[GenomicRanges]{GRangesList} objects since BioC 2.12, that is, since 2014. The \link[GenomicRanges]{GRanges} and \link[GenomicRanges]{GRangesList} classes are defined in the \pkg{GenomicRanges} package. See \code{?GRanges} and \code{?GenomicRanges} (after loading the \pkg{GenomicRanges} package) for more information about these classes. PLEASE MIGRATE YOUR CODE TO USE \link[GenomicRanges]{GRanges} OR \link[GenomicRanges]{GRangesList} OBJECTS INSTEAD OF \code{RangedData} OBJECTS AS SOON AS POSSIBLE. Don't hesitate to ask on the bioc-devel mailing list (\url{https://bioconductor.org/help/support/#bioc-devel}) if you need help with this. \code{RangedData} supports storing data, i.e. a set of variables, on a set of ranges spanning multiple spaces (e.g. chromosomes). Although the data is split across spaces, it can still be treated as one cohesive dataset when desired and extends \code{\linkS4class{DataTable}}. } \details{ A \code{RangedData} object consists of two primary components: a \code{\linkS4class{RangesList}} holding the ranges over multiple spaces and a parallel \code{\linkS4class{SplitDataFrameList}}, holding the split data. There is also an \code{universe} slot for denoting the source (e.g. the genome) of the ranges and/or data. There are two different modes of interacting with a \code{RangedData}. The first mode treats the object as a contiguous "data frame" annotated with range information. The accessors \code{start}, \code{end}, and \code{width} get the corresponding fields in the ranges as atomic integer vectors, undoing the division over the spaces. The \code{[[} and matrix-style \code{[,} extraction and subsetting functions unroll the data in the same way. \code{[[<-} does the inverse. The number of rows is defined as the total number of ranges and the number of columns is the number of variables in the data. It is often convenient and natural to treat the data this way, at least when the data is small and there is no need to distinguish the ranges by their space. The other mode is to treat the \code{RangedData} as a list, with an element (a virtual \code{\linkS4class{Ranges}}/\code{\linkS4class{DataFrame}} pair) for each space. The length of the object is defined as the number of spaces and the value returned by the \code{names} accessor gives the names of the spaces. The list-style \code{[} subset function behaves analogously. } \section{Accessor methods}{ In the code snippets below, \code{x} is a \code{RangedData} object. The following accessors treat the data as a contiguous dataset, ignoring the division into spaces: \describe{ \item{}{Array accessors: \describe{ \item{}{ \code{nrow(x)}: The number of ranges in \code{x}. } \item{}{ \code{ncol(x)}: The number of data variables in \code{x}. } \item{}{ \code{dim(x)}: An integer vector of length two, essentially \code{c(nrow(x), ncol(x))}. } \item{}{ \code{rownames(x)}, \code{rownames(x) <- value}: Gets or sets the names of the ranges in \code{x}. } \item{}{ \code{colnames(x)}, \code{colnames(x) <- value}: Gets the names of the variables in \code{x}. } \item{}{ \code{dimnames(x)}: A list with two elements, essentially \code{list(rownames(x), colnames(x))}. } \item{}{ \code{dimnames(x) <- value}: Sets the row and column names, where value is a list as described above. } \item{}{\code{columnMetadata(x)}: Get the \code{DataFrame} of metadata along the value columns, i.e., where each column in \code{x} is represented by a row in the metadata. Note that calling \code{mcols(x)} returns the metadata on each space in \code{x}. } \item{}{\code{columnMetadata(x) <- value}: Set the \code{DataFrame} of metadata for the columns. } \item{}{\code{within(data, expr, ...)}: Evaluates \code{expr} within \code{data}, a \code{RangedData}. Any values assigned in \code{expr} will be stored as value columns in \code{data}, unless they match one of the reserved names: \code{ranges}, \code{start}, \code{end}, \code{width} and \code{space}. Behavior is undefined if any of the range symbols are modified inconsistently. Modifications to \code{space} are ignored. } } } \item{}{Range accessors. The type of the return value depends on the type of \code{\linkS4class{Ranges}}. For \code{\linkS4class{IRanges}}, an integer vector. Regardless, the number of elements is always equal to \code{nrow(x)}. \describe{ \item{}{ \code{start(x), start(x) <- value}: Get or set the starts of the ranges. When setting the starts, \code{value} can be an integer vector of \code{length(sum(elementNROWS(ranges(x))))} or an IntegerList object of length \code{length(ranges(x))} and names \code{names(ranges(x))}. } \item{}{ \code{end(x), end(x) <- value}: Get or set the ends of the ranges. When setting the ends, \code{value} can be an integer vector of \code{length(sum(elementNROWS(ranges(x))))} or an IntegerList object of length \code{length(ranges(x))} and names \code{names(ranges(x))}. } \item{}{ \code{width(x), width(x) <- value}: Get or set the widths of the ranges. When setting the widths, \code{value} can be an integer vector of \code{length(sum(elementNROWS(ranges(x))))} or an IntegerList object of length \code{length(ranges(x))} and names \code{names(ranges(x))}. } } } } These accessors make the object seem like a list along the spaces: \describe{ \item{}{ \code{length(x)}: The number of spaces (e.g. chromosomes) in \code{x}. } \item{}{ \code{names(x)}, \code{names(x) <- value}: Get or set the names of the spaces (e.g. \code{"chr1"}). \code{NULL} or a character vector of the same length as \code{x}. } } Other accessors: \describe{ \item{}{ \code{universe(x)}, \code{universe(x) <- value}: Get or set the scalar string identifying the scope of the data in some way (e.g. genome, experimental platform, etc). The universe may be \code{NULL}. } \item{}{ \code{ranges(x), ranges(x) <- value}: Gets or sets the ranges in \code{x} as a \code{\linkS4class{RangesList}}. } \item{}{ \code{space(x)}: Gets the spaces from \code{ranges(x)}. } \item{}{ \code{values(x), values(x) <- value}: Gets or sets the data values in \code{x} as a \code{\linkS4class{SplitDataFrameList}}. } \item{}{ \code{score(x), score(x) <- value}: Gets or sets the column representing a "score" in \code{x}, as a vector. This is the column named \code{score}, or, if this does not exist, the first column, if it is numeric. The get method return \code{NULL} if no suitable score column is found. The set method takes a numeric vector as its value. } } } \section{Constructor}{ \describe{ \item{}{ \code{RangedData(ranges = IRanges(), ..., space = NULL, universe = NULL)}: Creates a \code{RangedData} with the ranges in \code{ranges} and variables given by the arguments in \code{...}. See the constructor \code{\linkS4class{DataFrame}} for how the \code{...} arguments are interpreted. If \code{ranges} is a \code{\linkS4class{Ranges}} object, the \code{space} argument is used to split of the data into spaces. If \code{space} is \code{NULL}, all of the ranges and values are placed into the same space, resulting in a single-space (length one) \code{RangedData} object. Otherwise, the ranges and values are split into spaces according to \code{space}, which is treated as a factor, like the \code{f} argument in \code{\link{split}}. If \code{ranges} is a \code{\linkS4class{RangesList}} object, then the supplied \code{space} argument is ignored and its value is derived from \code{ranges}. If \code{ranges} is not a \code{\linkS4class{Ranges}} or \code{\linkS4class{RangesList}} object, this function calls \code{as(ranges, "RangedData")} and returns the result if successful. The universe may be specified as a scalar string by the \code{universe} argument. } } } \section{Coercion}{ \describe{ \item{}{ \code{as.data.frame(x, row.names=NULL, optional=FALSE, ...)}: Copy the start, end, width of the ranges and all of the variables as columns in a \code{data.frame}. This is a bridge to existing functionality in R, but of course care must be taken if the data is large. Note that \code{optional} and \code{...} are ignored. } \item{}{ \code{as(from, "DataFrame")}: Like \code{as.data.frame} above, except the result is an \code{\linkS4class{DataFrame}} and it probably involves less copying, especially if there is only a single space. } \item{}{ \code{as(from, "RangedData")}: Coerce \code{from} to a \code{RangedData}, according to the type of \code{from}: \describe{ \item{\code{\linkS4class{Rle}}, \code{\linkS4class{RleList}}}{ Converts each run to a range and stores the run values in a column named "score". } \item{\code{\linkS4class{RleViewsList}}}{ Creates a \code{RangedData} using the ranges given by the runs of \code{subject(from)} in each of the windows, with a value column \code{score} taken as the corresponding subject values. } \item{\code{\linkS4class{Ranges}}}{ Creates a \code{RangedData} with only the ranges in \code{from}; no data columns. } \item{\code{\linkS4class{RangesList}}}{ Creates a \code{RangedData} with the ranges in \code{from}. Also propagates the \emph{inner} metadata columns of the \code{RangesList} (accessed with \code{mcols(unlist(from))}) to the data columns (aka values) of the \code{RangedData}. This makes it a \emph{lossless} coercion and the exact reverse of the coercion from \code{RangedData} to \code{RangesList}. } \item{\code{data.frame} or \code{DataTable}}{Constructs a \code{RangedData}, using the columns \dQuote{start}, \dQuote{end}, and, optionally, \dQuote{space} columns in \code{from}. The other columns become data columns in the result. Any \dQuote{width} column is ignored. } } } \item{}{ \code{as(from, "RangesList")}: Creates a \code{CompressedIRangesList} (a subclass of \code{RangesList}) made of the ranges in \code{from}. Also propagates the data columns (aka values) of the \code{RangedData} to the inner metadata columns of the \code{RangesList}. This makes it a \emph{lossless} coercion and the exact reverse of the coercion from \code{RangesList} to \code{RangedData}. } \item{}{\code{as.env(x, enclos = parent.frame())}: Creates an \code{environment} with a symbol for each variable in the frame, as well as a \code{ranges} symbol for the ranges. This is efficient, as no copying is performed. } } } \section{Subsetting and Replacement}{ In the code snippets below, \code{x} is a \code{RangedData} object. \describe{ \item{}{ \code{x[i]}: Subsets \code{x} by indexing into its spaces, so the result is of the same class, with a different set of spaces. \code{i} can be numerical, logical, \code{NULL} or missing. } \item{}{ \code{x[i,j]}: Subsets \code{x} by indexing into its rows and columns. The result is of the same class, with a different set of rows and columns. The row index \code{i} can either treat \code{x} as a flat table by being a character, integer, or logical vector or treat \code{x} as a partitioned table by being a \code{\linkS4class{RangesList}}, \code{\linkS4class{LogicalList}}, or \code{\linkS4class{IntegerList}} of the same length as \code{x}. } \item{}{ \code{x[[i]]}: Extracts a variable from \code{x}, where \code{i} can be a character, numeric, or logical scalar that indexes into the columns. The variable is unlisted over the spaces. For convenience, values of \code{"space"} and \code{"ranges"} are equivalent to \code{space(x)} and \code{unlist(ranges(x))} respectively. } \item{}{ \code{x$name}: similar to above, where \code{name} is taken literally as a column name in the data. } \item{}{ \code{x[[i]] <- value}: Sets value as column \code{i} in \code{x}, where \code{i} can be a character, numeric, or logical scalar that indexes into the columns. The length of \code{value} should equal \code{nrow(x)}. \code{x[[i]]} should be identical to \code{value} after this operation. For convenience, \code{i="ranges"} is equivalent to \code{ranges(x) <- value}. } \item{}{ \code{x$name <- value}: similar to above, where \code{name} is taken literally as a column name in the data. } } } \section{Splitting and Combining}{ In the code snippets below, \code{x} is a \code{RangedData} object. \describe{ \item{}{ \code{rbind(...)}: Matches the spaces from the \code{RangedData} objects in \code{...} by name and combines them row-wise. } \item{}{ \code{c(x, ..., recursive = FALSE)}: Combines \code{x} with arguments specified in \code{...}, which must all be \code{RangedData} objects. This combination acts as if \code{x} is a list of spaces, meaning that the result will contain the spaces of the first concatenated with the spaces of the second, and so on. This function is useful when creating \code{RangedData} objects on a space-by-space basis and then needing to combine them. } } } \section{Applying}{ An \code{lapply} method is provided to apply a function over the spaces of a \code{RangedData}: \describe{ \item{}{\code{lapply(X, FUN, ...)}: Applies \code{FUN} to each space in \code{X} with extra parameters in \code{...}. } } } \author{ Michael Lawrence } \seealso{ \linkS4class{DataTable}, the parent of this class, with more utilities. } \examples{ ranges <- IRanges(c(1,2,3),c(4,5,6)) filter <- c(1L, 0L, 1L) score <- c(10L, 2L, NA) ## constructing RangedData instances ## no variables rd <- RangedData() rd <- RangedData(ranges) ranges(rd) ## one variable rd <- RangedData(ranges, score) rd[["score"]] ## multiple variables rd <- RangedData(ranges, filter, vals = score) rd[["vals"]] # same as rd[["score"]] above rd$vals rd[["filter"]] rd <- RangedData(ranges, score + score) rd[["score...score"]] # names made valid ## split some data over chromosomes range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) both <- c(ranges, range2) score <- c(score, c(0L, 3L, NA, 22L)) filter <- c(filter, c(0L, 1L, NA, 0L)) chrom <- paste("chr", rep(c(1,2), c(length(ranges), length(range2))), sep="") rd <- RangedData(both, score, filter, space = chrom) rd[["score"]] # identical to score rd[1][["score"]] # identical to score[1:3] ## subsetting ## list style: [i] rd[numeric()] # these three are all empty rd[logical()] rd[NULL] rd[] # missing, full instance returned rd[FALSE] # logical, supports recycling rd[c(FALSE, FALSE)] # same as above rd[TRUE] # like rd[] rd[c(TRUE, FALSE)] rd[1] # numeric index rd[c(1,2)] rd[-2] ## matrix style: [i,j] rd[,NULL] # no columns rd[NULL,] # no rows rd[,1] rd[,1:2] rd[,"filter"] rd[1,] # now by the rows rd[c(1,3),] rd[1:2, 1] # row and column rd[c(1:2,1,3),1] ## repeating rows ## dimnames colnames(rd)[2] <- "foo" colnames(rd) rownames(rd) <- head(letters, nrow(rd)) rownames(rd) ## space names names(rd) names(rd)[1] <- "chr1" ## variable replacement count <- c(1L, 0L, 2L) rd <- RangedData(ranges, count, space = c(1, 2, 1)) ## adding a variable score <- c(10L, 2L, NA) rd[["score"]] <- score rd[["score"]] # same as 'score' ## replacing a variable count2 <- c(1L, 1L, 0L) rd[["count"]] <- count2 ## numeric index also supported rd[[2]] <- score rd[[2]] # gets 'score' ## removing a variable rd[[2]] <- NULL ncol(rd) # is only 1 rd$score2 <- score ## combining rd <- RangedData(ranges, score, space = c(1, 2, 1)) c(rd[1], rd[2]) # equal to 'rd' rd2 <- RangedData(ranges, score) ## applying lapply(rd, `[[`, 1) # get first column in each space } \keyword{methods} \keyword{classes} IRanges/man/RangedSelection-class.Rd0000644000175400017540000000462713175713360020362 0ustar00biocbuildbiocbuild\name{RangedSelection-class} \docType{class} \alias{RangedSelection-class} % accessors \alias{ranges,RangedSelection-method} \alias{colnames,RangedSelection-method} \alias{ranges<-,RangedSelection-method} \alias{colnames<-,RangedSelection-method} % coercion \alias{coerce,RangesList,RangedSelection-method} % constructor \alias{RangedSelection} \title{Selection of ranges and columns} \description{A \code{RangedSelection} represents a query against a table of interval data in terms of ranges and column names. The ranges select any table row with an overlapping interval. Note that the intervals are always returned, even if no columns are selected.} \details{ Traditionally, tabular data structures have supported the \code{\link{subset}} function, which allows one to select a subset of the rows and columns from the table. In that case, the rows and columns are specified by two separate arguments. As querying interval data sources, especially those external to R, such as binary indexed files and databases, is increasingly common, there is a need to encapsulate the row and column specifications into a single data structure, mostly for the sake of interface cleanliness. The \code{RangedSelection} class fills that role. } \section{Constructor}{ \describe{ \item{}{\code{RangedSelection(ranges = RangesList(), colnames = character())}: Constructors a \code{RangedSelection} with the given \code{ranges} and \code{colnames}. } } } \section{Coercion}{ \describe{ \item{}{\code{as(from, "RangedSelection")}: Coerces \code{from} to a \code{RangedSelection} object. Typically, \code{from} is a \code{\linkS4class{RangesList}}, the ranges of which become the ranges in the new \code{RangedSelection}. } } } \section{Accessors}{ In the code snippets below, \code{x} is always a \code{RangedSelection}. \describe{ \item{}{\code{ranges(x), ranges(x) <- value}: Gets or sets the ranges, a \code{\linkS4class{RangesList}}, that select rows with overlapping intervals. } \item{}{\code{colnames(x), colnames(x) <- value}: Gets the names, a \code{character} vector, indicating the columns. } } } \author{ Michael Lawrence } \examples{ rl <- RangesList(chr1 = IRanges(c(1, 5), c(3, 6))) RangedSelection(rl) as(rl, "RangedSelection") # same as above RangedSelection(rl, "score") } \keyword{methods} \keyword{classes} IRanges/man/Ranges-class.Rd0000644000175400017540000003313113175713360016523 0ustar00biocbuildbiocbuild\name{Ranges-class} \docType{class} % Classes: \alias{class:Ranges} \alias{Ranges-class} \alias{Ranges} % Generics and methods: \alias{length,Ranges-method} \alias{elementNROWS,Ranges-method} \alias{width} \alias{start,Ranges-method} \alias{width,Ranges-method} \alias{end,Ranges-method} \alias{mid} \alias{mid,Ranges-method} \alias{start<-} \alias{width<-} \alias{end<-} \alias{as.character,Ranges-method} \alias{as.factor,Ranges-method} \alias{as.matrix,Ranges-method} \alias{as.data.frame,Ranges-method} \alias{as.integer,Ranges-method} \alias{unlist,Ranges-method} \alias{show,Ranges-method} \alias{showAsCell,Ranges-method} \alias{isEmpty,Ranges-method} \alias{isNormal} \alias{isNormal,Ranges-method} \alias{whichFirstNotNormal} \alias{whichFirstNotNormal,Ranges-method} \alias{update,Ranges-method} \alias{tile} \alias{tile,Ranges-method} \alias{slidingWindows} \alias{slidingWindows,Ranges-method} \title{Ranges objects} \description{ The Ranges virtual class is a general container for storing a set of integer ranges. } \details{ A Ranges object is a vector-like object where each element describes a "range of integer values". A "range of integer values" is a finite set of consecutive integer values. Each range can be fully described with exactly 2 integer values which can be arbitrarily picked up among the 3 following values: its "start" i.e. its smallest (or first, or leftmost) value; its "end" i.e. its greatest (or last, or rightmost) value; and its "width" i.e. the number of integer values in the range. For example the set of integer values that are greater than or equal to -20 and less than or equal to 400 is the range that starts at -20 and has a width of 421. In other words, a range is a closed, one-dimensional interval with integer end points and on the domain of integers. The starting point (or "start") of a range can be any integer (see \code{start} below) but its "width" must be a non-negative integer (see \code{width} below). The ending point (or "end") of a range is equal to its "start" plus its "width" minus one (see \code{end} below). An "empty" range is a range that contains no value i.e. a range that has a null width. Depending on the context, it can be interpreted either as just the empty \emph{set} of integers or, more precisely, as the position \emph{between} its "end" and its "start" (note that for an empty range, the "end" equals the "start" minus one). The length of a Ranges object is the number of ranges in it, not the number of integer values in its ranges. A Ranges object is considered empty iff all its ranges are empty. Ranges objects have a vector-like semantic i.e. they only support single subscript subsetting (unlike, for example, standard R data frames which can be subsetted by row and by column). The Ranges class itself is a virtual class. The following classes derive directly from the Ranges class: \link{IRanges}, \link{IPos}, \link{NCList}, \link{PartitioningByEnd}. } \section{Methods}{ In the code snippets below, \code{x}, \code{y} and \code{object} are Ranges objects. Not all the functions described below will necessarily work with all kinds of Ranges derivatives but they should work at least for \link{IRanges} objects. Note that many more operations on Ranges objects are described in other man pages of the \pkg{IRanges} package. See for example the man page for \emph{intra range transformations} (e.g. \code{shift()}, see \code{?`\link{intra-range-methods}`}), or the man page for inter range transformations (e.g. \code{reduce()}, see \code{?`\link{inter-range-methods}`}), or the man page for \code{findOverlaps} methods (see \code{?`\link{findOverlaps-methods}`}), or the man page for \link{RangesList} objects where the \code{split} method for Ranges objects is documented. \describe{ \item{}{ \code{length(x)}: The number of ranges in \code{x}. } \item{}{ \code{start(x)}: The start values of the ranges. This is an integer vector of the same length as \code{x}. } \item{}{ \code{width(x)}: The number of integer values in each range. This is a vector of non-negative integers of the same length as \code{x}. } \item{}{ \code{end(x)}: \code{start(x) + width(x) - 1L} } \item{}{ \code{mid(x)}: returns the midpoint of the range, \code{start(x) + floor((width(x) - 1)/2)}. } \item{}{ \code{names(x)}: \code{NULL} or a character vector of the same length as \code{x}. } \item{}{ \code{update(object, ...)}: Convenience method for combining multiple modifications of \code{object} in one single call. For example \code{object <- update(object, start=start(object)-2L, end=end(object)+2L)} is equivalent to \code{start(object) <- start(object)-2L; end(object) <- end(object)+2L}. } \item{}{ \code{tile(x, n, width, ...)}: Splits each range in \code{x} into subranges as specified by \code{n} (number of ranges) or \code{width}. Only one of \code{n} or \code{width} can be specified. The return value is a \code{IRangesList} the same length as \code{x}. Ranges with a width less than the \code{width} argument are returned unchanged. } \item{}{ \code{slidingWindows(x, width, step=1L)}: Generates sliding windows within each range of \code{x}, of width \code{width}, and starting every \code{step} positions. The return value is a \code{IRangesList} the same length as \code{x}. Ranges with a width less than the \code{width} argument are returned unchanged. If the sliding windows do not exactly cover \code{x}, the last window is partial. } \item{}{ \code{isEmpty(x)}: Return a logical value indicating whether \code{x} is empty or not. } \item{}{ \code{as.matrix(x, ...)}: Convert \code{x} into a 2-column integer matrix containing \code{start(x)} and \code{width(x)}. Extra arguments (\code{...}) are ignored. } \item{}{ \code{as.data.frame(x, row.names=NULL, optional=FALSE, ...)}: Convert \code{x} into a standard R data frame object. \code{row.names} must be \code{NULL} or a character vector giving the row names for the data frame, and \code{optional} and any additional argument (\code{...}) is ignored. See \code{?\link{as.data.frame}} for more information about these arguments. } \item{}{ \code{as.integer(x)}: Convert \code{x} into an integer vector, by converting each range into the integer sequence formed by \code{from:to} and concatenating them together. } \item{}{ \code{unlist(x, recursive = TRUE, use.names = TRUE)}: Similar to \code{as.integer(x)} except can add names to elements. } \item{}{ \code{x[[i]]}: Return integer vector \code{start(x[i]):end(x[i])} denoted by \code{i}. Subscript \code{i} can be a single integer or a character string. } \item{}{ \code{x[i]}: Return a new Ranges object (of the same type as \code{x}) made of the selected ranges. \code{i} can be a numeric vector, a logical vector, \code{NULL} or missing. If \code{x} is a \link{NormalIRanges} object and \code{i} a positive numeric subscript (i.e. a numeric vector of positive values), then \code{i} must be strictly increasing. } \item{}{ \code{rep(x, times, length.out, each)}: Repeats the values in \code{x} through one of the following conventions: \describe{ \item{\code{times}}{Vector giving the number of times to repeat each element if of length \code{length(x)}, or to repeat the Ranges elements if of length 1.} \item{\code{length.out}}{Non-negative integer. The desired length of the output vector.} \item{\code{each}}{Non-negative integer. Each element of \code{x} is repeated \code{each} times.} } } \item{}{ \code{c(x, ...)}: Combine \code{x} and the Ranges objects in \code{...} together. Any object in \code{...} must belong to the same class as \code{x}, or to one of its subclasses, or must be \code{NULL}. The result is an object of the same class as \code{x}. NOTE: Only works for \link{IRanges} (and derived) objects for now. } \item{}{ \code{x * y}: The arithmetic operation \code{x * y} is for centered zooming. It symmetrically scales the width of \code{x} by \code{1/y}, where \code{y} is a numeric vector that is recycled as necessary. For example, \code{x * 2} results in ranges with half their previous width but with approximately the same midpoint. The ranges have been \dQuote{zoomed in}. If \code{y} is negative, it is equivalent to \code{x * (1/abs(y))}. Thus, \code{x * -2} would double the widths in \code{x}. In other words, \code{x} has been \dQuote{zoomed out}. } \item{}{ \code{x + y}: Expands the ranges in \code{x} on either side by the corresponding value in the numeric vector \code{y}. } \item{}{ \code{show(x)}: By default the \code{show} method displays 5 head and 5 tail lines. The number of lines can be altered by setting the global options \code{showHeadLines} and \code{showTailLines}. If the object length is less than the sum of the options, the full object is displayed. These options affect display of \link{IRanges}, \link{IPos}, \link[S4Vectors]{Hits}, \link[GenomicRanges]{GRanges}, \link[GenomicRanges]{GPos}, \link[GenomicAlignments]{GAlignments}, \link[Biostrings]{XStringSet} objects, and more... } } } \section{Normality}{ A Ranges object \code{x} is implicitly representing an arbitrary finite set of integers (that are not necessarily consecutive). This set is the set obtained by taking the union of all the values in all the ranges in \code{x}. This representation is clearly not unique: many different Ranges objects can be used to represent the same set of integers. However one and only one of them is guaranteed to be \emph{normal}. By definition a Ranges object is said to be \emph{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). Here is a simple algorithm to determine whether \code{x} is \emph{normal}: (1) if \code{length(x) == 0}, then \code{x} is normal; (2) if \code{length(x) == 1}, then \code{x} is normal iff \code{width(x) >= 1}; (3) if \code{length(x) >= 2}, then \code{x} is normal iff: \preformatted{ start(x)[i] <= end(x)[i] < start(x)[i+1] <= end(x)[i+1]} for every 1 <= \code{i} < \code{length(x)}. The obvious advantage of using a \emph{normal} Ranges object to represent a given finite set of integers is that it is the smallest in terms of number of ranges and therefore in terms of storage space. Also the fact that we impose its ranges to be ordered from left to right makes it unique for this representation. A special container (\link{NormalIRanges}) is provided for holding a \emph{normal} \link{IRanges} object: a \link{NormalIRanges} object is just an \link{IRanges} object that is guaranteed to be \emph{normal}. Here are some methods related to the notion of \emph{normal} Ranges: \describe{ \item{}{ \code{isNormal(x)}: Return TRUE or FALSE indicating whether \code{x} is \emph{normal} or not. } \item{}{ \code{whichFirstNotNormal(x)}: Return \code{NA} if \code{x} is \emph{normal}, or the smallest valid indice \code{i} in \code{x} for which \code{x[1:i]} is not \emph{normal}. } } } \author{H. Pagès and M. Lawrence} \seealso{ \itemize{ \item \link{IRanges} objects (\link{NormalIRanges} objects are documented in the same man page). \item The \link{IPos} class, a memory-efficient \link{Ranges} derivative for representing \emph{integer positions} (i.e. integer ranges of width 1). \item \link{Ranges-comparison} for comparing and ordering ranges. \item \link{findOverlaps-methods} for finding/counting overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for intra range and inter range transformations of a Ranges derivative. \item \link{coverage-methods} for computing the coverage of a set of ranges. \item \link{setops-methods} for set operations on ranges. \item \link{nearest-methods} for finding the nearest range neighbor. } } \examples{ ## --------------------------------------------------------------------- ## Basic manipulation ## --------------------------------------------------------------------- x <- IRanges(start=c(2:-1, 13:15), width=c(0:3, 2:0)) x length(x) start(x) width(x) end(x) isEmpty(x) as.matrix(x) as.data.frame(x) ## Subsetting: x[4:2] # 3 ranges x[-1] # 6 ranges x[FALSE] # 0 range x0 <- x[width(x) == 0] # 2 ranges isEmpty(x0) ## Use the replacement methods to resize the ranges: width(x) <- width(x) * 2 + 1 x end(x) <- start(x) # equivalent to width(x) <- 0 x width(x) <- c(2, 0, 4) x start(x)[3] <- end(x)[3] - 2 # resize the 3rd range x ## Name the elements: names(x) names(x) <- c("range1", "range2") x x[is.na(names(x))] # 5 ranges x[!is.na(names(x))] # 2 ranges ir <- IRanges(c(1,5), c(3,10)) ir*1 # no change ir*c(1,2) # zoom second range by 2X ir*-2 # zoom out 2X } \keyword{methods} \keyword{classes} IRanges/man/Ranges-comparison.Rd0000644000175400017540000002677213175713360017605 0ustar00biocbuildbiocbuild\name{Ranges-comparison} \alias{Ranges-comparison} \alias{pcompare} \alias{pcompare,Ranges,Ranges-method} \alias{rangeComparisonCodeToLetter} \alias{match,Ranges,Ranges-method} \alias{selfmatch,Ranges-method} \alias{is.unsorted,Ranges-method} \alias{order,Ranges-method} \title{Comparing and ordering ranges} \description{ Methods for comparing and/or ordering \link{Ranges} objects. } \usage{ ## match() & selfmatch() ## --------------------- \S4method{match}{Ranges,Ranges}(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash")) \S4method{selfmatch}{Ranges}(x, method=c("auto", "quick", "hash")) ## order() and related methods ## ---------------------------- \S4method{is.unsorted}{Ranges}(x, na.rm=FALSE, strictly=FALSE) \S4method{order}{Ranges}(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")) ## Generalized parallel comparison of 2 Ranges objects ## --------------------------------------------------- \S4method{pcompare}{Ranges,Ranges}(x, y) rangeComparisonCodeToLetter(code) } \arguments{ \item{x, table, y}{ \link{Ranges} objects. } \item{nomatch}{ The value to be returned in the case when no match is found. It is coerced to an \code{integer}. } \item{incomparables}{ Not supported. } \item{method}{ For \code{match} and \code{selfmatch}: Use a Quicksort-based (\code{method="quick"}) or a hash-based (\code{method="hash"}) algorithm. The latter tends to give better performance, except maybe for some pathological input that we've not encountered so far. When \code{method="auto"} is specified, the most efficient algorithm will be used, that is, the hash-based algorithm if \code{length(x) <= 2^29}, otherwise the Quicksort-based algorithm. For \code{order}: The \code{method} argument is ignored. } \item{na.rm}{ Ignored. } \item{strictly}{ Logical indicating if the check should be for \emph{strictly} increasing values. } \item{...}{ One or more \link{Ranges} objects. The \link{Ranges} objects after the first one are used to break ties. } \item{na.last}{ Ignored. } \item{decreasing}{ \code{TRUE} or \code{FALSE}. } \item{code}{ A vector of codes as returned by \code{pcompare}. } } \details{ Two elements of a \link{Ranges} derivative (i.e. two integer ranges) are considered equal iff they share the same start and width. \code{duplicated()} and \code{unique()} on a \link{Ranges} derivative are conforming to this. Note that with this definition, 2 empty ranges are generally not equal (they need to share the same start to be considered equal). This means that, when it comes to comparing ranges, an empty range is interpreted as a position between its end and start. For example, a typical usecase is comparison of insertion points defined along a string (like a DNA sequence) and represented as empty ranges. The "natural order" for the elements of a \link{Ranges} derivative is to order them (a) first by start and (b) then by width. This way, the space of integer ranges is totally ordered. \code{pcompare()}, \code{==}, \code{!=}, \code{<=}, \code{>=}, \code{<} and \code{>} on \link{Ranges} derivatives behave accordingly to this "natural order". \code{is.unsorted()}, \code{order()}, \code{sort()}, \code{rank()} on \link{Ranges} derivatives also behave accordingly to this "natural order". Finally, note that some \emph{inter range transformations} like \code{\link{reduce}} or \code{\link{disjoin}} also use this "natural order" implicitly when operating on \link{Ranges} derivatives. \describe{ \item{}{ \code{pcompare(x, y)}: Performs element-wise (aka "parallel") comparison of 2 \link{Ranges} objects of \code{x} and \code{y}, that is, returns an integer vector where the i-th element is a code describing how \code{x[i]} is qualitatively positioned with respect to \code{y[i]}. Here is a summary of the 13 predefined codes (and their letter equivalents) and their meanings: \preformatted{ -6 a: x[i]: .oooo....... 6 m: x[i]: .......oooo. y[i]: .......oooo. y[i]: .oooo....... -5 b: x[i]: ..oooo...... 5 l: x[i]: ......oooo.. y[i]: ......oooo.. y[i]: ..oooo...... -4 c: x[i]: ...oooo..... 4 k: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooo..... -3 d: x[i]: ...oooooo... 3 j: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooooo... -2 e: x[i]: ..oooooooo.. 2 i: x[i]: ....oooo.... y[i]: ....oooo.... y[i]: ..oooooooo.. -1 f: x[i]: ...oooo..... 1 h: x[i]: ...oooooo... y[i]: ...oooooo... y[i]: ...oooo..... 0 g: x[i]: ...oooooo... y[i]: ...oooooo... } Note that this way of comparing ranges is a refinement over the standard ranges comparison defined by the \code{==}, \code{!=}, \code{<=}, \code{>=}, \code{<} and \code{>} operators. In particular a code that is \code{< 0}, \code{= 0}, or \code{> 0}, corresponds to \code{x[i] < y[i]}, \code{x[i] == y[i]}, or \code{x[i] > y[i]}, respectively. The \code{pcompare} method for \link{Ranges} objects is guaranteed to return predefined codes only but methods for other objects (e.g. for \link[GenomicRanges]{GenomicRanges} objects) can return non-predefined codes. Like for the predefined codes, the sign of any non-predefined code must tell whether \code{x[i]} is less than, or greater than \code{y[i]}. } \item{}{ \code{rangeComparisonCodeToLetter(x)}: Translate the codes returned by \code{pcompare}. The 13 predefined codes are translated as follow: -6 -> a; -5 -> b; -4 -> c; -3 -> d; -2 -> e; -1 -> f; 0 -> g; 1 -> h; 2 -> i; 3 -> j; 4 -> k; 5-> l; 6 -> m. Any non-predefined code is translated to X. The translated codes are returned in a factor with 14 levels: a, b, ..., l, m, X. } \item{}{ \code{match(x, table, nomatch=NA_integer_, method=c("auto", "quick", "hash"))}: Returns an integer vector of the length of \code{x}, containing the index of the first matching range in \code{table} (or \code{nomatch} if there is no matching range) for each range in \code{x}. } \item{}{ \code{selfmatch(x, method=c("auto", "quick", "hash"))}: Equivalent to, but more efficient than, \code{match(x, x, method=method)}. } \item{}{ \code{duplicated(x, fromLast=FALSE, method=c("auto", "quick", "hash"))}: Determines which elements of \code{x} are equal to elements with smaller subscripts, and returns a logical vector indicating which elements are duplicates. \code{duplicated(x)} is equivalent to, but more efficient than, \code{duplicated(as.data.frame(x))} on a \link{Ranges} object. See \code{\link[base]{duplicated}} in the \pkg{base} package for more details. } \item{}{ \code{unique(x, fromLast=FALSE, method=c("auto", "quick", "hash"))}: Removes duplicate ranges from \code{x}. \code{unique(x)} is equivalent to, but more efficient than, \code{unique(as.data.frame(x))} on a \link{Ranges} object. See \code{\link[base]{unique}} in the \pkg{base} package for more details. } \item{}{ \code{x \%in\% table}: A shortcut for finding the ranges in \code{x} that match any of the ranges in \code{table}. Returns a logical vector of length equal to the number of ranges in \code{x}. } \item{}{ \code{findMatches(x, table, method=c("auto", "quick", "hash"))}: An enhanced version of \code{match} that returns all the matches in a \link{Hits} object. } \item{}{ \code{countMatches(x, table, method=c("auto", "quick", "hash"))}: Returns an integer vector of the length of \code{x} containing the number of matches in \code{table} for each element in \code{x}. } \item{}{ \code{order(...)}: Returns a permutation which rearranges its first argument (a \link{Ranges} object) into ascending order, breaking ties by further arguments (also \link{Ranges} objects). } \item{}{ \code{sort(x)}: Sorts \code{x}. See \code{\link[base]{sort}} in the \pkg{base} package for more details. } \item{}{ \code{rank(x, na.last=TRUE, ties.method=c("average", "first", "random", "max", "min"))}: Returns the sample ranks of the ranges in \code{x}. See \code{\link[base]{rank}} in the \pkg{base} package for more details. } } } \author{Hervé Pagès} \seealso{ \itemize{ \item The \link{Ranges} class. \item \link[S4Vectors]{Vector-comparison} in the \pkg{S4Vectors} package for general information about comparing, ordering, and tabulating vector-like objects. \item \link[GenomicRanges]{GenomicRanges-comparison} in the \pkg{GenomicRanges} package for comparing and ordering genomic ranges. \item \code{\link{findOverlaps}} for finding overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for intra and inter range transformations. \item \link{setops-methods} for set operations on \link{IRanges} objects. } } \examples{ ## --------------------------------------------------------------------- ## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 Ranges OBJECTS ## --------------------------------------------------------------------- x0 <- IRanges(1:11, width=4) x0 y0 <- IRanges(6, 9) pcompare(x0, y0) pcompare(IRanges(4:6, width=6), y0) pcompare(IRanges(6:8, width=2), y0) pcompare(x0, y0) < 0 # equivalent to 'x0 < y0' pcompare(x0, y0) == 0 # equivalent to 'x0 == y0' pcompare(x0, y0) > 0 # equivalent to 'x0 > y0' rangeComparisonCodeToLetter(-10:10) rangeComparisonCodeToLetter(pcompare(x0, y0)) ## Handling of zero-width ranges (a.k.a. empty ranges): x1 <- IRanges(11:17, width=0) x1 pcompare(x1, x1[4]) pcompare(x1, IRanges(12, 15)) ## Note that x1[2] and x1[6] are empty ranges on the edge of non-empty ## range IRanges(12, 15). Even though -1 and 3 could also be considered ## valid codes for describing these configurations, pcompare() ## considers x1[2] and x1[6] to be *adjacent* to IRanges(12, 15), and ## thus returns codes -5 and 5: pcompare(x1[2], IRanges(12, 15)) # -5 pcompare(x1[6], IRanges(12, 15)) # 5 x2 <- IRanges(start=c(20L, 8L, 20L, 22L, 25L, 20L, 22L, 22L), width=c( 4L, 0L, 11L, 5L, 0L, 9L, 5L, 0L)) x2 which(width(x2) == 0) # 3 empty ranges x2[2] == x2[2] # TRUE x2[2] == x2[5] # FALSE x2 == x2[4] x2 >= x2[3] ## --------------------------------------------------------------------- ## B. match(), selfmatch(), %in%, duplicated(), unique() ## --------------------------------------------------------------------- table <- x2[c(2:4, 7:8)] match(x2, table) x2 \%in\% table duplicated(x2) unique(x2) ## --------------------------------------------------------------------- ## C. findMatches(), countMatches() ## --------------------------------------------------------------------- findMatches(x2, table) countMatches(x2, table) x2_levels <- unique(x2) countMatches(x2_levels, x2) ## --------------------------------------------------------------------- ## D. order() AND RELATED METHODS ## --------------------------------------------------------------------- is.unsorted(x2) order(x2) sort(x2) rank(x2, ties.method="first") } \keyword{methods} IRanges/man/RangesList-class.Rd0000644000175400017540000001126713175713360017365 0ustar00biocbuildbiocbuild\name{RangesList-class} \docType{class} \alias{class:RangesList-class} \alias{class:SimpleRangesList-class} \alias{RangesList-class} \alias{SimpleRangesList-class} \alias{RangesList} \alias{SimpleRangesList} % accessors \alias{end,RangesList-method} \alias{end<-,RangesList-method} \alias{width,RangesList-method} \alias{width<-,RangesList-method} \alias{start,RangesList-method} \alias{start<-,RangesList-method} \alias{space} \alias{space,RangesList-method} \alias{universe,RangesList-method} \alias{universe<-,RangesList-method} \alias{universe} \alias{universe<-} \alias{isNormal,RangesList-method} \alias{whichFirstNotNormal,RangesList-method} \alias{coerce,list,RangesList-method} \alias{coerce,Ranges,RangesList-method} \alias{coerce,RangesList,SimpleRangesList-method} % show \alias{show,RangesList-method} \alias{showAsCell,RangesList-method} % merge \alias{merge,RangesList,missing-method} \alias{merge,missing,RangesList-method} \alias{merge,RangesList,RangesList-method} \title{List of Ranges} \description{An extension of \linkS4class{List} that holds only \linkS4class{Ranges} objects. Useful for storing ranges over a set of spaces (e.g. chromosomes), each of which requires a separate \code{Ranges} object. } \section{Accessors}{ In the code snippets below, \code{x} is a \code{RangesList} object. All of these accessors collapse over the spaces: \describe{ \item{}{\code{start(x), start(x) <- value}: Get or set the starts of the ranges. When setting the starts, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{}{\code{end(x), end(x) <- value}: Get or set the ends of the ranges. When setting the ends, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{}{\code{width(x), width(x) <- value}: Get or set the widths of the ranges. When setting the widths, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{}{\code{space(x)}: Gets the spaces of the ranges as a character vector. This is equivalent to \code{names(x)}, except each name is repeated according to the length of its element. } } } \section{Constructor}{ \describe{ \item{}{\code{RangesList(...)}: Each \code{Ranges} in \code{...} becomes an element in the new \code{RangesList}, in the same order. This is analogous to the \code{\link{list}} constructor, except every argument in \code{...} must be derived from \code{Ranges}. } } } \section{Coercion}{ In the code snippet below, \code{x} is a \code{RangesList} object. \describe{ \item{}{ \code{as.data.frame(x, row.names = NULL, optional = FALSE, ..., value.name = "value", use.outer.mcols = FALSE, group_name.as.factor = FALSE)}: Coerces \code{x} to a \code{data.frame}. See as.data.frame on the \code{List} man page for details (?\code{List}). } } In the following code snippet, \code{from} is something other than a \code{RangesList}: \describe{ \item{}{ \code{as(from, "RangesList")}: When \code{from} is a \code{Ranges}, analogous to \code{as.list} on a vector. } } } \section{Arithmetic Operations}{ Any arithmetic operation, such as \code{x + y}, \code{x * y}, etc, where \code{x} is a \code{RangesList}, is performed identically on each element. Currently, \code{Ranges} supports only the \code{*} operator, which zooms the ranges by a numeric factor. } \author{ Michael Lawrence } \seealso{ \code{\linkS4class{List}}, the parent of this class, for more functionality. } \examples{ ## --------------------------------------------------------------------- ## Basic manipulation ## --------------------------------------------------------------------- range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) named <- RangesList(one = range1, two = range2) length(named) # 2 start(named) # same as start(c(range1, range2)) names(named) # "one" and "two" named[[1]] # range1 unnamed <- RangesList(range1, range2) names(unnamed) # NULL # edit the width of the ranges in the list edited <- named width(edited) <- rep(c(3,2), elementNROWS(named)) edited # same as list(range1, range2) as.list(RangesList(range1, range2)) # coerce to data.frame as.data.frame(named) RangesList(range1, range2) ## zoom in 2X collection <- RangesList(one = range1, range2) collection * 2 } \keyword{methods} \keyword{classes} IRanges/man/Rle-class-leftovers.Rd0000644000175400017540000000423613175713360020041 0ustar00biocbuildbiocbuild\name{Rle-class-leftovers} \docType{class} \alias{ranges,Rle-method} \alias{coerce,Rle,IRanges-method} \alias{coerce,Rle,NormalIRanges-method} \alias{findRange} \alias{findRange,Rle-method} \alias{splitRanges} \alias{splitRanges,Rle-method} \alias{splitRanges,vector_OR_factor-method} \title{Rle objects (old man page)} \description{ IMPORTANT NOTE - 7/3/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Rle} objects located in the \pkg{S4Vectors} package. } \section{Coercion}{ In the code snippets below, \code{from} is an Rle object: \describe{ \item{}{ \code{as(from, "IRanges")}: Creates an \link{IRanges} instance from a logical Rle. Note that this instance is guaranteed to be normal. } \item{}{ \code{as(from, "NormalIRanges")}: Creates a \link{NormalIRanges} instance from a logical Rle. } } } \section{General Methods}{ In the code snippets below, \code{x} is an Rle object: \describe{ \item{}{ \code{split(x, f, drop=FALSE)}: Splits \code{x} according to \code{f} to create a \link{CompressedRleList} object. If \code{f} is a list-like object then \code{drop} is ignored and \code{f} is treated as if it was \code{rep(seq_len(length(f)), sapply(f, length))}, so the returned object has the same shape as \code{f} (it also receives the names of \code{f}). Otherwise, if \code{f} is not a list-like object, empty list elements are removed from the returned object if \code{drop} is \code{TRUE}. } \item{}{ \code{findRange(x, vec)}: Returns an \link{IRanges} object representing the ranges in Rle \code{vec} that are referenced by the indices in the integer vector \code{x}. } \item{}{ \code{splitRanges(x)}: Returns a \linkS4class{CompressedIRangesList} object that contains the ranges for each of the unique run values. } } } \seealso{ The \link[S4Vectors]{Rle} class defined and documented in the \pkg{S4Vectors} package. } \examples{ x <- Rle(10:1, 1:10) x } \keyword{methods} \keyword{classes} IRanges/man/RleViews-class.Rd0000644000175400017540000000270413175713360017046 0ustar00biocbuildbiocbuild\name{RleViews-class} \docType{class} % Classes: \alias{class:RleViews} \alias{RleViews-class} \alias{RleViews} % Constructors: \alias{Views,Rle-method} % Methods: \alias{show,RleViews-method} \title{The RleViews class} \description{ The RleViews class is the basic container for storing a set of views (start/end locations) on the same Rle object. } \details{ An RleViews object contains a set of views (start/end locations) on the same \link{Rle} object called "the subject vector" or simply "the subject". Each view is defined by its start and end locations: both are integers such that start <= end. An RleViews object is in fact a particular case of a \link{Views} object (the RleViews class contains the \link{Views} class) so it can be manipulated in a similar manner: see \code{?\link{Views}} for more information. Note that two views can overlap and that a view can be "out of limits" i.e. it can start before the first element of the subject or/and end after its last element. } \author{P. Aboyoun} \seealso{ \link{Views-class}, \link{Rle-class}, \link{view-summarization-methods} } \examples{ subject <- Rle(rep(c(3L, 2L, 18L, 0L), c(3,2,1,5))) myViews <- Views(subject, 3:0, 5:8) myViews subject(myViews) length(myViews) start(myViews) end(myViews) width(myViews) myViews[[2]] set.seed(0) vec <- Rle(sample(0:2, 20, replace = TRUE)) vec Views(vec, vec > 0) } \keyword{methods} \keyword{classes} IRanges/man/RleViewsList-class.Rd0000644000175400017540000000537513175713360017711 0ustar00biocbuildbiocbuild\name{RleViewsList-class} \docType{class} \alias{RleViewsList-class} \alias{SimpleRleViewsList-class} % accessor \alias{subject,SimpleRleViewsList-method} % constructor \alias{Views,RleList-method} \alias{RleViewsList} % coercion \alias{coerce,RleViewsList,IRangesList-method} \alias{coerce,RleViewsList,CompressedIRangesList-method} \alias{coerce,RleViewsList,SimpleIRangesList-method} \title{List of RleViews} \description{An extension of \linkS4class{ViewsList} that holds only \linkS4class{RleViews} objects. Useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate \linkS4class{RleViews} object. } \details{ For more information on methods available for RleViewsList objects consult the man pages for \link{ViewsList-class} and \link{view-summarization-methods}. } \section{Constructor}{ \describe{ \item{}{\code{RleViewsList(..., rleList, rangesList)}: Either \code{...} or the \code{rleList}/\code{rangesList} couplet provide the RleViews for the list. If \code{...} is provided, each of these arguments must be RleViews objects. Alternatively, \code{rleList} and \code{rangesList} accept Rle and Ranges objects respectively that are meshed together for form the RleViewsList. } \item{}{\code{Views(subject, start=NULL, end=NULL, width=NULL, names=NULL)}: Same as \code{RleViewsList(rleList = subject, rangesList = start)}. } } } \section{Coercion}{ In the code snippets below, \code{from} is an RleViewsList object: \describe{ \item{}{ \code{as(from, "IRangesList")}: Creates a \code{CompressedIRangesList} object containing the view locations in \code{from}. } \item{}{ \code{as(from, "CompressedIRangesList")}: Creates a \code{CompressedIRangesList} object containing the view locations in \code{from}. } \item{}{ \code{as(from, "SimpleIRangesList")}: Creates a \code{SimpleIRangesList} object containing the view locations in \code{from}. } } } \author{P. Aboyoun} \seealso{ \link{ViewsList-class}, \link{view-summarization-methods} } \examples{ ## Rle objects subject1 <- Rle(c(3L,2L,18L,0L), c(3,2,1,5)) set.seed(0) subject2 <- Rle(c(0L,5L,2L,0L,3L), c(8,5,2,7,4)) ## Views rleViews1 <- Views(subject1, 3:0, 5:8) rleViews2 <- Views(subject2, subject2 > 0) ## RleList and RangesList objects rleList <- RleList(subject1, subject2) rangesList <- IRangesList(IRanges(3:0, 5:8), IRanges(subject2 > 0)) ## methods for construction method1 <- RleViewsList(rleViews1, rleViews2) method2 <- RleViewsList(rleList = rleList, rangesList = rangesList) identical(method1, method2) ## calculation over the views viewSums(method1) } \keyword{methods} \keyword{classes} IRanges/man/Vector-class-leftovers.Rd0000644000175400017540000001227313175713360020561 0ustar00biocbuildbiocbuild\name{Vector-class-leftovers} \docType{class} \alias{window<-,Vector-method} \alias{window<-.Vector} \alias{window<-,vector-method} \alias{window<-.vector} \alias{window<-,factor-method} \alias{window<-.factor} \alias{rev,Vector-method} \alias{rep,Vector-method} \alias{rep.int,Vector-method} \alias{subset,Vector-method} \alias{mstack} \alias{mstack,Vector-method} \alias{mstack,vector-method} \alias{tapply,ANY,Vector-method} \alias{tapply,Vector,ANY-method} \alias{tapply,Vector,Vector-method} \alias{with,Vector-method} \alias{eval} \alias{eval,expression,Vector-method} \alias{eval,language,Vector-method} \title{Vector objects (old man page)} \description{ IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Vector} objects located in the \pkg{S4Vectors} package. } \section{Evaluation}{ In the following code snippets, \code{x} is a Vector object. \describe{ \item{}{ \code{with(x, expr)}: Evaluates \code{expr} within \code{as.env(x)} via \code{eval(x)}. } \item{}{ \code{eval(expr, envir, enclos=parent.frame())}: Evaluates \code{expr} within \code{envir}, where \code{envir} is coerced to an environment with \code{as.env(envir, enclos)}. The \code{expr} is first processed with \code{\link{bquote}}, such that any escaped symbols are directly resolved in the calling frame. } } } \section{Convenience wrappers for common subsetting operations}{ In the code snippets below, \code{x} is a Vector object or regular R vector object. The R vector object methods for \code{window} are defined in this package and the remaining methods are defined in base R. \describe{ \item{}{ \code{window(x, start=NA, end=NA, width=NA)}: Extract the subsequence window from the Vector object using: \describe{ \item{\code{start}, \code{end}, \code{width}}{The start, end, or width of the window. Two of the three are required.} } } \item{}{ \code{window(x, start=NA, end=NA, width=NA) <- value}: Replace the subsequence window specified on the left (i.e. the subsequence in \code{x} specified by \code{start}, \code{end} and \code{width}) by \code{value}. \code{value} must either be of class \code{class(x)}, belong to a subclass of \code{class(x)}, or be coercible to \code{class(x)} or a subclass of \code{class(x)}. The elements of \code{value} are repeated to create a Vector with the same number of elements as the width of the subsequence window it is replacing. } \item{}{ \code{head(x, n = 6L)}: If \code{n} is non-negative, returns the first n elements of the Vector object. If \code{n} is negative, returns all but the last \code{abs(n)} elements of the Vector object. } \item{}{ \code{tail(x, n = 6L)}: If \code{n} is non-negative, returns the last n elements of the Vector object. If \code{n} is negative, returns all but the first \code{abs(n)} elements of the Vector object. } \item{}{ \code{rev(x)}: Return a new Vector object made of the original elements in the reverse order. } \item{}{ \code{rep(x, times, length.out, each)}, \code{rep.int(x, times)}: Repeats the values in \code{x} through one of the following conventions: \describe{ \item{\code{times}}{Vector giving the number of times to repeat each element if of length \code{length(x)}, or to repeat the whole vector if of length 1.} \item{\code{length.out}}{Non-negative integer. The desired length of the output vector.} \item{\code{each}}{Non-negative integer. Each element of \code{x} is repeated \code{each} times.} } } \item{}{ \code{subset(x, subset)}: Return a new Vector object made of the subset using logical vector \code{subset}, where missing values are taken as FALSE. } } } \section{Combining}{ In the code snippets below, \code{x} is a Vector object. \describe{ \item{}{\code{mstack(..., .index.var = "name")}: A variant of \code{\link{stack}}, where the list is taken as the list of arguments in \code{...}, each of which should be a \code{Vector} or \code{vector} (mixing the two will not work). } } } \section{Looping}{ In the code snippets below, \code{x} is a Vector object. \describe{ \item{}{ \code{tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)}: Like the standard \code{\link[base]{tapply}} function defined in the base package, the \code{tapply} method for Vector objects applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors. } } } \section{Coercion}{ \describe{ \item{}{ \code{as.list(x)}: coerce a Vector to a list, where the \code{i}th element of the result corresponds to \code{x[i]}. } } } \seealso{ The \link[S4Vectors]{Vector} class defined and documented in the \pkg{S4Vectors} package. } \keyword{methods} \keyword{classes} IRanges/man/Views-class.Rd0000644000175400017540000001246513175713360016410 0ustar00biocbuildbiocbuild\name{Views-class} \docType{class} \alias{class:Views} \alias{Views-class} \alias{subject} \alias{subject,Views-method} \alias{ranges,Views-method} \alias{ranges<-} \alias{ranges<-,Views-method} \alias{length,Views-method} \alias{start,Views-method} \alias{end,Views-method} \alias{width,Views-method} \alias{names,Views-method} \alias{start<-,Views-method} \alias{end<-,Views-method} \alias{width<-,Views-method} \alias{names<-,Views-method} \alias{elementNROWS,Views-method} \alias{Views} \alias{coerce,Vector,Views-method} \alias{coerce,Views,Ranges-method} \alias{coerce,Views,IRanges-method} \alias{coerce,Views,NormalIRanges-method} \alias{as.matrix,Views-method} \alias{c,Views-method} \alias{trim} \alias{trim,Views-method} \alias{subviews} \alias{subviews,Views-method} \alias{successiveViews} \title{Views objects} \description{ The Views virtual class is a general container for storing a set of views on an arbitrary \link{Vector} object, called the "subject". Its primary purpose is to introduce concepts and provide some facilities that can be shared by the concrete classes that derive from it. Some direct subclasses of the Views class are: \link{RleViews}, \link[XVector]{XIntegerViews} (defined in the XVector package), \link[Biostrings]{XStringViews} (defined in the Biostrings package), etc... } \section{Constructor}{ \describe{ \item{}{ \code{Views(subject, start=NULL, end=NULL, width=NULL, names=NULL)}: This constructor is a generic function with dispatch on argument \code{subject}. Specific methods must be defined for the subclasses of the Views class. For example a method for \link[Biostrings:XString-class]{XString} subjects is defined in the Biostrings package that returns an \link[Biostrings:XStringViews-class]{XStringViews} object. There is no default method. The treatment of the \code{start}, \code{end} and \code{width} arguments is the same as with the \code{\link{IRanges}} constructor, except that, in addition, \code{Views} allows \code{start} to be a \link{Ranges} object. With this feature, \code{Views(subject, IRanges(my_starts, my_ends, my_widths, my_names))} and \code{Views(subject, my_starts, my_ends, my_widths, my_names)} are equivalent (except when \code{my_starts} is itself a \link{Ranges} object). } } } \section{Coercion}{ In the code snippets below, \code{from} is a Views object: \describe{ \item{}{ \code{as(from, "IRanges")}: Creates an \code{IRanges} object containing the view locations in \code{from}. } } } \section{Accessor-like methods}{ All the accessor-like methods defined for \code{IRanges} objects work on Views objects. In addition, the following accessors are defined for Views objects: \describe{ \item{}{ \code{subject(x)}: Return the subject of the views. } } } \section{Subsetting}{ \describe{ \item{}{ \code{x[i]}: Select the views specified by \code{i}. } \item{}{ \code{x[[i]]}: Extracts the view selected by \code{i} as an object of the same class as \code{subject(x)}. Subscript \code{i} can be a single integer or a character string. The result is the subsequence of \code{subject(x)} defined by \code{window(subject(x), start=start(x)[i], end=end(x)[i])} or an error if the view is "out of limits" (i.e. \code{start(x)[i] < 1} or \code{end(x)[i] > length(subject(x))}). } } } \section{Combining}{ \describe{ \item{}{ \code{c(x, ..., ignore.mcols=FALSE)}: Combine \code{Views} objects. They must have the same subject. } } } \section{Other methods}{ \describe{ \item{}{ \code{trim(x, use.names=TRUE)}: Equivalent to \code{restrict(x, start=1L, end=length(subject(x)), keep.all.ranges=TRUE, use.names=use.names)}. } \item{}{ \code{subviews(x, start=NA, end=NA, width=NA, use.names=TRUE)}: \code{start}, \code{end}, and \code{width} arguments must be vectors of integers, eventually with NAs, that contain coordinates relative to the current ranges. Equivalent to \code{trim(narrow(x, start=start, end=end, width=width, use.names=use.names))}. } \item{}{ \code{successiveViews(subject, width, gapwidth=0, from=1)}: Equivalent to \code{Views(subject, successiveIRanges(width, gapwidth, from))}. See \code{?successiveIRanges} for a description of the \code{width}, \code{gapwidth} and \code{from} arguments. } } } \author{Hervé Pagès} \seealso{ \link{IRanges-class}, \link{Vector-class}, \link{IRanges-utils}, \link[XVector]{XVector}. Some direct subclasses of the Views class: \link{RleViews-class}, \link[XVector]{XIntegerViews-class}, \link[XVector]{XDoubleViews-class}, \link[Biostrings]{XStringViews-class}. \code{\link{findOverlaps}}. } \examples{ showClass("Views") # shows (some of) the known subclasses ## Create a set of 4 views on an XInteger subject of length 10: subject <- Rle(3:-6) v1 <- Views(subject, start=4:1, end=4:7) ## Extract the 2nd view: v1[[2]] ## Some views can be "out of limits" v2 <- Views(subject, start=4:-1, end=6) trim(v2) subviews(v2, end=-2) ## See ?`XIntegerViews-class` in the XVector package for more examples. } \keyword{methods} \keyword{classes} IRanges/man/ViewsList-class.Rd0000644000175400017540000000245313175713360017240 0ustar00biocbuildbiocbuild\name{ViewsList-class} \docType{class} \alias{class:ViewsList} \alias{ViewsList-class} \alias{ViewsList} \alias{class:SimpleViewsList} \alias{SimpleViewsList-class} \alias{SimpleViewsList} % accessors \alias{ranges,SimpleViewsList-method} \alias{start,SimpleViewsList-method} \alias{end,SimpleViewsList-method} \alias{width,SimpleViewsList-method} \alias{universe,ViewsList-method} \alias{universe<-,ViewsList-method} % coercion \alias{as.matrix,ViewsList-method} \title{List of Views} \description{An extension of \linkS4class{List} that holds only \linkS4class{Views} objects. } \details{ ViewsList is a virtual class. Specialized subclasses like e.g. \linkS4class{RleViewsList} are useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate \linkS4class{RleViews} object. As a \linkS4class{List} subclass, ViewsList inherits all the methods available for \linkS4class{List} objects. It also presents an API that is very similar to that of \linkS4class{Views}, where operations are vectorized over the elements and generally return lists. } \author{P. Aboyoun and H. Pagès} \seealso{ \link{List-class}, \link{RleViewsList-class}. \code{\link{findOverlaps}}. } \examples{ showClass("ViewsList") } \keyword{methods} \keyword{classes} IRanges/man/coverage-methods.Rd0000644000175400017540000003162313175713360017441 0ustar00biocbuildbiocbuild\name{coverage-methods} \alias{coverage-methods} \alias{coverage} \alias{coverage,Ranges-method} \alias{coverage,Views-method} \alias{coverage,RangesList-method} \alias{coverage,RangedData-method} \title{Coverage of a set of ranges} \description{ For each position in the space underlying a set of ranges, counts the number of ranges that cover it. } \usage{ coverage(x, shift=0L, width=NULL, weight=1L, ...) \S4method{coverage}{Ranges}(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) \S4method{coverage}{RangesList}(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash")) } \arguments{ \item{x}{ A \link{Ranges}, \link{Views}, or \link{RangesList} object. See \code{?`\link[GenomicRanges]{coverage-methods}`} in the \pkg{GenomicRanges} package for \code{coverage} methods for other objects. } \item{shift}{ Specifies how much each range in \code{x} should be shifted before the coverage is computed. \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: \code{shift} must be an integer or numeric vector parallel to \code{x} (will get recycled if necessary) and with no NAs. \item If \code{x} is a \link{RangesList} object: \code{shift} must be a numeric vector or list-like object of the same length as \code{x} (will get recycled if necessary). If it's a numeric vector, it's first turned into a list with \code{as.list}. After recycling, each list element \code{shift[[i]]} must be an integer or numeric vector parallel to \code{x[[i]]} (will get recycled if necessary) and with no NAs. } A positive shift value will shift the corresponding range in \code{x} to the right, and a negative value to the left. } \item{width}{ Specifies the length of the returned coverage vector(s). \itemize{ \item If \code{x} is a \link{Ranges} object: \code{width} must be \code{NULL} (the default), an NA, or a single non-negative integer. After being shifted, the ranges in \code{x} are always clipped on the left to keep only their positive portion i.e. their intersection with the [1, +inf) interval. If \code{width} is a single non-negative integer, then they're also clipped on the right to keep only their intersection with the [1, width] interval. In that case \code{coverage} returns a vector of length \code{width}. Otherwise, it returns a vector that extends to the last position in the underlying space covered by the shifted ranges. \item If \code{x} is a \link{Views} object: Same as for a \link{Ranges} object, except that, if \code{width} is \code{NULL} then it's treated as if it was \code{length(subject(x))}. \item If \code{x} is a \link{RangesList} object: \code{width} must be \code{NULL} or an integer vector parallel to \code{x} (i.e. with one element per list element in \code{x}). If not \code{NULL}, the vector must contain NAs or non-negative integers and it will get recycled to the length of \code{x} if necessary. If \code{NULL}, it is replaced with \code{NA} and recycled to the length of \code{x}. Finally \code{width[i]} is used to compute the coverage vector for \code{x[[i]]} and is therefore treated like explained above (when \code{x} is a \link{Ranges} object). } } \item{weight}{ Assigns a weight to each range in \code{x}. \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: \code{weight} must be an integer or numeric vector parallel to \code{x} (will get recycled if necessary). \item If \code{x} is a \link{RangesList} object: \code{weight} must be a numeric vector or list-like object of the same length as \code{x} (will get recycled if necessary). If it's a numeric vector, it's first turned into a list with \code{as.list}. After recycling, each list element \code{weight[[i]]} must be an integer or numeric vector parallel to \code{x[[i]]} (will get recycled if necessary). } If \code{weight} is an integer vector or list-like object of integer vectors, the coverage vector(s) will be returned as integer-\link{Rle} object(s). If it's a numeric vector or list-like object of numeric vectors, the coverage vector(s) will be returned as numeric-\link{Rle} object(s). Alternatively, \code{weight} can also be specified as a single string naming a metadata column in \code{x} (i.e. a column in \code{mcols(x)}) to be used as the \code{weight} vector. } \item{method}{ If \code{method} is set to \code{"sort"}, then \code{x} is sorted previous to the calculation of the coverage. If \code{method} is set to \code{hash}, then \code{x} is hashed directly to a vector of length \code{width} without previous sorting. The \code{"hash"} method is faster than the \code{"sort"} method when \code{x} is large (i.e. contains a lot of ranges). When \code{x} is small and \code{width} is big (e.g. \code{x} represents a small set of reads aligned to a big chromosome), then \code{method="sort"} is faster and uses less memory than \code{method="hash"}. Using \code{method="auto"} selects the best method based on \code{length(x)} and \code{width}. } \item{...}{ Further arguments to be passed to or from other methods. } } \value{ If \code{x} is a \link{Ranges} or \link{Views} object: An integer- or numeric-\link{Rle} object depending on whether \code{weight} is an integer or numeric vector. If \code{x} is a \link{RangesList} object: An \link{RleList} object with one coverage vector per list element in \code{x}, and with \code{x} names propagated to it. The i-th coverage vector can be either an integer- or numeric-\link{Rle} object, depending on the type of \code{weight[[i]]} (after \code{weight} has gone thru \code{as.list} and recycling, like described previously). } \author{H. Pagès and P. Aboyoun} \seealso{ \itemize{ \item \link[GenomicRanges]{coverage-methods} in the \pkg{GenomicRanges} package for more \code{coverage} methods. \item The \code{\link{slice}} function for slicing the \link{Rle} or \link{RleList} object returned by \code{coverage}. \item The \link{Ranges}, \link{RangesList}, \link{Rle}, and \link{RleList} classes. } } \examples{ ## --------------------------------------------------------------------- ## A. COVERAGE OF AN IRanges OBJECT ## --------------------------------------------------------------------- x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) coverage(x) coverage(x, shift=7) coverage(x, shift=7, width=27) coverage(x, shift=c(-4, 2)) # 'shift' gets recycled coverage(x, shift=c(-4, 2), width=12) coverage(x, shift=-max(end(x))) coverage(restrict(x, 1, 10)) coverage(reduce(x), shift=7) coverage(gaps(shift(x, 7), start=1, end=27)) ## With weights: coverage(x, weight=as.integer(10^(0:7))) # integer-Rle coverage(x, weight=c(2.8, -10)) # numeric-Rle, 'shift' gets recycled ## --------------------------------------------------------------------- ## B. SOME MATHEMATICAL PROPERTIES OF THE coverage() FUNCTION ## --------------------------------------------------------------------- ## PROPERTY 1: The coverage vector is not affected by reordering the ## input ranges: set.seed(24) x <- IRanges(sample(1000, 40, replace=TRUE), width=17:10) cvg0 <- coverage(x) stopifnot(identical(coverage(sample(x)), cvg0)) ## Of course, if the ranges are shifted and/or assigned weights, then ## this doesn't hold anymore, unless the 'shift' and/or 'weight' ## arguments are reordered accordingly. ## PROPERTY 2: The coverage of the concatenation of 2 Ranges objects 'x' ## and 'y' is the sum of the 2 individual coverage vectors: y <- IRanges(sample(-20:280, 36, replace=TRUE), width=28) stopifnot(identical(coverage(c(x, y), width=100), coverage(x, width=100) + coverage(y, width=100))) ## Note that, because adding 2 vectors in R recycles the shortest to ## the length of the longest, the following is generally FALSE: identical(coverage(c(x, y)), coverage(x) + coverage(y)) # FALSE ## It would only be TRUE if the 2 coverage vectors we add had the same ## length, which would only happen by chance. By using the same 'width' ## value when we computed the 2 coverages previously, we made sure they ## had the same length. ## Because of properties 1 & 2, we have: x1 <- x[c(TRUE, FALSE)] # pick up 1st, 3rd, 5th, etc... ranges x2 <- x[c(FALSE, TRUE)] # pick up 2nd, 4th, 6th, etc... ranges cvg1 <- coverage(x1, width=100) cvg2 <- coverage(x2, width=100) stopifnot(identical(coverage(x, width=100), cvg1 + cvg2)) ## PROPERTY 3: Multiplying the weights by a scalar has the effect of ## multiplying the coverage vector by the same scalar: weight <- runif(40) cvg3 <- coverage(x, weight=weight) stopifnot(all.equal(coverage(x, weight=-2.68 * weight), -2.68 * cvg3)) ## Because of properties 1 & 2 & 3, we have: stopifnot(identical(coverage(x, width=100, weight=c(5L, -11L)), 5L * cvg1 - 11L * cvg2)) ## PROPERTY 4: Using the sum of 2 weight vectors produces the same ## result as using the 2 weight vectors separately and summing the ## 2 results: weight2 <- 10 * runif(40) + 3.7 stopifnot(all.equal(coverage(x, weight=weight + weight2), cvg3 + coverage(x, weight=weight2))) ## PROPERTY 5: Repeating any input range N number of times is ## equivalent to multiplying its assigned weight by N: times <- sample(0:10L, length(x), replace=TRUE) stopifnot(all.equal(coverage(rep(x, times), weight=rep(weight, times)), coverage(x, weight=weight * times))) ## In particular, if 'weight' is not supplied: stopifnot(identical(coverage(rep(x, times)), coverage(x, weight=times))) ## PROPERTY 6: If none of the input range actually gets clipped during ## the "shift and clip" process, then: ## ## sum(cvg) = sum(width(x) * weight) ## stopifnot(sum(cvg3) == sum(width(x) * weight)) ## In particular, if 'weight' is not supplied: stopifnot(sum(cvg0) == sum(width(x))) ## Note that this property is sometimes used in the context of a ## ChIP-Seq analysis to estimate "the number of reads in a peak", that ## is, the number of short reads that belong to a peak in the coverage ## vector computed from the genomic locations (a.k.a. genomic ranges) ## of the aligned reads. Because of property 6, the number of reads in ## a peak is approximately the area under the peak divided by the short ## read length. ## PROPERTY 7: If 'weight' is not supplied, then disjoining or reducing ## the ranges before calling coverage() has the effect of "shaving" the ## coverage vector at elevation 1: table(cvg0) shaved_cvg0 <- cvg0 runValue(shaved_cvg0) <- pmin(runValue(cvg0), 1L) table(shaved_cvg0) stopifnot(identical(coverage(disjoin(x)), shaved_cvg0)) stopifnot(identical(coverage(reduce(x)), shaved_cvg0)) ## --------------------------------------------------------------------- ## C. SOME SANITY CHECKS ## --------------------------------------------------------------------- dummy.coverage <- function(x, shift=0L, width=NULL) { y <- unlist(shift(x, shift)) if (is.null(width)) width <- max(c(0L, y)) Rle(tabulate(y, nbins=width)) } check_real_vs_dummy <- function(x, shift=0L, width=NULL) { res1 <- coverage(x, shift=shift, width=width) res2 <- dummy.coverage(x, shift=shift, width=width) stopifnot(identical(res1, res2)) } check_real_vs_dummy(x) check_real_vs_dummy(x, shift=7) check_real_vs_dummy(x, shift=7, width=27) check_real_vs_dummy(x, shift=c(-4, 2)) check_real_vs_dummy(x, shift=c(-4, 2), width=12) check_real_vs_dummy(x, shift=-max(end(x))) ## With a set of distinct single positions: x3 <- IRanges(sample(50000, 20000), width=1) stopifnot(identical(sort(start(x3)), which(coverage(x3) != 0L))) ## --------------------------------------------------------------------- ## D. COVERAGE OF AN IRangesList OBJECT ## --------------------------------------------------------------------- x <- IRangesList(A=IRanges(3*(4:-1), width=1:3), B=IRanges(2:10, width=5)) cvg <- coverage(x) cvg stopifnot(identical(cvg[[1]], coverage(x[[1]]))) stopifnot(identical(cvg[[2]], coverage(x[[2]]))) coverage(x, width=c(50, 9)) coverage(x, width=c(NA, 9)) coverage(x, width=9) # 'width' gets recycled ## Each list element in 'shift' and 'weight' gets recycled to the length ## of the corresponding element in 'x'. weight <- list(as.integer(10^(0:5)), -0.77) cvg2 <- coverage(x, weight=weight) cvg2 # 1st coverage vector is an integer-Rle, 2nd is a numeric-Rle identical(mapply(coverage, x=x, weight=weight), as.list(cvg2)) } \keyword{methods} \keyword{utilities} IRanges/man/extractList.Rd0000644000175400017540000001426113175713360016512 0ustar00biocbuildbiocbuild\name{extractList} \alias{relist,ANY,PartitioningByEnd-method} \alias{relist,ANY,List-method} \alias{relist,Vector,list-method} \alias{relist,ANY,PartitioningByEnd-method} \alias{splitAsList} \alias{splitAsList,ANY,ANY} \alias{extractList} \alias{extractList,ANY,ANY-method} \alias{extractList,ANY-method} \alias{regroup} \title{Group elements of a vector-like object into a list-like object} \description{ \code{relist} and \code{split} are 2 common ways of grouping the elements of a vector-like object into a list-like object. The \pkg{IRanges} and \pkg{S4Vectors} packages define \code{relist} and \code{split} methods that operate on a \link{Vector} object and return a \link{List} object. Note that the \code{\link[S4Vectors]{split}} methods defined in \pkg{S4Vectors} delegate to the \code{splitAsList} function defined in \pkg{IRanges} and documented below. Because \code{relist} and \code{split} both impose restrictions on the kind of grouping that they support (e.g. every element in the input object needs to go in a group and can only go in one group), the \pkg{IRanges} package introduces the \code{extractList} generic function for performing \emph{arbitrary} groupings. } \usage{ ## relist() ## -------- \S4method{relist}{ANY,List}(flesh, skeleton) \S4method{relist}{Vector,list}(flesh, skeleton) ## splitAsList() ## ------------- splitAsList(x, f, drop=FALSE, ...) ## extractList() ## ------------- extractList(x, i) ## regroup() ## --------- regroup(x, g) } \arguments{ \item{flesh, x}{ A vector-like object. } \item{skeleton}{ A list-like object. Only the "shape" (i.e. element lengths) of \code{skeleton} matters. Its exact content is ignored. } \item{f}{ An atomic vector or a factor (possibly in \link{Rle} form). } \item{drop}{ Logical indicating if levels that do not occur should be dropped (if \code{f} is a factor). } \item{i}{ A list-like object. Unlike for \code{skeleton}, the content here matters (see Details section below). Note that \code{i} can be a \link{Ranges} object (a particular type of list-like object), and, in that case, \code{extractList} is particularly fast (this is a common use case). } \item{g}{ A \linkS4class{Grouping} or an object coercible to one. For \code{regroup}, \code{g} groups the elements of \code{x}. } \item{...}{ Arguments to pass to methods. } } \details{ \code{relist}, \code{split}, and \code{extractList} have in common that they return a list-like object where each list element has the same class as the original vector-like object. Thus they need to be able to select the appropriate \link{List} concrete subclass to use for this returned value. This selection is performed by \code{\link[S4Vectors]{relistToClass}} and is based only on the class of the original object. By default, \code{extractList(x, i)} is equivalent to: \preformatted{ relist(x[unlist(i)], i) } An exception is made when \code{x} is a data-frame-like object. In that case \code{x} is subsetted along the rows, that is, \code{extractList(x, i)} is equivalent to: \preformatted{ relist(x[unlist(i), ], i) } This is more or less how the default method is implemented, except for some optimizations when \code{i} is a \link{Ranges} object. \code{relist} and \code{split} (or \code{splitAsList}) can be seen as special cases of \code{extractList}: \preformatted{ relist(flesh, skeleton) is equivalent to extractList(flesh, PartitioningByEnd(skeleton)) split(x, f) is equivalent to extractList(x, split(seq_along(f), f)) } It is good practise to use \code{extractList} only for cases not covered by \code{relist} or \code{split}. Whenever possible, using \code{relist} or \code{split} is preferred as they will always perform more efficiently. In addition their names carry meaning and are familiar to most R users/developers so they'll make your code easier to read/understand. Note that the transformation performed by \code{relist} or \code{split} is always reversible (via \code{unlist} and \code{unsplit}, respectively), but not the transformation performed by \code{extractList} (in general). The \code{regroup} function splits the elements of \code{unlist(x)} into a list according to the grouping \code{g}. Each element of \code{unlist(x)} inherits its group from its parent element of \code{x}. \code{regroup} is different from \code{relist} and \code{split}, because \code{x} is already grouped, and the goal is to combine groups. } \value{ The \code{relist} methods behave like \code{utils::relist} except that they return a \link{List} object. If \code{skeleton} has names, then they are propagated to the returned value. \code{splitAsList} behaves like \code{base::split} except that the former returns a \link{List} object instead of an ordinary list. \code{extractList} returns a list-like object parallel to \code{i} and with the same "shape" as \code{i} (i.e. same element lengths). If \code{i} has names, then they are propagated to the returned value. All these functions return a list-like object where the list elements have the same class as \code{x}. \code{\link[S4Vectors]{relistToClass}} gives the exact class of the returned object. } \author{Hervé Pagès} \seealso{ \itemize{ \item The \code{\link[base]{unlist}} and \code{\link[utils]{relist}} functions in the \pkg{base} and \pkg{utils} packages, respectively. \item The \code{\link[base]{split}} and \code{\link[base]{unsplit}} functions in the \pkg{base} package. \item The \code{\link[S4Vectors]{split}} methods defined in the \pkg{S4Vectors} package. \item \link[S4Vectors]{Vector}, \link[S4Vectors]{List}, \link[S4Vectors]{Rle}, and \link[S4Vectors]{DataFrame} objects in the \pkg{S4Vectors} package. \code{\link[S4Vectors]{relistToClass}} is documented in the man page for \link[S4Vectors]{List} objects. \item \link{Ranges} objects. } } \examples{ ## On an Rle object: x <- Rle(101:105, 6:2) i <- IRanges(6:10, 16:12, names=letters[1:5]) extractList(x, i) ## On a DataFrame object: df <- DataFrame(X=x, Y=LETTERS[1:20]) extractList(df, i) } \keyword{manip} IRanges/man/findOverlaps-methods.Rd0000644000175400017540000004045313175713360020303 0ustar00biocbuildbiocbuild\name{findOverlaps-methods} \alias{findOverlaps-methods} \alias{findOverlaps} \alias{findOverlaps,Ranges,Ranges-method} \alias{findOverlaps,Vector,missing-method} \alias{findOverlaps,integer,Ranges-method} \alias{findOverlaps,Views,Views-method} \alias{findOverlaps,Views,Vector-method} \alias{findOverlaps,Vector,Views-method} \alias{findOverlaps,RangesList,RangesList-method} \alias{findOverlaps,ViewsList,ViewsList-method} \alias{findOverlaps,ViewsList,Vector-method} \alias{findOverlaps,Vector,ViewsList-method} \alias{findOverlaps,RangedData,RangedData-method} \alias{findOverlaps,RangedData,RangesList-method} \alias{findOverlaps,RangesList,RangedData-method} \alias{findOverlaps,Pairs,missing-method} \alias{findOverlaps,Pairs,ANY-method} \alias{findOverlaps,ANY,Pairs-method} \alias{findOverlaps,Pairs,Pairs-method} \alias{countOverlaps} \alias{countOverlaps,Vector,Vector-method} \alias{countOverlaps,Vector,missing-method} \alias{countOverlaps,Ranges,Ranges-method} \alias{countOverlaps,RangesList,RangesList-method} \alias{countOverlaps,ViewsList,ViewsList-method} \alias{countOverlaps,ViewsList,Vector-method} \alias{countOverlaps,Vector,ViewsList-method} \alias{countOverlaps,RangedData,RangedData-method} \alias{countOverlaps,RangedData,RangesList-method} \alias{countOverlaps,RangesList,RangedData-method} \alias{overlapsAny} \alias{overlapsAny,Vector,Vector-method} \alias{overlapsAny,Vector,missing-method} \alias{overlapsAny,RangesList,RangesList-method} \alias{overlapsAny,ViewsList,ViewsList-method} \alias{overlapsAny,ViewsList,Vector-method} \alias{overlapsAny,Vector,ViewsList-method} \alias{overlapsAny,RangedData,RangedData-method} \alias{overlapsAny,RangedData,RangesList-method} \alias{overlapsAny,RangesList,RangedData-method} \alias{\%over\%} \alias{\%within\%} \alias{\%outside\%} \alias{subsetByOverlaps} \alias{subsetByOverlaps,Vector,Vector-method} \alias{subsetByOverlaps,RangedData,RangedData-method} \alias{subsetByOverlaps,RangedData,RangesList-method} \alias{subsetByOverlaps,RangesList,RangedData-method} \alias{overlapsRanges} \alias{overlapsRanges,Ranges,Ranges-method} \alias{overlapsRanges,RangesList,RangesList-method} \alias{poverlaps} \alias{poverlaps,Ranges,Ranges-method} \alias{poverlaps,Ranges,integer-method} \alias{poverlaps,integer,Ranges-method} \alias{mergeByOverlaps} \alias{findOverlapPairs} % deprecated \alias{ranges,Hits-method} \alias{ranges,HitsList-method} \title{Finding overlapping ranges} \description{ Various methods for finding/counting interval overlaps between two "range-based" objects: a query and a subject. NOTE: This man page describes the methods that operate on \link{Ranges}, \link{Views}, \link{RangesList}, or \link{ViewsList} objects. See \code{?`\link[GenomicRanges]{findOverlaps,GenomicRanges,GenomicRanges-method}`} in the \pkg{GenomicRanges} package for methods that operate on \link[GenomicRanges]{GenomicRanges} or \link[GenomicRanges]{GRangesList} objects. } \usage{ findOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ...) countOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) overlapsAny(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) query \%over\% subject query \%within\% subject query \%outside\% subject subsetByOverlaps(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) overlapsRanges(query, subject, hits=NULL, ...) poverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type = c("any", "start", "end", "within", "equal"), ...) mergeByOverlaps(query, subject, ...) findOverlapPairs(query, subject, ...) } \arguments{ \item{query, subject, x, ranges}{ Each of them can be a \link{Ranges}, \link{Views}, \link{RangesList}, or \link{ViewsList} object. In addition, if \code{subject} or \code{ranges} is a \link{Ranges} object, \code{query} or \code{x} can be an integer vector to be converted to length-one ranges. If \code{query} (or \code{x}) is a \link{RangesList} object, then \code{subject} (or \code{ranges}) must also be a \link{RangesList} object. If both arguments are list-like objects with names, each list element from the 2nd argument is paired with the list element from the 1st argument with the matching name, if any. Otherwise, list elements are paired by position. The overlap is then computed between the pairs as described below. If \code{subject} is omitted, \code{query} is queried against itself. In this case, and only this case, the \code{drop.self} and \code{drop.redundant} arguments are allowed. By default, the result will contain hits for each range against itself, and if there is a hit from A to B, there is also a hit for B to A. If \code{drop.self} is \code{TRUE}, all self matches are dropped. If \code{drop.redundant} is \code{TRUE}, only one of A->B and B->A is returned. } \item{maxgap}{ A single integer >= -1. If \code{type} is set to \code{"any"}, \code{maxgap} is interpreted as the maximum \emph{gap} that is allowed between 2 ranges for the ranges to be considered as overlapping. The \emph{gap} between 2 ranges is the number of positions that separate them. The \emph{gap} between 2 adjacent ranges is 0. By convention when one range has its start or end strictly inside the other (i.e. non-disjoint ranges), the \emph{gap} is considered to be -1. If \code{type} is set to anything else, \code{maxgap} has a special meaning that depends on the particular \code{type}. See \code{type} below for more information. } \item{minoverlap}{ A single non-negative integer. Only ranges with a minimum of \code{minoverlap} overlapping positions are considered to be overlapping. When \code{type} is \code{"any"}, at least one of \code{maxgap} and \code{minoverlap} must be set to its default value. } \item{type}{ By default, any overlap is accepted. By specifying the \code{type} parameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). If \code{type} is \code{start} or \code{end}, the intervals are required to have matching starts or ends, respectively. Specifying \code{equal} as the type returns the intersection of the \code{start} and \code{end} matches. If \code{type} is \code{within}, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy the \code{minoverlap} constraint described above. The \code{maxgap} parameter has special meaning with the special overlap types. For \code{start}, \code{end}, and \code{equal}, it specifies the maximum difference in the starts, ends or both, respectively. For \code{within}, it is the maximum amount by which the subject may be wider than the query. If \code{maxgap} is set to -1 (the default), it's replaced internally by 0. } \item{select}{ If \code{query} is a \link{Ranges} or \link{Views} object: When \code{select} is \code{"all"} (the default), the results are returned as a \link[S4Vectors]{Hits} object. Otherwise the returned value is an integer vector \emph{parallel} to \code{query} (i.e. same length) containing the first, last, or arbitrary overlapping interval in \code{subject}, with \code{NA} indicating intervals that did not overlap any intervals in \code{subject}. If \code{query} is a \link{RangesList} or \link{ViewsList} object: When \code{select} is \code{"all"} (the default), the results are returned as a \link[S4Vectors]{HitsList} object. Otherwise the returned value depends on the \code{drop} argument. When \code{select != "all" && !drop}, an \link{IntegerList} is returned, where each element of the result corresponds to a space in \code{query}. When \code{select != "all" && drop}, an integer vector is returned containing indices that are offset to align with the unlisted \code{query}. } \item{invert}{ If \code{TRUE}, keep only the ranges in \code{x} that do \emph{not} overlap \code{ranges}. } \item{hits}{ The \link[S4Vectors]{Hits} or \link[S4Vectors]{HitsList} object returned by \code{findOverlaps}, or \code{NULL}. If \code{NULL} then \code{hits} is computed by calling \code{findOverlaps(query, subject, ...)} internally (the extra arguments passed to \code{overlapsRanges} are passed to \code{findOverlaps}). } \item{...}{ Further arguments to be passed to or from other methods: \itemize{ \item \code{drop}: Supported only when \code{query} is a \link{RangesList} or \link{ViewsList} object. \code{FALSE} by default. See \code{select} argument above for the details. \item \code{drop.self}, \code{drop.redundant}: When \code{subject} is omitted, the \code{drop.self} and \code{drop.redundant} arguments (both \code{FALSE} by default) are allowed. See \code{query} and \code{subject} arguments above for the details. } } } \details{ A common type of query that arises when working with intervals is finding which intervals in one set overlap those in another. The simplest approach is to call the \code{findOverlaps} function on a \link{Ranges} or other object with range information (aka "range-based object"). } \value{ For \code{findOverlaps}: see \code{select} argument above. For \code{countOverlaps}: the overlap hit count for each range in \code{query} using the specified \code{findOverlaps} parameters. For \link{RangesList} objects, it returns an \link{IntegerList} object. \code{overlapsAny} finds the ranges in \code{query} that overlap any of the ranges in \code{subject}. For \link{Ranges} or \link{Views} objects, it returns a logical vector of length equal to the number of ranges in \code{query}. For \link{RangesList} or \link{ViewsList} objects, it returns a \link{LogicalList} object where each element of the result corresponds to a space in \code{query}. \code{\%over\%} and \code{\%within\%} are convenience wrappers for the 2 most common use cases. Currently defined as \code{`\%over\%` <- function(query, subject) overlapsAny(query, subject)} and \code{`\%within\%` <- function(query, subject) overlapsAny(query, subject, type="within")}. \code{\%outside\%} is simply the inverse of \code{\%over\%}. \code{subsetByOverlaps} returns the subset of \code{x} that has an overlap hit with a range in \code{ranges} using the specified \code{findOverlaps} parameters. When \code{hits} is a \link[S4Vectors]{Hits} (or \link[S4Vectors]{HitsList}) object, \code{overlapsRanges(query, subject, hits)} returns a \link{Ranges} (or \link{RangesList}) object of the \emph{same shape} as \code{hits} holding the regions of intersection between the overlapping ranges in objects \code{query} and \code{subject}, which should be the same query and subject used in the call to \code{findOverlaps} that generated \code{hits}. \emph{Same shape} means same length when \code{hits} is a \link[S4Vectors]{Hits} object, and same length and same elementNROWS when \code{hits} is a \link[S4Vectors]{HitsList} object. \code{poverlaps} compares \code{query} and \code{subject} in parallel (like e.g., \code{pmin}) and returns a logical vector indicating whether each pair of ranges overlaps. Integer vectors are treated as width-one ranges. \code{mergeByOverlaps} computes the overlap between query and subject according to the arguments in \code{\dots}. It then extracts the corresponding hits from each object and returns a \code{DataFrame} containing one column for the query and one for the subject, as well as any \code{mcols} that were present on either object. The query and subject columns are named by quoting and deparsing the corresponding argument. \code{findOverlapPairs} is like \code{mergeByOverlaps}, except it returns a formal \code{\link[S4Vectors:Pairs-class]{Pairs}} object that provides useful downstream conveniences, such as finding the intersection of the overlapping ranges with \code{\link{pintersect}}. } \references{ Allen's Interval Algebra: James F. Allen: Maintaining knowledge about temporal intervals. In: Communications of the ACM. 26/11/1983. ACM Press. S. 832-843, ISSN 0001-0782 } \author{Michael Lawrence and Hervé Pagès} \seealso{ \itemize{ \item \link[S4Vectors]{Hits} and \link[S4Vectors]{HitsList} objects in the \pkg{S4Vectors} package for representing a set of hits between 2 vector-like or list-like objects. \item \link[GenomicRanges]{findOverlaps,GenomicRanges,GenomicRanges-method} in the \pkg{GenomicRanges} package for methods that operate on \link[GenomicRanges]{GRanges} or \link[GenomicRanges]{GRangesList} objects. \item The \link{NCList} class and constructor. \item The \link{Ranges}, \link{Views}, \link{RangesList}, and \link{ViewsList} classes. \item The \link{IntegerList} and \link{LogicalList} classes. } } \examples{ ## --------------------------------------------------------------------- ## findOverlaps() ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) findOverlaps(query, subject) ## at most one hit per query findOverlaps(query, subject, select="first") findOverlaps(query, subject, select="last") findOverlaps(query, subject, select="arbitrary") ## including adjacent ranges in the result findOverlaps(query, subject, maxgap=0L) query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2), c(5, 4)) ## one IRanges object with itself findOverlaps(query) ## single points as query subject <- IRanges(c(1, 6, 13), c(4, 9, 14)) findOverlaps(c(3L, 7L, 10L), subject, select="first") ## special overlap types query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) findOverlaps(query, subject, type="start") findOverlaps(query, subject, type="start", maxgap=1L) findOverlaps(query, subject, type="end", select="first") ov <- findOverlaps(query, subject, type="within", maxgap=1L) ov ## Using pairs to find intersection of overlapping ranges hits <- findOverlaps(query, subject) p <- Pairs(query, subject, hits=hits) pintersect(p) ## Shortcut p <- findOverlapPairs(query, subject) pintersect(p) ## --------------------------------------------------------------------- ## overlapsAny() ## --------------------------------------------------------------------- overlapsAny(query, subject, type="start") overlapsAny(query, subject, type="end") query \%over\% subject # same as overlapsAny(query, subject) query \%within\% subject # same as overlapsAny(query, subject, # type="within") ## --------------------------------------------------------------------- ## overlapsRanges() ## --------------------------------------------------------------------- ## Extract the regions of intersection between the overlapping ranges: overlapsRanges(query, subject, ov) ## --------------------------------------------------------------------- ## Using RangesList objects ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) qpartition <- factor(c("a","a","b")) qlist <- split(query, qpartition) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) spartition <- factor(c("a","a","b")) slist <- split(subject, spartition) ## at most one hit per query findOverlaps(qlist, slist, select="first") findOverlaps(qlist, slist, select="last") findOverlaps(qlist, slist, select="arbitrary") query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) qpartition <- factor(c("a","a","b","b")) qlist <- split(query, qpartition) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) spartition <- factor(c("a","a","b","b")) slist <- split(subject, spartition) overlapsAny(qlist, slist, type="start") overlapsAny(qlist, slist, type="end") qlist %over% slist subsetByOverlaps(qlist, slist) countOverlaps(qlist, slist) } \keyword{methods} IRanges/man/inter-range-methods.Rd0000644000175400017540000003332713175713360020064 0ustar00biocbuildbiocbuild\name{inter-range-methods} \alias{inter-range-methods} \alias{range} \alias{range,Ranges-method} \alias{range,IPos-method} \alias{range,RangesList-method} \alias{range,CompressedIRangesList-method} \alias{range,RangedData-method} \alias{reduce} \alias{reduce,Ranges-method} \alias{reduce,Views-method} \alias{reduce,RangesList-method} \alias{reduce,CompressedIRangesList-method} \alias{gaps} \alias{gaps,Ranges-method} \alias{gaps,Views-method} \alias{gaps,RangesList-method} \alias{gaps,CompressedIRangesList-method} \alias{gaps,MaskCollection-method} \alias{disjoin} \alias{disjoin,Ranges-method} \alias{disjoin,NormalIRanges-method} \alias{disjoin,RangesList-method} \alias{disjoin,CompressedIRangesList-method} \alias{isDisjoint} \alias{isDisjoint,Ranges-method} \alias{isDisjoint,IPos-method} \alias{isDisjoint,NormalIRanges-method} \alias{isDisjoint,RangesList-method} \alias{disjointBins} \alias{disjointBins,Ranges-method} \alias{disjointBins,NormalIRanges-method} \alias{disjointBins,RangesList-method} \title{Inter range transformations of a Ranges, Views, RangesList, or MaskCollection object} \description{ Range-based transformations are grouped in 2 categories: \enumerate{ \item \emph{Intra range transformations} (e.g. \code{\link{shift}()}) transform each range individually (and independently of the other ranges). They return an object \emph{parallel} to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described in the \link{intra-range-methods} man page (see \code{?`\link{intra-range-methods}`}). \item \emph{Inter range transformations} (e.g. \code{reduce()}) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally \emph{NOT} parallel to the input object. Those transformations are described below. } } \usage{ ## range() ## ------- \S4method{range}{Ranges}(x, ..., with.revmap=FALSE, na.rm=FALSE) \S4method{range}{RangesList}(x, ..., with.revmap=FALSE, na.rm=FALSE) ## reduce() ## -------- reduce(x, drop.empty.ranges=FALSE, ...) \S4method{reduce}{Ranges}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) \S4method{reduce}{Views}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) \S4method{reduce}{RangesList}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) ## gaps() ## ------ gaps(x, start=NA, end=NA) ## disjoin(), isDisjoint(), and disjointBins() ## ------------------------------------------- disjoin(x, ...) \S4method{disjoin}{Ranges}(x, with.revmap=FALSE) \S4method{disjoin}{RangesList}(x, with.revmap=FALSE) isDisjoint(x, ...) disjointBins(x, ...) } \arguments{ \item{x}{ A \link{Ranges} or \link{RangesList} object for \code{range}, \code{disjoin}, \code{isDisjoint}, and \code{disjointBins}. A \link{Ranges}, \link{Views}, or \link{RangesList} object for \code{reduce} and \code{gaps}. } \item{...}{ For \code{range}, additional \link{Ranges} or \link{RangesList} object to consider. } \item{na.rm}{ Ignored. } \item{drop.empty.ranges}{ \code{TRUE} or \code{FALSE}. Should empty ranges be dropped? } \item{min.gapwidth}{ Ranges separated by a gap of at least \code{min.gapwidth} positions are not merged. } \item{with.revmap}{ \code{TRUE} or \code{FALSE}. Should the mapping from output to input ranges be stored in the returned object? If yes, then it is stored as metadata column \code{revmap} of type \link{IntegerList}. } \item{with.inframe.attrib}{ \code{TRUE} or \code{FALSE}. For internal use. } \item{start, end}{ \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: A single integer or \code{NA}. Use these arguments to specify the interval of reference i.e. which interval the returned gaps should be relative to. \item If \code{x} is a \link{RangesList} object: Integer vectors containing the coordinate bounds for each \link{RangesList} top-level element. } } } \details{ Unless specified otherwise, when \code{x} is a \link{RangesList} object, any transformation described here is equivalent to applying the transformation to each \link{RangesList} top-level element separately. \subsection{reduce}{ \code{reduce} first orders the ranges in \code{x} from left to right, then merges the overlapping or adjacent ones. }\subsection{range}{ \code{range} first combines \code{x} and the arguments in \code{...}. If the combined \link{IRanges} object contains at least 1 range, then \code{range} returns an \link{IRanges} instance with a single range, from the minimum start to the maximum end of the combined object. Otherwise (i.e. if the combined object contains no range), \code{IRanges()} is returned (i.e. an \link{IRanges} instance of length 0). When passing more than 1 \link{RangesList} object to \code{range()}, they are first merged into a single \link{RangesList} object: by name if all objects have names, otherwise, if they are all of the same length, by position. Else, an exception is thrown. }\subsection{gaps}{ \code{gaps} returns the "normal" \link{Ranges} object representing the set of integers that remain after the set of integers represented by \code{x} has been removed from the interval specified by the \code{start} and \code{end} arguments. If \code{x} is a \link{Views} object, then \code{start=NA} and \code{end=NA} are interpreted as \code{start=1} and \code{end=length(subject(x))}, respectively, so, if \code{start} and \code{end} are not specified, then gaps are extracted with respect to the entire subject. }\subsection{isDisjoint}{ A \link{Ranges} object \code{x} is considered to be "disjoint" if its ranges are non-overlapping. \code{isDisjoint} tests whether the object is "disjoint" or not. Note that a "normal" \link{Ranges} object is always "disjoint" but the opposite is not true. See \code{?isNormal} for more information about normal \link{Ranges} objects. About empty ranges. \code{isDisjoint} handles empty ranges (a.k.a. zero-width ranges) as follow: single empty range A is considered to overlap with single range B iff it's contained in B without being on the edge of B (in which case it would be ambiguous whether A is contained in or adjacent to B). More precisely, single empty range A is considered to overlap with single range B iff \preformatted{ start(B) < start(A) and end(A) < end(B)} Because A is an empty range it verifies \code{end(A) = start(A) - 1} so the above is equivalent to: \preformatted{ start(B) < start(A) <= end(B)} and also equivalent to: \preformatted{ start(B) <= end(A) < end(B)} Finally, it is also equivalent to: \preformatted{ pcompare(A, B) == 2} See \code{?`\link{Ranges-comparison}`} for the meaning of the codes returned by the \code{\link{pcompare}} function. }\subsection{disjoin}{ \code{disjoin} returns a disjoint object, by finding the union of the end points in \code{x}. In other words, the result consists of a range for every interval, of maximal length, over which the set of overlapping ranges in \code{x} is the same and at least of size 1. }\subsection{disjointBins}{ \code{disjointBins} segregates \code{x} into a set of bins so that the ranges in each bin are disjoint. Lower-indexed bins are filled first. The method returns an integer vector indicating the bin index for each range. } } \value{ If \code{x} is a \link{Ranges} object: \itemize{ \item \code{range}, \code{reduce}, \code{gaps}, and \code{disjoin} return an \link{IRanges} instance. \item \code{isDisjoint} returns \code{TRUE} or \code{FALSE}. \item \code{disjointBins} returns an integer vector \emph{parallel} to \code{x}, that is, where the i-th element corresponds to the i-th element in \code{x}. } If \code{x} is a \link{Views} object: \code{reduce} and \code{gaps} return a \link{Views} object on the same subject as \code{x} but with modified views. If \code{x} is a \link{RangesList} object: \itemize{ \item \code{range}, \code{reduce}, \code{gaps}, and \code{disjoin} return a \link{RangesList} object \emph{parallel} to \code{x}. \item \code{isDisjoint} returns a logical vector \emph{parallel} to \code{x}. \item \code{disjointBins} returns an \link{IntegerList} object \emph{parallel} to \code{x}. } } \author{H. Pagès, M. Lawrence, and P. Aboyoun} \seealso{ \itemize{ \item \link{intra-range-methods} for intra range transformations. \item The \link{Ranges}, \link{Views}, \link{RangesList}, and \link{MaskCollection} classes. \item The \link[GenomicRanges]{inter-range-methods} man page in the \pkg{GenomicRanges} package for \emph{inter range transformations} of genomic ranges. \item \link{setops-methods} for set operations on \link{IRanges} objects. \item \code{\link[S4Vectors]{endoapply}} in the \pkg{S4Vectors} package. } } \examples{ ## --------------------------------------------------------------------- ## range() ## --------------------------------------------------------------------- ## On a Ranges object: x <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 3, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 0, 3)) range(x) ## On a RangesList object (XVector package required): range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) if (require(XVector)) { range(collection) } irl1 <- IRangesList(a=IRanges(c(1, 2),c(4, 3)), b=IRanges(c(4, 6),c(10, 7))) irl2 <- IRangesList(c=IRanges(c(0, 2),c(4, 5)), a=IRanges(c(4, 5),c(6, 7))) range(irl1, irl2) # matched by names names(irl2) <- NULL range(irl1, irl2) # now by position ## --------------------------------------------------------------------- ## reduce() ## --------------------------------------------------------------------- ## On a Ranges object: reduce(x) y <- reduce(x, with.revmap=TRUE) mcols(y)$revmap # an IntegerList reduce(x, drop.empty.ranges=TRUE) y <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) mcols(y)$revmap ## Use the mapping from reduced to original ranges to split the DataFrame ## of original metadata columns by reduced range: ir0 <- IRanges(c(11:13, 2, 7:6), width=3) mcols(ir0) <- DataFrame(id=letters[1:6], score=1:6) ir <- reduce(ir0, with.revmap=TRUE) ir revmap <- mcols(ir)$revmap revmap relist(mcols(ir0)[unlist(revmap), ], revmap) # a SplitDataFrameList ## On a RangesList object. These 4 are the same: res1 <- reduce(collection) res2 <- IRangesList(one=reduce(range1), reduce(range2), reduce(range3)) res3 <- do.call(IRangesList, lapply(collection, reduce)) res4 <- endoapply(collection, reduce) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) reduce(collection, drop.empty.ranges=TRUE) ## --------------------------------------------------------------------- ## gaps() ## --------------------------------------------------------------------- ## On a Ranges object: x0 <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 3)) gaps(x0) gaps(x0, start=-6, end=20) ## On a Views object: subject <- Rle(1:-3, 6:2) v <- Views(subject, start=c(8, 3), end=c(14, 4)) gaps(v) ## On a RangesList object. These 4 are the same: res1 <- gaps(collection) res2 <- IRangesList(one=gaps(range1), gaps(range2), gaps(range3)) res3 <- do.call(IRangesList, lapply(collection, gaps)) res4 <- endoapply(collection, gaps) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks gaps(mymasks) ## --------------------------------------------------------------------- ## disjoin() ## --------------------------------------------------------------------- ## On a Ranges object: ir <- IRanges(c(1, 1, 4, 10), c(6, 3, 8, 10)) disjoin(ir) # IRanges(c(1, 4, 7, 10), c(3, 6, 8, 10)) disjoin(ir, with.revmap=TRUE) ## On a RangesList object: disjoin(collection) disjoin(collection, with.revmap=TRUE) ## --------------------------------------------------------------------- ## isDisjoint() ## --------------------------------------------------------------------- ## On a Ranges object: isDisjoint(IRanges(c(2,5,1), c(3,7,3))) # FALSE isDisjoint(IRanges(c(2,9,5), c(3,9,6))) # TRUE isDisjoint(IRanges(1, 5)) # TRUE ## Handling of empty ranges: x <- IRanges(c(11, 16, 11, -2, 11), c(15, 29, 10, 10, 10)) stopifnot(isDisjoint(x)) ## Sliding an empty range along a non-empty range: sapply(11:17, function(i) pcompare(IRanges(i, width=0), IRanges(12, 15))) sapply(11:17, function(i) isDisjoint(c(IRanges(i, width=0), IRanges(12, 15)))) ## On a RangesList object: isDisjoint(collection) ## --------------------------------------------------------------------- ## disjointBins() ## --------------------------------------------------------------------- ## On a Ranges object: disjointBins(IRanges(1, 5)) # 1L disjointBins(IRanges(c(3, 1, 10), c(5, 12, 13))) # c(2L, 1L, 2L) ## On a RangesList object: disjointBins(collection) } \keyword{utilities} IRanges/man/intra-range-methods.Rd0000644000175400017540000004251013175713360020052 0ustar00biocbuildbiocbuild\name{intra-range-methods} \alias{intra-range-methods} \alias{shift} \alias{shift,Ranges-method} \alias{shift,IPos-method} \alias{shift,Views-method} \alias{shift,RangesList-method} \alias{shift,CompressedIRangesList-method} \alias{narrow} \alias{narrow,Ranges-method} \alias{narrow,Views-method} \alias{narrow,RangesList-method} \alias{narrow,CompressedIRangesList-method} \alias{narrow,MaskCollection-method} \alias{resize} \alias{resize,Ranges-method} \alias{resize,RangesList-method} \alias{resize,IntervalList-method} \alias{resize,CompressedIRangesList-method} \alias{flank} \alias{flank,Ranges-method} \alias{flank,RangesList-method} \alias{flank,CompressedIRangesList-method} \alias{promoters} \alias{promoters,Ranges-method} \alias{promoters,Views-method} \alias{promoters,RangesList-method} \alias{promoters,CompressedIRangesList-method} \alias{reflect} \alias{reflect,Ranges-method} \alias{restrict} \alias{restrict,Ranges-method} \alias{restrict,RangesList-method} \alias{restrict,CompressedIRangesList-method} \alias{threebands} \alias{threebands,IRanges-method} \alias{Ops,Ranges,numeric-method} \alias{Ops,Ranges,ANY-method} \alias{Ops,CompressedIRangesList,numeric-method} \alias{Ops,RangesList,numeric-method} \title{Intra range transformations of a Ranges, Views, RangesList, or MaskCollection object} \description{ Range-based transformations are grouped in 2 categories: \enumerate{ \item \emph{Intra range transformations} (e.g. \code{shift()}) transform each range individually (and independently of the other ranges). They return an object \emph{parallel} to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described below. \item \emph{Inter range transformations} (e.g. \code{\link{reduce}()}) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally \emph{NOT} parallel to the input object. Those transformations are described in the \link{inter-range-methods} man page (see \code{?`\link{inter-range-methods}`}). } Except for \code{threebands()}, all the transformations described in this man page are \emph{endomorphisms} that operate on a single "range-based" object, that is, they transform the ranges contained in the input object and return them in an object of the \emph{same class} as the input object. } \usage{ ## shift() shift(x, shift=0L, use.names=TRUE) ## narrow() narrow(x, start=NA, end=NA, width=NA, use.names=TRUE) ## resize() resize(x, width, fix="start", use.names=TRUE, ...) ## flank() flank(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) ## promoters() promoters(x, upstream=2000, downstream=200, ...) ## reflect() reflect(x, bounds, use.names=TRUE) ## restrict() restrict(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) ## threebands() threebands(x, start=NA, end=NA, width=NA) } \arguments{ \item{x}{ A \link{Ranges}, \link{Views}, \link{RangesList}, or \link{MaskCollection} object. } \item{shift}{ An integer vector containing the shift information. Recycled as necessary so that each element corresponds to a range in \code{x}. It can also be an \link{IntegerList} object if \code{x} is a \link{RangesList} object. } \item{use.names}{ \code{TRUE} or \code{FALSE}. Should names be preserved? } \item{start, end}{ \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: A vector of integers for all functions except for \code{flank}. For \code{restrict}, the supplied \code{start} and \code{end} arguments must be vectors of integers, eventually with NAs, that specify the restriction interval(s). Recycled as necessary so that each element corresponds to a range in \code{x}. Same thing for \code{narrow} and \code{threebands}, except that here \code{start} and \code{end} must contain coordinates relative to the ranges in \code{x}. See the Details section below. For \code{flank}, \code{start} is a logical indicating whether \code{x} should be flanked at the start (\code{TRUE}) or the end (\code{FALSE}). Recycled as necessary so that each element corresponds to a range in \code{x}. \item If \code{x} is a \link{RangesList} object: For \code{flank}, \code{start} must be either a logical vector or a \link{LogicalList} object indicating whether \code{x} should be flanked at the start (\code{TRUE}) or the end (\code{FALSE}). Recycled as necessary so that each element corresponds to a range in \code{x}. For \code{narrow}, \code{start} and \code{end} must be either an integer vector or an \link{IntegerList} object containing coordinates relative to the current ranges. For \code{restrict}, \code{start} and \code{end} must be either an integer vector or an \link{IntegerList} object (possibly containing NA's). } } \item{width}{ \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: For \code{narrow} and \code{threebands}, a vector of integers, eventually with NAs. See the SEW (Start/End/Width) interface for the details (\code{?solveUserSEW}). For \code{resize} and \code{flank}, the width of the resized or flanking regions. Note that if \code{both} is \code{TRUE}, this is effectively doubled. Recycled as necessary so that each element corresponds to a range in \code{x}. \item If \code{x} is a \link{RangesList} object: For \code{resize} and \code{flank}, either an integer vector or an \link{IntegerList} object containing the width of the flanking or resized regions. Recycled as necessary so that each element corresponds to a range in \code{x}. (Note for \code{flank}: if \code{both} is \code{TRUE}, this is effectively doubled.) For \code{narrow}, either an integer vector or a \link{IntegerList} object containing the widths to narrow to. See the SEW (Start/End/Width) interface for the details (\code{?solveUserSEW}). } } \item{fix}{ \itemize{ \item If \code{x} is a \link{Ranges} or \link{Views} object: A character vector or character-Rle of length 1 or \code{length(x)} containing the values \code{"start"}, \code{"end"}, and \code{"center"} denoting what to use as an anchor for each element in \code{x}. \item If \code{x} is a \link{RangesList} object: A character vector of length 1, a \link{CharacterList} object, or a character-RleList object containing the values \code{"start"}, \code{"end"}, and \code{"center"} denoting what to use as an anchor for each element in \code{x}. } } \item{...}{ Additional arguments for methods. } \item{both}{ If \code{TRUE}, extends the flanking region \code{width} positions \emph{into} the range. The resulting range thus straddles the end point, with \code{width} positions on either side. } \item{upstream, downstream}{ Single \code{integer} values >= 0L. \code{upstream} defines the number of nucleotides toward the 5' end and \code{downstream} defines the number toward the 3' end, relative to the transcription start site. Promoter regions are formed by merging the upstream and downstream ranges. Default values for \code{upstream} and \code{downstream} were chosen based on our current understanding of gene regulation. On average, promoter regions in the mammalian genome are 5000 bp upstream and downstream of the transcription start site. } \item{bounds}{ An \link{IRanges} object to serve as the reference bounds for the reflection, see below. } \item{keep.all.ranges}{ \code{TRUE} or \code{FALSE}. Should ranges that don't overlap with the restriction interval(s) be kept? Note that "don't overlap" means that they end strictly before \code{start - 1} or start strictly after \code{end + 1}. Ranges that end at \code{start - 1} or start at \code{end + 1} are always kept and their width is set to zero in the returned \link{IRanges} object. } } \details{ Unless specified otherwise, when \code{x} is a \link{RangesList} object, any transformation described here is equivalent to applying the transformation to each \link{RangesList} top-level element separately. \subsection{shift}{ \code{shift} shifts all the ranges in \code{x} by the amount specified by the \code{shift} argument. }\subsection{narrow}{ \code{narrow} narrows the ranges in \code{x} i.e. each range in the returned \link{Ranges} object is a subrange of the corresponding range in \code{x}. The supplied start/end/width values are solved by a call to \code{solveUserSEW(width(x), start=start, end=end, width=width)} and therefore must be compliant with the rules of the SEW (Start/End/Width) interface (see \code{?\link{solveUserSEW}} for the details). Then each subrange is derived from the original range according to the solved start/end/width values for this range. Note that those solved values are interpreted relatively to the original range. }\subsection{resize}{ \code{resize} resizes the ranges to the specified width where either the start, end, or center is used as an anchor. }\subsection{flank}{ \code{flank} generates flanking ranges for each range in \code{x}. If \code{start} is \code{TRUE} for a given range, the flanking occurs at the start, otherwise the end. The widths of the flanks are given by the \code{width} parameter. The widths can be negative, in which case the flanking region is reversed so that it represents a prefix or suffix of the range in \code{x}. The \code{flank} operation is illustrated below for a call of the form \code{flank(x, 3, TRUE)}, where \code{x} indicates a range in \code{x} and \code{-} indicates the resulting flanking region: \preformatted{ ---xxxxxxx} If \code{start} were \code{FALSE}: \preformatted{ xxxxxxx---} For negative width, i.e. \code{flank(x, -3, FALSE)}, where \code{*} indicates the overlap between \code{x} and the result: \preformatted{ xxxx***} If \code{both} is \code{TRUE}, then, for all ranges in \code{x}, the flanking regions are extended \emph{into} (or out of, if width is negative) the range, so that the result straddles the given endpoint and has twice the width given by \code{width}. This is illustrated below for \code{flank(x, 3, both=TRUE)}: \preformatted{ ---***xxxx} }\subsection{promoters}{ \code{promoters} generates promoter ranges for each range in \code{x} relative to the transcription start site (TSS), where TSS is \code{start(x)}. The promoter range is expanded around the TSS according to the \code{upsteam} and \code{downstream} arguments. \code{upstream} represents the number of nucleotides in the 5' direction and \code{downstream} the number in the 3' direction. The full range is defined as, (start(x) - upstream) to (start(x) + downstream - 1). For documentation for using \code{promoters} on a \link[GenomicRanges]{GRanges} object see \code{?`\link[GenomicRanges]{promoters,GenomicRanges-method}`} in the \pkg{GenomicRanges} package. }\subsection{reflect}{ \code{reflect} "reflects" or reverses each range in \code{x} relative to the corresponding range in \code{bounds}, which is recycled as necessary. Reflection preserves the width of a range, but shifts it such the distance from the left bound to the start of the range becomes the distance from the end of the range to the right bound. This is illustrated below, where \code{x} represents a range in \code{x} and \code{[} and \code{]} indicate the bounds: \preformatted{ [..xxx.....] becomes [.....xxx..]} }\subsection{restrict}{ \code{restrict} restricts the ranges in \code{x} to the interval(s) specified by the \code{start} and \code{end} arguments. }\subsection{threebands}{ \code{threebands} extends the capability of \code{narrow} by returning the 3 ranges objects associated to the narrowing operation. The returned value \code{y} is a list of 3 ranges objects named \code{"left"}, \code{"middle"} and \code{"right"}. The middle component is obtained by calling \code{narrow} with the same arguments (except that names are dropped). The left and right components are also instances of the same class as \code{x} and they contain what has been removed on the left and right sides (respectively) of the original ranges during the narrowing. Note that original object \code{x} can be reconstructed from the left and right bands with \code{punion(y$left, y$right, fill.gap=TRUE)}. } } \author{H. Pagès, M. Lawrence, and P. Aboyoun} \seealso{ \itemize{ \item \link{inter-range-methods} for inter range transformations. \item The \link{Ranges}, \link{Views}, \link{RangesList}, and \link{MaskCollection} classes. \item The \link[GenomicRanges]{intra-range-methods} man page in the \pkg{GenomicRanges} package for \emph{intra range transformations} of genomic ranges. \item \link{setops-methods} for set operations on \link{IRanges} objects. \item \code{\link[S4Vectors]{endoapply}} in the \pkg{S4Vectors} package. } } \examples{ ## --------------------------------------------------------------------- ## shift() ## --------------------------------------------------------------------- ## On a Ranges object: ir1 <- successiveIRanges(c(19, 5, 0, 8, 5)) ir1 shift(ir1, shift=-3) ## On a RangesList object: range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) shift(collection, shift=5) # same as endoapply(collection, shift, shift=5) ## Sanity check: res1 <- shift(collection, shift=5) res2 <- endoapply(collection, shift, shift=5) stopifnot(identical(res1, res2)) ## --------------------------------------------------------------------- ## narrow() ## --------------------------------------------------------------------- ## On a Ranges object: ir2 <- ir1[width(ir1) != 0] narrow(ir2, start=4, end=-2) narrow(ir2, start=-4, end=-2) narrow(ir2, end=5, width=3) narrow(ir2, start=c(3, 4, 2, 3), end=c(12, 5, 7, 4)) ## On a RangesList object: narrow(collection[-3], start=2) narrow(collection[-3], end=-2) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks narrow(mymasks, start=8) ## --------------------------------------------------------------------- ## resize() ## --------------------------------------------------------------------- ## On a Ranges object: resize(ir2, 200) resize(ir2, 2, fix="end") ## On a RangesList object: resize(collection, width=200) ## --------------------------------------------------------------------- ## flank() ## --------------------------------------------------------------------- ## On a Ranges object: ir3 <- IRanges(c(2,5,1), c(3,7,3)) flank(ir3, 2) flank(ir3, 2, start=FALSE) flank(ir3, 2, start=c(FALSE, TRUE, FALSE)) flank(ir3, c(2, -2, 2)) flank(ir3, 2, both = TRUE) flank(ir3, 2, start=FALSE, both=TRUE) flank(ir3, -2, start=FALSE, both=TRUE) ## On a RangesList object: flank(collection, width=10) ## --------------------------------------------------------------------- ## promoters() ## --------------------------------------------------------------------- ## On a Ranges object: ir4 <- IRanges(20:23, width=3) promoters(ir4, upstream=0, downstream=0) ## no change promoters(ir4, upstream=0, downstream=1) ## start value only promoters(ir4, upstream=1, downstream=0) ## single upstream nucleotide ## On a RangesList object: promoters(collection, upstream=5, downstream=2) ## --------------------------------------------------------------------- ## reflect() ## --------------------------------------------------------------------- ## On a Ranges object: bounds <- IRanges(c(0, 5, 3), c(10, 6, 9)) reflect(ir3, bounds) ## reflect() does not yet support RangesList objects! ## --------------------------------------------------------------------- ## restrict() ## --------------------------------------------------------------------- ## On a Ranges object: restrict(ir1, start=12, end=34) restrict(ir1, start=20) restrict(ir1, start=21) restrict(ir1, start=21, keep.all.ranges=TRUE) ## On a RangesList object: restrict(collection, start=2, end=8) ## --------------------------------------------------------------------- ## threebands() ## --------------------------------------------------------------------- ## On a Ranges object: z <- threebands(ir2, start=4, end=-2) ir2b <- punion(z$left, z$right, fill.gap=TRUE) stopifnot(identical(ir2, ir2b)) threebands(ir2, start=-5) ## threebands() does not support RangesList objects. } \keyword{utilities} IRanges/man/multisplit.Rd0000644000175400017540000000160013175713360016403 0ustar00biocbuildbiocbuild\name{multisplit} \alias{multisplit} \title{ Split elements belonging to multiple groups } \description{ This is like \code{\link{split}}, except elements can belong to multiple groups, in which case they are repeated to appear in multiple elements of the return value. } \usage{ multisplit(x, f) } \arguments{ \item{x}{ The object to split, like a vector. } \item{f}{ A list-like object of vectors, the same length as \code{x}, where each element indicates the groups to which each element of \code{x} belongs. } } \value{ A list-like object, with an element for each unique value in the unlisted \code{f}, containing the elements in \code{x} where the corresponding element in \code{f} contained that value. Just try it. } \author{ Michael Lawrence } \examples{ multisplit(1:3, list(letters[1:2], letters[2:3], letters[2:4])) } \keyword{ manip } IRanges/man/nearest-methods.Rd0000644000175400017540000001776413175713360017321 0ustar00biocbuildbiocbuild\name{nearest-methods} \alias{nearest-methods} \alias{class:RangesORmissing} \alias{RangesORmissing-class} \alias{RangesORmissing} \alias{nearest} \alias{precede} \alias{follow} \alias{distance} \alias{distanceToNearest} \alias{selectNearest} \alias{nearest,Ranges,RangesORmissing-method} \alias{precede,Ranges,RangesORmissing-method} \alias{follow,Ranges,RangesORmissing-method} \alias{distance,Ranges,Ranges-method} \alias{distance,Pairs,missing-method} \alias{distanceToNearest,Ranges,RangesORmissing-method} \title{Finding the nearest range neighbor} \description{ The \code{nearest}, \code{precede}, \code{follow}, \code{distance} and \code{distanceToNearest} methods for \code{\linkS4class{Ranges}} objects and subclasses. } \usage{ \S4method{nearest}{Ranges,RangesORmissing}(x, subject, select = c("arbitrary", "all")) \S4method{precede}{Ranges,RangesORmissing}(x, subject, select = c("first", "all")) \S4method{follow}{Ranges,RangesORmissing}(x, subject, select = c("last", "all")) \S4method{distanceToNearest}{Ranges,RangesORmissing}(x, subject, select = c("arbitrary", "all")) \S4method{distance}{Ranges,Ranges}(x, y) \S4method{distance}{Pairs,missing}(x, y) } \arguments{ \item{x}{The query \code{\linkS4class{Ranges}} instance, or (for \code{distance()}) a \code{\linkS4class{Pairs}} containing both the query (first) and subject (second). } \item{subject}{The subject \code{Ranges} instance, within which the nearest neighbors are found. Can be missing, in which case \code{x} is also the subject. } \item{select}{Logic for handling ties. By default, all the methods select a single interval (arbitrary for \code{nearest},the first by order in \code{subject} for \code{precede}, and the last for \code{follow}). To get all matchings, as a \code{Hits} object, use \dQuote{all}. } \item{y}{For the \code{distance} method, a \code{Ranges} instance. Cannot be missing. If \code{x} and \code{y} are not the same length, the shortest will be recycled to match the length of the longest. } \item{hits}{The hits between \code{x} and \code{subject}} \item{...}{Additional arguments for methods} } \details{ \itemize{ \item{nearest: }{ The conventional nearest neighbor finder. Returns a integer vector containing the index of the nearest neighbor range in \code{subject} for each range in \code{x}. If there is no nearest neighbor (if \code{subject} is empty), NA's are returned. Here is roughly how it proceeds, for a range \code{xi} in \code{x}: \enumerate{ \item Find the ranges in \code{subject} that overlap \code{xi}. If a single range \code{si} in \code{subject} overlaps \code{xi}, \code{si} is returned as the nearest neighbor of \code{xi}. If there are multiple overlaps, one of the overlapping ranges is chosen arbitrarily. \item If no ranges in \code{subject} overlap with \code{xi}, then the range in \code{subject} with the shortest distance from its end to the start \code{xi} or its start to the end of \code{xi} is returned. } } \item{precede: }{ For each range in \code{x}, \code{precede} returns the index of the interval in \code{subject} that is directly preceded by the query range. Overlapping ranges are excluded. \code{NA} is returned when there are no qualifying ranges in \code{subject}. } \item{follow: }{ The opposite of \code{precede}, this function returns the index of the range in \code{subject} that a query range in \code{x} directly follows. Overlapping ranges are excluded. \code{NA} is returned when there are no qualifying ranges in \code{subject}. } \item{distanceToNearest: }{ Returns the distance for each range in \code{x} to its nearest neighbor in \code{subject}. } \item{distance: }{ Returns the distance for each range in \code{x} to the range in \code{y}. The \code{distance} method differs from others documented on this page in that it is symmetric; \code{y} cannot be missing. If \code{x} and \code{y} are not the same length, the shortest will be recycled to match the length of the longest. The \code{select} argument is not available for \code{distance} because comparisons are made in a pair-wise fashion. The return value is the length of the longest of \code{x} and \code{y}. The \code{distance} calculation changed in BioC 2.12 to accommodate zero-width ranges in a consistent and intuitive manner. The new distance can be explained by a \emph{block} model where a range is represented by a series of blocks of size 1. Blocks are adjacent to each other and there is no gap between them. A visual representation of IRanges(4,7) would be \preformatted{ +-----+-----+-----+-----+ 4 5 6 7 } The distance between two consecutive blocks is 0L (prior to Bioconductor 2.12 it was 1L). The new distance calculation now returns the size of the gap between two ranges. This change to distance affects the notion of overlaps in that we no longer say: x and y overlap <=> distance(x, y) == 0 Instead we say x and y overlap => distance(x, y) == 0 or x and y overlap or are adjacent <=> distance(x, y) == 0 } \item{selectNearest: }{ Selects the hits that have the minimum distance within those for each query range. Ties are possible and can be broken with \code{\link[S4Vectors]{breakTies}}. } } } \value{ For \code{nearest}, \code{precede} and \code{follow}, an integer vector of indices in \code{subject}, or a \code{\linkS4class{Hits}} if \code{select="all"}. For \code{distanceToNearest}, a \code{Hits} object with an elementMetadata column of the \code{distance} between the pair. Access \code{distance} with \code{mcols} accessor. For \code{distance}, an integer vector of distances between the ranges in \code{x} and \code{y}. For \code{selectNearest}, a \code{\linkS4class{Hits}} object, sorted by query. } \author{M. Lawrence} \seealso{ \itemize{ \item The \link{Ranges} and \link{Hits} classes. \item The \link[GenomicRanges]{GenomicRanges} and \link[GenomicRanges]{GRanges} classes in the GenomicRanges package. \item \code{\link{findOverlaps}} for finding just the overlapping ranges. \item{}{ GenomicRanges methods for \itemize{ \item \code{precede} \item \code{follow} \item \code{nearest} \item \code{distance} \item \code{distanceToNearest} } are documented at ?\code{\link[GenomicRanges]{nearest-methods}} or ?\code{\link[GenomicRanges]{precede,GenomicRanges,GenomicRanges-method}} } } } \examples{ ## ------------------------------------------ ## precede() and follow() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(3, 7, 10)) subject <- IRanges(c(3, 2, 10), c(3, 13, 12)) precede(query, subject) # c(3L, 3L, NA) precede(IRanges(), subject) # integer() precede(query, IRanges()) # rep(NA_integer_, 3) precede(query) # c(3L, 3L, NA) follow(query, subject) # c(NA, NA, 1L) follow(IRanges(), subject) # integer() follow(query, IRanges()) # rep(NA_integer_, 3) follow(query) # c(NA, NA, 2L) ## ------------------------------------------ ## nearest() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) nearest(query, subject) # c(1L, 1L, 3L) nearest(query) # c(2L, 1L, 2L) ## ------------------------------------------ ## distance() ## ------------------------------------------ ## adjacent distance(IRanges(1,5), IRanges(6,10)) # 0L ## overlap distance(IRanges(1,5), IRanges(3,7)) # 0L ## zero-width sapply(-3:3, function(i) distance(shift(IRanges(4,3), i), IRanges(4,3))) } \keyword{utilities} IRanges/man/range-squeezers.Rd0000644000175400017540000000701613175713360017324 0ustar00biocbuildbiocbuild\name{range-squeezers} \alias{range-squeezers} \alias{ranges} \alias{rglist} \alias{rglist,Pairs-method} \title{Squeeze the ranges out of a range-based object} \description{ S4 generic functions for squeezing the ranges out of a range-based object. These are analog to range squeezers \code{\link[GenomicRanges]{granges}} and \code{\link[GenomicRanges]{grglist}} defined in the \pkg{GenomicRanges} package, except that \code{ranges} returns the ranges in an \link{IRanges} object (instead of a \link[GenomicRanges]{GRanges} object for \code{\link[GenomicRanges]{granges}}), and \code{rglist} returns them in an \link{IRangesList} object (instead of a \link[GenomicRanges]{GRangesList} object for \code{\link[GenomicRanges]{grglist}}). } \usage{ ranges(x, use.names=TRUE, use.mcols=FALSE, ...) rglist(x, use.names=TRUE, use.mcols=FALSE, ...) } \arguments{ \item{x}{ An object containing ranges e.g. a \link{Ranges}, \link[GenomicRanges]{GenomicRanges}, \link[SummarizedExperiment]{RangedSummarizedExperiment}, \link[GenomicAlignments]{GAlignments}, \link[GenomicAlignments]{GAlignmentPairs}, or \link[GenomicAlignments]{GAlignmentsList} object, or a \link[S4Vectors]{Pairs} object containing ranges. } \item{use.names}{ \code{TRUE} (the default) or \code{FALSE}. Whether or not the names on \code{x} (accessible with \code{names(x)}) should be propagated to the returned object. } \item{use.mcols}{ \code{TRUE} or \code{FALSE} (the default). Whether or not the metadata columns on \code{x} (accessible with \code{mcols(x)}) should be propagated to the returned object. } \item{...}{ Additional arguments, for use in specific methods. } } \details{ Various packages (e.g. \pkg{IRanges}, \pkg{GenomicRanges}, \pkg{SummarizedExperiment}, \pkg{GenomicAlignments}, etc...) define and document various range squeezing methods for various types of objects. Note that these functions can be seen as \emph{object getters} or as functions performing coercion. For some objects (e.g. \link[GenomicAlignments]{GAlignments} and \link[GenomicAlignments]{GAlignmentPairs} objects defined in the \pkg{GenomicAlignments} package), \code{as(x, "IRanges")} and \code{as(x, "IRangesList")}, are equivalent to \code{ranges(x, use.names=TRUE, use.mcols=TRUE)} and \code{rglist(x, use.names=TRUE, use.mcols=TRUE)}, respectively. } \value{ An \link{IRanges} object for \code{ranges}. An \link{IRangesList} object for \code{rglist}. If \code{x} is a vector-like object (e.g. \link[GenomicAlignments]{GAlignments}), the returned object is expected to be \emph{parallel} to \code{x}, that is, the i-th element in the output corresponds to the i-th element in the input. If \code{use.names} is TRUE, then the names on \code{x} (if any) are propagated to the returned object. If \code{use.mcols} is TRUE, then the metadata columns on \code{x} (if any) are propagated to the returned object. } \author{H. Pagès} \seealso{ \itemize{ \item \link{IRanges} and \link{IRangesList} objects. \item \link[SummarizedExperiment]{RangedSummarizedExperiment} objects in the \pkg{SummarizedExperiment} packages. \item \link[GenomicAlignments]{GAlignments}, \link[GenomicAlignments]{GAlignmentPairs}, and \link[GenomicAlignments]{GAlignmentsList} objects in the \pkg{GenomicAlignments} package. } } \examples{ ## See ?GAlignments in the GenomicAlignments package for examples of ## "ranges" and "rglist" methods. } \keyword{methods} IRanges/man/read.Mask.Rd0000644000175400017540000001450313175713360016010 0ustar00biocbuildbiocbuild\name{read.Mask} \alias{read.Mask} \alias{read.agpMask} \alias{read.gapMask} \alias{read.liftMask} \alias{read.rmMask} \alias{read.trfMask} \title{Read a mask from a file} \description{ \code{read.agpMask} and \code{read.gapMask} extract the AGAPS mask from an NCBI "agp" file or a UCSC "gap" file, respectively. \code{read.liftMask} extracts the AGAPS mask from a UCSC "lift" file (i.e. a file containing offsets of contigs within sequences). \code{read.rmMask} extracts the RM mask from a RepeatMasker .out file. \code{read.trfMask} extracts the TRF mask from a Tandem Repeats Finder .bed file. } \usage{ read.agpMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) read.gapMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) read.liftMask(file, seqname="?", mask.width=NA) read.rmMask(file, seqname="?", mask.width=NA, use.IDs=FALSE) read.trfMask(file, seqname="?", mask.width=NA) } \arguments{ \item{file}{ Either a character string naming a file or a connection open for reading. } \item{seqname}{ The name of the sequence for which the mask must be extracted. If no sequence is specified (i.e. \code{seqname="?"}) then an error is raised and the sequence names found in the file are displayed. If the file doesn't contain any information for the specified sequence, then a warning is issued and an empty mask of width \code{mask.width} is returned. } \item{mask.width}{ The width of the mask to return i.e. the length of the sequence this mask will be put on. See \code{?`\link{MaskCollection-class}`} for more information about the width of a \link{MaskCollection} object. } \item{gap.types}{ \code{NULL} or a character vector containing gap types. Use this argument to filter the assembly gaps that are to be extracted from the "agp" or "gap" file based on their type. Most common gap types are \code{"contig"}, \code{"clone"}, \code{"centromere"}, \code{"telomere"}, \code{"heterochromatin"}, \code{"short_arm"} and \code{"fragment"}. With \code{gap.types=NULL}, all the assembly gaps described in the file are extracted. With \code{gap.types="?"}, an error is raised and the gap types found in the file for the specified sequence are displayed. } \item{use.gap.types}{ Whether or not the gap types provided in the "agp" or "gap" file should be used to name the ranges constituing the returned mask. See \code{?`\link{IRanges-class}`} for more information about the names of an \link{IRanges} object. } \item{use.IDs}{ Whether or not the repeat IDs provided in the RepeatMasker .out file should be used to name the ranges constituing the returned mask. See \code{?`\link{IRanges-class}`} for more information about the names of an \link{IRanges} object. } } \seealso{ \link{MaskCollection-class}, \link{IRanges-class} } \examples{ ## --------------------------------------------------------------------- ## A. Extract a mask of assembly gaps ("AGAPS" mask) with read.agpMask() ## --------------------------------------------------------------------- ## Note: The hs_b36v3_chrY.agp file was obtained by downloading, ## extracting and renaming the hs_ref_chrY.agp.gz file from ## ## ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/ ## hs_ref_chrY.agp.gz 5 KB 24/03/08 04:33:00 PM ## ## on May 9, 2008. chrY_length <- 57772954 file1 <- system.file("extdata", "hs_b36v3_chrY.agp", package="IRanges") mask1 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length, use.gap.types=TRUE) mask1 mask1[[1]] mask11 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length, gap.types=c("centromere", "heterochromatin")) mask11[[1]] ## --------------------------------------------------------------------- ## B. Extract a mask of assembly gaps ("AGAPS" mask) with read.liftMask() ## --------------------------------------------------------------------- ## Note: The hg18liftAll.lft file was obtained by downloading, ## extracting and renaming the liftAll.zip file from ## ## http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/ ## liftAll.zip 03-Feb-2006 11:35 5.5K ## ## on May 8, 2008. file2 <- system.file("extdata", "hg18liftAll.lft", package="IRanges") mask2 <- read.liftMask(file2, seqname="chr1") mask2 if (interactive()) { ## contigs 7 and 8 for chrY are adjacent read.liftMask(file2, seqname="chrY") ## displays the sequence names found in the file read.liftMask(file2) ## specify an unknown sequence name read.liftMask(file2, seqname="chrZ", mask.width=300) } ## --------------------------------------------------------------------- ## C. Extract a RepeatMasker ("RM") or Tandem Repeats Finder ("TRF") ## mask with read.rmMask() or read.trfMask() ## --------------------------------------------------------------------- ## Note: The ce2chrM.fa.out and ce2chrM.bed files were obtained by ## downloading, extracting and renaming the chromOut.zip and ## chromTrf.zip files from ## ## http://hgdownload.cse.ucsc.edu/goldenPath/ce2/bigZips/ ## chromOut.zip 21-Apr-2004 09:05 2.6M ## chromTrf.zip 21-Apr-2004 09:07 182K ## ## on May 7, 2008. ## Before you can extract a mask with read.rmMask() or read.trfMask(), you ## need to know the length of the sequence that you're going to put the ## mask on: if (interactive()) { library(BSgenome.Celegans.UCSC.ce2) chrM_length <- seqlengths(Celegans)[["chrM"]] ## Read the RepeatMasker .out file for chrM in ce2: file3 <- system.file("extdata", "ce2chrM.fa.out", package="IRanges") RMmask <- read.rmMask(file3, seqname="chrM", mask.width=chrM_length) RMmask ## Read the Tandem Repeats Finder .bed file for chrM in ce2: file4 <- system.file("extdata", "ce2chrM.bed", package="IRanges") TRFmask <- read.trfMask(file4, seqname="chrM", mask.width=chrM_length) TRFmask desc(TRFmask) <- paste(desc(TRFmask), "[period<=12]") TRFmask ## Put the 2 masks on chrM: chrM <- Celegans$chrM masks(chrM) <- RMmask # this would drop all current masks, if any masks(chrM) <- append(masks(chrM), TRFmask) chrM } } \keyword{manip} IRanges/man/reverse-methods.Rd0000644000175400017540000000324613175713360017321 0ustar00biocbuildbiocbuild\name{reverse} \alias{reverse} \alias{reverse,character-method} \alias{reverse,IRanges-method} \alias{reverse,NormalIRanges-method} \alias{reverse,Views-method} \alias{reverse,MaskCollection-method} \title{reverse} \description{ A generic function for reversing vector-like or list-like objects. This man page describes methods for reversing a character vector, a \link{Views} object, or a \link{MaskCollection} object. Note that \code{reverse} is similar to but not the same as \code{\link[base]{rev}}. } \usage{ reverse(x, ...) } \arguments{ \item{x}{ A vector-like or list-like object. } \item{...}{ Additional arguments to be passed to or from methods. } } \details{ On a character vector or a \link{Views} object, \code{reverse} reverses each element individually, without modifying the top-level order of the elements. More precisely, each individual string of a character vector is reversed. } \value{ An object of the same class and length as the original object. } \seealso{ \link[XVector]{reverse-methods}, \link{Views-class}, \link{MaskCollection-class}, \code{\link{endoapply}}, \code{\link[base]{rev}} } \examples{ ## On a character vector: reverse(c("Hi!", "How are you?")) rev(c("Hi!", "How are you?")) ## On a Views object: v <- successiveViews(Rle(c(-0.5, 12.3, 4.88), 4:2), 1:4) v reverse(v) rev(v) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) reverse(mymasks) } \keyword{methods} \keyword{manip} IRanges/man/seqapply.Rd0000644000175400017540000000236013175713360016037 0ustar00biocbuildbiocbuild\name{seqapply} \alias{unsplit,List-method} \alias{split<-,Vector-method} \title{ 2 methods that should be documented somewhere else } \description{ \code{unsplit} method for \link{List} object and \code{split<-} method for \link{Vector} object. } \usage{ \S4method{unsplit}{List}(value, f, drop = FALSE) \S4method{split}{Vector}(x, f, drop = FALSE, ...) <- value } \arguments{ \item{value}{ The \link{List} object to unsplit. } \item{f}{ A \code{factor} or \code{list} of factors } \item{drop}{ Whether to drop empty elements from the returned list } \item{x}{ Like \code{X} } \item{\dots}{ Extra arguments to pass to \code{FUN} } } \details{ \code{unsplit} unlists \code{value}, where the order of the returned vector is as if \code{value} were originally created by splitting that vector on the factor \code{f}. \code{split(x, f, drop = FALSE) <- value}: Virtually splits \code{x} by the factor \code{f}, replaces the elements of the resulting list with the elements from the list \code{value}, and restores \code{x} to its original form. Note that this works for any \code{Vector}, even though \code{split} itself is not universally supported. } \author{ Michael Lawrence } \keyword{manip} IRanges/man/setops-methods.Rd0000644000175400017540000001576713175713360017176 0ustar00biocbuildbiocbuild\name{setops-methods} \alias{setops-methods} \alias{union} \alias{union,Ranges,Ranges-method} \alias{union,RangesList,RangesList-method} \alias{union,CompressedIRangesList,CompressedIRangesList-method} \alias{union,Pairs,missing-method} \alias{intersect} \alias{intersect,Ranges,Ranges-method} \alias{intersect,RangesList,RangesList-method} \alias{intersect,CompressedIRangesList,CompressedIRangesList-method} \alias{intersect,Pairs,missing-method} \alias{setdiff} \alias{setdiff,Ranges,Ranges-method} \alias{setdiff,RangesList,RangesList-method} \alias{setdiff,CompressedIRangesList,CompressedIRangesList-method} \alias{setdiff,Pairs,missing-method} \alias{punion} \alias{punion,Ranges,Ranges-method} \alias{punion,Pairs,missing-method} \alias{pintersect} \alias{pintersect,Ranges,Ranges-method} \alias{pintersect,Pairs,missing-method} \alias{psetdiff} \alias{psetdiff,Ranges,Ranges-method} \alias{psetdiff,Pairs,missing-method} \alias{pgap} \alias{pgap,Ranges,Ranges-method} \title{Set operations on Ranges and RangesList objects} \description{ Performs set operations on \link{Ranges} and \link{RangesList} objects. } \usage{ ## Vector-wise set operations ## -------------------------- \S4method{union}{Ranges,Ranges}(x, y) \S4method{union}{Pairs,missing}(x, y, ...) \S4method{intersect}{Ranges,Ranges}(x, y) \S4method{intersect}{Pairs,missing}(x, y, ...) \S4method{setdiff}{Ranges,Ranges}(x, y) \S4method{setdiff}{Pairs,missing}(x, y, ...) ## Element-wise (aka "parallel") set operations ## -------------------------------------------- \S4method{punion}{Ranges,Ranges}(x, y, fill.gap=FALSE) \S4method{punion}{Pairs,missing}(x, y, ...) \S4method{pintersect}{Ranges,Ranges}(x, y, resolve.empty=c("none", "max.start", "start.x")) \S4method{pintersect}{Pairs,missing}(x, y, ...) \S4method{psetdiff}{Ranges,Ranges}(x, y) \S4method{psetdiff}{Pairs,missing}(x, y, ...) \S4method{pgap}{Ranges,Ranges}(x, y) } \arguments{ \item{x, y}{ Objects representing ranges. } \item{fill.gap}{ Logical indicating whether or not to force a union by using the rule \code{start = min(start(x), start(y)), end = max(end(x), end(y))}. } \item{resolve.empty}{ One of \code{"none"}, \code{"max.start"}, or \code{"start.x"} denoting how to handle ambiguous empty ranges formed by intersections. \code{"none"} - throw an error if an ambiguous empty range is formed, \code{"max.start"} - associate the maximum start value with any ambiguous empty range, and \code{"start.x"} - associate the start value of \code{x} with any ambiguous empty range. (See Details section below for the definition of an ambiguous range.) } \item{...}{ The methods for \link[S4Vectors]{Pairs} objects pass any extra argument to the internal call to \code{punion(first(x), last(x), ...)}, \code{pintersect(first(x), last(x), ...)}, etc... } } \details{ The \code{union}, \code{intersect} and \code{setdiff} methods for \link{Ranges} objects return a "normal" \link{Ranges} object representing the union, intersection and (asymmetric!) difference of the sets of integers represented by \code{x} and \code{y}. \code{punion}, \code{pintersect}, \code{psetdiff} and \code{pgap} are generic functions that compute the element-wise (aka "parallel") union, intersection, (asymmetric!) difference and gap between each element in \code{x} and its corresponding element in \code{y}. Methods for \link{Ranges} objects are defined. For these methods, \code{x} and \code{y} must have the same length (i.e. same number of ranges). They return a \link{Ranges} object \emph{parallel} to \code{x} and \code{y} i.e. where the i-th range corresponds to the i-th range in \code{x} and in\code{y}) and represents the union/intersection/difference/gap of/between the corresponding \code{x[i]} and \code{y[i]}. If \code{x} is a \code{\link[S4Vectors]{Pairs}} object, then \code{y} should be missing, and the operation is performed between the members of each pair. By default, \code{pintersect} will throw an error when an "ambiguous empty range" is formed. An ambiguous empty range can occur three different ways: 1) when corresponding non-empty ranges elements \code{x} and \code{y} have an empty intersection, 2) if the position of an empty range element does not fall within the corresponding limits of a non-empty range element, or 3) if two corresponding empty range elements do not have the same position. For example if empty range element [22,21] is intersected with non-empty range element [1,10], an error will be produced; but if it is intersected with the range [22,28], it will produce [22,21]. As mentioned in the Arguments section above, this behavior can be changed using the \code{resolve.empty} argument. } \value{ On \link{Ranges} objects, \code{union}, \code{intersect}, and \code{setdiff} return an \link{IRanges} \emph{instance} that is guaranteed to be \emph{normal} (see \code{\link{isNormal}}) but is NOT promoted to \link{NormalIRanges}. On \link{Ranges} objects, \code{punion}, \code{pintersect}, \code{psetdiff}, and \code{pgap} return an object of the same class and length as their first argument. } \author{H. Pagès and M. Lawrence} \seealso{ \itemize{ \item \code{pintersect} is similar to \code{\link{narrow}}, except the end points are absolute, not relative. \code{pintersect} is also similar to \code{\link{restrict}}, except ranges outside of the restriction become empty and are not discarded. \item \link[GenomicRanges]{setops-methods} in the \pkg{GenomicRanges} package for set operations on genomic ranges. \item \link{findOverlaps-methods} for finding/counting overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for intra range and inter range transformations. \item \link{Ranges} and \link{RangesList} objects. In particular, \emph{normality} of a \link{Ranges} object is discussed in the man page for \link{Ranges} objects. \item \code{\link[S4Vectors]{mendoapply}} in the \pkg{S4Vectors} package. } } \examples{ x <- IRanges(c(1, 5, -2, 0, 14), c(10, 9, 3, 11, 17)) subject <- Rle(1:-3, 6:2) y <- Views(subject, start=c(14, 0, -5, 6, 18), end=c(20, 2, 2, 8, 20)) ## Vector-wise operations: union(x, ranges(y)) union(ranges(y), x) intersect(x, ranges(y)) intersect(ranges(y), x) setdiff(x, ranges(y)) setdiff(ranges(y), x) ## Element-wise (aka "parallel") operations: try(punion(x, ranges(y))) punion(x[3:5], ranges(y)[3:5]) punion(x, ranges(y), fill.gap=TRUE) try(pintersect(x, ranges(y))) pintersect(x[3:4], ranges(y)[3:4]) pintersect(x, ranges(y), resolve.empty="max.start") psetdiff(ranges(y), x) try(psetdiff(x, ranges(y))) start(x)[4] <- -99 end(y)[4] <- 99 psetdiff(x, ranges(y)) pgap(x, ranges(y)) ## On RangesList objects: irl1 <- IRangesList(a=IRanges(c(1,2),c(4,3)), b=IRanges(c(4,6),c(10,7))) irl2 <- IRangesList(c=IRanges(c(0,2),c(4,5)), a=IRanges(c(4,5),c(6,7))) union(irl1, irl2) intersect(irl1, irl2) setdiff(irl1, irl2) } \keyword{utilities} IRanges/man/slice-methods.Rd0000644000175400017540000000464213175713360016746 0ustar00biocbuildbiocbuild\name{slice-methods} \alias{slice-methods} \alias{slice} \alias{slice,ANY-method} \alias{slice,Rle-method} \alias{slice,RleList-method} \title{Slice a vector-like or list-like object} \description{ \code{slice} is a generic function that creates views on a vector-like or list-like object that contain the elements that are within the specified bounds. } \usage{ slice(x, lower=-Inf, upper=Inf, ...) \S4method{slice}{Rle}(x, lower=-Inf, upper=Inf, includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE) \S4method{slice}{RleList}(x, lower=-Inf, upper=Inf, includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE) } \arguments{ \item{x}{ An \link{Rle} or \link{RleList} object, or any object coercible to an Rle object. } \item{lower, upper}{ The lower and upper bounds for the slice. } \item{includeLower, includeUpper}{ Logical indicating whether or not the specified boundary is open or closed. } \item{rangesOnly}{ A logical indicating whether or not to drop the original data from the output. } \item{...}{ Additional arguments to be passed to specific methods. } } \details{ \code{slice} is useful for finding areas of absolute maxima (peaks), absolute minima (troughs), or fluctuations within specified limits. One or more view summarization methods can be used on the result of \code{slice}. See \code{?`link{view-summarization-methods}`} } \value{ The method for \link{Rle} objects returns an \link{RleViews} object if \code{rangesOnly=FALSE} or an \link{IRanges} object if \code{rangesOnly=TRUE}. The method for \link{RleList} objects returns an \link{RleViewsList} object if \code{rangesOnly=FALSE} or an \link{IRangesList} object if \code{rangesOnly=TRUE}. } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{view-summarization-methods} for summarizing the views returned by \code{slice}. \item \link[XVector]{slice-methods} in the \pkg{XVector} package for more \code{slice} methods. \item \code{\link{coverage}} for computing the coverage across a set of ranges. \item The \link{Rle}, \link{RleList}, \link{RleViews}, and \link{RleViewsList} classes. } } \examples{ ## Views derived from coverage x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L), width=c(5L, 6L, 3L, 4L, 3L, 3L)) cvg <- coverage(x) slice(cvg, lower=2) slice(cvg, lower=2, rangesOnly=TRUE) } \keyword{methods} IRanges/man/view-summarization-methods.Rd0000644000175400017540000001067013175713360021517 0ustar00biocbuildbiocbuild\name{view-summarization-methods} \alias{view-summarization-methods} \alias{viewApply} \alias{viewApply,Views-method} \alias{viewApply,RleViews-method} \alias{viewApply,RleViewsList-method} \alias{viewMins} \alias{viewMins,RleViews-method} \alias{viewMins,RleViewsList-method} \alias{viewMaxs} \alias{viewMaxs,RleViews-method} \alias{viewMaxs,RleViewsList-method} \alias{viewSums} \alias{viewSums,RleViews-method} \alias{viewSums,RleViewsList-method} \alias{viewMeans} \alias{viewMeans,RleViews-method} \alias{viewMeans,RleViewsList-method} \alias{viewWhichMins} \alias{viewWhichMins,RleViews-method} \alias{viewWhichMins,RleViewsList-method} \alias{viewWhichMaxs} \alias{viewWhichMaxs,RleViews-method} \alias{viewWhichMaxs,RleViewsList-method} \alias{viewRangeMins} \alias{viewRangeMins,RleViews-method} \alias{viewRangeMins,RleViewsList-method} \alias{viewRangeMaxs} \alias{viewRangeMaxs,RleViews-method} \alias{viewRangeMaxs,RleViewsList-method} \alias{Summary,Views-method} \alias{mean,Views-method} \alias{max,Views-method} \alias{min,Views-method} \alias{sum,Views-method} \alias{which.min,Views-method} \alias{which.max,Views-method} \title{Summarize views on a vector-like object with numeric values} \description{ \code{viewApply} applies a function on each view of a \link{Views} or \link{ViewsList} object. \code{viewMins}, \code{viewMaxs}, \code{viewSums}, \code{viewMeans} calculate respectively the minima, maxima, sums, and means of the views in a \link{Views} or \link{ViewsList} object. } \usage{ viewApply(X, FUN, ..., simplify = TRUE) viewMins(x, na.rm=FALSE) \S4method{min}{Views}(x, ..., na.rm = FALSE) viewMaxs(x, na.rm=FALSE) \S4method{max}{Views}(x, ..., na.rm = FALSE) viewSums(x, na.rm=FALSE) \S4method{sum}{Views}(x, ..., na.rm = FALSE) viewMeans(x, na.rm=FALSE) \S4method{mean}{Views}(x, ...) viewWhichMins(x, na.rm=FALSE) \S4method{which.min}{Views}(x) viewWhichMaxs(x, na.rm=FALSE) \S4method{which.max}{Views}(x) viewRangeMins(x, na.rm=FALSE) viewRangeMaxs(x, na.rm=FALSE) } \arguments{ \item{X}{ A Views object. } \item{FUN}{ The function to be applied to each view in \code{X}. } \item{...}{ Additional arguments to be passed on. } \item{simplify}{ A logical value specifying whether or not the result should be simplified to a vector or matrix if possible. } \item{x}{ An \link{RleViews} or \link{RleViewsList} object. } \item{na.rm}{ Logical indicating whether or not to include missing values in the results. } } \details{ The \code{viewMins}, \code{viewMaxs}, \code{viewSums}, and \code{viewMeans} functions provide efficient methods for calculating the specified numeric summary by performing the looping in compiled code. The \code{viewWhichMins}, \code{viewWhichMaxs}, \code{viewRangeMins}, and \code{viewRangeMaxs} functions provide efficient methods for finding the locations of the minima and maxima. } \value{ For all the functions in this man page (except \code{viewRangeMins} and \code{viewRangeMaxs}): A numeric vector of the length of \code{x} if \code{x} is an \link{RleViews} object, or a \link{List} object of the length of \code{x} if it's an \link{RleViewsList} object. For \code{viewRangeMins} and \code{viewRangeMaxs}: An \link{IRanges} object if \code{x} is an \link{RleViews} object, or an \link{IRangesList} object if it's an \link{RleViewsList} object. } \note{ For convenience, methods for \code{min}, \code{max}, \code{sum}, \code{mean}, \code{which.min} and \code{which.max} are provided as wrappers around the corresponding \code{view*} functions (which might be deprecated at some point). } \author{P. Aboyoun} \seealso{ \itemize{ \item The \code{\link{slice}} function for slicing an \link{Rle} or \link{RleList} object. \item \link[XVector]{view-summarization-methods} in the \pkg{XVector} package for more view summarization methods. \item The \link{RleViews} and \link{RleViewsList} classes. \item The \code{\link{which.min}} and \code{\link{colSums}} functions. } } \examples{ ## Views derived from coverage x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L), width=c(5L, 6L, 3L, 4L, 3L, 3L)) cvg <- coverage(x) cvg_views <- slice(cvg, lower=2) viewApply(cvg_views, diff) viewMins(cvg_views) viewMaxs(cvg_views) viewSums(cvg_views) viewMeans(cvg_views) viewWhichMins(cvg_views) viewWhichMaxs(cvg_views) viewRangeMins(cvg_views) viewRangeMaxs(cvg_views) } \keyword{methods} \keyword{arith} IRanges/src/0000755000175400017540000000000013175724757013741 5ustar00biocbuildbiocbuildIRanges/src/CompressedAtomicList_utils.c0000644000175400017540000002261013175724757021423 0ustar00biocbuildbiocbuild/**************************************************************************** * Utilities for CompressedAtomicList objects * ****************************************************************************/ #include "IRanges.h" #define R_INT_MIN (1+INT_MIN) #define PARTITIONED_SUM(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, NA_CHECK) { \ PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, NA_CHECK, 0, \ summary += val,); \ } #define PARTITIONED_PROD(ACCESSOR, NA_CHECK) { \ PARTITIONED_AGG(double, ACCESSOR, REALSXP, REAL, NA_CHECK, 1, \ summary *= val,); \ } #define PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, RELOP) { \ PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ACCESSOR, NA_CHECK, INIT, \ if (val RELOP summary) summary = val,); \ } #define PARTITIONED_MIN(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT) { \ PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, <); \ } #define PARTITIONED_MAX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT) { \ PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, >); \ } #define PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, RELOP) { \ SEXP na_rm = ScalarLogical(TRUE); \ PARTITIONED_AGG(int, ACCESSOR, INTSXP, INTEGER, NA_CHECK, NA_INTEGER, \ if (val RELOP summary_val) \ (summary_val = val, summary = j - prev_end + 1), \ C_TYPE summary_val = INIT) \ } #define PARTITIONED_WHICH_MAX(C_TYPE, ACCESSOR, NA_CHECK, INIT) { \ PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, >) \ } #define PARTITIONED_WHICH_MIN(C_TYPE, ACCESSOR, NA_CHECK, INIT) { \ PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, <) \ } #define PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, NA_CHECK, \ INIT, UPDATE, EXTRA_INIT) { \ SEXP unlistData = _get_CompressedList_unlistData(x); \ SEXP ends = \ _get_PartitioningByEnd_end(_get_CompressedList_partitioning(x)); \ Rboolean _na_rm = asLogical(na_rm); \ int prev_end = 0; \ SEXP ans = allocVector(ANS_TYPE, length(ends)); \ for (int i = 0; i < length(ends); i++) { \ int end = INTEGER(ends)[i]; \ C_TYPE summary = INIT; \ EXTRA_INIT; \ for (int j = prev_end; j < end; j++) { \ C_TYPE val = ACCESSOR(unlistData)[j]; \ if (NA_CHECK) { \ if (_na_rm) { \ continue; \ } else { \ summary = NA_ ## ANS_ACCESSOR; \ break; \ } \ } \ UPDATE; \ } \ ANS_ACCESSOR(ans)[i] = summary; \ prev_end = end; \ } \ setAttrib(ans, R_NamesSymbol, _get_CompressedList_names(x)); \ return ans; \ } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_sum(SEXP x, SEXP na_rm) { PARTITIONED_SUM(Rboolean, LOGICAL, INTSXP, INTEGER, val == NA_LOGICAL); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_sum(SEXP x, SEXP na_rm) { PARTITIONED_SUM(int, INTEGER, INTSXP, INTEGER, val == NA_INTEGER); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_sum(SEXP x, SEXP na_rm) { PARTITIONED_SUM(double, REAL, REALSXP, REAL, ISNA(val)); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_prod(SEXP x, SEXP na_rm) { PARTITIONED_PROD(LOGICAL, val == NA_LOGICAL); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_prod(SEXP x, SEXP na_rm) { PARTITIONED_PROD(INTEGER, val == NA_INTEGER); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_prod(SEXP x, SEXP na_rm) { PARTITIONED_PROD(REAL, ISNA(val)); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_min(SEXP x, SEXP na_rm) { PARTITIONED_MIN(Rboolean, LOGICAL, LGLSXP, val == NA_LOGICAL, TRUE); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_which_min(SEXP x) { PARTITIONED_WHICH_MIN(Rboolean, LOGICAL, val == NA_LOGICAL, TRUE); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_min(SEXP x, SEXP na_rm) { PARTITIONED_MIN(int, INTEGER, INTSXP, val == NA_INTEGER, INT_MAX); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_which_min(SEXP x) { PARTITIONED_WHICH_MIN(int, INTEGER, val == NA_INTEGER, INT_MAX); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_min(SEXP x, SEXP na_rm) { PARTITIONED_MIN(double, REAL, REALSXP, ISNA(val), R_PosInf); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_which_min(SEXP x) { PARTITIONED_WHICH_MIN(double, REAL, ISNA(val), R_PosInf); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_max(SEXP x, SEXP na_rm) { PARTITIONED_MAX(Rboolean, LOGICAL, LGLSXP, val == NA_LOGICAL, TRUE); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_which_max(SEXP x) { PARTITIONED_WHICH_MAX(Rboolean, LOGICAL, val == NA_LOGICAL, TRUE); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_max(SEXP x, SEXP na_rm) { PARTITIONED_MAX(int, INTEGER, INTSXP, val == NA_INTEGER, R_INT_MIN); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_which_max(SEXP x) { PARTITIONED_WHICH_MAX(int, INTEGER, val == NA_INTEGER, R_INT_MIN); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_max(SEXP x, SEXP na_rm) { PARTITIONED_MAX(double, REAL, REALSXP, ISNA(val), R_NegInf); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_which_max(SEXP x) { PARTITIONED_WHICH_MAX(double, REAL, ISNA(val), R_NegInf); } #define PARTITIONED_IS_UNSORTED(C_TYPE, ACCESSOR, NA_CHECK) { \ if (asLogical(strictly)) { \ PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, \ val <= ACCESSOR(unlistData)[j-1], 1); \ } else { \ PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, \ val < ACCESSOR(unlistData)[j-1], 1); \ } \ } #define PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, BREAK_CHECK, OFFSET) { \ SEXP unlistData = _get_CompressedList_unlistData(x); \ SEXP ends = \ _get_PartitioningByEnd_end(_get_CompressedList_partitioning(x)); \ Rboolean _na_rm = asLogical(na_rm); \ int prev_end = 0; \ SEXP ans = allocVector(LGLSXP, length(ends)); \ for (int i = 0; i < length(ends); i++) { \ int end = INTEGER(ends)[i]; \ Rboolean summary = FALSE; \ for (int j = prev_end + OFFSET; j < end; j++) { \ C_TYPE val = ACCESSOR(unlistData)[j]; \ if (NA_CHECK) { \ if (_na_rm) { \ continue; \ } else { \ summary = NA_LOGICAL; \ break; \ } \ } \ if (BREAK_CHECK) { \ summary = TRUE; \ break; \ } \ } \ LOGICAL(ans)[i] = summary; \ prev_end = end; \ } \ setAttrib(ans, R_NamesSymbol, _get_CompressedList_names(x)); \ return ans; \ } /* * --- .Call ENTRY POINT --- */ SEXP CompressedLogicalList_is_unsorted(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(Rboolean, LOGICAL, val == NA_LOGICAL); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedIntegerList_is_unsorted(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(int, INTEGER, val == NA_INTEGER); } /* * --- .Call ENTRY POINT --- */ SEXP CompressedNumericList_is_unsorted(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(double, REAL, ISNA(val)); } IRanges/src/CompressedIRangesList_class.c0000644000175400017540000001307413175724757021510 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of CompressedIRangesList objects * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include #define R_INT_MIN (1+INT_MIN) /**************************************************************************** * C-level abstract getters. */ CompressedIRangesList_holder _hold_CompressedIRangesList(SEXP x) { CompressedIRangesList_holder x_holder; SEXP x_end; x_holder.classname = get_classname(x); x_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(x)); x_holder.length = LENGTH(x_end); x_holder.end = INTEGER(x_end); x_holder.unlistData_holder = _hold_IRanges( _get_CompressedList_unlistData(x)); return x_holder; } int _get_length_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder) { return x_holder->length; } IRanges_holder _get_elt_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i) { int offset, length; offset = i == 0 ? 0 : x_holder->end[i - 1]; length = x_holder->end[i] - offset; return _get_linear_subset_from_IRanges_holder( &(x_holder->unlistData_holder), offset, length); } int _get_eltNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i) { /* IRanges_holder ir_holder; ir_holder = _get_elt_from_CompressedIRangesList_holder(x_holder, i); return _get_length_from_IRanges_holder(&ir_holder); */ int offset; offset = i == 0 ? 0 : x_holder->end[i - 1]; return x_holder->end[i] - offset; /* faster than the above */ } /**************************************************************************** * CompressedIRangesList methods. */ /* --- .Call ENTRY POINT --- */ SEXP CompressedIRangesList_isNormal(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, i; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_LOGICAL(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder( &x_holder, i); LOGICAL(ans)[i] = _is_normal_IRanges_holder(&ir_holder); } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP CompressedIRangesList_summary(SEXP object) { int ans_len; SEXP part_end; SEXP ans, ans_names, col_names; part_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(object)); ans_len = LENGTH(part_end); PROTECT(ans = allocMatrix(INTSXP, ans_len, 2)); memset(INTEGER(ans), 0, 2 * ans_len * sizeof(int)); if (ans_len > 0) { int i, j, prev_end = 0; int *ans1_elt, *ans2_elt; const int *part_end_elt, *ranges_width; SEXP unlistData = _get_CompressedList_unlistData(object); ranges_width = INTEGER(_get_IRanges_width(unlistData)); for (i = 0, ans1_elt = INTEGER(ans), ans2_elt = (INTEGER(ans) + ans_len), part_end_elt = INTEGER(part_end); i < ans_len; i++, ans1_elt++, ans2_elt++, part_end_elt++) { *ans1_elt = *part_end_elt - prev_end; for (j = 0; j < *ans1_elt; j++) { *ans2_elt += *ranges_width; ranges_width++; } prev_end = *part_end_elt; } } PROTECT(ans_names = NEW_LIST(2)); PROTECT(col_names = NEW_CHARACTER(2)); SET_STRING_ELT(col_names, 0, mkChar("Length")); SET_STRING_ELT(col_names, 1, mkChar("WidthSum")); SET_VECTOR_ELT(ans_names, 0, duplicate(_get_CompressedList_names(object))); SET_VECTOR_ELT(ans_names, 1, col_names); SET_DIMNAMES(ans, ans_names); UNPROTECT(3); return ans; } /**************************************************************************** * CompressedNormalIRangesList methods. */ /* --- .Call ENTRY POINT --- */ SEXP CompressedNormalIRangesList_min(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = INT_MAX; } else { *ans_elt = _get_start_elt_from_IRanges_holder(&ir_holder, 0); } } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP CompressedNormalIRangesList_max(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = R_INT_MIN; } else { *ans_elt = _get_end_elt_from_IRanges_holder(&ir_holder, ir_len - 1); } } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } IRanges/src/CompressedList_class.c0000644000175400017540000000625613175724757020243 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of CompressedList objects * ****************************************************************************/ #include "IRanges.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP unlistData_symbol = NULL, partitioning_symbol = NULL; SEXP _get_CompressedList_unlistData(SEXP x) { INIT_STATIC_SYMBOL(unlistData) return GET_SLOT(x, unlistData_symbol); } SEXP _get_CompressedList_partitioning(SEXP x) { INIT_STATIC_SYMBOL(partitioning) return GET_SLOT(x, partitioning_symbol); } /* Not strict "slot getters" but very much like. */ int _get_CompressedList_length(SEXP x) { return LENGTH(_get_PartitioningByEnd_end( _get_CompressedList_partitioning(x))); } SEXP _get_CompressedList_names(SEXP x) { return _get_Partitioning_names( _get_CompressedList_partitioning(x)); } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_CompressedList_unlistData(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(unlistData) SET_SLOT(x, unlistData_symbol, value); return; } static void set_CompressedList_partitioning(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(partitioning) SET_SLOT(x, partitioning_symbol, value); return; } /**************************************************************************** * C-level constructor. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_CompressedList(const char *classname, SEXP unlistData, SEXP partitioning) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_CompressedList_unlistData(ans, unlistData); set_CompressedList_partitioning(ans, partitioning); UNPROTECT(2); return ans; } /**************************************************************************** * C-level abstract getters for CompressedIntegerList objects. */ CompressedIntsList_holder _hold_CompressedIntegerList(SEXP x) { SEXP partitioning_end; CompressedIntsList_holder x_holder; partitioning_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(x)); x_holder.length = LENGTH(partitioning_end); x_holder.breakpoints = INTEGER(partitioning_end); x_holder.unlisted = INTEGER(_get_CompressedList_unlistData(x)); return x_holder; } int _get_length_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder) { return x_holder->length; } Ints_holder _get_elt_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder, int i) { Ints_holder x_elt_holder; int offset; if (i == 0) { offset = 0; } else { offset = x_holder->breakpoints[i - 1]; } x_elt_holder.ptr = x_holder->unlisted + offset; x_elt_holder.length = x_holder->breakpoints[i] - offset; return x_elt_holder; } IRanges/src/Grouping_class.c0000644000175400017540000001040113175724757017060 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of Grouping objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP high2low_symbol = NULL, low2high_symbol = NULL, end_symbol = NULL, NAMES_symbol = NULL; SEXP _get_H2LGrouping_high2low(SEXP x) { INIT_STATIC_SYMBOL(high2low) return GET_SLOT(x, high2low_symbol); } SEXP _get_H2LGrouping_low2high(SEXP x) { INIT_STATIC_SYMBOL(low2high) return GET_SLOT(x, low2high_symbol); } SEXP _get_Partitioning_names(SEXP x) { INIT_STATIC_SYMBOL(NAMES) return GET_SLOT(x, NAMES_symbol); } SEXP _get_PartitioningByEnd_end(SEXP x) { INIT_STATIC_SYMBOL(end) return GET_SLOT(x, end_symbol); } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_Partitioning_names(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(NAMES) SET_SLOT(x, NAMES_symbol, value); return; } static void set_PartitioningByEnd_end(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(end) SET_SLOT(x, end_symbol, value); return; } /**************************************************************************** * C-level constructor. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_PartitioningByEnd(const char *classname, SEXP end, SEXP names) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_PartitioningByEnd_end(ans, end); if (names == NULL) names = R_NilValue; set_Partitioning_names(ans, names); UNPROTECT(2); return ans; } /**************************************************************************** * --- .Call ENTRY POINTS --- * ****************************************************************************/ SEXP H2LGrouping_members(SEXP x, SEXP group_ids) { SEXP ans, x_high2low, x_low2high, x_low2high_elt; int x_length, nids, ans_length, i, j, group_id, *ans_elt; if (TYPEOF(group_ids) != INTSXP) error("the group ids must be integers"); x_high2low = _get_H2LGrouping_high2low(x); x_low2high = _get_H2LGrouping_low2high(x); x_length = LENGTH(x_low2high); /* same as LENGTH(x_high2low) */ nids = LENGTH(group_ids); /* 1st pass: determine 'ans_length' */ ans_length = 0; for (j = 0; j < nids; j++) { group_id = INTEGER(group_ids)[j]; if (group_id == NA_INTEGER) error("some group ids are NAs"); i = group_id - 1; if (i < 0 || i >= x_length) error("subscript out of bounds"); if (INTEGER(x_high2low)[i] != NA_INTEGER) continue; ans_length++; x_low2high_elt = VECTOR_ELT(x_low2high, i); if (x_low2high_elt == R_NilValue) continue; ans_length += LENGTH(x_low2high_elt); } PROTECT(ans = NEW_INTEGER(ans_length)); /* 2nd pass: fill 'ans' */ ans_elt = INTEGER(ans); for (j = 0; j < nids; j++) { group_id = INTEGER(group_ids)[j]; i = group_id - 1; if (INTEGER(x_high2low)[i] != NA_INTEGER) continue; *(ans_elt++) = i + 1; x_low2high_elt = VECTOR_ELT(x_low2high, i); if (x_low2high_elt == R_NilValue) continue; memcpy(ans_elt, INTEGER(x_low2high_elt), sizeof(int) * LENGTH(x_low2high_elt)); ans_elt += LENGTH(x_low2high_elt); } sort_int_array(INTEGER(ans), ans_length, 0); UNPROTECT(1); return ans; } SEXP H2LGrouping_vmembers(SEXP x, SEXP group_ids_list) { SEXP ans, group_ids; int ans_length, i; ans_length = LENGTH(group_ids_list); PROTECT(ans = NEW_LIST(ans_length)); for (i = 0; i < ans_length; i++) { group_ids = VECTOR_ELT(group_ids_list, i); if (TYPEOF(group_ids) != INTSXP) error("'L' must be a list of integer vectors"); SET_VECTOR_ELT(ans, i, H2LGrouping_members(x, group_ids)); } UNPROTECT(1); return ans; } IRanges/src/IRanges.h0000644000175400017540000001555713175724757015457 0ustar00biocbuildbiocbuild#include "../inst/include/IRanges_defines.h" #include #define INIT_STATIC_SYMBOL(NAME) \ { \ if (NAME ## _symbol == NULL) \ NAME ## _symbol = install(# NAME); \ } /* Ranges_class.c */ SEXP valid_Ranges( SEXP x_start, SEXP x_end, SEXP x_width ); /* Ranges_comparison.c */ int _overlap_code( int x_start, int x_width, int y_start, int y_width ); int _invert_overlap_code( int code ); SEXP Ranges_pcompare( SEXP x_start, SEXP x_width, SEXP y_start, SEXP y_width ); /* IRanges_class.c */ SEXP _get_IRanges_start(SEXP x); SEXP _get_IRanges_width(SEXP x); SEXP _get_IRanges_names(SEXP x); int _get_IRanges_length(SEXP x); IRanges_holder _hold_IRanges(SEXP x); int _get_length_from_IRanges_holder(const IRanges_holder *x_holder); int _get_width_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); int _get_start_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); int _get_end_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); SEXP _get_names_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); IRanges_holder _get_linear_subset_from_IRanges_holder( const IRanges_holder *x_holder, int offset, int length ); void _set_IRanges_names( SEXP x, SEXP names ); void _copy_IRanges_slots( SEXP x, SEXP x0 ); SEXP _new_IRanges( const char *classname, SEXP start, SEXP width, SEXP names ); SEXP _new_IRanges_from_IntPairAE( const char *classname, const IntPairAE *intpair_ae ); SEXP _new_list_of_IRanges_from_IntPairAEAE( const char *element_type, const IntPairAEAE *intpair_aeae ); SEXP _alloc_IRanges( const char *classname, int length ); int _is_normal_IRanges_holder(const IRanges_holder *x_holder); SEXP IRanges_isNormal(SEXP x); SEXP IRanges_from_integer(SEXP x); SEXP NormalIRanges_from_logical(SEXP x); /* IRanges_constructor.c */ SEXP solve_user_SEW0( SEXP start, SEXP end, SEXP width ); SEXP solve_user_SEW( SEXP refwidths, SEXP start, SEXP end, SEXP width, SEXP translate_negative_coord, SEXP allow_nonnarrowing ); /* Grouping_class.c */ SEXP _get_H2LGrouping_high2low(SEXP x); SEXP _get_H2LGrouping_low2high(SEXP x); SEXP _get_Partitioning_names(SEXP x); SEXP _get_PartitioningByEnd_end(SEXP x); SEXP _new_PartitioningByEnd( const char *classname, SEXP end, SEXP names ); SEXP H2LGrouping_members( SEXP x, SEXP group_ids ); SEXP H2LGrouping_vmembers( SEXP x, SEXP group_ids_list ); /* SimpleIRangesList_class.c */ SEXP SimpleIRangesList_isNormal(SEXP x, SEXP use_names); SEXP SimpleNormalIRangesList_min(SEXP x); SEXP SimpleNormalIRangesList_max(SEXP x); /* CompressedList_class.c */ SEXP _get_CompressedList_unlistData(SEXP x); SEXP _get_CompressedList_partitioning(SEXP x); int _get_CompressedList_length(SEXP x); SEXP _get_CompressedList_names(SEXP x); SEXP _new_CompressedList( const char *classname, SEXP unlistData, SEXP partitioning ); CompressedIntsList_holder _hold_CompressedIntegerList( SEXP x ); int _get_length_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder ); Ints_holder _get_elt_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder, int i ); /* RleViews_utils.c */ SEXP RleViews_viewMins( SEXP x, SEXP na_rm ); SEXP RleViews_viewMaxs( SEXP x, SEXP na_rm ); SEXP RleViews_viewSums( SEXP x, SEXP na_rm ); SEXP RleViews_viewMeans( SEXP x, SEXP na_rm ); SEXP RleViews_viewWhichMins( SEXP x, SEXP na_rm ); SEXP RleViews_viewWhichMaxs( SEXP x, SEXP na_rm ); /* CompressedIRangesList_class.c */ CompressedIRangesList_holder _hold_CompressedIRangesList(SEXP x); int _get_length_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder ); IRanges_holder _get_elt_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i ); int _get_eltNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i ); SEXP CompressedIRangesList_isNormal( SEXP x, SEXP use_names ); SEXP CompressedIRangesList_summary( SEXP object ); SEXP CompressedNormalIRangesList_min( SEXP x, SEXP use_names ); SEXP CompressedNormalIRangesList_max( SEXP x, SEXP use_names ); /* inter_range_methods.c */ SEXP IRanges_range(SEXP x); SEXP Ranges_reduce( SEXP x_start, SEXP x_width, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap, SEXP with_inframe_start ); SEXP CompressedIRangesList_reduce( SEXP x, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap ); SEXP IRanges_gaps( SEXP x_start, SEXP x_width, SEXP start, SEXP end ); SEXP CompressedIRangesList_gaps( SEXP x, SEXP start, SEXP end ); SEXP Ranges_disjointBins( SEXP x_start, SEXP x_width ); /* coverage_methods.c */ SEXP IRanges_coverage( SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_len, SEXP method ); SEXP CompressedIRangesList_coverage( SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_lens, SEXP method ); /* NCList.c */ SEXP NCList_new(); SEXP NCList_free(SEXP nclist_xp); SEXP NCList_build( SEXP nclist_xp, SEXP x_start, SEXP x_end, SEXP x_subset ); SEXP new_NCListAsINTSXP_from_NCList( SEXP nclist_xp ); SEXP NCListAsINTSXP_print( SEXP x_nclist, SEXP x_start, SEXP x_end ); SEXP NCList_find_overlaps( SEXP q_start, SEXP q_end, SEXP s_start, SEXP s_end, SEXP nclist, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length ); SEXP NCList_find_overlaps_in_groups( SEXP q_start, SEXP q_end, SEXP q_space, SEXP q_groups, SEXP s_start, SEXP s_end, SEXP s_space, SEXP s_groups, SEXP nclists, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length ); /* CompressedAtomicList_utils.c */ SEXP CompressedLogicalList_sum( SEXP x, SEXP na_rm); SEXP CompressedIntegerList_sum( SEXP x, SEXP na_rm); SEXP CompressedNumericList_sum( SEXP x, SEXP na_rm); SEXP CompressedLogicalList_prod( SEXP x, SEXP na_rm); SEXP CompressedIntegerList_prod( SEXP x, SEXP na_rm); SEXP CompressedNumericList_prod( SEXP x, SEXP na_rm); SEXP CompressedLogicalList_min( SEXP x, SEXP na_rm); SEXP CompressedLogicalList_which_min( SEXP x); SEXP CompressedIntegerList_min( SEXP x, SEXP na_rm); SEXP CompressedIntegerList_which_min( SEXP x); SEXP CompressedNumericList_min( SEXP x, SEXP na_rm); SEXP CompressedNumericList_which_min( SEXP x); SEXP CompressedLogicalList_max( SEXP x, SEXP na_rm); SEXP CompressedLogicalList_which_max( SEXP x); SEXP CompressedIntegerList_max( SEXP x, SEXP na_rm); SEXP CompressedIntegerList_which_max( SEXP x); SEXP CompressedNumericList_max( SEXP x, SEXP na_rm); SEXP CompressedNumericList_which_max( SEXP x); SEXP CompressedLogicalList_is_unsorted( SEXP x, SEXP na_rm, SEXP strictly); SEXP CompressedIntegerList_is_unsorted( SEXP x, SEXP na_rm, SEXP strictly); SEXP CompressedNumericList_is_unsorted( SEXP x, SEXP na_rm, SEXP strictly); IRanges/src/IRanges_class.c0000644000175400017540000002374013175724757016630 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of IRanges objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP start_symbol = NULL, width_symbol = NULL, NAMES_symbol = NULL; SEXP _get_IRanges_start(SEXP x) { INIT_STATIC_SYMBOL(start) return GET_SLOT(x, start_symbol); } SEXP _get_IRanges_width(SEXP x) { INIT_STATIC_SYMBOL(width) return GET_SLOT(x, width_symbol); } SEXP _get_IRanges_names(SEXP x) { INIT_STATIC_SYMBOL(NAMES) return GET_SLOT(x, NAMES_symbol); } /* Not a strict "slot getter" but very much like. */ int _get_IRanges_length(SEXP x) { return LENGTH(_get_IRanges_start(x)); } /**************************************************************************** * C-level abstract getters. */ IRanges_holder _hold_IRanges(SEXP x) { IRanges_holder x_holder; x_holder.classname = get_classname(x); x_holder.is_constant_width = 0; x_holder.length = _get_IRanges_length(x); x_holder.width = INTEGER(_get_IRanges_width(x)); x_holder.start = INTEGER(_get_IRanges_start(x)); x_holder.end = NULL; x_holder.SEXP_offset = 0; x_holder.names = _get_IRanges_names(x); return x_holder; } int _get_length_from_IRanges_holder(const IRanges_holder *x_holder) { return x_holder->length; } int _get_width_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { return x_holder->is_constant_width ? x_holder->width[0] : x_holder->width[i]; } int _get_start_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { if (x_holder->start) return x_holder->start[i]; return x_holder->end[i] - _get_width_elt_from_IRanges_holder(x_holder, i) + 1; } int _get_end_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { if (x_holder->end) return x_holder->end[i]; return x_holder->start[i] + _get_width_elt_from_IRanges_holder(x_holder, i) - 1; } SEXP _get_names_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { return STRING_ELT(x_holder->names, x_holder->SEXP_offset + i); } IRanges_holder _get_linear_subset_from_IRanges_holder( const IRanges_holder *x_holder, int offset, int length) { IRanges_holder y_holder; y_holder = *x_holder; y_holder.length = length; y_holder.start += offset; if (!y_holder.is_constant_width) y_holder.width += offset; y_holder.SEXP_offset += offset; return y_holder; } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_IRanges_start(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(start) SET_SLOT(x, start_symbol, value); return; } static void set_IRanges_width(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(width) SET_SLOT(x, width_symbol, value); /* Rprintf("set_IRanges_width(): value=%p _get_IRanges_width(x)=%p\n", value, _get_IRanges_width(x)); */ return; } static void set_IRanges_names(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(NAMES) SET_SLOT(x, NAMES_symbol, value); return; } /* WARNING: Use only AFTER 'x@start' has been set! Because this setter is trying to figure out what the length of 'x' is. */ void _set_IRanges_names(SEXP x, SEXP names) { if (names == NULL) names = R_NilValue; else if (names != R_NilValue && LENGTH(names) != _get_IRanges_length(x)) error("_set_IRanges_names(): " "number of names and number of elements differ"); set_IRanges_names(x, names); return; } /* Note that 'start' and 'width' must NOT contain NAs. set_IRanges_slots() trusts the caller and does NOT check this! */ static void set_IRanges_slots(SEXP x, SEXP start, SEXP width, SEXP names) { if (LENGTH(width) != LENGTH(start)) error("set_IRanges_slots(): " "number of starts and number of widths differ"); set_IRanges_start(x, start); set_IRanges_width(x, width); _set_IRanges_names(x, names); return; } void _copy_IRanges_slots(SEXP x, SEXP x0) { SEXP slot; PROTECT(slot = duplicate(_get_IRanges_start(x0))); set_IRanges_start(x, slot); UNPROTECT(1); PROTECT(slot = duplicate(_get_IRanges_width(x0))); set_IRanges_width(x, slot); UNPROTECT(1); PROTECT(slot = duplicate(_get_IRanges_names(x0))); set_IRanges_names(x, slot); UNPROTECT(1); return; } /**************************************************************************** * C-level constructors. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_IRanges(const char *classname, SEXP start, SEXP width, SEXP names) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_IRanges_slots(ans, start, width, names); UNPROTECT(2); return ans; } SEXP _new_IRanges_from_IntPairAE(const char *classname, const IntPairAE *intpair_ae) { SEXP ans, start, width; PROTECT(start = new_INTEGER_from_IntAE(intpair_ae->a)); PROTECT(width = new_INTEGER_from_IntAE(intpair_ae->b)); PROTECT(ans = _new_IRanges(classname, start, width, R_NilValue)); UNPROTECT(3); return ans; } /* TODO: Try to make this faster by making only 1 call to _new_IRanges() (or _alloc_IRanges()) and cloning and modifying this initial object inside the for loop. */ SEXP _new_list_of_IRanges_from_IntPairAEAE(const char *element_type, const IntPairAEAE *intpair_aeae) { SEXP ans, ans_elt; int nelt, i; const IntPairAE *ae; nelt = IntPairAEAE_get_nelt(intpair_aeae); PROTECT(ans = NEW_LIST(nelt)); for (i = 0; i < nelt; i++) { ae = intpair_aeae->elts[i]; PROTECT(ans_elt = _new_IRanges_from_IntPairAE(element_type, ae)); SET_VECTOR_ELT(ans, i, ans_elt); UNPROTECT(1); } UNPROTECT(1); return ans; } /* Allocation WITHOUT initialization. The 'start' and 'width' slots are not initialized (they contain junk). */ SEXP _alloc_IRanges(const char *classname, int length) { SEXP start, width, ans; PROTECT(start = NEW_INTEGER(length)); PROTECT(width = NEW_INTEGER(length)); PROTECT(ans = _new_IRanges(classname, start, width, R_NilValue)); UNPROTECT(3); return ans; } /**************************************************************************** * Validity functions. */ int _is_normal_IRanges_holder(const IRanges_holder *x_holder) { int x_len, i; x_len = _get_length_from_IRanges_holder(x_holder); if (x_len == 0) return 1; if (_get_width_elt_from_IRanges_holder(x_holder, 0) <= 0) return 0; for (i = 1; i < x_len; i++) { if (_get_width_elt_from_IRanges_holder(x_holder, i) <= 0) return 0; if (_get_start_elt_from_IRanges_holder(x_holder, i) <= _get_end_elt_from_IRanges_holder(x_holder, i - 1) + 1) return 0; } return 1; } /* --- .Call ENTRY POINT --- */ SEXP IRanges_isNormal(SEXP x) { IRanges_holder ir_holder; ir_holder = _hold_IRanges(x); return ScalarLogical(_is_normal_IRanges_holder(&ir_holder)); } /**************************************************************************** * Coercion functions. */ /* --- .Call ENTRY POINT --- */ SEXP IRanges_from_integer(SEXP x) { SEXP ans, ans_start, ans_width; int i, x_length, ans_length; int *start_buf, *width_buf; int *x_elt, *start_elt, *width_elt, prev_elt_plus1; x_length = LENGTH(x); if (x_length == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); } else { ans_length = 1; start_buf = (int *) R_alloc((long) x_length, sizeof(int)); width_buf = (int *) R_alloc((long) x_length, sizeof(int)); start_buf[0] = INTEGER(x)[0]; width_buf[0] = 1; prev_elt_plus1 = start_buf[0] + 1; start_elt = start_buf; width_elt = width_buf; for (i = 1, x_elt = (INTEGER(x)+1); i < x_length; i++, x_elt++) { if (*x_elt == NA_INTEGER) error("cannot create an IRanges object from an integer vector with missing values"); if (*x_elt == prev_elt_plus1) { *width_elt += 1; } else { ans_length++; start_elt++; width_elt++; *start_elt = *x_elt; *width_elt = 1; prev_elt_plus1 = *x_elt; } prev_elt_plus1++; } PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); memcpy(INTEGER(ans_start), start_buf, sizeof(int) * ans_length); memcpy(INTEGER(ans_width), width_buf, sizeof(int) * ans_length); } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } /* --- .Call ENTRY POINT --- */ SEXP NormalIRanges_from_logical(SEXP x) { SEXP ans, ans_start, ans_width; int i, x_length, buf_length, ans_length; int *start_buf, *width_buf; int *x_elt, *start_elt, *width_elt, prev_elt; x_length = LENGTH(x); if (x_length == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); } else { buf_length = x_length / 2 + 1; ans_length = 0; start_buf = (int *) R_alloc((long) buf_length, sizeof(int)); width_buf = (int *) R_alloc((long) buf_length, sizeof(int)); prev_elt = 0; start_elt = start_buf - 1; width_elt = width_buf - 1; for (i = 1, x_elt = LOGICAL(x); i <= x_length; i++, x_elt++) { if (*x_elt == NA_LOGICAL) error("cannot create an IRanges object from a logical vector with missing values"); if (*x_elt == 1) { if (prev_elt) { *width_elt += 1; } else { ans_length++; start_elt++; width_elt++; *start_elt = i; *width_elt = 1; } } prev_elt = *x_elt; } PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); memcpy(INTEGER(ans_start), start_buf, sizeof(int) * ans_length); memcpy(INTEGER(ans_width), width_buf, sizeof(int) * ans_length); } PROTECT(ans = _new_IRanges("NormalIRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } IRanges/src/IRanges_constructor.c0000644000175400017540000001326613175724757020112 0ustar00biocbuildbiocbuild/**************************************************************************** * Support functions for the IRanges constructor * ****************************************************************************/ #include "IRanges.h" static char errmsg_buf[200]; /**************************************************************************** * solve_user_SEW0() */ static int solve_user_SEW0_row(int start, int end, int width, int *solved_start, int *solved_width) { int nb_of_unknowns; nb_of_unknowns = (start == NA_INTEGER) + (end == NA_INTEGER) + (width == NA_INTEGER); if (nb_of_unknowns >= 2) { snprintf(errmsg_buf, sizeof(errmsg_buf), "range cannot be determined from the supplied arguments (too many NAs)"); return -1; } if (start == NA_INTEGER) { start = end - width + 1; } else if (width == NA_INTEGER) { width = end - start + 1; } else if (end != NA_INTEGER && end != start + width - 1) { snprintf(errmsg_buf, sizeof(errmsg_buf), "supplied arguments are incompatible"); return -1; } if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative widths are not allowed"); return -1; } *solved_start = start; *solved_width = width; return 0; } SEXP solve_user_SEW0(SEXP start, SEXP end, SEXP width) { SEXP ans, ans_start, ans_width; int ans_length, i; ans_length = LENGTH(start); PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); for (i = 0; i < ans_length; i++) { if (solve_user_SEW0_row(INTEGER(start)[i], INTEGER(end)[i], INTEGER(width)[i], INTEGER(ans_start) + i, INTEGER(ans_width) + i) != 0) { UNPROTECT(2); error("solving row %d: %s", i + 1, errmsg_buf); } } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } /**************************************************************************** * solve_user_SEW() */ static int translate_negative_coord0; static int nonnarrowing_is_OK; static int translate_negative_startorend(int refwidth, int startorend) { if (startorend < 0) startorend += refwidth + 1; return startorend; } static int check_start(int refwidth, int start, const char *what) { if (nonnarrowing_is_OK) return 0; if (start < 1) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s start " "(%d) is < 1", what, start); return -1; } if (start > refwidth + 1) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s start " "(%d) is > refwidth + 1", what, start); return -1; } return 0; } static int check_end(int refwidth, int end, const char *what) { if (nonnarrowing_is_OK) return 0; if (end < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s end " "(%d) is < 0", what, end); return -1; } if (end > refwidth) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s end " "(%d) is > refwidth", what, end); return -1; } return 0; } static int solve_user_SEW_row(int refwidth, int start, int end, int width, int *solved_start, int *solved_width) { if (refwidth == NA_INTEGER || refwidth < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative values or NAs are not allowed in 'refwidths'"); return -1; } if (start != NA_INTEGER) { if (translate_negative_coord0) start = translate_negative_startorend(refwidth, start); if (check_start(refwidth, start, "supplied") != 0) return -1; } if (end != NA_INTEGER) { if (translate_negative_coord0) end = translate_negative_startorend(refwidth, end); if (check_end(refwidth, end, "supplied") != 0) return -1; } if (width == NA_INTEGER) { if (start == NA_INTEGER) start = 1; if (end == NA_INTEGER) end = refwidth; width = end - start + 1; if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the supplied start/end lead to a " "negative width"); return -1; } } else if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative values are not allowed in 'width'"); return -1; } else if ((start == NA_INTEGER) == (end == NA_INTEGER)) { snprintf(errmsg_buf, sizeof(errmsg_buf), "either the supplied start or the supplied end " "(but not both) must be NA when the supplied width " "is not NA"); return -1; } else { // Either 'start' or 'end' is NA if (start == NA_INTEGER) { start = end - width + 1; if (check_start(refwidth, start, "solved") != 0) return -1; } else { end = start + width - 1; if (check_end(refwidth, end, "solved") != 0) return -1; } } *solved_start = start; *solved_width = width; return 0; } /* * --- .Call ENTRY POINT --- */ SEXP solve_user_SEW(SEXP refwidths, SEXP start, SEXP end, SEXP width, SEXP translate_negative_coord, SEXP allow_nonnarrowing) { SEXP ans, ans_start, ans_width; int ans_length, i0, i1, i2, i3; translate_negative_coord0 = LOGICAL(translate_negative_coord)[0]; nonnarrowing_is_OK = LOGICAL(allow_nonnarrowing)[0]; ans_length = LENGTH(refwidths); PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); for (i0 = i1 = i2 = i3 = 0; i0 < ans_length; i0++, i1++, i2++, i3++) { /* recycling */ if (i1 >= LENGTH(start)) i1 = 0; if (i2 >= LENGTH(end)) i2 = 0; if (i3 >= LENGTH(width)) i3 = 0; if (solve_user_SEW_row(INTEGER(refwidths)[i0], INTEGER(start)[i1], INTEGER(end)[i2], INTEGER(width)[i3], INTEGER(ans_start) + i0, INTEGER(ans_width) + i0) != 0) { UNPROTECT(2); error("solving row %d: %s", i0 + 1, errmsg_buf); } } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } IRanges/src/NCList.c0000644000175400017540000013047013175724757015246 0ustar00biocbuildbiocbuild/**************************************************************************** * A Nested Containment List implementation * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include /* for malloc, realloc, free, abs */ #include /* for log10 */ /* #include static double cumulated_time = 0.0; static clock_t clock0; static void start_clock() { clock0 = clock(); } static void stop_clock() { cumulated_time += ((double) clock() - clock0) / CLOCKS_PER_SEC; } static void init_clock(const char *msg) { printf("%s", msg); cumulated_time = 0.0; clock0 = clock(); } static void print_elapsed_time() { stop_clock(); printf("%8.6f s\n", cumulated_time); } */ /**************************************************************************** * A simple wrapper to realloc() */ /* 'new_nmemb' must be > 'old_nmemb'. */ static void *realloc2(void *ptr, int new_nmemb, int old_nmemb, size_t size) { void *new_ptr; if (new_nmemb <= old_nmemb) error("IRanges internal error in realloc2(): " "'new_nmemb' <= 'old_nmemb'"); size *= new_nmemb; if (old_nmemb == 0) { new_ptr = malloc(size); } else { new_ptr = realloc(ptr, size); } if (new_ptr == NULL) error("IRanges internal error in realloc2(): " "memory (re)allocation failed"); return new_ptr; } static int get_new_maxdepth(int maxdepth) { return maxdepth == 0 ? 16384 : 4 * maxdepth; } /**************************************************************************** * NCList structure */ /* sizeof(NCList) is 24 bytes (0x18 bytes) */ typedef struct nclist_t { int buflength; /* >= 0 */ int nchildren; /* >= 0 and <= buflength */ struct nclist_t *childrenbuf; /* Of length 'buflength'. */ int *rgidbuf; /* Of length 'nchildren'. The IDs of the ranges asso- ciated with the children. The ID of a range is just its 0-based position in original Ranges object 'x'. Allows reverse mapping of the children into 'x' (e.g. to find their start, end, or width). */ } NCList; static void init_NCList(NCList *nclist) { nclist->buflength = nclist->nchildren = 0; return; } /**************************************************************************** * Utilities to walk on an NCList structure non-recursively */ typedef struct NCList_walking_stack_elt_t { const NCList *parent_nclist; int n; /* point to n-th child of 'parent_nclist' */ } NCListWalkingStackElt; #define GET_NCLIST(stack_elt) \ ((stack_elt)->parent_nclist->childrenbuf + (stack_elt)->n) #define GET_RGID(stack_elt) \ ((stack_elt)->parent_nclist->rgidbuf[(stack_elt)->n]) static NCListWalkingStackElt *NCList_walking_stack = NULL; static int NCList_walking_stack_maxdepth = 0; static int NCList_walking_stack_depth = 0; #define RESET_NCLIST_WALKING_STACK() NCList_walking_stack_depth = 0 /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static NCListWalkingStackElt *pop_NCListWalkingStackElt() { NCList_walking_stack_depth--; return NCList_walking_stack + NCList_walking_stack_depth; } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static NCListWalkingStackElt *peek_NCListWalkingStackElt() { return NCList_walking_stack + NCList_walking_stack_depth - 1; } static void extend_NCList_walking_stack() { int new_maxdepth; new_maxdepth = get_new_maxdepth(NCList_walking_stack_maxdepth); NCList_walking_stack = (NCListWalkingStackElt *) realloc2(NCList_walking_stack, new_maxdepth, NCList_walking_stack_maxdepth, sizeof(NCListWalkingStackElt)); NCList_walking_stack_maxdepth = new_maxdepth; return; } /* Return a pointer to n-th child. */ static const NCList *move_to_child(const NCList *parent_nclist, int n) { NCListWalkingStackElt *stack_elt; if (NCList_walking_stack_depth == NCList_walking_stack_maxdepth) extend_NCList_walking_stack(); stack_elt = NCList_walking_stack + NCList_walking_stack_depth++; stack_elt->parent_nclist = parent_nclist; stack_elt->n = n; return GET_NCLIST(stack_elt); } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static const NCList *move_to_right_sibling_or_uncle(const NCList *nclist) { NCListWalkingStackElt *stack_elt; stack_elt = NCList_walking_stack + NCList_walking_stack_depth; do { stack_elt--; if (++(stack_elt->n) < stack_elt->parent_nclist->nchildren) return ++nclist; nclist = stack_elt->parent_nclist; } while (--NCList_walking_stack_depth != 0); return NULL; } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static const NCList *move_to_right_uncle() { const NCList *parent_nclist; parent_nclist = pop_NCListWalkingStackElt()->parent_nclist; if (NCList_walking_stack_depth == 0) return NULL; return move_to_right_sibling_or_uncle(parent_nclist); } static const NCList *move_down(const NCList *nclist) { while (nclist->nchildren != 0) nclist = move_to_child(nclist, 0); return nclist; } /* Top-down walk: parent is treated before children and children are treated from left to right. For a top-down walk that visits the entire tree (i.e. "complete walk") do: RESET_NCLIST_WALKING_STACK(); for (nclist = top_nclist; nclist != NULL; nclist = next_top_down(nclist)) { treat nclist } */ static const NCList *next_top_down(const NCList *nclist) { /* Try to move to first child, if any. */ if (nclist->nchildren != 0) return move_to_child(nclist, 0); if (NCList_walking_stack_depth == 0) return NULL; return move_to_right_sibling_or_uncle(nclist); } /* Bottom-up walk: children are treated from left to right and before parent. For a bottom-up walk that visits the entire tree (i.e. "complete walk"), do: RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { treat nclist } */ static const NCList *next_bottom_up() { NCListWalkingStackElt *stack_elt; const NCList *parent_nclist; if (NCList_walking_stack_depth == 0) return NULL; stack_elt = peek_NCListWalkingStackElt(); stack_elt->n++; parent_nclist = stack_elt->parent_nclist; if (stack_elt->n < parent_nclist->nchildren) { /* Move down thru the next children. */ return move_down(GET_NCLIST(stack_elt)); } /* All children have been treated --> move 1 level up. */ NCList_walking_stack_depth--; return parent_nclist; } /**************************************************************************** * Test the top-down and bottom-up non-recursive walks on an NCList structure */ /* static void print_NCList_walking_stack() { int d; printf("NCList_walking_stack:"); for (d = 0; d < NCList_walking_stack_depth; d++) printf(" %d", NCList_walking_stack[d].n); printf("\n"); return; } static void print_NCList_node(const NCList *nclist, int depth) { int d, n; for (d = 0; d < depth; d++) printf("-"); printf(" "); printf("NCList node at address %p:\n", nclist); for (d = 0; d < depth; d++) printf("-"); printf(" "); printf(" buflength=%d; nchildren=%d\n", nclist->buflength, nclist->nchildren); for (d = 0; d < depth; d++) printf("-"); printf(" "); printf(" rgidbuf:"); for (n = 0; n < nclist->nchildren; n++) printf(" %d", nclist->rgidbuf[n]); printf("\n"); return; } static void print_NCList_rec(const NCList *nclist, int depth) { int n; print_NCList_node(nclist, depth); for (n = 0; n < nclist->nchildren; n++) print_NCList_rec(nclist->childrenbuf + n, depth + 1); return; } static void test_complete_top_down_walk(const NCList *top_nclist) { const NCList *nclist; printf("======= START complete top-down walk ========\n"); RESET_NCLIST_WALKING_STACK(); for (nclist = top_nclist; nclist != NULL; nclist = next_top_down(nclist)) { print_NCList_walking_stack(); print_NCList_node(nclist, NCList_walking_stack_depth); printf("\n"); fflush(stdout); } printf("======== END complete top-down walk =========\n"); return; } static void test_complete_bottom_up_walk(const NCList *top_nclist) { const NCList *nclist; printf("======= START complete bottom-up walk =======\n"); RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { print_NCList_walking_stack(); print_NCList_node(nclist, NCList_walking_stack_depth); printf("\n"); fflush(stdout); } printf("======== END complete bottom-up walk ========\n"); return; } */ /**************************************************************************** * free_NCList() */ static void free_NCList(const NCList *top_nclist) { const NCList *nclist; /* Complete bottom-up walk. */ RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { if (nclist->buflength != 0) { free(nclist->childrenbuf); free(nclist->rgidbuf); } } return; } /**************************************************************************** * NCList_new() and NCList_free() */ /* --- .Call ENTRY POINT --- */ SEXP NCList_new() { NCList *top_nclist; //init_clock("preprocessing: T1 = "); top_nclist = (NCList *) malloc(sizeof(NCList)); if (top_nclist == NULL) error("NCList_new: memory allocation failed"); init_NCList(top_nclist); return R_MakeExternalPtr(top_nclist, R_NilValue, R_NilValue); } /* --- .Call ENTRY POINT --- */ SEXP NCList_free(SEXP nclist_xp) { NCList *top_nclist; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("NCList_free: pointer to NCList struct is NULL"); free_NCList(top_nclist); free(top_nclist); R_SetExternalPtrAddr(nclist_xp, NULL); return R_NilValue; } /**************************************************************************** * NCList_build() */ static void extend_NCList(NCList *nclist) { int old_buflength, new_buflength; NCList *new_childrenbuf; int *new_rgidbuf; old_buflength = nclist->buflength; if (old_buflength == 0) { new_buflength = 1; } else { if (old_buflength < 256) new_buflength = 16 * old_buflength; else if (old_buflength < 131072) new_buflength = 8 * old_buflength; else if (old_buflength < 8388608) new_buflength = 4 * old_buflength; else if (old_buflength < 134217728) new_buflength = 2 * old_buflength; else new_buflength = old_buflength + 67108864; } new_childrenbuf = (NCList *) realloc2(nclist->childrenbuf, new_buflength, old_buflength, sizeof(NCList)); new_rgidbuf = (int *) realloc2(nclist->rgidbuf, new_buflength, old_buflength, sizeof(int)); nclist->buflength = new_buflength; nclist->childrenbuf = new_childrenbuf; nclist->rgidbuf = new_rgidbuf; return; } typedef struct NCList_building_stack_elt_t { NCList *nclist; int rgid; /* range ID */ } NCListBuildingStackElt; static NCListBuildingStackElt *NCList_building_stack = NULL; static int NCList_building_stack_maxdepth = 0; static NCListBuildingStackElt append_NCList_elt(NCList *landing_nclist, int rgid) { int nchildren; NCListBuildingStackElt stack_elt; nchildren = landing_nclist->nchildren; if (nchildren == landing_nclist->buflength) extend_NCList(landing_nclist); stack_elt.nclist = landing_nclist->childrenbuf + nchildren; stack_elt.rgid = landing_nclist->rgidbuf[nchildren] = rgid; init_NCList(stack_elt.nclist); landing_nclist->nchildren++; return stack_elt; } static void extend_NCList_building_stack() { int new_maxdepth; new_maxdepth = get_new_maxdepth(NCList_building_stack_maxdepth); NCList_building_stack = (NCListBuildingStackElt *) realloc2(NCList_building_stack, new_maxdepth, NCList_building_stack_maxdepth, sizeof(NCListBuildingStackElt)); NCList_building_stack_maxdepth = new_maxdepth; return; } static void build_NCList(NCList *top_nclist, const int *x_start_p, const int *x_end_p, const int *x_subset_p, int x_len) { int *base, rgid, retcode, i, d, current_end; NCList *landing_nclist; NCListBuildingStackElt stack_elt; /* Compute the order of 'x' (or its subset) in 'base'. The sorting is first by ascending start then by descending end. */ base = (int *) malloc(sizeof(int) * x_len); if (base == NULL) error("build_NCList: memory allocation failed"); if (x_subset_p == NULL) { for (rgid = 0; rgid < x_len; rgid++) base[rgid] = rgid; } else { memcpy(base, x_subset_p, sizeof(int) * x_len); } retcode = sort_int_pairs(base, x_len, x_start_p, x_end_p, 0, 1, 1, NULL, NULL); if (retcode != 0) { free(base); error("build_NCList: memory allocation failed"); } init_NCList(top_nclist); for (i = 0, d = -1; i < x_len; i++) { rgid = base[i]; current_end = x_end_p[rgid]; while (d >= 0 && x_end_p[NCList_building_stack[d].rgid] < current_end) d--; // unstack landing_nclist = d == -1 ? top_nclist : NCList_building_stack[d].nclist; // append 'rgid' to landing_nclist stack_elt = append_NCList_elt(landing_nclist, rgid); // put stack_elt on stack if (++d == NCList_building_stack_maxdepth) extend_NCList_building_stack(); NCList_building_stack[d] = stack_elt; } free(base); return; } /* --- .Call ENTRY POINT --- */ SEXP NCList_build(SEXP nclist_xp, SEXP x_start, SEXP x_end, SEXP x_subset) { NCList *top_nclist; int x_len; const int *x_start_p, *x_end_p, *x_subset_p; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("NCList_build: pointer to NCList struct is NULL"); x_len = check_integer_pairs(x_start, x_end, &x_start_p, &x_end_p, "start(x)", "end(x)"); if (x_subset == R_NilValue) { x_subset_p = NULL; } else { x_subset_p = INTEGER(x_subset); x_len = LENGTH(x_subset); } build_NCList(top_nclist, x_start_p, x_end_p, x_subset_p, x_len); return nclist_xp; } /**************************************************************************** * new_NCListAsINTSXP_from_NCList() */ /* * Setting an arbitrary hard limit on the max depth of NCListAsINTSXP objects * to prevent C stack overflows when walking on them recursively (e.g. with * print_NCListAsINTSXP_rec() or NCListAsINTSXP_get_y_overlaps_rec()). * A better solution would be to not use recursive code at all when traversing * an NCListAsINTSXP object. Then NCListAsINTSXP objects of arbitrary depth * could be supported and it wouldn't be necessary to set the limit below. */ #define NCListAsINTSXP_MAX_DEPTH 100000 #define NCListAsINTSXP_NCHILDREN(nclist) ((nclist)[0]) #define NCListAsINTSXP_RGIDS(nclist) ((nclist) + 1) #define NCListAsINTSXP_OFFSETS(nclist) \ ((nclist) + 1 + NCListAsINTSXP_NCHILDREN(nclist)) static int compute_NCListAsINTSXP_length(const NCList *top_nclist) { unsigned int ans_len; const NCList *nclist; int nchildren; ans_len = 0U; /* Complete bottom-up walk (top-down walk would also work). */ RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { if (NCList_walking_stack_depth > NCListAsINTSXP_MAX_DEPTH) error("compute_NCListAsINTSXP_length: " "NCList object is too deep (has more " "than\n %d levels of nested ranges)", NCListAsINTSXP_MAX_DEPTH); nchildren = nclist->nchildren; if (nchildren == 0) continue; ans_len += 1U + 2U * (unsigned int) nchildren; if (ans_len > INT_MAX) error("compute_NCListAsINTSXP_length: " "NCList object is too big to fit in " "an integer vector"); } return (int) ans_len; } /* Recursive! */ static int dump_NCList_to_int_array_rec(const NCList *nclist, int *out) { int nchildren, offset, dump_len, n; const NCList *child_nclist; const int *rgid_p; nchildren = nclist->nchildren; if (nchildren == 0) return 0; offset = 1 + 2 * nchildren; NCListAsINTSXP_NCHILDREN(out) = nchildren; for (n = 0, child_nclist = nclist->childrenbuf, rgid_p = nclist->rgidbuf; n < nchildren; n++, child_nclist++, rgid_p++) { NCListAsINTSXP_RGIDS(out)[n] = *rgid_p; dump_len = dump_NCList_to_int_array_rec(child_nclist, out + offset); NCListAsINTSXP_OFFSETS(out)[n] = dump_len != 0 ? offset : -1; offset += dump_len; } return offset; } /* --- .Call ENTRY POINT --- */ SEXP new_NCListAsINTSXP_from_NCList(SEXP nclist_xp) { SEXP ans; const NCList *top_nclist; int ans_len; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("new_NCListAsINTSXP_from_NCList: " "pointer to NCList struct is NULL"); ans_len = compute_NCListAsINTSXP_length(top_nclist); PROTECT(ans = NEW_INTEGER(ans_len)); dump_NCList_to_int_array_rec(top_nclist, INTEGER(ans)); UNPROTECT(1); //print_elapsed_time(); return ans; } /**************************************************************************** * NCListAsINTSXP_print() */ /* Recursive! Print 1 line per range in 'nclist'. Return max depth. */ static int print_NCListAsINTSXP_rec(const int *nclist, const int *x_start_p, const int *x_end_p, int depth, const char *format) { int maxdepth, nchildren, n, d, rgid, offset, tmp; maxdepth = depth; nchildren = NCListAsINTSXP_NCHILDREN(nclist); for (n = 0; n < nchildren; n++) { for (d = 1; d < depth; d++) Rprintf("|"); rgid = NCListAsINTSXP_RGIDS(nclist)[n]; Rprintf(format, rgid + 1); Rprintf(": [%d, %d]\n", x_start_p[rgid], x_end_p[rgid]); offset = NCListAsINTSXP_OFFSETS(nclist)[n]; if (offset != -1) { tmp = print_NCListAsINTSXP_rec(nclist + offset, x_start_p, x_end_p, depth + 1, format); if (tmp > maxdepth) maxdepth = tmp; } } return maxdepth; } /* --- .Call ENTRY POINT --- */ SEXP NCListAsINTSXP_print(SEXP x_nclist, SEXP x_start, SEXP x_end) { const int *top_nclist; int x_len, max_digits, maxdepth; const int *x_start_p, *x_end_p; char format[10]; top_nclist = INTEGER(x_nclist); x_len = check_integer_pairs(x_start, x_end, &x_start_p, &x_end_p, "start(x)", "end(x)"); if (x_len == 0) { maxdepth = 0; } else { max_digits = (int) log10((double) x_len) + 1; sprintf(format, "%c0%d%c", '%', max_digits, 'd'); maxdepth = print_NCListAsINTSXP_rec(top_nclist, x_start_p, x_end_p, 1, format); } Rprintf("max depth = %d\n", maxdepth); return R_NilValue; } /**************************************************************************** * pp_find_overlaps() */ /* 6 supported types of overlap. */ #define TYPE_ANY 1 #define TYPE_START 2 #define TYPE_END 3 #define TYPE_WITHIN 4 #define TYPE_EXTEND 5 #define TYPE_EQUAL 6 typedef struct backpack_t { /* Members set by prepare_backpack(). */ const int *x_start_p; const int *x_end_p; const int *x_space_p; int maxgap; int minoverlap; int overlap_type; int min_overlap_score0; int (*is_hit_fun)(int rgid, const struct backpack_t *backpack); int select_mode; int circle_len; int pp_is_q; IntAE *hits; int *direct_out; /* Members set by update_backpack(). */ int y_rgid; int y_start; int y_end; int y_space; int min_x_end; int max_x_start; } Backpack; static int overlap_score0(int x_start, int x_end, int y_start, int y_end) { return (x_end <= y_end ? x_end : y_end) - (x_start >= y_start ? x_start : y_start); } static int is_TYPE_ANY_hit(int rgid, const Backpack *backpack) { int x_start, x_end; if (backpack->minoverlap == 0) return 1; /* Check the score */ x_start = backpack->x_start_p[rgid]; x_end = backpack->x_end_p[rgid]; return x_end - x_start >= backpack->min_overlap_score0; } static int is_TYPE_START_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the starts. */ x_start = backpack->x_start_p[rgid]; d = abs(backpack->y_start - x_start); if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; x_end = backpack->x_end_p[rgid]; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_TYPE_END_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the ends. */ x_end = backpack->x_end_p[rgid]; d = abs(backpack->y_end - x_end); if (backpack->circle_len != NA_INTEGER) d %= backpack->circle_len; if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; x_start = backpack->x_start_p[rgid]; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_TYPE_WITHIN_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d; if (backpack->maxgap == 0) return 1; x_start = backpack->x_start_p[rgid]; x_end = backpack->x_end_p[rgid]; d = backpack->y_start - x_start + x_end - backpack->y_end; return d <= backpack->maxgap; } static int is_TYPE_EXTEND_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d1, d2; x_start = backpack->x_start_p[rgid]; d1 = x_start - backpack->y_start; if (d1 < 0) return 0; x_end = backpack->x_end_p[rgid]; d2 = backpack->y_end - x_end; if (d2 < 0) return 0; if (x_end - x_start < backpack->min_overlap_score0) return 0; if (backpack->maxgap == 0) return 1; return d1 + d2 <= backpack->maxgap; } static int is_TYPE_EQUAL_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the starts. */ x_start = backpack->x_start_p[rgid]; d = abs(backpack->y_start - x_start); if (d > backpack->maxgap) return 0; /* Check the distance between the ends. */ x_end = backpack->x_end_p[rgid]; d = abs(backpack->y_end - x_end); if (backpack->circle_len != NA_INTEGER) d %= backpack->circle_len; if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_hit(int rgid, const Backpack *backpack) { int x_space; /* 1st: perform checks common to all types of overlaps */ if (backpack->x_space_p != NULL && backpack->y_space != 0) { x_space = backpack->x_space_p[rgid]; if (x_space != 0 && x_space != backpack->y_space) return 0; } /* 2nd: perform checks specific to the current type of overlaps (by calling the callback function for this type) */ return backpack->is_hit_fun(rgid, backpack); } static void report_hit(int rgid, const Backpack *backpack) { int rgid1, q_rgid, s_rgid1, *selection_p; rgid1 = rgid + 1; /* 1-based */ if (backpack->select_mode == ALL_HITS) { /* Report the hit. */ IntAE_insert_at(backpack->hits, IntAE_get_nelt(backpack->hits), rgid1); return; } /* Update current selection if necessary. */ if (backpack->pp_is_q) { q_rgid = rgid; s_rgid1 = backpack->y_rgid + 1; } else { q_rgid = backpack->y_rgid; s_rgid1 = rgid1; } selection_p = backpack->direct_out + q_rgid; if (backpack->select_mode == COUNT_HITS) { (*selection_p)++; return; } if (*selection_p == NA_INTEGER || (backpack->select_mode == FIRST_HIT) == (s_rgid1 < *selection_p)) *selection_p = s_rgid1; return; } static Backpack prepare_backpack(const int *x_start_p, const int *x_end_p, const int *x_space_p, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, int pp_is_q, IntAE *hits, int *direct_out) { Backpack backpack; backpack.x_start_p = x_start_p; backpack.x_end_p = x_end_p; backpack.x_space_p = x_space_p; backpack.maxgap = maxgap; backpack.minoverlap = minoverlap; backpack.overlap_type = overlap_type; if (overlap_type == TYPE_ANY) backpack.min_overlap_score0 = minoverlap - maxgap - 2; else backpack.min_overlap_score0 = minoverlap - 1; /* set callback function for the current type of overlaps */ switch (overlap_type) { case TYPE_ANY: backpack.is_hit_fun = is_TYPE_ANY_hit; break; case TYPE_START: backpack.is_hit_fun = is_TYPE_START_hit; break; case TYPE_END: backpack.is_hit_fun = is_TYPE_END_hit; break; case TYPE_WITHIN: backpack.is_hit_fun = is_TYPE_WITHIN_hit; break; case TYPE_EXTEND: backpack.is_hit_fun = is_TYPE_EXTEND_hit; break; case TYPE_EQUAL: backpack.is_hit_fun = is_TYPE_EQUAL_hit; break; } backpack.select_mode = select_mode; backpack.circle_len = circle_len; backpack.pp_is_q = pp_is_q; backpack.hits = hits; backpack.direct_out = direct_out; return backpack; } static void update_backpack(Backpack *backpack, int y_rgid, int y_start, int y_end, int y_space) { int slack, min_x_end, max_x_start, min_overlap_score0; backpack->y_rgid = y_rgid; backpack->y_start = y_start; backpack->y_end = y_end; backpack->y_space = y_space; /* set 'min_x_end' and 'max_x_start' */ if (backpack->overlap_type == TYPE_ANY) { if (backpack->minoverlap == 0) { slack = backpack->maxgap + 1; } else { slack = 1 - backpack->minoverlap; } backpack->min_x_end = y_start - slack; backpack->max_x_start = y_end + slack; return; } if (backpack->overlap_type == TYPE_WITHIN) { backpack->min_x_end = backpack->y_end; backpack->max_x_start = backpack->y_start; return; } if (backpack->overlap_type == TYPE_EXTEND || backpack->minoverlap != 0 || backpack->circle_len != NA_INTEGER) { min_overlap_score0 = backpack->min_overlap_score0; backpack->min_x_end = y_start + min_overlap_score0; backpack->max_x_start = y_end - min_overlap_score0; if (backpack->overlap_type == TYPE_EXTEND) return; } /* TYPE_START, TYPE_END, or TYPE_EQUAL */ /* min_x_end */ if (backpack->overlap_type == TYPE_START) { /* TYPE_START */ if (backpack->minoverlap == 0) backpack->min_x_end = y_start - backpack->maxgap - 1; } else if (backpack->circle_len == NA_INTEGER) { /* TYPE_END or TYPE_EQUAL */ min_x_end = y_end - backpack->maxgap; if (backpack->minoverlap == 0 || min_x_end > backpack->min_x_end) backpack->min_x_end = min_x_end; } /* max_x_start */ if (backpack->overlap_type == TYPE_END) { /* TYPE_END */ if (backpack->minoverlap == 0) backpack->max_x_start = y_end + backpack->maxgap + 1; //} else if (backpack->circle_len == NA_INTEGER) { } else { /* TYPE_START or TYPE_EQUAL */ max_x_start = y_start + backpack->maxgap; if (backpack->minoverlap == 0 || max_x_start < backpack->max_x_start) backpack->max_x_start = max_x_start; } //printf("y_start=%d y_end=%d min_x_end=%d max_x_start=%d\n", // y_start, y_end, backpack->min_x_end, backpack->max_x_start); return; } static void shift_y(Backpack *backpack, int shift) { backpack->y_start += shift; backpack->y_end += shift; backpack->min_x_end += shift; backpack->max_x_start += shift; return; } typedef void (*GetYOverlapsFunType)(const void *x_nclist, const Backpack *backpack); static void pp_find_overlaps( const int *q_start_p, const int *q_end_p, const int *q_space_p, const int *q_subset_p, int q_len, const int *s_start_p, const int *s_end_p, const int *s_space_p, const int *s_subset_p, int s_len, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, const void *pp, int pp_is_q, GetYOverlapsFunType get_y_overlaps_fun, IntAE *qh_buf, IntAE *sh_buf, int *direct_out) { const int *x_start_p, *x_end_p, *x_space_p, *y_start_p, *y_end_p, *y_space_p, *y_subset_p; int y_len, backpack_select_mode, i, j, y_start, y_end, old_nhit, new_nhit, k; IntAE *xh_buf, *yh_buf; Backpack backpack; if (q_len == 0 || s_len == 0) return; if (pp_is_q) { x_start_p = q_start_p; x_end_p = q_end_p; x_space_p = q_space_p; xh_buf = qh_buf; y_start_p = s_start_p; y_end_p = s_end_p; y_space_p = s_space_p; y_subset_p = s_subset_p; y_len = s_len; yh_buf = sh_buf; if (overlap_type == TYPE_WITHIN) overlap_type = TYPE_EXTEND; else if (overlap_type == TYPE_EXTEND) overlap_type = TYPE_WITHIN; } else { x_start_p = s_start_p; x_end_p = s_end_p; x_space_p = s_space_p; xh_buf = sh_buf; y_start_p = q_start_p; y_end_p = q_end_p; y_space_p = q_space_p; y_subset_p = q_subset_p; y_len = q_len; yh_buf = qh_buf; } if (circle_len != NA_INTEGER && select_mode == COUNT_HITS) backpack_select_mode = ALL_HITS; else backpack_select_mode = select_mode; backpack = prepare_backpack(x_start_p, x_end_p, x_space_p, maxgap, minoverlap, overlap_type, backpack_select_mode, circle_len, pp_is_q, xh_buf, direct_out); for (i = 0; i < y_len; i++) { j = y_subset_p == NULL ? i : y_subset_p[i]; y_start = y_start_p[j]; y_end = y_end_p[j]; if (y_end - y_start < backpack.min_overlap_score0) continue; update_backpack(&backpack, j, y_start, y_end, y_space_p == NULL ? 0 : y_space_p[j]); /* pass 0 */ get_y_overlaps_fun(pp, &backpack); if (circle_len == NA_INTEGER) goto life_is_good; if (select_mode == ARBITRARY_HIT && !pp_is_q && direct_out[j] != NA_INTEGER) goto life_is_good; /* pass 1 */ shift_y(&backpack, - circle_len); get_y_overlaps_fun(pp, &backpack); if (select_mode == ARBITRARY_HIT && !pp_is_q && direct_out[j] != NA_INTEGER) goto life_is_good; /* pass 2 */ shift_y(&backpack, 2 * circle_len); get_y_overlaps_fun(pp, &backpack); life_is_good: if (backpack_select_mode != ALL_HITS) continue; old_nhit = IntAE_get_nelt(yh_buf); if (circle_len != NA_INTEGER) { /* delete duplicates */ IntAE_qsort(xh_buf, old_nhit, 0); IntAE_uniq(xh_buf, old_nhit); } new_nhit = IntAE_get_nelt(xh_buf); if (select_mode != COUNT_HITS) { j++; /* 1-based */ for (k = old_nhit; k < new_nhit; k++) IntAE_insert_at(yh_buf, k, j); continue; } if (pp_is_q) { for (k = old_nhit; k < new_nhit; k++) direct_out[xh_buf->elts[k] - 1]++; } else { direct_out[j] += new_nhit - old_nhit; } IntAE_set_nelt(xh_buf, old_nhit); } return; } /**************************************************************************** * int_bsearch() */ /* * 'subset_len' is assumed to be > 0. * Return the first index 'n' for which 'base[subset[n]] >= min', or * 'subset_len' if there is no such index. * TODO: Maybe move this to int_utils.c or sort_utils.c in S4Vectors/src/ */ static int int_bsearch(const int *subset, int subset_len, const int *base, int min) { int n1, n2, n, b; /* Check first element. */ n1 = 0; b = base[subset[n1]]; if (b >= min) return n1; /* Check last element. */ n2 = subset_len - 1; b = base[subset[n2]]; if (b < min) return subset_len; if (b == min) return n2; /* Binary search. Seems that using >> 1 instead of / 2 is faster, even when compiling with 'gcc -O2' (one would hope that the optimizer is able to do that kind of optimization). */ while ((n = (n1 + n2) >> 1) != n1) { b = base[subset[n]]; if (b == min) return n; if (b < min) n1 = n; else n2 = n; } return n2; } /**************************************************************************** * NCList_get_y_overlaps() */ /* Recursive! */ static void NCList_get_y_overlaps_rec(const NCList *x_nclist, const Backpack *backpack) { const int *rgidbuf; int nchildren, n, rgid; const NCList *child_nclist; rgidbuf = x_nclist->rgidbuf; nchildren = x_nclist->nchildren; n = int_bsearch(rgidbuf, nchildren, backpack->x_end_p, backpack->min_x_end); for (child_nclist = x_nclist->childrenbuf + n, rgidbuf = rgidbuf + n; n < nchildren; n++, child_nclist++, rgidbuf++) { rgid = *rgidbuf; if (backpack->x_start_p[rgid] > backpack->max_x_start) break; if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) break; } if (child_nclist->nchildren != 0) NCList_get_y_overlaps_rec(child_nclist, backpack); } return; } static int find_landing_child(const NCList *nclist, const Backpack *backpack) { int nchildren, n; nchildren = nclist->nchildren; if (nchildren == 0) return -1; n = int_bsearch(nclist->rgidbuf, nchildren, backpack->x_end_p, backpack->min_x_end); if (n >= nchildren) return -1; return n; } /* Non-recursive version of NCList_get_y_overlaps_rec(). */ static void NCList_get_y_overlaps(const NCList *top_nclist, const Backpack *backpack) { int n, rgid; const NCList *nclist; NCListWalkingStackElt *stack_elt; /* Incomplete top-down walk: only a pruned version of the full tree (i.e. a subtree starting at the same top node) will be visited. */ RESET_NCLIST_WALKING_STACK(); n = find_landing_child(top_nclist, backpack); if (n < 0) return; nclist = move_to_child(top_nclist, n); while (nclist != NULL) { stack_elt = peek_NCListWalkingStackElt(); rgid = GET_RGID(stack_elt); if (backpack->x_start_p[rgid] > backpack->max_x_start) { /* Skip all further siblings of 'nclist'. */ nclist = move_to_right_uncle(); continue; } if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) return; /* we're done! */ } n = find_landing_child(nclist, backpack); /* Skip first 'n' or all children of 'nclist'. */ nclist = n >= 0 ? move_to_child(nclist, n) : move_to_right_sibling_or_uncle(nclist); } return; } /**************************************************************************** * NCListAsINTSXP_get_y_overlaps() */ /* Recursive! */ static void NCListAsINTSXP_get_y_overlaps_rec(const int *x_nclist, const Backpack *backpack) { const int *rgid_p, *offset_p; int nchildren, n, rgid, offset; rgid_p = NCListAsINTSXP_RGIDS(x_nclist); nchildren = NCListAsINTSXP_NCHILDREN(x_nclist); n = int_bsearch(rgid_p, nchildren, backpack->x_end_p, backpack->min_x_end); for (rgid_p = rgid_p + n, offset_p = NCListAsINTSXP_OFFSETS(x_nclist) + n; n < nchildren; n++, rgid_p++, offset_p++) { rgid = *rgid_p; if (backpack->x_start_p[rgid] > backpack->max_x_start) break; if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) break; } offset = *offset_p; if (offset != -1) NCListAsINTSXP_get_y_overlaps_rec(x_nclist + offset, backpack); } return; } /**************************************************************************** * find_overlaps() */ static int find_overlaps( const int *q_start_p, const int *q_end_p, const int *q_space_p, const int *q_subset_p, int q_len, const int *s_start_p, const int *s_end_p, const int *s_space_p, const int *s_subset_p, int s_len, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, SEXP nclist_sxp, int pp_is_q, IntAE *qh_buf, IntAE *sh_buf, int *direct_out) { NCList nclist; const void *pp; GetYOverlapsFunType get_y_overlaps_fun; if (q_len == 0 || s_len == 0) return 0; if (nclist_sxp == R_NilValue) { /* On-the-fly preprocessing. */ pp_is_q = q_len < s_len; if (pp_is_q) build_NCList(&nclist, q_start_p, q_end_p, q_subset_p, q_len); else build_NCList(&nclist, s_start_p, s_end_p, s_subset_p, s_len); pp = &nclist; get_y_overlaps_fun = (GetYOverlapsFunType) NCList_get_y_overlaps; } else { pp = INTEGER(nclist_sxp); get_y_overlaps_fun = (GetYOverlapsFunType) NCListAsINTSXP_get_y_overlaps_rec; } pp_find_overlaps( q_start_p, q_end_p, q_space_p, q_subset_p, q_len, s_start_p, s_end_p, s_space_p, s_subset_p, s_len, maxgap, minoverlap, overlap_type, select_mode, circle_len, pp, pp_is_q, get_y_overlaps_fun, qh_buf, sh_buf, direct_out); if (nclist_sxp == R_NilValue) free_NCList(&nclist); return pp_is_q; } /**************************************************************************** * Helper functions shared by NCList_find_overlaps() and * NCList_find_overlaps_in_groups() */ static int get_overlap_type(SEXP type) { const char *type0; if (!IS_CHARACTER(type) || LENGTH(type) != 1) error("'type' must be a single string"); type = STRING_ELT(type, 0); if (type == NA_STRING) error("'type' cannot be NA"); type0 = CHAR(type); if (strcmp(type0, "any") == 0) return TYPE_ANY; if (strcmp(type0, "start") == 0) return TYPE_START; if (strcmp(type0, "end") == 0) return TYPE_END; if (strcmp(type0, "within") == 0) return TYPE_WITHIN; if (strcmp(type0, "extend") == 0) return TYPE_EXTEND; if (strcmp(type0, "equal") == 0) return TYPE_EQUAL; error("'type' must be \"any\", \"start\", \"end\", " "\"within\", \"extend\", or \"equal\""); return 0; } static int get_maxgap0(SEXP maxgap, int overlap_type) { int maxgap0; if (!IS_INTEGER(maxgap) || LENGTH(maxgap) != 1) error("'maxgap' must be a single integer"); maxgap0 = INTEGER(maxgap)[0]; if (maxgap0 == NA_INTEGER) error("'maxgap' cannot be NA"); if (maxgap0 < -1) error("'maxgap' must be >= -1"); if (maxgap0 == -1 && overlap_type != TYPE_ANY) maxgap0 = 0; return maxgap0; } static int get_minoverlap0(SEXP minoverlap, int maxgap, int overlap_type) { int minoverlap0; if (!IS_INTEGER(minoverlap) || LENGTH(minoverlap) != 1) error("'minoverlap' must be a single integer"); minoverlap0 = INTEGER(minoverlap)[0]; if (minoverlap0 == NA_INTEGER) error("'minoverlap' cannot be NA"); if (minoverlap0 < 0) error("'minoverlap' cannot be negative"); if (overlap_type == TYPE_ANY && maxgap != -1 && minoverlap0 != 0) error("when 'type' is \"any\", at least one of 'maxgap' " "and 'minoverlap' must be set to its default value"); return minoverlap0; } static int get_circle_length(SEXP circle_length) { int circle_len; if (!IS_INTEGER(circle_length) || LENGTH(circle_length) != 1) error("'circle_length' must be a single integer"); circle_len = INTEGER(circle_length)[0]; if (circle_len != NA_INTEGER && circle_len <= 0) error("'circle_length' must be a single " "positive integer or NA"); return circle_len; } static SEXP new_direct_out(int q_len, int select_mode) { SEXP ans; int init_val, i, *ans_elt; PROTECT(ans = NEW_INTEGER(q_len)); init_val = select_mode == COUNT_HITS ? 0 : NA_INTEGER; for (i = 0, ans_elt = INTEGER(ans); i < q_len; i++, ans_elt++) *ans_elt = init_val; UNPROTECT(1); return ans; } /**************************************************************************** * NCList_find_overlaps() * * --- .Call ENTRY POINT --- * Args: * q_start, q_end: Integer vectors of same length. * s_start, s_end: Integer vectors of same length. * nclist: An integer vector representing the Nested Containment * List for 'y'. * nclist_is_q: TRUE or FALSE. * maxgap: See get_maxgap0() C function. * minoverlap: See get_minoverlap0() C function. * type: See get_overlap_type() C function. * select: See _get_select_mode() C function in S4Vectors. * circle_length: A single positive integer or NA_INTEGER. */ SEXP NCList_find_overlaps( SEXP q_start, SEXP q_end, SEXP s_start, SEXP s_end, SEXP nclist, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length) { int q_len, s_len, maxgap0, minoverlap0, overlap_type, select_mode, circle_len, *direct_out, pp_is_q; const int *q_start_p, *q_end_p, *s_start_p, *s_end_p; IntAE *qh_buf, *sh_buf; SEXP ans; q_len = check_integer_pairs(q_start, q_end, &q_start_p, &q_end_p, "start(q)", "end(q)"); s_len = check_integer_pairs(s_start, s_end, &s_start_p, &s_end_p, "start(s)", "end(s)"); overlap_type = get_overlap_type(type); maxgap0 = get_maxgap0(maxgap, overlap_type); minoverlap0 = get_minoverlap0(minoverlap, maxgap0, overlap_type); select_mode = get_select_mode(select); circle_len = get_circle_length(circle_length); qh_buf = new_IntAE(0, 0, 0); sh_buf = new_IntAE(0, 0, 0); direct_out = NULL; if (select_mode != ALL_HITS) { PROTECT(ans = new_direct_out(q_len, select_mode)); direct_out = INTEGER(ans); } //init_clock("find_overlaps: T2 = "); pp_is_q = find_overlaps( q_start_p, q_end_p, NULL, NULL, q_len, s_start_p, s_end_p, NULL, NULL, s_len, maxgap0, minoverlap0, overlap_type, select_mode, circle_len, nclist, LOGICAL(nclist_is_q)[0], qh_buf, sh_buf, direct_out); //print_elapsed_time(); if (select_mode != ALL_HITS) { UNPROTECT(1); return ans; } return new_Hits(qh_buf->elts, sh_buf->elts, IntAE_get_nelt(qh_buf), q_len, s_len, !pp_is_q); } /**************************************************************************** * NCList_find_overlaps_in_groups() * * --- .Call ENTRY POINT --- * Args: * q_start, q_end, q_space: Integer vectors of same length (or NULL for * 'q_space'). * q_groups: A CompressedIntegerList object of length NG1. Each list * element (integer vector) represents a group of 0-based * indices into 'q_start', 'q_end', and 'q_space'. * s_start, s_end, s_space: Integer vectors of same length (or NULL for * 's_space'). * s_groups: A CompressedIntegerList object of length NG2. Each list * element (integer vector) represents a group of 0-based * indices into 's_start', 's_end', and 's_space'. * nclists: A list of length >= min(NG1, NG2). Each list element must * be NULL or an integer vector representing a Nested * Containment List. * nclist_is_q: A logical vector parallel to 'nclists'. * maxgap: See get_maxgap0() C function. * minoverlap: See get_minoverlap0() C function. * type: See get_overlap_type() C function. * select: See _get_select_mode() C function in S4Vectors. * circle_length: An integer vector of length >= min(NG1, NG2) with positive * or NA values. */ SEXP NCList_find_overlaps_in_groups( SEXP q_start, SEXP q_end, SEXP q_space, SEXP q_groups, SEXP s_start, SEXP s_end, SEXP s_space, SEXP s_groups, SEXP nclists, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length) { int q_len, s_len, NG1, NG2, maxgap0, minoverlap0, overlap_type, select_mode, NG, i, qi_len, si_len, *direct_out; const int *q_start_p, *q_end_p, *q_space_p, *s_start_p, *s_end_p, *s_space_p; CompressedIntsList_holder q_groups_holder, s_groups_holder; Ints_holder qi_group_holder, si_group_holder; IntAE *qh_buf, *sh_buf; SEXP ans; /* Check query. */ q_len = check_integer_pairs(q_start, q_end, &q_start_p, &q_end_p, "q_start", "q_end"); if (q_space == R_NilValue) { q_space_p = NULL; } else { if (LENGTH(q_space) != q_len) error("'q_space' must have the length of 'q_start'"); q_space_p = INTEGER(q_space); } q_groups_holder = _hold_CompressedIntegerList(q_groups); NG1 = _get_length_from_CompressedIntsList_holder(&q_groups_holder); /* Check subject. */ s_len = check_integer_pairs(s_start, s_end, &s_start_p, &s_end_p, "s_start", "s_end"); if (s_space == R_NilValue) { s_space_p = NULL; } else { if (LENGTH(s_space) != s_len) error("'s_space' must have the length of 's_start'"); s_space_p = INTEGER(s_space); } s_groups_holder = _hold_CompressedIntegerList(s_groups); NG2 = _get_length_from_CompressedIntsList_holder(&s_groups_holder); overlap_type = get_overlap_type(type); maxgap0 = get_maxgap0(maxgap, overlap_type); minoverlap0 = get_minoverlap0(minoverlap, maxgap0, overlap_type); select_mode = get_select_mode(select); qh_buf = new_IntAE(0, 0, 0); sh_buf = new_IntAE(0, 0, 0); direct_out = NULL; if (select_mode != ALL_HITS) { PROTECT(ans = new_direct_out(q_len, select_mode)); direct_out = INTEGER(ans); } NG = NG1 <= NG2 ? NG1 : NG2; for (i = 0; i < NG; i++) { qi_group_holder = _get_elt_from_CompressedIntsList_holder( &q_groups_holder, i); qi_len = qi_group_holder.length; si_group_holder = _get_elt_from_CompressedIntsList_holder( &s_groups_holder, i); si_len = si_group_holder.length; find_overlaps( q_start_p, q_end_p, q_space_p, qi_group_holder.ptr, qi_len, s_start_p, s_end_p, s_space_p, si_group_holder.ptr, si_len, maxgap0, minoverlap0, overlap_type, select_mode, INTEGER(circle_length)[i], VECTOR_ELT(nclists, i), LOGICAL(nclist_is_q)[i], qh_buf, sh_buf, direct_out); } if (select_mode != ALL_HITS) { UNPROTECT(1); return ans; } return new_Hits(qh_buf->elts, sh_buf->elts, IntAE_get_nelt(qh_buf), q_len, s_len, 0); } /**************************************************************************** Algorithm complexity ==================== X: length of object to preprocess Y: length of other object H: nb of hits (upper bound is X * Y) Time of preprocessing: T1 = a * X * log(X) Time of find_overlaps(..., select="all"): T2 = b * Y * log(X) + c * H Total time T is T1 + T2. ****************************************************************************/ IRanges/src/R_init_IRanges.c0000644000175400017540000001220413175724757016740 0ustar00biocbuildbiocbuild#include "IRanges.h" #define CALLMETHOD_DEF(fun, numArgs) {#fun, (DL_FUNC) &fun, numArgs} #define REGISTER_CCALLABLE(fun) \ R_RegisterCCallable("IRanges", #fun, (DL_FUNC) &fun) static const R_CallMethodDef callMethods[] = { /* Ranges_class.c */ CALLMETHOD_DEF(valid_Ranges, 3), /* Ranges_comparison.c */ CALLMETHOD_DEF(Ranges_pcompare, 4), /* IRanges_class.c */ CALLMETHOD_DEF(IRanges_isNormal, 1), CALLMETHOD_DEF(IRanges_from_integer, 1), CALLMETHOD_DEF(NormalIRanges_from_logical, 1), /* IRanges_constructor.c */ CALLMETHOD_DEF(solve_user_SEW0, 3), CALLMETHOD_DEF(solve_user_SEW, 6), /* Grouping_class.c */ CALLMETHOD_DEF(H2LGrouping_members, 2), CALLMETHOD_DEF(H2LGrouping_vmembers, 2), /* RleViews_utils.c */ CALLMETHOD_DEF(RleViews_viewMins, 2), CALLMETHOD_DEF(RleViews_viewMaxs, 2), CALLMETHOD_DEF(RleViews_viewSums, 2), CALLMETHOD_DEF(RleViews_viewMeans, 2), CALLMETHOD_DEF(RleViews_viewWhichMins, 2), CALLMETHOD_DEF(RleViews_viewWhichMaxs, 2), /* SimpleIRangesList_class.c */ CALLMETHOD_DEF(SimpleIRangesList_isNormal, 2), CALLMETHOD_DEF(SimpleNormalIRangesList_min, 1), CALLMETHOD_DEF(SimpleNormalIRangesList_max, 1), /* CompressedIRangesList_class.c */ CALLMETHOD_DEF(CompressedIRangesList_isNormal, 2), CALLMETHOD_DEF(CompressedIRangesList_summary, 1), CALLMETHOD_DEF(CompressedNormalIRangesList_min, 2), CALLMETHOD_DEF(CompressedNormalIRangesList_max, 2), /* inter_range_methods.c */ CALLMETHOD_DEF(IRanges_range, 1), CALLMETHOD_DEF(Ranges_reduce, 6), CALLMETHOD_DEF(CompressedIRangesList_reduce, 4), CALLMETHOD_DEF(IRanges_gaps, 4), CALLMETHOD_DEF(CompressedIRangesList_gaps, 3), CALLMETHOD_DEF(Ranges_disjointBins, 2), /* coverage_methods.c */ CALLMETHOD_DEF(IRanges_coverage, 6), CALLMETHOD_DEF(CompressedIRangesList_coverage, 6), /* NCList.c */ CALLMETHOD_DEF(NCList_new, 0), CALLMETHOD_DEF(NCList_free, 1), CALLMETHOD_DEF(NCList_build, 4), CALLMETHOD_DEF(new_NCListAsINTSXP_from_NCList, 1), CALLMETHOD_DEF(NCListAsINTSXP_print, 3), CALLMETHOD_DEF(NCList_find_overlaps, 11), CALLMETHOD_DEF(NCList_find_overlaps_in_groups, 15), /* CompressedAtomicList_utils.c */ CALLMETHOD_DEF(CompressedLogicalList_sum, 2), CALLMETHOD_DEF(CompressedIntegerList_sum, 2), CALLMETHOD_DEF(CompressedNumericList_sum, 2), CALLMETHOD_DEF(CompressedLogicalList_prod, 2), CALLMETHOD_DEF(CompressedIntegerList_prod, 2), CALLMETHOD_DEF(CompressedNumericList_prod, 2), CALLMETHOD_DEF(CompressedLogicalList_min, 2), CALLMETHOD_DEF(CompressedLogicalList_which_min, 1), CALLMETHOD_DEF(CompressedIntegerList_min, 2), CALLMETHOD_DEF(CompressedIntegerList_which_min, 1), CALLMETHOD_DEF(CompressedNumericList_min, 2), CALLMETHOD_DEF(CompressedNumericList_which_min, 1), CALLMETHOD_DEF(CompressedLogicalList_max, 2), CALLMETHOD_DEF(CompressedLogicalList_which_max, 1), CALLMETHOD_DEF(CompressedIntegerList_max, 2), CALLMETHOD_DEF(CompressedIntegerList_which_max, 1), CALLMETHOD_DEF(CompressedNumericList_max, 2), CALLMETHOD_DEF(CompressedNumericList_which_max, 1), CALLMETHOD_DEF(CompressedLogicalList_is_unsorted, 3), CALLMETHOD_DEF(CompressedIntegerList_is_unsorted, 3), CALLMETHOD_DEF(CompressedNumericList_is_unsorted, 3), {NULL, NULL, 0} }; void R_init_IRanges(DllInfo *info) { R_registerRoutines(info, NULL, callMethods, NULL, NULL); /* Ranges_comparison.c */ REGISTER_CCALLABLE(_overlap_code); REGISTER_CCALLABLE(_invert_overlap_code); /* IRanges_class.c */ REGISTER_CCALLABLE(_get_IRanges_start); REGISTER_CCALLABLE(_get_IRanges_width); REGISTER_CCALLABLE(_get_IRanges_names); REGISTER_CCALLABLE(_get_IRanges_length); REGISTER_CCALLABLE(_hold_IRanges); REGISTER_CCALLABLE(_get_length_from_IRanges_holder); REGISTER_CCALLABLE(_get_width_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_start_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_end_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_names_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_linear_subset_from_IRanges_holder); REGISTER_CCALLABLE(_set_IRanges_names); REGISTER_CCALLABLE(_copy_IRanges_slots); REGISTER_CCALLABLE(_new_IRanges); REGISTER_CCALLABLE(_new_IRanges_from_IntPairAE); REGISTER_CCALLABLE(_new_list_of_IRanges_from_IntPairAEAE); REGISTER_CCALLABLE(_alloc_IRanges); /* Grouping_class.c */ REGISTER_CCALLABLE(_get_H2LGrouping_high2low); REGISTER_CCALLABLE(_get_H2LGrouping_low2high); REGISTER_CCALLABLE(_get_Partitioning_names); REGISTER_CCALLABLE(_get_PartitioningByEnd_end); REGISTER_CCALLABLE(_new_PartitioningByEnd); /* CompressedList_class.c */ REGISTER_CCALLABLE(_get_CompressedList_unlistData); REGISTER_CCALLABLE(_get_CompressedList_partitioning); REGISTER_CCALLABLE(_get_CompressedList_length); REGISTER_CCALLABLE(_get_CompressedList_names); REGISTER_CCALLABLE(_new_CompressedList); REGISTER_CCALLABLE(_hold_CompressedIntegerList); REGISTER_CCALLABLE(_get_length_from_CompressedIntsList_holder); REGISTER_CCALLABLE(_get_elt_from_CompressedIntsList_holder); /* CompressedIRangesList_class.c */ REGISTER_CCALLABLE(_hold_CompressedIRangesList); REGISTER_CCALLABLE(_get_length_from_CompressedIRangesList_holder); REGISTER_CCALLABLE(_get_elt_from_CompressedIRangesList_holder); REGISTER_CCALLABLE(_get_eltNROWS_from_CompressedIRangesList_holder); return; } IRanges/src/Ranges_class.c0000644000175400017540000000452213175724757016514 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of Ranges objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /* * --- .Call ENTRY POINT --- * Doesn't raise an error but returns NULL or a single string describing the * first encountered validity failure. */ SEXP valid_Ranges(SEXP x_start, SEXP x_end, SEXP x_width) { static char validity_failures[200]; int x_len, i, tmp; const int *x_start_p, *x_end_p, *x_width_p; if (!IS_INTEGER(x_start) || !IS_INTEGER(x_end) || !IS_INTEGER(x_width)) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' must be integer vectors", "start(x)", "end(x)", "width(x)"); goto failure; } x_len = LENGTH(x_start); if (LENGTH(x_end) != x_len || LENGTH(x_width) != x_len) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' must have the same length", "start(x)", "end(x)", "width(x)"); goto failure; } for (i = 0, x_start_p = INTEGER(x_start), x_end_p = INTEGER(x_end), x_width_p = INTEGER(x_width); i < x_len; i++, x_start_p++, x_end_p++, x_width_p++) { if (*x_start_p == NA_INTEGER || *x_end_p == NA_INTEGER || *x_width_p == NA_INTEGER) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' cannot contain NAs", "start(x)", "end(x)", "width(x)"); goto failure; } if (*x_width_p < 0) { snprintf(validity_failures, sizeof(validity_failures), "'%s' cannot contain negative integers", "width(x)"); goto failure; } /* Safe because NA_INTEGER == INT_MIN (see R_ext/Arith.h) */ tmp = *x_start_p - 1; /* The purpose of the 1st part of the test (the part before ||) is to avoid an integer overflow during the 2nd part of the test (the part after ||). */ if (tmp > INT_MAX - *x_width_p || tmp + *x_width_p != *x_end_p) { snprintf(validity_failures, sizeof(validity_failures), "'%s[i] - %s[i] != %s[i] + 1' for i = %d", "end(x)", "start(x)", "width(x)", i + 1); goto failure; } } return R_NilValue; failure: return mkString(validity_failures); } IRanges/src/Ranges_comparison.c0000644000175400017540000001552113175724757017562 0ustar00biocbuildbiocbuild/**************************************************************************** * Range-wise comparison of 2 Ranges objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * Generalized comparison of 2 integer ranges. * * There are 13 different ways 2 integer ranges x and y can be positioned * with respect to each other. They are summarized in the following table * together with the codes we assign them: * * numeric code & | numeric code & * 1-letter code & | 1-letter code & * long code | long code * --------------- --------------- | --------------- --------------- * x: .oooo....... -6 'a' "x y" | x: .......oooo. 6 'm' "y x" * y: .......oooo. | y: .oooo....... * --------------- --------------- | --------------- --------------- * x: ..oooo...... -5 'b' "xy" | x: ......oooo.. 5 'l' "yx" * y: ......oooo.. | y: ..oooo...... * --------------- --------------- | --------------- --------------- * x: ...oooo..... -4 'c' "x=y" | x: .....oooo... 4 'k' "y=x" * y: .....oooo... | y: ...oooo..... * --------------- --------------- | --------------- --------------- * x: ...oooooo... -3 'd' "x=" | x: .....oooo... 3 'j' "y=" * y: .....oooo... | y: ...oooooo... * --------------- --------------- | --------------- --------------- * x: ..oooooooo.. -2 'e' "x=x" | x: ....oooo.... 2 'i' "y=y" * y: ....oooo.... | y: ..oooooooo.. * --------------- --------------- | --------------- --------------- * x: ...oooo..... -1 'f' "=y" | x: ...oooooo... 1 'h' "=x" * y: ...oooooo... | y: ...oooo..... * --------------- --------------------------------- --------------- * \ x: ...oooooo... 0 'g' "=" / * \ y: ...oooooo... / * \---------------------------------/ * Notes: * o This way of comparing ranges is a refinement over the standard ranges * comparison defined by the ==, !=, <=, >=, < and > operators. In * particular a numeric code that is < 0, = 0, or > 0 corresponds to * x < y, x == y, or x > y, respectively. * o In this file we use the term "overlap" in a loose way even when there * is actually no overlap between ranges x and y. Real overlaps correspond * to numeric codes >= -4 and <= 4, and to long codes that contain an * equal ("="). * o Long codes are designed to be user-friendly whereas numeric and * 1-letter codes are designed to be more compact and memory efficient. * Typically the formers will be exposed to the end-user and translated * internally into the latters. * o Swapping x and y changes the sign of the corresponding numeric code and * substitutes "x" by "y" and "y" by "x" in the corresponding long code. * o Reflecting ranges x and y relative to an arbitrary position (i.e. doing * a symetry with respect to a vertical axis) has the effect of reversing * the associated long code e.g. "x=y" becomes "y=x". The effect on the * numeric code is implemented by the _invert_overlap_code() function. * * 'x_start', 'x_width', 'y_start' and 'y_width' are assumed to be non NA (not * checked). 'x_start' and 'y_start' must be 1-based. 'x_width' and 'y_width' * are assumed to be >= 0 (not checked). */ int _overlap_code(int x_start, int x_width, int y_start, int y_width) { int x_end_plus1, y_end_plus1; x_end_plus1 = x_start + x_width; if (x_end_plus1 < y_start) return -6; if (x_end_plus1 == y_start) { if (x_width == 0 && y_width == 0) return 0; return -5; } y_end_plus1 = y_start + y_width; if (y_end_plus1 < x_start) return 6; if (y_end_plus1 == x_start) return 5; if (x_start < y_start) { if (x_end_plus1 < y_end_plus1) return -4; if (x_end_plus1 == y_end_plus1) return -3; return -2; } if (x_start == y_start) { if (x_end_plus1 < y_end_plus1) return -1; if (x_end_plus1 == y_end_plus1) return 0; return 1; } if (x_end_plus1 < y_end_plus1) return 2; if (x_end_plus1 == y_end_plus1) return 3; return 4; } int _invert_overlap_code(int code) { if (code == -2 || code == 0 || code == 2) return code; if (code <= -4 || code >= 4) return - code; /* Only possible values left: -3, -1, 1, 3 */ return code < 0 ? code + 4 : code - 4; } /* Vectorized comparison of 2 vectors of ranges. */ static void pcompare_ranges( const int *x_start, const int *x_width, int x_len, const int *y_start, const int *y_width, int y_len, int *out, int out_len, int with_warning) { int i, j, k; for (i = j = k = 0; k < out_len; i++, j++, k++) { if (i >= x_len) i = 0; /* recycle i */ if (j >= y_len) j = 0; /* recycle j */ out[k] = _overlap_code(x_start[i], x_width[i], y_start[j], y_width[j]); } /* This warning message is meaningful only when 'out_len' is 'max(x_len, y_len)' and is consistent with the warning we get from binary arithmetic/comparison operations on numeric vectors. */ if (with_warning && out_len != 0 && (i != x_len || j != y_len)) warning("longer object length is not a multiple " "of shorter object length"); return; } /* --- .Call ENTRY POINT --- * 'x_start' and 'x_width': integer vectors of the same length M. * 'y_start' and 'y_width': integer vectors of the same length N. * The 4 integer vectors are assumed to be NA free and 'x_width' and * 'y_width' are assumed to contain non-negative values. For efficiency * reasons, those assumptions are not checked. * If M != N then the shorter object is recycled to the length of the longer * object, except if M or N is 0 in which case the object with length != 0 is * truncated to length 0. */ SEXP Ranges_pcompare(SEXP x_start, SEXP x_width, SEXP y_start, SEXP y_width) { int x_len, y_len, ans_len; const int *x_start_p, *x_width_p, *y_start_p, *y_width_p; SEXP ans; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); y_len = check_integer_pairs(y_start, y_width, &y_start_p, &y_width_p, "start(y)", "width(y)"); if (x_len == 0 || y_len == 0) ans_len = 0; else ans_len = x_len >= y_len ? x_len : y_len; PROTECT(ans = NEW_INTEGER(ans_len)); pcompare_ranges(x_start_p, x_width_p, x_len, y_start_p, y_width_p, y_len, INTEGER(ans), ans_len, 1); UNPROTECT(1); return ans; } IRanges/src/RleViews_utils.c0000644000175400017540000005171013175724757017071 0ustar00biocbuildbiocbuild#include "IRanges.h" #include #include #include #define R_INT_MIN (1+INT_MIN) /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewMins(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = INT_MAX; } else if (type == 'r') { REAL(ans)[i] = R_PosInf; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else if (INTEGER(values)[index] < INTEGER(ans)[i]) { INTEGER(ans)[i] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else if (REAL(values)[index] < REAL(ans)[i]) { REAL(ans)[i] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewMaxs(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = R_INT_MIN; } else if (type == 'r') { REAL(ans)[i] = R_NegInf; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else if (INTEGER(values)[index] > INTEGER(ans)[i]) { INTEGER(ans)[i] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else if (REAL(values)[index] > REAL(ans)[i]) { REAL(ans)[i] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewSums(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case CPLXSXP: type = 'c'; PROTECT(ans = NEW_COMPLEX(ans_len)); break; default: error("Rle must contain either 'integer', 'numeric', or 'complex' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = 0; } else if (type == 'r') { REAL(ans)[i] = 0; } else if (type == 'c') { COMPLEX(ans)[i].r = 0; COMPLEX(ans)[i].i = 0; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else { INTEGER(ans)[i] += INTEGER(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (INTEGER(ans)[i] != NA_INTEGER && (INTEGER(ans)[i] > INT_MAX || INTEGER(ans)[i] < R_INT_MIN)) error("Integer overflow"); } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else { REAL(ans)[i] += REAL(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'c') { while (lower_run <= upper_bound) { if (ISNAN(COMPLEX(values)[index].r) || ISNAN(COMPLEX(values)[index].i)) { if (!LOGICAL(na_rm)[0]) { COMPLEX(ans)[i].r = NA_REAL; COMPLEX(ans)[i].i = NA_REAL; break; } } else { COMPLEX(ans)[i].r += COMPLEX(values)[index].r * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); COMPLEX(ans)[i].i += COMPLEX(values)[index].i * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewMeans(SEXP x, SEXP na_rm) { char type = '?'; int i, n, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case CPLXSXP: type = 'c'; PROTECT(ans = NEW_COMPLEX(ans_len)); break; default: error("Rle must contain either 'integer', 'numeric', or 'complex' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (width <= 0) { if (type == 'i') { REAL(ans)[i] = R_NaN; } else if (type == 'r') { REAL(ans)[i] = R_NaN; } else if (type == 'c') { COMPLEX(ans)[i].r = R_NaN; COMPLEX(ans)[i].i = R_NaN; } } else { n = width; if (type == 'i') { REAL(ans)[i] = 0; } else if (type == 'r') { REAL(ans)[i] = 0; } else if (type == 'c') { COMPLEX(ans)[i].r = 0; COMPLEX(ans)[i].i = 0; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { REAL(ans)[i] += INTEGER(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { REAL(ans)[i] = R_NaN; } else if (REAL(ans)[i] != NA_REAL) { REAL(ans)[i] /= n; } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { REAL(ans)[i] += REAL(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { REAL(ans)[i] = R_NaN; } else if (REAL(ans)[i] != NA_REAL) { REAL(ans)[i] /= n; } } else if (type == 'c') { while (lower_run <= upper_bound) { if (ISNAN(COMPLEX(values)[index].r) || ISNAN(COMPLEX(values)[index].i)) { if (!LOGICAL(na_rm)[0]) { COMPLEX(ans)[i].r = NA_REAL; COMPLEX(ans)[i].i = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { COMPLEX(ans)[i].r += COMPLEX(values)[index].r * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); COMPLEX(ans)[i].i += COMPLEX(values)[index].i * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { COMPLEX(ans)[i].r = R_NaN; COMPLEX(ans)[i].i = R_NaN; } else if (COMPLEX(ans)[i].r != NA_REAL) { COMPLEX(ans)[i].r /= n; COMPLEX(ans)[i].i /= n; } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewWhichMins(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *ans_elt, *lengths_elt; SEXP curr, ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); curr = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(curr = NEW_INTEGER(1)); break; case REALSXP: type = 'r'; PROTECT(curr = NEW_NUMERIC(1)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); PROTECT(ans = NEW_INTEGER(ans_len)); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0, ans_elt = INTEGER(ans); i < ans_len; i++, ans_elt++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); *ans_elt = NA_INTEGER; if (width > 0) { if (type == 'i') { INTEGER(curr)[0] = INT_MAX; } else if (type == 'r') { REAL(curr)[0] = R_PosInf; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { break; } } else if (INTEGER(values)[index] < INTEGER(curr)[0]) { *ans_elt = lower_bound; INTEGER(curr)[0] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { break; } } else if (REAL(values)[index] < REAL(curr)[0]) { *ans_elt = lower_bound; REAL(curr)[0] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(3); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP RleViews_viewWhichMaxs(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *ans_elt, *lengths_elt; SEXP curr, ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); curr = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(curr = NEW_INTEGER(1)); break; case REALSXP: type = 'r'; PROTECT(curr = NEW_NUMERIC(1)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); PROTECT(ans = NEW_INTEGER(ans_len)); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0, ans_elt = INTEGER(ans); i < ans_len; i++, ans_elt++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); *ans_elt = NA_INTEGER; if (width > 0) { if (type == 'i') { INTEGER(curr)[0] = R_INT_MIN; } else if (type == 'r') { REAL(curr)[0] = R_NegInf; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { break; } } else if (INTEGER(values)[index] > INTEGER(curr)[0]) { *ans_elt = lower_bound; INTEGER(curr)[0] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { break; } } else if (REAL(values)[index] > REAL(curr)[0]) { *ans_elt = lower_bound; REAL(curr)[0] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(3); return ans; } IRanges/src/S4Vectors_stubs.c0000644000175400017540000000003613175724757017160 0ustar00biocbuildbiocbuild#include "_S4Vectors_stubs.c" IRanges/src/SimpleRangesList_class.c0000644000175400017540000000436613175724757020530 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of SimpleRangesList objects * ****************************************************************************/ #include "IRanges.h" #include #define R_INT_MIN (1+INT_MIN) /* * --- .Call ENTRY POINT --- */ SEXP SimpleIRangesList_isNormal(SEXP x, SEXP use_names) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, i; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_LOGICAL(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); LOGICAL(ans)[i] = _is_normal_IRanges_holder(&ir_holder); } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP SimpleNormalIRangesList_min(SEXP x) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = INT_MAX; } else { *ans_elt = _get_start_elt_from_IRanges_holder(&ir_holder, 0); } } PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP SimpleNormalIRangesList_max(SEXP x) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = R_INT_MIN; } else { *ans_elt = _get_end_elt_from_IRanges_holder(&ir_holder, ir_len - 1); } } PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(2); return ans; } IRanges/src/coverage_methods.c0000644000175400017540000005707313175724757017437 0ustar00biocbuildbiocbuild/**************************************************************************** * * * Weighted coverage of a set of integer ranges * * -------------------------------------------- * * * * Authors: H. Pag\`es and P. Aboyoun * * Code for "sort" method based on timing enhancements * * by Charles C. Berry * * * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include /* for qsort() */ #include /* for R_CheckUserInterrupt() */ static const char *x_label, *shift_label, *width_label, *weight_label; static void check_recycling_was_round(int last_pos_in_current, int current_len, const char *current_label, const char *target_label) { if (current_len >= 2 && last_pos_in_current < current_len) warning("'%s' length is not a divisor of '%s' length", current_label, target_label); return; } /**************************************************************************** * "sort" method * ****************************************************************************/ /**************************************************************************** * Basic manipulation of the SEids buffer (Start/End ids). */ #define SEid_TO_1BASED_INDEX(SEid) ((SEid) >= 0 ? (SEid) : -(SEid)) #define SEid_IS_END(SEid) ((SEid) >= 0) static const int *base_start; static const int *base_width; static int compar_SEids_for_asc_order(const void *p1, const void *p2) { int SEid1, SEid2, index1, index2, s1, s2; SEid1 = *((const int *) p1); SEid2 = *((const int *) p2); index1 = SEid_TO_1BASED_INDEX(SEid1); index2 = SEid_TO_1BASED_INDEX(SEid2); /* If SEid is a Start id, then s = start If SEid is an End id, then s = end + 1 */ s1 = base_start[index1]; if (SEid_IS_END(SEid1)) s1 += base_width[index1]; s2 = base_start[index2]; if (SEid_IS_END(SEid2)) s2 += base_width[index2]; return s1 - s2; } /* Initialize the SEids buffer (integer weights). */ static int init_SEids_int_weight(int *SEids, const int *x_width, int x_len, const int *weight, int weight_len) { int SEids_len, i, j, index; SEids_len = 0; for (i = j = 0, index = 1; i < x_len; i++, j++, index++) { if (j >= weight_len) j = 0; /* recycle j */ if (x_width[i] == 0 || weight[j] == 0) continue; *(SEids++) = index; /* Start id */ *(SEids++) = - index; /* End id */ SEids_len += 2; } check_recycling_was_round(j, weight_len, weight_label, x_label); return SEids_len; } /* Initialize the SEids buffer (numeric weights). */ static int init_SEids_double_weight(int *SEids, const int *x_width, int x_len, const double *weight, int weight_len) { int SEids_len, i, j, index; SEids_len = 0; for (i = j = 0, index = 1; i < x_len; i++, j++, index++) { if (j >= weight_len) j = 0; /* recycle j */ if (x_width[i] == 0 || weight[j] == 0.0) continue; *(SEids++) = index; /* Start id */ *(SEids++) = - index; /* End id */ SEids_len += 2; } check_recycling_was_round(j, weight_len, weight_label, x_label); return SEids_len; } /* Sort the SEids buffer. */ static void sort_SEids(int *SEids, int SEids_len, const int *x_start, const int *x_width) { base_start = x_start - 1; base_width = x_width - 1; qsort(SEids, SEids_len, sizeof(int), compar_SEids_for_asc_order); return; } /**************************************************************************** * int_coverage_sort(), double_coverage_sort() */ /* 'values_buf' and 'lengths_buf' must have a length >= SEids_len + 1 */ static void compute_int_coverage_in_bufs(const int *SEids, int SEids_len, const int *x_start, const int *x_width, const int *weight, int weight_len, int cvg_len, int *values_buf, int *lengths_buf) { int curr_val, curr_weight, curr_pos, i, prev_pos, index; *(values_buf++) = curr_val = 0; curr_pos = 1; reset_ovflow_flag(); /* we use safe_int_add() in loop below */ for (i = 0; i < SEids_len; i++, SEids++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); prev_pos = curr_pos; index = SEid_TO_1BASED_INDEX(*SEids) - 1; curr_pos = x_start[index]; curr_weight = weight[index % weight_len]; if (SEid_IS_END(*SEids)) { curr_weight = - curr_weight; curr_pos += x_width[index]; } curr_val = safe_int_add(curr_val, curr_weight); *(values_buf++) = curr_val; *(lengths_buf++) = curr_pos - prev_pos; } if (get_ovflow_flag()) warning("NAs produced by integer overflow"); *lengths_buf = cvg_len + 1 - curr_pos; return; } static void compute_double_coverage_in_bufs(const int *SEids, int SEids_len, const int *x_start, const int *x_width, const double *weight, int weight_len, int cvg_len, double *values_buf, int *lengths_buf) { double curr_val, curr_weight; int curr_pos, i, prev_pos, index; *(values_buf++) = curr_val = 0.0; curr_pos = 1; for (i = 0; i < SEids_len; i++, SEids++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); prev_pos = curr_pos; index = SEid_TO_1BASED_INDEX(*SEids) - 1; curr_pos = x_start[index]; curr_weight = weight[index % weight_len]; if (SEid_IS_END(*SEids)) { curr_weight = - curr_weight; curr_pos += x_width[index]; } curr_val += curr_weight; *(values_buf++) = curr_val; *(lengths_buf++) = curr_pos - prev_pos; } *lengths_buf = cvg_len + 1 - curr_pos; return; } static SEXP int_coverage_sort(const int *x_start, const int *x_width, int x_len, const int *weight, int weight_len, int cvg_len) { int *SEids, SEids_len, zero, buf_len, *values_buf, *lengths_buf; SEids = (int *) R_alloc((long) 2 * x_len, sizeof(int)); SEids_len = init_SEids_int_weight(SEids, x_width, x_len, weight, weight_len); if (SEids_len == 0) { //return an Rle with one run of 0's zero = 0; return construct_integer_Rle(1, &zero, &cvg_len, 0); } sort_SEids(SEids, SEids_len, x_start, x_width); buf_len = SEids_len + 1; values_buf = (int *) R_alloc((long) buf_len, sizeof(int)); lengths_buf = (int *) R_alloc((long) buf_len, sizeof(int)); compute_int_coverage_in_bufs(SEids, SEids_len, x_start, x_width, weight, weight_len, cvg_len, values_buf, lengths_buf); return construct_integer_Rle(buf_len, values_buf, lengths_buf, 0); } static SEXP double_coverage_sort(const int *x_start, const int *x_width, int x_len, const double *weight, int weight_len, int cvg_len) { int *SEids, SEids_len, buf_len, *lengths_buf; double zero, *values_buf; SEids = (int *) R_alloc((long) 2 * x_len, sizeof(int)); SEids_len = init_SEids_double_weight(SEids, x_width, x_len, weight, weight_len); if (SEids_len == 0) { //return an Rle with one run of 0's zero = 0.0; return construct_numeric_Rle(1, &zero, &cvg_len, 0); } sort_SEids(SEids, SEids_len, x_start, x_width); buf_len = SEids_len + 1; values_buf = (double *) R_alloc((long) buf_len, sizeof(double)); lengths_buf = (int *) R_alloc((long) buf_len, sizeof(int)); compute_double_coverage_in_bufs(SEids, SEids_len, x_start, x_width, weight, weight_len, cvg_len, values_buf, lengths_buf); return construct_numeric_Rle(buf_len, values_buf, lengths_buf, 0); } static SEXP coverage_sort(const int *x_start, const int *x_width, int x_len, SEXP weight, int cvg_len) { int weight_len; weight_len = LENGTH(weight); return IS_INTEGER(weight) ? int_coverage_sort(x_start, x_width, x_len, INTEGER(weight), weight_len, cvg_len) : double_coverage_sort(x_start, x_width, x_len, REAL(weight), weight_len, cvg_len); } /**************************************************************************** * "hash" method * ****************************************************************************/ static SEXP int_coverage_hash( const int *x_start, const int *x_width, int x_len, const int *weight, int weight_len, int cvg_len) { int *cvg_buf, *cvg_p, w, cumsum, i, j; cvg_buf = (int *) R_alloc((long) cvg_len + 1, sizeof(int)); memset(cvg_buf, 0, cvg_len * sizeof(int)); reset_ovflow_flag(); /* we use safe_int_add() in loop below */ for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ cvg_p = cvg_buf + *x_start - 1; w = weight[j]; *cvg_p = safe_int_add(*cvg_p, w); cvg_p += *x_width; *cvg_p = safe_int_add(*cvg_p, - w); } check_recycling_was_round(j, weight_len, weight_label, x_label); cumsum = 0; for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) { cumsum = safe_int_add(*cvg_p, cumsum); *cvg_p = cumsum; } if (get_ovflow_flag()) warning("NAs produced by integer overflow"); return construct_integer_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP double_coverage_hash( const int *x_start, const int *x_width, int x_len, const double *weight, int weight_len, int cvg_len) { double *cvg_buf, *cvg_p, w, cumsum; int i, j; cvg_buf = (double *) R_alloc((long) cvg_len + 1, sizeof(double)); for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) *cvg_p = 0.0; for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ cvg_p = cvg_buf + *x_start - 1; w = weight[j]; *cvg_p += w; cvg_p += *x_width; *cvg_p -= w; } check_recycling_was_round(j, weight_len, weight_label, x_label); cumsum = 0.0; for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) { cumsum += *cvg_p; *cvg_p = cumsum; } return construct_numeric_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP coverage_hash(const int *x_start, const int *x_width, int x_len, SEXP weight, int cvg_len) { int weight_len; weight_len = LENGTH(weight); return IS_INTEGER(weight) ? int_coverage_hash(x_start, x_width, x_len, INTEGER(weight), weight_len, cvg_len) : double_coverage_hash(x_start, x_width, x_len, REAL(weight), weight_len, cvg_len); } /**************************************************************************** * Helper functions for checking args of type SEXP. * * They either pass (and return nothing) or raise an error with an * * informative message. * ****************************************************************************/ static void check_arg_is_integer(SEXP arg, const char *arg_label) { if (!IS_INTEGER(arg)) error("'%s' must be an integer vector", arg_label); return; } static void check_arg_is_numeric(SEXP arg, const char *arg_label) { if (!(IS_INTEGER(arg) || IS_NUMERIC(arg))) error("'%s' must be an integer or numeric vector", arg_label); return; } static void check_arg_is_list(SEXP arg, const char *arg_label) { if (!isVectorList(arg)) error("'%s' must be a list", arg_label); return; } /* * Check that 'arg_len' is equal to 'x_len', or that it's the length of an * argument that can be recycled to the length of 'x'. * Assumes that 'arg_len' and 'x_len' are >= 0. */ static void check_arg_is_recyclable(int arg_len, int x_len, const char *arg_label, const char *x_label) { if (arg_len < x_len) { if (arg_len == 0) error("cannot recycle zero-length '%s' " "to the length of '%s'", arg_label, x_label); } else if (arg_len > x_len) { if (arg_len >= 2) error("'%s' is longer than '%s'", arg_label, x_label); } return; } /**************************************************************************** * compute_coverage_from_IRanges_holder() * ****************************************************************************/ /* * This is probably overly cautious. Could be that the cast from double to int * with (int) already does exactly this (i.e. produces an NA_INTEGER for all * the cases explicitely handled here) and is portable. */ static int double2int(double x) { if (x == R_PosInf || x == R_NegInf || ISNAN(x) /* NA or NaN */ || x >= (double) INT_MAX + 1.00 || x <= (double) INT_MIN) return NA_INTEGER; return (int) x; } /* * Args: * x_holder: A IRanges_holder struct holding the input ranges, those * ranges being those of a fictive IRanges object 'x'. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * circle_len: A single integer. NA or > 0. * After the input ranges are shifted: * - If 'width' is a non-negative integer, then the ranges are clipped with * respect to the [1, width] interval and the function returns 'width'. * - If 'width' is NA, then the ranges are clipped with respect to the * [1, +inf) interval (i.e. they're only clipped on the left) and the * function returns 'max(end(x))' or 0 if 'x' is empty. * The shifted and clipped ranges are returned in 'out_ranges'. * Let's call 'cvg_len' the value returned by the function. If the output * ranges are in a tiling configuration with respect to the [1, cvg_len] * interval (i.e. they're non-overlapping, ordered from left to right, and * they fully cover the interval), then '*out_ranges_are_tiles' is set to 1. * Otherwise, it's set to 0. */ static int shift_and_clip_ranges(const IRanges_holder *x_holder, SEXP shift, int width, int circle_len, IntPairAE *out_ranges, int *out_ranges_are_tiles) { int x_len, shift_len, cvg_len, auto_cvg_len, prev_end, i, j, x_start, x_end, shift_elt, tmp; x_len = _get_length_from_IRanges_holder(x_holder); /* Check 'shift'. */ check_arg_is_numeric(shift, shift_label); shift_len = LENGTH(shift); check_arg_is_recyclable(shift_len, x_len, shift_label, x_label); /* Infer 'cvg_len' from 'width' and 'circle_len'. */ *out_ranges_are_tiles = 1; if (width == NA_INTEGER) { auto_cvg_len = 1; } else if (width < 0) { error("'%s' cannot be negative", width_label); } else if (width == 0) { return width; } else if (circle_len == NA_INTEGER) { auto_cvg_len = 0; } else if (circle_len <= 0) { error("length of underlying circular sequence is <= 0"); } else if (width > circle_len) { error("'%s' cannot be greater than length of " "underlying circular sequence", width_label); } else { auto_cvg_len = 1; } cvg_len = auto_cvg_len ? 0 : width; if (x_len == 0) { if (cvg_len != 0) *out_ranges_are_tiles = 0; return cvg_len; } IntPairAE_set_nelt(out_ranges, 0); prev_end = 0; for (i = j = 0; i < x_len; i++, j++) { if (j >= shift_len) j = 0; /* recycle j */ x_start = _get_start_elt_from_IRanges_holder(x_holder, i); x_end = _get_end_elt_from_IRanges_holder(x_holder, i); if (IS_INTEGER(shift)) { shift_elt = INTEGER(shift)[j]; if (shift_elt == NA_INTEGER) error("'%s' contains NAs", shift_label); } else { shift_elt = double2int(REAL(shift)[j]); if (shift_elt == NA_INTEGER) error("'%s' contains NAs, NaNs, or numbers " "that cannot be turned into integers", shift_label); } /* Risk of integer overflow! */ x_start += shift_elt; x_end += shift_elt; if (circle_len != NA_INTEGER) { tmp = x_start % circle_len; if (tmp <= 0) tmp += circle_len; x_end += tmp - x_start; x_start = tmp; } if (x_end < 0) { x_end = 0; } else if (x_end > cvg_len) { if (auto_cvg_len) cvg_len = x_end; else x_end = cvg_len; } if (x_start < 1) x_start = 1; else if (x_start > (tmp = cvg_len + 1)) x_start = tmp; if (*out_ranges_are_tiles) { if (x_start == prev_end + 1) prev_end = x_end; else *out_ranges_are_tiles = 0; } IntPairAE_insert_at(out_ranges, i, x_start, x_end - x_start + 1); } check_recycling_was_round(j, shift_len, shift_label, x_label); if (*out_ranges_are_tiles && x_end != cvg_len) *out_ranges_are_tiles = 0; return cvg_len; } /* * Args: * x_holder: A IRanges_holder struct holding the input ranges, those * ranges being those of a fictive IRanges object 'x'. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * weight: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary). * circle_len: A single integer. NA or > 0. * method: Either "auto", "sort", or "hash". * Returns an Rle object. */ static SEXP compute_coverage_from_IRanges_holder( const IRanges_holder *x_holder, SEXP shift, int width, SEXP weight, int circle_len, SEXP method, IntPairAE *ranges_buf) { int x_len, cvg_len, out_ranges_are_tiles, weight_len, effective_method, take_short_path; const int *x_start, *x_width; const char *method0; x_len = _get_length_from_IRanges_holder(x_holder); cvg_len = shift_and_clip_ranges(x_holder, shift, width, circle_len, ranges_buf, &out_ranges_are_tiles); x_start = ranges_buf->a->elts; x_width = ranges_buf->b->elts; /* Check 'weight'. */ check_arg_is_numeric(weight, weight_label); weight_len = LENGTH(weight); check_arg_is_recyclable(weight_len, x_len, weight_label, x_label); /* Infer 'effective_method' from 'method' and 'cvg_len'. */ if (!IS_CHARACTER(method) || LENGTH(method) != 1) error("'method' must be a single string"); method = STRING_ELT(method, 0); if (method == NA_STRING) error("'method' cannot be NA"); method0 = CHAR(method); if (strcmp(method0, "auto") == 0) { /* Based on empirical observation. */ effective_method = x_len <= 0.25 * cvg_len ? 1 : 2; } else if (strcmp(method0, "sort") == 0) { effective_method = 1; } else if (strcmp(method0, "hash") == 0) { effective_method = 2; } else { error("'method' must be \"auto\", \"sort\", or \"hash\""); } //Rprintf("out_ranges_are_tiles = %d\n", out_ranges_are_tiles); //Rprintf("x_len = %d\n", x_len); //Rprintf("cvg_len = %d\n", cvg_len); if (out_ranges_are_tiles) { if (cvg_len == 0) { take_short_path = 1; x_len = 0; } else if (weight_len == 1) { take_short_path = 1; x_len = 1; x_width = &cvg_len; } else if (weight_len == x_len) { take_short_path = 1; } else { take_short_path = 0; } if (take_short_path) { /* Short path for the tiling case. */ //Rprintf("taking short path\n"); return IS_INTEGER(weight) ? construct_integer_Rle(x_len, INTEGER(weight), x_width, 0) : construct_numeric_Rle(x_len, REAL(weight), x_width, 0); } } //Rprintf("taking normal path\n"); return effective_method == 1 ? coverage_sort(x_start, x_width, x_len, weight, cvg_len) : coverage_hash(x_start, x_width, x_len, weight, cvg_len); } /* --- .Call ENTRY POINT --- * Args: * x: An IRanges object. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * weight: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary). * circle_len: A single integer. NA or > 0. * method: Either "auto", "sort", or "hash". * Returns an Rle object. */ SEXP IRanges_coverage(SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_len, SEXP method) { IRanges_holder x_holder; int x_len; IntPairAE *ranges_buf; x_holder = _hold_IRanges(x); x_len = _get_length_from_IRanges_holder(&x_holder); /* Check 'width'. */ check_arg_is_integer(width, "width"); if (LENGTH(width) != 1) error("'%s' must be a single integer", "width"); /* Check 'circle_len'. */ check_arg_is_integer(circle_len, "circle.length"); if (LENGTH(circle_len) != 1) error("'%s' must be a single integer", "circle.length"); ranges_buf = new_IntPairAE(x_len, 0); x_label = "x"; shift_label = "shift"; width_label = "width"; weight_label = "weight"; return compute_coverage_from_IRanges_holder(&x_holder, shift, INTEGER(width)[0], weight, INTEGER(circle_len)[0], method, ranges_buf); } /* --- .Call ENTRY POINT --- * Args: * x: A CompressedIRangesList object of length N. * shift: A list of length N (will get recycled if necessary). After * recycling, each list element must be a numeric (integer or * double) vector parallel to x[[i]] that will itself get * recycled if necessary, and with no NAs. * width: An integer vector of length N (will get recycled if * necessary). Values must be NAs or >= 0. * or a single non-negative number. * weight: A list of length N (will get recycled if necessary). After * recycling, each list element must be a numeric (integer or * double) vector parallel to x[[i]] that will itself get * recycled if necessary. * circle_lens: An integer vector of length N (will get recycled if * necessary). Values must be NAs or > 0. * method: Either "auto", "sort", or "hash". * Returns a list of N RleList objects. */ SEXP CompressedIRangesList_coverage(SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_lens, SEXP method) { CompressedIRangesList_holder x_holder; int x_len, shift_len, width_len, weight_len, circle_lens_len, i, j, k, l, m; IntPairAE *ranges_buf; SEXP ans, ans_elt, shift_elt, weight_elt; IRanges_holder x_elt_holder; char x_label_buf[40], shift_label_buf[40], width_label_buf[40], weight_label_buf[40]; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); /* Check 'shift'. */ check_arg_is_list(shift, "shift"); shift_len = LENGTH(shift); check_arg_is_recyclable(shift_len, x_len, "shift", "x"); /* Check 'width'. */ check_arg_is_integer(width, "width"); width_len = LENGTH(width); check_arg_is_recyclable(width_len, x_len, "width", "x"); /* Check 'weight'. */ check_arg_is_list(weight, "weight"); weight_len = LENGTH(weight); check_arg_is_recyclable(weight_len, x_len, "weight", "x"); /* Check 'circle_lens'. */ check_arg_is_integer(circle_lens, "circle.length"); circle_lens_len = LENGTH(circle_lens); check_arg_is_recyclable(circle_lens_len, x_len, "circle.length", "x"); ranges_buf = new_IntPairAE(0, 0); x_label = x_label_buf; shift_label = shift_label_buf; width_label = width_label_buf; weight_label = weight_label_buf; PROTECT(ans = NEW_LIST(x_len)); for (i = j = k = l = m = 0; i < x_len; i++, j++, k++, l++, m++) { if (j >= shift_len) j = 0; /* recycle j */ if (k >= width_len) k = 0; /* recycle k */ if (l >= weight_len) l = 0; /* recycle l */ if (m >= circle_lens_len) m = 0; /* recycle m */ snprintf(x_label_buf, sizeof(x_label_buf), "x[[%d]]", i + 1); snprintf(shift_label_buf, sizeof(shift_label_buf), "shift[[%d]]", j + 1); snprintf(width_label_buf, sizeof(width_label_buf), "width[%d]", k + 1); snprintf(weight_label_buf, sizeof(weight_label_buf), "weight[[%d]]", l + 1); x_elt_holder = _get_elt_from_CompressedIRangesList_holder( &x_holder, i); shift_elt = VECTOR_ELT(shift, j); weight_elt = VECTOR_ELT(weight, l); PROTECT(ans_elt = compute_coverage_from_IRanges_holder( &x_elt_holder, shift_elt, INTEGER(width)[k], weight_elt, INTEGER(circle_lens)[m], method, ranges_buf)); SET_VECTOR_ELT(ans, i, ans_elt); UNPROTECT(1); } check_recycling_was_round(j, shift_len, "shift", "x"); check_recycling_was_round(k, width_len, "width", "x"); check_recycling_was_round(l, weight_len, "weight", "x"); check_recycling_was_round(m, circle_lens_len, "circle.length", "x"); UNPROTECT(1); return ans; } IRanges/src/inter_range_methods.c0000644000175400017540000003367413175724757020142 0ustar00biocbuildbiocbuild/**************************************************************************** * Fast inter-range methods * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include #define R_INT_MIN (1+INT_MIN) /**************************************************************************** * Low-level helper functions. */ static int get_maxNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder) { int x_len, ir_len_max, i, ir_len; x_len = _get_length_from_CompressedIRangesList_holder(x_holder); ir_len_max = 0; for (i = 0; i < x_len; i++) { ir_len = _get_eltNROWS_from_CompressedIRangesList_holder( x_holder, i); if (ir_len > ir_len_max) ir_len_max = ir_len; } return ir_len_max; } static int append_IRanges_holder_to_IntPairAE(IntPairAE *intpair_ae, const IRanges_holder *ir_holder) { int ir_len, j, start, width; ir_len = _get_length_from_IRanges_holder(ir_holder); for (j = 0; j < ir_len; j++) { start = _get_start_elt_from_IRanges_holder(ir_holder, j); width = _get_width_elt_from_IRanges_holder(ir_holder, j); IntPairAE_insert_at(intpair_ae, IntPairAE_get_nelt(intpair_ae), start, width); } return ir_len; } /**************************************************************************** * "range" method. */ /* --- .Call ENTRY POINT --- */ SEXP IRanges_range(SEXP x) { int x_len, min, max, i, end; const int *start_p, *width_p; SEXP ans, ans_start, ans_width; x_len = _get_IRanges_length(x); if (x_len == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } start_p = INTEGER(_get_IRanges_start(x)); width_p = INTEGER(_get_IRanges_width(x)); min = *(start_p++); max = min + *(width_p++) - 1; for (i = 1; i < x_len; i++, start_p++, width_p++) { if (*start_p < min) min = *start_p; end = *start_p + *width_p - 1; if (end > max) max = end; } PROTECT(ans_start = ScalarInteger(min)); PROTECT(ans_width = ScalarInteger(max - min + 1)); PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } /**************************************************************************** * "ranges" methods. */ /* WARNING: The reduced ranges are *appended* to 'out_ranges'! Returns the number of ranges that were appended. */ static int reduce_ranges(const int *x_start, const int *x_width, int x_len, int drop_empty_ranges, int min_gapwidth, int *order_buf, IntPairAE *out_ranges, IntAEAE *revmap, int *out_inframe_start) { int out_len, out_len0, i, j, start_j, width_j, end_j, append_or_drop, max_end, gapwidth, delta, width_inc; IntAE *tmp, *revmap_elt; if (min_gapwidth < 0) error("IRanges internal error in reduce_ranges(): " "negative min_gapwidth not supported"); get_order_of_int_pairs(x_start, x_width, x_len, 0, 0, order_buf, 0); out_len = out_len0 = IntPairAE_get_nelt(out_ranges); for (i = 0; i < x_len; i++) { j = order_buf[i]; start_j = x_start[j]; width_j = x_width[j]; end_j = start_j + width_j - 1; if (i == 0) { /* 'append_or_drop' is a toggle that indicates how the current input range should be added to 'out_ranges': 1 for appended (or dropped), 0 for merged. */ append_or_drop = 1; max_end = end_j; delta = start_j - 1; } else { /* If 'i' != 0 and 'append_or_drop' is 1 then the previous range was empty so 'gapwidth' will be >= 0. */ gapwidth = start_j - max_end - 1; if (gapwidth >= min_gapwidth) append_or_drop = 1; } if (append_or_drop) { if (width_j != 0 || (!drop_empty_ranges && (out_len == out_len0 || start_j != out_ranges->a->elts[ out_len - 1]))) { /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, start_j, width_j); if (revmap != NULL) { /* Append to 'revmap'. */ tmp = new_IntAE(1, 1, j + 1); IntAEAE_insert_at(revmap, out_len, tmp); revmap_elt = revmap->elts[out_len]; } out_len++; append_or_drop = 0; } max_end = end_j; if (i != 0) delta += gapwidth; } else { width_inc = end_j - max_end; if (width_inc > 0) { /* Merge with last range in 'out_ranges'. */ out_ranges->b->elts[out_len - 1] += width_inc; max_end = end_j; } if (!(width_j == 0 && drop_empty_ranges) && revmap != NULL) { /* Append to 'revmap'. */ IntAE_insert_at(revmap_elt, IntAE_get_nelt(revmap_elt), j + 1); } } if (out_inframe_start != NULL) out_inframe_start[j] = start_j - delta; } return out_len - out_len0; } /* --- .Call ENTRY POINT --- */ SEXP Ranges_reduce(SEXP x_start, SEXP x_width, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap, SEXP with_inframe_start) { int x_len, *inframe_start; const int *x_start_p, *x_width_p; SEXP ans, ans_names, ans_revmap, ans_inframe_start; IntPairAE *out_ranges; IntAE *order_buf; IntAEAE *revmap; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); if (LOGICAL(with_revmap)[0]) { revmap = new_IntAEAE(0, 0); } else { revmap = NULL; } if (LOGICAL(with_inframe_start)[0]) { PROTECT(ans_inframe_start = NEW_INTEGER(x_len)); inframe_start = INTEGER(ans_inframe_start); } else { inframe_start = NULL; } out_ranges = new_IntPairAE(0, 0); order_buf = new_IntAE(x_len, 0, 0); reduce_ranges(x_start_p, x_width_p, x_len, LOGICAL(drop_empty_ranges)[0], INTEGER(min_gapwidth)[0], order_buf->elts, out_ranges, revmap, inframe_start); /* Make 'ans' */ PROTECT(ans = NEW_LIST(4)); PROTECT(ans_names = NEW_CHARACTER(4)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_STRING_ELT(ans_names, 2, mkChar("revmap")); SET_STRING_ELT(ans_names, 3, mkChar("inframe.start")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); if (revmap != NULL) { PROTECT(ans_revmap = new_LIST_from_IntAEAE(revmap, 0)); SET_VECTOR_ELT(ans, 2, ans_revmap); UNPROTECT(1); } if (inframe_start != NULL) { SET_VECTOR_ELT(ans, 3, ans_inframe_start); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP CompressedIRangesList_reduce(SEXP x, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap) { SEXP ans, ans_names, ans_revmap, ans_breakpoints; //ans_unlistData, ans_partitioning; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, in_len_max, i; IntAE *order_buf; IntPairAE *in_ranges, *out_ranges; IntAEAE *revmap; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); if (LOGICAL(with_revmap)[0]) { revmap = new_IntAEAE(0, 0); } else { revmap = NULL; } in_len_max = get_maxNROWS_from_CompressedIRangesList_holder(&x_holder); order_buf = new_IntAE(in_len_max, 0, 0); in_ranges = new_IntPairAE(0, 0); out_ranges = new_IntPairAE(0, 0); PROTECT(ans_breakpoints = NEW_INTEGER(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); IntPairAE_set_nelt(in_ranges, 0); append_IRanges_holder_to_IntPairAE(in_ranges, &ir_holder); reduce_ranges(in_ranges->a->elts, in_ranges->b->elts, IntPairAE_get_nelt(in_ranges), LOGICAL(drop_empty_ranges)[0], INTEGER(min_gapwidth)[0], order_buf->elts, out_ranges, revmap, NULL); INTEGER(ans_breakpoints)[i] = IntPairAE_get_nelt(out_ranges); } /* Make 'ans' */ PROTECT(ans = NEW_LIST(4)); PROTECT(ans_names = NEW_CHARACTER(4)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_STRING_ELT(ans_names, 2, mkChar("revmap")); SET_STRING_ELT(ans_names, 3, mkChar("breakpoints")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); if (revmap != NULL) { PROTECT(ans_revmap = new_LIST_from_IntAEAE(revmap, 0)); SET_VECTOR_ELT(ans, 2, ans_revmap); UNPROTECT(1); } SET_VECTOR_ELT(ans, 3, ans_breakpoints); UNPROTECT(2); /* PROTECT(ans_unlistData = _new_IRanges_from_IntPairAE("IRanges", out_ranges)); PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); PROTECT(ans_partitioning = _new_PartitioningByEnd( "PartitioningByEnd", ans_breakpoints, ans_names)); PROTECT(ans = _new_CompressedList(get_classname(x), ans_unlistData, ans_partitioning)); UNPROTECT(5); */ return ans; } /**************************************************************************** * "gaps" methods. */ /* WARNING: The ranges representing the gaps are *appended* to 'out_ranges'! Returns the number of ranges that were appended. */ static int gaps_ranges(const int *x_start, const int *x_width, int x_len, int restrict_start, int restrict_end, int *order_buf, IntPairAE *out_ranges) { int out_len, out_len0, i, j, start_j, width_j, end_j, max_end, gapstart, gapwidth; if (restrict_start != NA_INTEGER) max_end = restrict_start - 1; else max_end = NA_INTEGER; get_order_of_int_pairs(x_start, x_width, x_len, 0, 0, order_buf, 0); out_len = out_len0 = IntPairAE_get_nelt(out_ranges); for (i = 0; i < x_len; i++) { j = order_buf[i]; width_j = x_width[j]; if (width_j == 0) continue; start_j = x_start[j]; end_j = start_j + width_j - 1; if (max_end == NA_INTEGER) { max_end = end_j; } else { gapstart = max_end + 1; if (restrict_end != NA_INTEGER && start_j > restrict_end + 1) start_j = restrict_end + 1; gapwidth = start_j - gapstart; if (gapwidth >= 1) { /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, gapstart, gapwidth); out_len++; max_end = end_j; } else if (end_j > max_end) { max_end = end_j; } } if (restrict_end != NA_INTEGER && max_end >= restrict_end) break; } if (restrict_end != NA_INTEGER && max_end != NA_INTEGER && max_end < restrict_end) { gapstart = max_end + 1; gapwidth = restrict_end - max_end; /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, gapstart, gapwidth); out_len++; } return out_len - out_len0; } /* --- .Call ENTRY POINT --- */ SEXP IRanges_gaps(SEXP x_start, SEXP x_width, SEXP start, SEXP end) { int x_len; const int *x_start_p, *x_width_p; SEXP ans, ans_names; IntPairAE *out_ranges; IntAE *order_buf; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); out_ranges = new_IntPairAE(0, 0); order_buf = new_IntAE(x_len, 0, 0); gaps_ranges(x_start_p, x_width_p, x_len, INTEGER(start)[0], INTEGER(end)[0], order_buf->elts, out_ranges); PROTECT(ans = NEW_LIST(2)); PROTECT(ans_names = NEW_CHARACTER(2)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP CompressedIRangesList_gaps(SEXP x, SEXP start, SEXP end) { SEXP ans, ans_names, ans_unlistData, ans_breakpoints, ans_partitioning; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, in_len_max, start_len, end_len, *start_elt, *end_elt, i; IntAE *order_buf; IntPairAE *in_ranges, *out_ranges; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); in_len_max = get_maxNROWS_from_CompressedIRangesList_holder(&x_holder); order_buf = new_IntAE(in_len_max, 0, 0); in_ranges = new_IntPairAE(0, 0); out_ranges = new_IntPairAE(0, 0); start_len = LENGTH(start); end_len = LENGTH(end); if (start_len != 1 && start_len != x_len) error("'start' must have length 1 or the length of 'x'"); if (end_len != 1 && end_len != x_len) error("'end' must have length 1 or the length of 'x'"); PROTECT(ans_breakpoints = NEW_INTEGER(x_len)); start_elt = INTEGER(start); end_elt = INTEGER(end); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); IntPairAE_set_nelt(in_ranges, 0); append_IRanges_holder_to_IntPairAE(in_ranges, &ir_holder); gaps_ranges(in_ranges->a->elts, in_ranges->b->elts, IntPairAE_get_nelt(in_ranges), *start_elt, *end_elt, order_buf->elts, out_ranges); INTEGER(ans_breakpoints)[i] = IntPairAE_get_nelt(out_ranges); if (start_len != 1) start_elt++; if (end_len != 1) end_elt++; } PROTECT(ans_unlistData = _new_IRanges_from_IntPairAE("IRanges", out_ranges)); PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); PROTECT(ans_partitioning = _new_PartitioningByEnd( "PartitioningByEnd", ans_breakpoints, ans_names)); PROTECT(ans = _new_CompressedList(get_classname(x), ans_unlistData, ans_partitioning)); UNPROTECT(5); return ans; } /**************************************************************************** * "disjointBins" method. */ /* --- .Call ENTRY POINT --- * Worst case complexity of O(n^2) :(, but in practice very fast. */ SEXP Ranges_disjointBins(SEXP x_start, SEXP x_width) { SEXP ans; IntAE *bin_ends = new_IntAE(128, 0, 0); PROTECT(ans = NEW_INTEGER(length(x_start))); for (int i = 0; i < length(x_start); i++) { // find a bin, starting at first int j = 0, end = INTEGER(x_start)[i] + INTEGER(x_width)[i] - 1; for (; j < IntAE_get_nelt(bin_ends) && bin_ends->elts[j] >= INTEGER(x_start)[i]; j++); // remember when this bin will be open if (j == IntAE_get_nelt(bin_ends)) IntAE_append(bin_ends, &end, 1); else bin_ends->elts[j] = end; // store the bin for this range INTEGER(ans)[i] = j + 1; } UNPROTECT(1); return ans; } IRanges/tests/0000755000175400017540000000000013175713360014300 5ustar00biocbuildbiocbuildIRanges/tests/run_unitTests.R0000644000175400017540000000011713175713360017310 0ustar00biocbuildbiocbuildrequire("IRanges") || stop("unable to load IRanges package") IRanges:::.test() IRanges/vignettes/0000755000175400017540000000000013175724757015162 5ustar00biocbuildbiocbuildIRanges/vignettes/IRangesOverview.Rnw0000644000175400017540000007512613175713360020730 0ustar00biocbuildbiocbuild%\VignetteIndexEntry{An Introduction to IRanges} %\VignetteDepends{} %\VignetteKeywords{Ranges} %\VignettePackage{IRanges} \documentclass[10pt]{article} \usepackage{times} \usepackage{hyperref} \textwidth=6.5in \textheight=8.5in %\parskip=.3cm \oddsidemargin=-.1in \evensidemargin=-.1in \headheight=-.3in \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Rcode}[1]{{\texttt{#1}}} \newcommand{\software}[1]{\textsf{#1}} \newcommand{\R}{\software{R}} \newcommand{\IRanges}{\Rpackage{IRanges}} \title{An Introduction to \IRanges{}} \author{Patrick Aboyoun, Michael Lawrence, Herv\'e Pag\`es} \date{\today} \begin{document} \maketitle <>= options(width=72) @ \section{Introduction} The \IRanges{} package is designed to represent sequences, ranges representing indices along those sequences, and data related to those ranges. In this vignette, we will rely on simple, illustrative example datasets, rather than large, real-world data, so that each data structure and algorithm can be explained in an intuitive, graphical manner. We expect that packages that apply \IRanges{} to a particular problem domain will provide vignettes with relevant, realistic examples. The \IRanges{} package is available at bioconductor.org and can be downloaded via \Rfunction{biocLite}: <>= source("http://bioconductor.org/biocLite.R") biocLite("IRanges") @ <>= library(IRanges) @ \section{Vector objects} In the context of the \IRanges{} package, a sequence is an ordered finite collection of elements. The \IRanges{} packages represents two types of objects as sequences: (1) atomic sequences and (2) lists (or non-atomic sequences). The following subsections describe each in turn. All \IRanges{}-derived sequences inherit from the \Rclass{Vector} virtual class. \subsection{Atomic Vectors} In \R{}, atomic sequences are typically stored in atomic vectors. The \IRanges{} package includes an additional atomic sequence object type, \Rclass{Rle}, which compresses an atomic sequence through run-length encoding. We begin our discussion of atomic sequences using two \Rclass{Rle} vectors. <>= set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length = 500), seq(10, 0.001, length = 500)) xVector <- Rle(rpois(1e7, lambda)) yVector <- Rle(rpois(1e7, lambda[c(251:length(lambda), 1:250)])) @ All atomic sequences in \R{} have three main properties: (1) a notion of length or number of elements, (2) the ability to extract elements to create new atomic sequences, and (3) the ability to be combined with one or more atomic sequences to form larger atomic sequences. The main functions for these three operations are \Rfunction{length}, \Rfunction{[}, and \Rfunction{c}. <>= length(xVector) xVector[1] zVector <- c(xVector, yVector) @ While these three methods may seem trivial, they provide a great deal of power and many atomic sequence manipulations can be constructed using them. \subsubsection{Vector Subsetting} As with ordinary \R{} atomic vectors, it is often necessary to subset one sequence from another. When this subsetting does not duplicate or reorder the elements being extracted, the result is called a \textit{subsequence}. In general, the \Rfunction{[} function can be used to construct a new sequence or extract a subsequence, but its interface is often inconvenient and not amenable to optimization. To compensate for this, the \IRanges{} package supports seven additional functions for sequence extraction: \begin{enumerate} \item \Rfunction{window} - Extracts a subsequence over a specified region. \item \Rfunction{subset} - Extracts the subsequence specified by a logical vector. \item \Rfunction{head} - Extracts a consecutive subsequence containing the first n elements. \item \Rfunction{tail} - Extracts a consecutive subsequence containing the last n elements. \item \Rfunction{rev} - Creates a new sequence with the elements in the reverse order. \item \Rfunction{rep} - Creates a new sequence by repeating sequence elements. \end{enumerate} The following code illustrates how these functions are used on an ordinary \R{} \Rclass{integer} vector: <>= xSnippet <- xVector[IRanges(4751, 4760)] xSnippet head(xSnippet) tail(xSnippet) rev(xSnippet) rep(xSnippet, 2) subset(xSnippet, xSnippet >= 5L) @ \subsubsection{Combining Vectors} The \IRanges{} package uses two generic functions, \Rfunction{c} and \Rfunction{append}, for combining two \Rclass{Vector} objects. The methods for \Rclass{Vector} objects follow the definition that these two functions are given the the \Rpackage{base} package. <>= c(xSnippet, rev(xSnippet)) append(xSnippet, xSnippet, after = 3) @ \subsubsection{Looping over Vectors and Vector subsets} In \R{}, \Rfunction{for} looping can be an expensive operation. To compensate for this, \IRanges{} uses three generics, \Rfunction{endoapply}, \Rfunction{lapply}, and \Rfunction{sapply}, for looping over sequences and two generics, \Rfunction{aggregate} and \Rfunction{shiftApply}, to perform calculations over subsequences. The \Rfunction{lapply} and \Rfunction{sapply} functions are familiar to many \R{} users since they are the standard functions for looping over the elements of an \R{} \Rclass{list} object. The \Rfunction{endoapply} function performs an endomorphism equivalent to \Rfunction{lapply}, i.e. returns a \Rclass{Vector} object of the same class as the input rather than a \Rclass{list} object. More will be given on these three functions in the Lists subsection. The \Rfunction{aggregate} function combines sequence extraction functionality of the \Rfunction{window} function with looping capabilities of the \Rfunction{sapply} function. For example, here is some code to compute medians across a moving window of width 3 using the function \Rfunction{aggregate}: <>= xSnippet aggregate(xSnippet, start = 1:8, width = 3, FUN = median) @ The \Rfunction{shiftApply} function is a looping operation involving two sequences whose elements are lined up via a positional shift operation. For example, the elements of \Robject{xVector} and \Robject{yVector} were simulated from Poisson distributions with the mean of element i from \Robject{yVector} being equivalent to the mean of element i + 250 from \Robject{xVector}. If we did not know the size of the shift, we could estimate it by finding the shift that maximizes the correlation between \Robject{xVector} and \Robject{yVector}. <>= cor(xVector, yVector) shifts <- seq(235, 265, by=3) corrs <- shiftApply(shifts, yVector, xVector, FUN = cor) @ % <>= plot(shifts, corrs) @ The result is shown in Fig.~\ref{figshiftcorrs}. \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-figshiftcorrs} \caption{\label{figshiftcorrs}% Correlation between \Robject{xVector} and \Robject{yVector} for various shifts.} \end{center} \end{figure} \subsubsection{Run Length Encoding} Up until this point we have used \R{} atomic vectors to represent atomic sequences, but there are times when these object become too large to manage in memory. When there are lots of consecutive repeats in the sequence, the data can be compressed and managed in memory through a run-length encoding where a data value is paired with a run length. For example, the sequence \{1, 1, 1, 2, 3, 3\} can be represented as values = \{1, 2, 3\}, run lengths = \{3, 1, 2\}. The \Rclass{Rle} class in \IRanges{} is used to represent a run-length encoded (compressed) sequence of \Rclass{logical}, \Rclass{integer}, \Rclass{numeric}, \Rclass{complex}, \Rclass{character}, or \Rclass{raw} values. One way to construct an \Rclass{Rle} object is through the \Rclass{Rle} constructor function: <>= xRle <- Rle(xVector) yRle <- Rle(yVector) xRle yRle @ When there are lots of consecutive repeats, the memory savings through an RLE can be quite dramatic. For example, the \Robject{xRle} object occupies less than one quarter of the space of the original \Robject{xVector} object, while storing the same information: <>= as.vector(object.size(xRle) / object.size(xVector)) identical(as.vector(xRle), xVector) @ The functions \Rfunction{runValue} and \Rfunction{runLength} extract the run values and run lengths from an \Rclass{Rle} object respectively: <>= head(runValue(xRle)) head(runLength(xRle)) @ The \Rclass{Rle} class supports many of the basic methods associated with \R{} atomic vectors including the Ops, Math, Math2, Summary, and Complex group generics. Here is a example of manipulating \Rclass{Rle} objects using methods from the Ops group: <>= xRle > 0 xRle + yRle xRle > 0 | yRle > 0 @ Here are some from the Summary group: <>= range(xRle) sum(xRle > 0 | yRle > 0) @ And here is one from the Math group: <>= log1p(xRle) @ As with the atomic vectors, the \Rfunction{cor} and \Rfunction{shiftApply} functions operate on \Rclass{Rle} objects: <>= cor(xRle, yRle) shiftApply(249:251, yRle, xRle, FUN = function(x, y) var(x, y) / (sd(x) * sd(y))) @ For more information on the methods supported by the \Rclass{Rle} class, consult the \Rcode{Rle} man page. \subsection{Lists} In many data analysis situation there is a desire to organize and manipulate multiple objects simultaneously. Typically this is done in \R{} through the usage of a list. While a list serves as a generic container, it does not confer any information about the specific class of its elements, provides no infrastructure to ensure type safety, and the S3 and S4 method dispatch mechanisms do not support method selection for lists with homogeneous object types. The \Rclass{List} virtual class defined in the \IRanges{} package addresses these issues. \Rclass{List} is a direct extension of \Rclass{Vector}. \subsubsection{Lists of Atomic Vectors} The first type of lists we consider are those containing atomic sequences such as \Rclass{integer} vectors or \Rclass{Rle} objects. We may wish to define a method that retrieves the length of each atomic sequence element, without special type checking. To enable this, we define collection classes such as \Rclass{IntegerList} and \Rclass{RleList}, which inherit from the \Rclass{List} virtual class, for representing lists of \Rclass{integer} vectors and \Rclass{Rle} objects respectively. <>= getClassDef("RleList") @ As the class definition above shows, the \Rclass{RleList} class is virtual with subclasses \Rclass{SimpleRleList} and \Rclass{CompressedRleList}. A \Rclass{SimpleRleList} class uses a regular \R{} list to store the underlying elements and the \Rclass{CompressedRleList} class stores the elements in an unlisted form and keeps track of where the element breaks are. The former ``simple list" class is useful when the Rle elements are long and the latter ``compressed list" class is useful when the list is long and/or sparse (i.e. a number of the list elements have length 0). In fact, all of the atomic vector types (raw, logical, integer, numeric, complex, and character) have similar list classes that derive from the \Rclass{List} virtual class. For example, there is an \Rclass{IntegerList} virtual class with subclasses \Rclass{SimpleIntegerList} and \Rclass{CompressedIntegerList}. Each of the list classes for atomic sequences, be they stored as vectors or \Rclass{Rle} objects, have a constructor function with a name of the appropriate list virtual class, such as \Rclass{IntegerList}, and an optional argument \Rfunarg{compress} that takes an argument to specify whether or not to create the simple list object type or the compressed list object type. The default is to create the compressed list object type. <>= args(IntegerList) cIntList1 <- IntegerList(x = xVector, y = yVector) cIntList1 sIntList2 <- IntegerList(x = xVector, y = yVector, compress = FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress = FALSE) object.size(cIntList2) object.size(sIntList2) @ The \Rfunction{length} function returns the number of elements in a \Rclass{Vector}-derived object and, for a \Rclass{List}-derived object like ``simple list" or ``compressed list", the \Rfunction{elementNROWS} function returns an integer vector containing the lengths of each of the elements: <>= length(cIntList2) Rle(elementNROWS(cIntList2)) @ Just as with ordinary \R{} \Rclass{list} objects, \Rclass{List}-derived object support the \Rfunction{[[} for element extraction, \Rfunction{c} for combining, and \Rfunction{lapply}/\Rfunction{sapply} for looping. When looping over sparse lists, the ``compressed list" classes can be much faster during computations since only the non-empty elements are looped over during the \Rfunction{lapply}/\Rfunction{sapply} computation and all the empty elements are assigned the appropriate value based on their status. <>= system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) @ Unlist ordinary \R{} \Rclass{list} objects, \Rclass{AtomicList} objects support the \Rfunction{Ops} (e.g. \Rfunction{+}, \Rfunction{==}, \Rfunction{\&}), \Rfunction{Math} (e.g. \Rfunction{log}, \Rfunction{sqrt}), \Rfunction{Math2} (e.g. \Rfunction{round}, \Rfunction{signif}), \Rfunction{Summary} (e.g. \Rfunction{min}, \Rfunction{max}, \Rfunction{sum}), and \Rfunction{Complex} (e.g. \Rfunction{Re}, \Rfunction{Im}) group generics. <>= xRleList <- RleList(xRle, 2L * rev(xRle)) yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList > 0 xRleList + yRleList sum(xRleList > 0 | yRleList > 0) @ Since these atomic lists inherit from \Rclass{List}, they can also use the looping function \Rfunction{endoapply} to perform endomorphisms. <>= safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] @ \section{Data Tables} To Do: \Rclass{DataTable}, \Rclass{DataFrame}, \Rclass{DataFrameList}, \Rclass{SplitDataFrameList} \section{Vector Annotations} Often when one has a collection of objects, there is a need to attach metadata that describes the collection in some way. Two kinds of metadata can be attached to a \Rclass{Vector} object: \begin{enumerate} \item Metadata about the object as a whole: this metadata is accessed via the \Rfunction{metadata} accessor and is represented as an ordinary \Rclass{list}; \item Metadata about the individual elements of the object: this metadata is accessed via the \Rfunction{mcols} accessor (\Rfunction{mcols} stands for {\it metadata columns}) and is represented as a \Rclass{DataTable} object (i.e. as an instance of a concrete subclass of \Rclass{DataTable}, e.g. a \Rclass{DataFrame} object). This \Rclass{DataTable} object can be thought of as the result of binding together one or several vector-like objects (the metadata columns) of the same length as the \Rclass{Vector} object. Each row of the \Rclass{DataTable} object annotates the corresponding element of the \Rclass{Vector} object. \end{enumerate} \section{Vector Ranges} When analyzing sequences, we are often interested in particular consecutive subsequences. For example, the \Sexpr{letters} vector could be considered a sequence of lower-case letters, in alphabetical order. We would call the first five letters (\textit{a} to \textit{e}) a consecutive subsequence, while the subsequence containing only the vowels would not be consecutive. It is not uncommon for an analysis task to focus only on the geometry of the regions, while ignoring the underlying sequence values. A list of indices would be a simple way to select a subsequence. However, a sparser representation for consecutive subsequences would be a range, a pairing of a start position and a width, as used when extracting sequences with \Rfunction{window}. When analyzing subsequences in \IRanges{}, each range is treated as an observation. The virtual \Rclass{Ranges} class represents lists of ranges, or, equivalently and as a derivative \Rclass{IntegerList}, sequences of consecutive integers. The most commonly used implementation of \Rclass{Ranges} is \Rclass{IRanges}, which stores the starts and widths as ordinary integer vectors. To construct an \Rclass{IRanges} instance, we call the \Rfunction{IRanges} constructor. Ranges are normally specified by passing two out of the three parameters: start, end and width (see \Rcode{help(IRanges)} for more information). % <>= ir1 <- IRanges(start = 1:10, width = 10:1) ir2 <- IRanges(start = 1:10, end = 11) ir3 <- IRanges(end = 11, width = 10:1) identical(ir1, ir2) & identical(ir2, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width = c(12, 6, 6, 15, 6, 2, 7)) @ % All of the above calls construct an \Rclass{IRanges} instance with the same ranges, using different combinations of the \Rfunarg{start}, \Rfunarg{end} and \Rfunarg{width} parameters. Accessing the starts, widths and ends is supported by every \Rclass{Ranges} implementation. <>= start(ir) @ <>= end(ir) @ <>= width(ir) @ For \Rclass{IRanges} and some other \Rclass{Ranges} derivatives, subsetting is also supported, by numeric and logical indices. <>= ir[1:4] @ <>= ir[start(ir) <= 15] @ One may think of each range as a sequence of integer ranges, and \Rclass{Ranges} is, in fact, derived from \Rclass{IntegerList}. <>= ir[[1]] @ In order to illustrate range operations, we'll create a function to plot ranges. <>= plotRanges <- function(x, xlim = x, main = deparse(substitute(x)), col = "black", sep = 0.5, ...) { height <- 1 if (is(xlim, "Ranges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col = col, ...) title(main) axis(1) } @ <>= plotRanges(ir) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ir-plotRanges} \caption{\label{fig-ir-plotRanges}% Plot of original ranges.} \end{center} \end{figure} \subsection{Normality} Sometimes, it is necessary to formally represent a subsequence, where no elements are repeated and order is preserved. Also, it is occasionally useful to think of a \Rclass{Ranges} object as a set, where no elements are repeated and order does not matter. While every \Rclass{Ranges} object, as a \Rclass{Vector} derivative, has an implicit ordering, one can enforce the same ordering for all such objects, so that ordering becomes inconsequential within that context. The \Rclass{NormalIRanges} class formally represents either a subsequence encoding or a set of integers. By definition a Ranges object is said to be \textit{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). There are three main advantages of using a \textit{normal} \Rclass{Ranges} object: (1) it guarantees a subsequence encoding or set of integers, (2) it is compact in terms of the number of ranges, and (3) it uniquely identifies its information, which simplifies comparisons. The \Rfunction{reduce} function reduces any \Rclass{Ranges} object to a \Rclass{NormalIRanges} by merging redundant ranges. <>= reduce(ir) plotRanges(reduce(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-reduce} \caption{\label{fig-ranges-reduce}% Plot of reduced ranges.} \end{center} \end{figure} \subsection{Lists of \Rclass{Ranges} objects} It is common to manipulate collections of \Rclass{Ranges} objects during an analysis. Thus, the \IRanges{} package defines some specific classes for working with multiple \Rclass{Ranges} objects. The \Rclass{RangesList} class asserts that each element is a \Rclass{Ranges} object and provides convenience methods, such as \Rfunction{start}, \Rfunction{end} and \Rfunction{width} accessors that return \Rclass{IntegerList} objects, aligning with the \Rclass{RangesList} object. To explicitly construct a \Rclass{RangesList}, use the \Rfunction{RangesList} function. <>= rl <- RangesList(ir, rev(ir)) @ % <>= start(rl) @ \subsection{Vector Extraction} As the elements of a \Rclass{Ranges} object encode consecutive subsequences, they may be used directly in sequence extraction. Note that when a \textit{normal} \Rclass{Ranges} is given as the index, the result is a subsequence, as no elements are repeated or reordered. If the sequence is a \Rclass{Vector} subclass (i.e. not an ordinary \Rclass{vector}), the canonical \Rfunction{[} function accepts a \Rclass{Ranges} instance. % <>= irextract <- IRanges(start = c(4501, 4901) , width = 100) xRle[irextract] @ % \subsection{Finding Overlapping Ranges} The function \Rfunction{findOverlaps} detects overlaps between two \Rclass{Ranges} objects. <>= ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) @ \subsection{Counting Overlapping Ranges} The function \Rfunction{coverage} counts the number of ranges over each position. <>= cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-coverage} \caption{\label{fig-ranges-coverage}% Plot of ranges with accumulated coverage.} \end{center} \end{figure} \subsection{Finding Neighboring Ranges} The \Rfunction{nearest} function finds the nearest neighbor ranges (overlapping is zero distance). The \Rfunction{precede} and \Rfunction{follow} functions find the non-overlapping nearest neighbors on a specific side. \subsection{Transforming Ranges} Utilities are available for transforming a \Rclass{Ranges} object in a variety of ways. Some transformations, like \Rfunction{reduce} introduced above, can be dramatic, while others are simple per-range adjustments of the starts, ends or widths. \subsubsection{Adjusting starts, ends and widths} Perhaps the simplest transformation is to adjust the start values by a scalar offset, as performed by the \Rfunction{shift} function. Below, we shift all ranges forward 10 positions. % <>= shift(ir, 10) @ There are several other ways to transform ranges. These include \Rfunction{narrow}, \Rfunction{resize}, \Rfunction{flank}, \Rfunction{reflect}, \Rfunction{restrict}, and \Rfunction{threebands}. For example \Rfunction{narrow} supports the adjustment of start, end and width values, which should be relative to each range. These adjustments are vectorized over the ranges. As its name suggests, the ranges can only be narrowed. % <>= narrow(ir, start=1:5, width=2) @ The \Rfunction{restrict} function ensures every range falls within a set of bounds. Ranges are contracted as necessary, and the ranges that fall completely outside of but not adjacent to the bounds are dropped, by default. % <>= restrict(ir, start=2, end=3) @ The \Rfunction{threebands} function extends \Rfunction{narrow} so that the remaining left and right regions adjacent to the narrowed region are also returned in separate \Rclass{Ranges} objects. % <>= threebands(ir, start=1:5, width=2) @ The arithmetic operators \Rfunction{+}, \Rfunction{-} and \Rfunction{*} change both the start and the end/width by symmetrically expanding or contracting each range. Adding or subtracting a numeric (integer) vector to a \Rclass{Ranges} causes each range to be expanded or contracted on each side by the corresponding value in the numeric vector. <>= ir + seq_len(length(ir)) @ % The \Rfunction{*} operator symmetrically magnifies a \Rclass{Ranges} object by a factor, where positive contracts (zooms in) and negative expands (zooms out). % <>= ir * -2 # double the width @ WARNING: The semantic of these arithmetic operators might be revisited at some point. Please restrict their use to the context of interactive visualization (where they arguably provide some convenience) but avoid to use them programmatically. \subsubsection{Making ranges disjoint} A more complex type of operation is making a set of ranges disjoint, \textit{i.e.} non-overlapping. For example, \Rfunction{threebands} returns a disjoint set of three ranges for each input range. The \Rfunction{disjoin} function makes a \Rclass{Ranges} object disjoint by fragmenting it into the widest ranges where the set of overlapping ranges is the same. <>= disjoin(ir) plotRanges(disjoin(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-disjoin} \caption{\label{fig-ranges-disjoin}% Plot of disjoined ranges.} \end{center} \end{figure} A variant of \Rfunction{disjoin} is \Rfunction{disjointBins}, which divides the ranges into bins, such that the ranges in each bin are disjoint. The return value is an integer vector of the bins. <>= disjointBins(ir) @ \subsubsection{Other transformations} Other transformations include \Rfunction{reflect} and \Rfunction{flank}. The former ``flips'' each range within a set of common reference bounds. <>= reflect(ir, IRanges(start(ir), width=width(ir)*2)) @ % The \Rfunction{flank} returns ranges of a specified width that flank, to the left (default) or right, each input range. One use case of this is forming promoter regions for a set of genes. <>= flank(ir, width = seq_len(length(ir))) @ % \subsection{Set Operations} Sometimes, it is useful to consider a \Rclass{Ranges} object as a set of integers, although there is always an implicit ordering. This is formalized by \Rclass{NormalIRanges}, above, and we now present versions of the traditional mathematical set operations \textit{complement}, \textit{union}, \textit{intersect}, and \textit{difference} for \Rclass{Ranges} objects. There are two variants for each operation. The first treats each \Rclass{Ranges} object as a set and returns a \textit{normal} value, while the other has a ``parallel'' semantic like \Rfunction{pmin}/\Rfunction{pmax} and performs the operation for each range pairing separately. The \textit{complement} operation is implemented by the \Rfunction{gaps} and \Rfunction{pgap} functions. By default, \Rfunction{gaps} will return the ranges that fall between the ranges in the (normalized) input. It is possible to specify a set of bounds, so that flanking ranges are included. <>= gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-gaps} \caption{\label{fig-ranges-gap}% Plot of gaps from ranges.} \end{center} \end{figure} \Rfunction{pgap} considers each parallel pairing between two \Rclass{Ranges} objects and finds the range, if any, between them. Note that the function name is singular, suggesting that only one range is returned per range in the input. <>= @ The remaining operations, \textit{union}, \textit{intersect} and \textit{difference} are implemented by the \Rfunction{[p]union}, \Rfunction{[p]intersect} and \Rfunction{[p]setdiff} functions, respectively. These are relatively self-explanatory. <>= @ <>= @ <>= @ <>= @ <>= @ <>= @ % \subsection{Mapping Ranges Between Vectors} \section{Vector Views} The \IRanges{} package provides the virtual \Rclass{Views} class, which stores a sequence together with an \Rclass{IRanges} object defining ranges on the sequence. Each range is said to represent a \textit{view} onto the sequence. Here, we will demonstrate the \Rclass{RleViews} class, where the sequence is of class \Rclass{Rle}. Other \Rclass{Views} implementations exist, such as \Rclass{XStringViews} in the \Rpackage{Biostrings} package. \subsection{Creating Views} There are two basic constructors for creating views: the \Rfunction{Views} function based on indicators and the \Rfunction{slice} based on numeric boundaries. <>= xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xViewsList <- slice(xRleList, 1) @ \subsection{Aggregating Views} While \Rfunction{sapply} can be used to loop over each window, the native functions \Rfunction{viewMaxs}, \Rfunction{viewMins}, \Rfunction{viewSums}, and \Rfunction{viewMeans} provide fast looping to calculate their respective statistical summaries. <>= head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) @ \section{IRanges in Biological Sequence Analysis} The \IRanges{} packages was primarily designed with biological sequence analysis in mind and Table \ref{table:bioseq} shows how some biological sequence analysis concepts are represented in the \IRanges{} class system. \begin{table}[ht] \begin{center} \begin{tabular}{l|l} \hline Biological Entity & \Rclass{Vector} Subclass \\ \hline Genome browser track(s) & \Rclass{GRanges}/\Rclass{GRangesList} \\ Coverage across chromosomes/contigs & \Rclass{RleList} \\ Mapped ranges to genome & \Rclass{CompressedIRangesList} \\ Data (sans ranges) across chroms/contigs & \Rclass{SplitDataFrameList} \\ \hline \end{tabular} \end{center} \caption{\Rclass{Vector} subclasses for Biological Sequence Analysis} \label{table:bioseq} \end{table} \pagebreak[4] \section{Session Information} \begin{table*}[tbp] \begin{minipage}{\textwidth} <>= toLatex(sessionInfo()) @ \end{minipage} \caption{\label{tab:sessioninfo}% The output of \Rfunction{sessionInfo} on the build system after running this vignette.} \end{table*} \end{document}