IRanges/DESCRIPTION0000644000175100017510000000546014641351314014637 0ustar00biocbuildbiocbuildPackage: IRanges Title: Foundation of integer range manipulation in Bioconductor Description: Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames. biocViews: Infrastructure, DataRepresentation URL: https://bioconductor.org/packages/IRanges BugReports: https://github.com/Bioconductor/IRanges/issues Version: 2.38.1 License: Artistic-2.0 Encoding: UTF-8 Authors@R: c( person("Hervé", "Pagès", role=c("aut", "cre"), email="hpages.on.github@gmail.com"), person("Patrick", "Aboyoun", role="aut"), person("Michael", "Lawrence", role="aut")) Depends: R (>= 4.0.0), methods, utils, stats, BiocGenerics (>= 0.39.2), S4Vectors (>= 0.33.3) Imports: stats4 LinkingTo: S4Vectors Suggests: XVector, GenomicRanges, Rsamtools, GenomicAlignments, GenomicFeatures, BSgenome.Celegans.UCSC.ce2, pasillaBamSubset, RUnit, BiocStyle Collate: range-squeezers.R Vector-class-leftovers.R DataFrameList-class.R DataFrameList-utils.R AtomicList-class.R AtomicList-utils.R Ranges-and-RangesList-classes.R IPosRanges-class.R IPosRanges-comparison.R IntegerRangesList-class.R IRanges-class.R IRanges-constructor.R IRanges-utils.R Rle-class-leftovers.R IPos-class.R subsetting-utils.R Grouping-class.R Views-class.R RleViews-class.R RleViews-utils.R extractList.R seqapply.R multisplit.R SimpleGrouping-class.R IRangesList-class.R IPosList-class.R ViewsList-class.R RleViewsList-class.R RleViewsList-utils.R RangedSelection-class.R MaskCollection-class.R read.Mask.R CompressedList-class.R CompressedList-comparison.R CompressedHitsList-class.R CompressedDataFrameList-class.R CompressedAtomicList-class.R CompressedGrouping-class.R CompressedRangesList-class.R Hits-class-leftovers.R NCList-class.R findOverlaps-methods.R windows-methods.R intra-range-methods.R inter-range-methods.R reverse-methods.R coverage-methods.R cvg-methods.R slice-methods.R setops-methods.R nearest-methods.R cbind-Rle-methods.R tile-methods.R extractListFragments.R zzz.R git_url: https://git.bioconductor.org/packages/IRanges git_branch: RELEASE_3_19 git_last_commit: d4d8da9 git_last_commit_date: 2024-07-03 Repository: Bioconductor 3.19 Date/Publication: 2024-07-03 NeedsCompilation: yes Packaged: 2024-07-03 22:38:04 UTC; biocbuild Author: Hervé Pagès [aut, cre], Patrick Aboyoun [aut], Michael Lawrence [aut] Maintainer: Hervé Pagès IRanges/NAMESPACE0000644000175100017510000002312414626176651014361 0ustar00biocbuildbiocbuilduseDynLib(IRanges) import(methods) importFrom(utils, stack, read.table) importFrom(stats, cov, cor, median, quantile, smoothEnds, runmed, "window<-", aggregate, setNames) import(BiocGenerics) import(S4Vectors) importFrom(stats4, summary) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 classes ### exportClasses( ## DataFrameList-class.R: DataFrameList, DFrameList, SimpleDataFrameList, SimpleDFrameList, SplitDataFrameList, SplitDFrameList, SimpleSplitDataFrameList, SimpleSplitDFrameList, ## AtomicList-class.R: AtomicList, SimpleAtomicList, LogicalList, SimpleLogicalList, IntegerList, SimpleIntegerList, NumericList, SimpleNumericList, ComplexList, SimpleComplexList, CharacterList, SimpleCharacterList, RawList, SimpleRawList, RleList, SimpleRleList, FactorList, SimpleFactorList, ## Ranges-and-RangesList-classes.R: Ranges, IntegerRanges, Pos, RangesList, SimpleRangesList, IntegerRangesList, SimpleIntegerRangesList, PosList, SimplePosList, ## IPosRanges-class.R: IPosRanges, ## IntegerRangesList-class.R IntegerRangesList, SimpleIntegerRangesList, ## IRanges-class.R: IRanges, NormalIRanges, ## IPos-class.R: IPos, UnstitchedIPos, StitchedIPos, ## Grouping-class.R: Grouping, ManyToOneGrouping, ManyToManyGrouping, H2LGrouping, Dups, GroupingRanges, GroupingIRanges, Partitioning, PartitioningByEnd, PartitioningByWidth, PartitioningMap, ## Views-class.R: Views, ## RleViews-class.R: RleViews, ## SimpleGrouping-class.R: SimpleGrouping, SimpleManyToOneGrouping, BaseManyToManyGrouping, SimpleManyToManyGrouping, ## IRangesList-class.R: IRangesList, SimpleIRangesList, NormalIRangesList, SimpleNormalIRangesList, ## IPosList-class.R: IPosList, SimpleIPosList, ## ViewsList-class.R: ViewsList, SimpleViewsList, ## RleViewsList-class.R: RleViewsList, SimpleRleViewsList, ## RangedSelection-class.R: RangedSelection, ## MaskCollection-class.R: MaskCollection, ## CompressedList-class.R: CompressedList, ## CompressedDataFrameList-class.R: CompressedDataFrameList, CompressedDFrameList, CompressedSplitDataFrameList, CompressedSplitDFrameList, ## CompressedAtomicList-class.R: CompressedAtomicList, CompressedLogicalList, CompressedIntegerList, CompressedNumericList, CompressedComplexList, CompressedCharacterList, CompressedRawList, CompressedRleList, CompressedFactorList, ## CompressedGrouping-class.R: CompressedGrouping, CompressedManyToOneGrouping, CompressedManyToManyGrouping, ## CompressedRangesList-class.R: CompressedRangesList, CompressedPosList, CompressedIntegerRangesList, CompressedIRangesList, CompressedNormalIRangesList, CompressedIPosList, ## NCList-class.R: NCList, NCLists, ## nearest-methods.R: IntegerRanges_OR_missing ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S3 methods ### S3method(as.data.frame, IPosRanges) S3method(as.data.frame, IPos) S3method(diff, AtomicList) S3method(summary, IPosRanges) S3method(summary, IPos) S3method(`window<-`, Vector) S3method(`window<-`, vector) S3method(`window<-`, factor) ### We also export them thru the export() directive so that (a) they can be ### called directly, (b) tab-completion on the name of the generic shows them, ### and (c) methods() doesn't asterisk them. export( as.data.frame.IPosRanges, as.data.frame.IPos, diff.AtomicList, summary.IPosRanges, summary.IPos, "window<-.Vector", "window<-.vector", "window<-.factor" ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 methods for generics NOT defined in this package ### exportMethods( ## Methods for generics defined in the base package: length, "length<-", names, "names<-", dim, dimnames, "dimnames<-", "[", "[<-", "[[", "[[<-", as.integer, as.character, as.factor, as.matrix, drop, min, max, range, diff, transform, nchar, chartr, tolower, toupper, sub, gsub, startsWith, endsWith, "split<-", "!", merge, is.na, by, ## Methods for generics defined in the methods package: coerce, show, Ops, Math, Math2, Summary, Complex, ## Methods for generics defined in the utils package: stack, ## Methods for generics defined in the stats package: mean, var, cov, cor, sd, median, quantile, mad, IQR, smoothEnds, runmed, "window<-", ## Methods for generics defined in the stats4 package: summary, ## Methods for generics defined in the BiocGenerics package: nrow, ncol, NROW, NCOL, rownames, "rownames<-", colnames, "colnames<-", rbind, cbind, unlist, relist, unsplit, lapply, tapply, as.vector, as.list, as.data.frame, which, which.max, which.min, pmax, pmin, pmax.int, pmin.int, match, duplicated, unique, anyDuplicated, is.unsorted, order, append, paste, table, union, intersect, setdiff, start, "start<-", end, "end<-", width, "width<-", pos, dims, nrows, ncols, updateObject, ## Methods for generics defined in the S4Vectors package: ROWNAMES, "ROWNAMES<-", bindROWS, extractROWS, replaceROWS, getListElement, parallel_slot_names, classNameForDisplay, from, to, nLnode, nRnode, pcompare, pcompareRecursively, selfmatch, runLength, "runValue<-", runsum, runmean, runwtsum, runq, elementNROWS, isEmpty, revElements, as.env, active, "active<-" ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export non-generic functions ### export( multisplit, IRanges, solveUserSEW, successiveIRanges, slidingIRanges, breakInChunks, whichAsIRanges, asNormalIRanges, rangeComparisonCodeToLetter, IPos, NCList, NCLists, H2LGrouping, Dups, PartitioningByEnd, PartitioningByWidth, PartitioningMap, RangedSelection, IRangesList, RleViewsList, "%over%", "%within%", "%outside%", "%pover%", "%pwithin%", "%poutside%", mergeByOverlaps, findOverlapPairs, MaskCollection.show_frame, Mask, read.gapMask, read.agpMask, read.liftMask, read.rmMask, read.trfMask, ##read.chain, successiveViews, slidingViews, LogicalList, IntegerList, NumericList, ComplexList, CharacterList, RawList, RleList, FactorList, DataFrameList, SplitDataFrameList, ManyToOneGrouping, ManyToManyGrouping, regroup, heads, tails, selectNearest, INCOMPATIBLE_ARANGES_MSG, extractListFragments, equisplit ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Export S4 generics defined in IRanges, and corresponding methods ### export( ## range-squeezers.R: ranges, rglist, ## Ranges-and-RangesList-classes.R: mid, isNormal, whichFirstNotNormal, ## Views-class.R: subject, "ranges<-", Views, trim, subviews, viewApply, viewMins, viewMaxs, viewSums, viewMeans, viewWhichMins, viewWhichMaxs, viewRangeMins, viewRangeMaxs, ## Grouping-class.R: nobj, grouplengths, members, vmembers, togroup, togrouplength, high2low, low2high, grouprank, togrouprank, mapOrder, ## Rle-class-leftovers.R: findRange, splitRanges, ## extractList.R: extractList, ## DataFrameList-class.R: commonColnames, "commonColnames<-", columnMetadata, "columnMetadata<-", ## IntegerRangesList-class.R: space, ## MaskCollection-class.R: nir_list, desc, "desc<-", maskedwidth, maskedratio, collapse, ## findOverlaps-methods.R: findOverlaps, countOverlaps, overlapsAny, subsetByOverlaps, overlapsRanges, poverlaps, ## windows-methods.R: windows, narrow, ## intra-range-methods.R: update_ranges, shift, resize, flank, promoters, terminators, reflect, restrict, threebands, ## inter-range-methods.R: reduce, gaps, disjoin, isDisjoint, disjointBins, ## reverse-methods.R: reverse, ## coverage-methods.R: coverage, ## cvg-methods.R: cvg, ## slice-methods.R: slice, ## setops-methods.R: punion, pintersect, psetdiff, pgap, ## nearest-methods.R: precede, follow, nearest, distance, distanceToNearest, ## tile-methods.R: tile, slidingWindows, ## AtomicList-utils.R: ifelse2 ) ### Same list as above. exportMethods( ranges, rglist, runsum, runmean, runwtsum, runq, mid, isNormal, whichFirstNotNormal, subject, "ranges<-", Views, trim, subviews, viewApply, viewMins, viewMaxs, viewSums, viewMeans, viewWhichMins, viewWhichMaxs, viewRangeMins, viewRangeMaxs, nobj, grouplengths, members, vmembers, togroup, togrouplength, high2low, low2high, grouprank, togrouprank, mapOrder, findRange, splitRanges, extractList, commonColnames, "commonColnames<-", columnMetadata, "columnMetadata<-", space, nir_list, desc, "desc<-", maskedwidth, maskedratio, collapse, findOverlaps, countOverlaps, overlapsAny, subsetByOverlaps, overlapsRanges, poverlaps, windows, narrow, update_ranges, shift, resize, flank, promoters, terminators, reflect, restrict, threebands, reduce, gaps, disjoin, isDisjoint, disjointBins, reverse, coverage, cvg, slice, punion, pintersect, psetdiff, pgap, precede, follow, nearest, distance, distanceToNearest, tile, slidingWindows, ifelse2 ) IRanges/NEWS0000644000175100017510000015733714626176651013657 0ustar00biocbuildbiocbuildCHANGES IN VERSION 2.38.0 ------------------------- NEW FEATURES o Add terminators(), same as promoters() but for terminator regions. CHANGES IN VERSION 2.36.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Add link to revElements() in man page for reverse(). BUG FIXES o Fix is.unsorted() methods for Compressed[Integer|Numeric]List objects (they were never working since their introduction years ago). CHANGES IN VERSION 2.34.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Improve error handling in AtomicList constructors when input is too big. CHANGES IN VERSION 2.32.0 ------------------------- NEW FEATURES o splitAsList() can now perform a "dumb split", that is, when no split factor is supplied, 'splitAsList(x)' is equivalent to 'unname(splitAsList(x, seq_along(x)))' but is slightly more efficient. SIGNIFICANT USER-VISIBLE CHANGES o Add ellipsis argument (...) to the gaps() generic function. CHANGES IN VERSION 2.30.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Like the DataFrame class defined in the S4Vectors package, classes SimpleDataFrameList, CompressedDataFrameList, SimpleSplitDataFrameList, and CompressedSplitDataFrameList, are now virtual. This completes the replacement of DataFrame with DFrame announced in September 2019. See: https://www.bioconductor.org/help/course-materials/2019/BiocDevelForum/02-DataFrame.pdf CHANGES IN VERSION 2.28.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Replace dim(), nrow(), and ncol() methods for DataFrameList objects with dims(), nrows(), and ncols() methods. DEPRECATED AND DEFUNCT o Deprecate dim(), nrow(), and ncol() methods for DataFrameList objects in favor of the new dims(), nrows(), and ncols() methods. CHANGES IN VERSION 2.26.0 ------------------------- NEW FEATURES o Add commonColnames() accessor to get or set the character vector of column names present in the individual DataFrames of a SplitDataFrameList object. o Implement unary + and - for AtomicList derivatives. SIGNIFICANT USER-VISIBLE CHANGES o Much improved error handling and messages in IRanges() constructor function DEPRECATED AND DEFUNCT o Remove RangesList() constructor (was deprecated in BioC 3.7 and defunct in BioC 3.8). BUG FIXES o Fix unplit() on named List objects. o Fix findOverlapPairs() for missing subject (fixes #35). o quantile() on an AtomicList object always returns a matrix (fixes #33). o Fix which.min()/which.max() for CompressedNumericList objects (fixes #30). o Export startsWith() and endsWith() methods for CharacterList/RleList objects (fixes #26). CHANGES IN VERSION 2.24.0 ------------------------- NEW FEATURES o coverage() now supports 'method="naive"'. This is in addition to the already supported methods "sort" and "hash". This new method is a slower version of the "hash" method that has the advantage of avoiding floating point artefacts in the no-coverage regions of the numeric-Rle object returned by coverage() when the weights are supplied as a numeric vector of type 'double'. See "FLOATING POINT ARITHMETIC CAN BRING A SURPRISE" example in '?coverage'. DEPRECATED AND DEFUNCT o Removed RangedData class and anything related to RangedData objects. BUG FIXES o Fix bug in list element recycling. CHANGES IN VERSION 2.22.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Resync with change to smoothEnds() in R 4.0. In R 4.0, stats::smoothEnds() always returns an integer vector when the input is an integer vector. smoothEnds() on an IntegerList now reflects this: it returns an IntegerList object instead of a NumericList object. DEPRECATED AND DEFUNCT o RangedData objects are now defunct. RangedData objects are defunct in BioC 3.11. They were deprecated in BioC 3.9 and, before that, their use has been discouraged in favor of GRanges or GRangesList objects since BioC 2.12, that is, since 2014. BUG FIXES o Fix restrict() method for RangesList objects for when ranges are dropped. CHANGES IN VERSION 2.20.0 ------------------------- NEW FEATURES o IPos objects now exist in 2 flavors: UnstitchedIPos and StitchedIPos IPos is now a virtual class with 2 concrete subclasses: UnstitchedIPos and StitchedIPos. In an UnstitchedIPos instance the positions are stored as an integer vector. In a StitchedIPos instance, like with old IPos instances, the positions are stored as an IRanges object where each range represents a run of consecutive positions. See ?IPos for more information. Old serialized IPos instances need to be converted to StitchedIPos instances with updateObject(). o IPos objects now can hold names o The IRanges() and IPos() constructors now accept user-supplied metadata columns o Add grep(), startsWith() and endsWith() methods for CharacterList objects SIGNIFICANT USER-VISIBLE CHANGES o as.data.frame(IRanges) now propagates the metadata columns o Move splitAsList() to the S4Vectors package o Move S4 class "atomic" from the S4Vectors package o No longer export %in% (was a leftover from an older time when the package was defining an %in% method) DEPRECATED AND DEFUNCT o After being deprecated in BioC 3.9, the following RangedData methods are now defunct: findOverlaps, rownames<-, colnames<-, columnMetadata, columnMetadata<-, c, rbind, as.env, as.data.frame, and coercion from RangedData to DataFrame. o Remove the following RangedData methods: - score, score<-, lapply, within, countOverlaps; - coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList, IntegerRanges, or IntegerRangesList to RangedData. These methods were deprecated in BioC 3.8 and defunct in BioC 3.9. BUG FIXES o Fix integer overflow issue in end() setter for IRanges objects. CHANGES IN VERSION 2.18.0 ------------------------- NEW FEATURES o Add some methods for CharacterList derivatives (nchar, substring, substr, chartr, toupper, tolower, sub, gsub, grepl). DEPRECATED AND DEFUNCT o Deprecate RangedData objects. The use of RangedData objects has been discouraged in favor of GRanges or GRangesList objects since BioC 2.12, that is, since 2014. Developers are required to migrate their code to use GRanges or GRangesList instead of RangedData objects (the GRanges and GRangesList classes are defined in the GenomicRanges package). o Several RangedData methods are now defunct (after being deprecated in BioC 3.8): - score, score<-, lapply, within, countOverlaps; - coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList, IntegerRanges, or IntegerRangesList to RangedData. BUG FIXES o Fix unlist() on a SimpleRleList object of length 0 o Fix drop() for FactorList derivatives o Fix removed rownames upon replacing in a SplitDataFrameList CHANGES IN VERSION 2.16.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Optimize unlist() on Views objects. o Optimize range(), any() and all() on CompressedRleList objects. o Optimize start(), end(), width() setters on CompressedRangesList objects. DEPRECATED AND DEFUNCT o Deprecate several RangedData methods: - score, score<-, lapply, within, countOverlaps; - coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList, IntegerRanges, or IntegerRangesList to RangedData. RangedData objects will be deprecated in BioC 3.9 (their use has been discouraged since BioC 2.12, that is, since 2014). Package developers that are still using RangedData objects need to migrate their code to use GRanges or GRangesList objects instead. o The RangesList() constructor is now defunct (after being deprecated in BioC 3.7). BUG FIXES o Fix DF[IRanges(...), ] on a DataFrame with data.frame columns. o Make [[, as.list(), lapply(), and unlist() fail more graciously on a IRanges object. o NCList objects now properly support c(). CHANGES IN VERSION 2.14.0 ------------------------- NEW FEATURES o Add the windows() generic with various methods. This is a "parallel" version of window() for list-like objects i.e. it does 'mendoapply(window, x, start, end, width)' but uses a fast implementation. Also add heads() and tails() as convenience wrappers around windows(). They do 'mendoapply(head, x, n)' and 'mendoapply(tail, x, n)', respectively, but use a fast implementation. They're replacements for S4Vectors::phead() and S4Vectors::ptail() which are now deprecated. o Add equisplit() to split a vector-like object into a specified number of partitions with equal (total) width. This is useful for instance to ensure balanced loading of workers in parallel evaluation. o promoters() arguments 'upstream' and 'downstream' now can be integer vectors parallel to 'x' (for consistency with the other intra range transformations). o The promoters() generic and methods get the 'use.names' argument. o Add "resize", "flank", and "restrict" methods for Views objects. o Add "as.integer" method for Pos objects (equivalent to pos()). SIGNIFICANT USER-VISIBLE CHANGES o The Ranges virtual class is now the common parent of the IRanges, GRanges, and GAlignments classes (GRanges and GAlignments are defined in the GenomicRanges and GenomicAlignments packages, respectively). More precisely, Ranges is a virtual class that now serves as the parent class for any class that represents a vector of ranges. The ranges can be integer ranges (i.e. ranges on the space of integers) like in an IRanges object, or genomic ranges (i.e. ranges on a genome) like in a GRanges object. Note that because Ranges extends List, all Ranges derivatives are considered list-like objects. This means that GRanges objects and their derivatives are considered list-like objects, which is new (even though [[ don't work on them yet, this will be implemented in Bioconductor 3.8). o Similarly the RangesList virtual class is now the common parent of the IRangesList, GRangesList, and GAlignmentsList classes. o IRanges objects don't support [[, unlist(), as.list(), lapply(), and as.integer() anymore. This is a temporary situation only. These operations will be re-introduced in Bioconductor 3.8 but with a different semantic. The overall goal of all these changes is to bring more consitency between IRanges and GRanges objects (GRanges objects will also support [[, unlist(), as.list(), and lapply() in Bioconductor 3.8). Non-exported IRanges:::unlist_as_integer() helper is a temporary replacement for what unlist() and as.integer() used to do a IRanges object. o Move the pos() generic to BiocGenerics. o Switch order of breakInChunks() arguments 'chunksize' and 'nchunk' to be consistent with tileGenome(). o tile() and slidingWindows() now preserve names. o Optimize [[<- on a CompressedList object. Was very inefficient. The optimized method can be up to 100x faster or more on a long object. o All the S4Vectors-specific material in the IRangesOverview.Rnw vignette has moved to the new S4VectorsOverview.Rnw vignette located in the S4Vectors package. DEPRECATED AND DEFUNCT o Deprecate the RangesList() constructor. IRangesList() should be used instead. o The "ranges" methods for Hits and HitsList objects are now defunct (were deprecated in BioC 3.6). o The "overlapsAny", "subsetByOverlaps", "coverage" and "range" methods for RangedData objects are now defunct (were deprecated in BioC 3.6). o The universe() getter and setter as well as the 'universe' argument of the RangesList(), IRangesList(), RleViewsList(), and RangedData() constructor functions are now defunct (were deprecated in BioC 3.6). CHANGES IN VERSION 2.12.0 ------------------------- NEW FEATURES o Add IPos objects for storing a set of integer positions where most of the positions are typically (but not necessarily) adjacent. o Add coercion of a character vector or factor representing ranges (e.g. "22-155") to an IRanges object, as well as "as.character" and "as.factor" methods for Ranges objects. o Introduce overlapsRanges() as a replacement for "ranges" methods for Hits and HitsList objects, and deprecate the latter. o Add "is.unsorted" method for Ranges objects. o Add "ranges" method for Ranges objects (downgrade the object to an IRanges instance and drop its metadata columns). o Add 'use.names' and 'use.mcols' args to ranges() generic. SIGNIFICANT USER-VISIBLE CHANGES o Change 'maxgap' and 'minoverlap' defaults for findOverlaps() and family (i.e. countOverlaps(), overlapsAny(), and subsetByOverlaps()). This change addresses 2 long-standing issues: (1) by default zero-width ranges are not excluded anymore, and (2) control of zero-width ranges and adjacent ranges is finally decoupled (only partially though). New default for 'minoverlap' is 0 instead of 1. New default for 'maxgap' is -1 instead of 0. See ?findOverlaps for more information about 'maxgap' and the meaning of -1. For example, if 'type' is "any", you need to set 'maxgap' to 0 if you want adjacent ranges to be considered as overlapping. Note that poverlaps() still uses the old 'maxgap' and 'minoverlap' defaults. o subsetByOverlaps() first 2 arguments are now named 'x' and 'ranges' (instead of 'query' and 'subject') for consistency with the transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps() functions from the GenomicFeatures package and with the snpsByOverlaps() function from the BSgenome package. o Replace ifelse() generic and methods with ifelse2() (eager semantics). o Coercion from Ranges to IRanges now propagates the metadata columns. o Move rglist() generic from GenomicRanges to IRanges package. o The "union", "intersect", and "setdiff" methods for Ranges objects don't act like endomorphisms anymore: now they always return an IRanges *instance* whatever Ranges derivatives are passed to them (e.g. NCList or NormalIRanges). DEPRECATED AND DEFUNCT o Deprecate "ranges" methods for Hits and HitsList objects (replaced with overlapsRanges()). o Deprecate the "overlapsAny", "subsetByOverlaps", "coverage" and "range" methods for RangedData objects. o Deprecate the universe() getter and setter as well as the 'universe' argument of the RangesList(), IRangesList(), RleViewsList(), and RangedData() constructor functions. o Default "togroup" method is now defunct (was deprecated in BioC 3.3). o Remove grouplength() (was deprecated in BioC 3.3 and replaced with grouplengths, then defunct in BioC 3.4). BUG FIXES o nearest() and distanceToNearest() now call findOverlaps() internally with maxgap=0 and minoverlap=0. This fixes incorrect results obtained in some situations e.g. in the situation reported here: https://support.bioconductor.org/p/99369/ (zero-width ranges) but also in this situation: nearest(IRanges(5, 10), IRanges(1, 4:5), select="all") where the 2 ranges in the subject are *both* nearest to the 5-10 range. o Fix restrict() and reverse() on IRanges objects with metadata columns. o Fix table() on Ranges objects. o Various other minor fixes. CHANGES IN VERSION 2.10.0 ------------------------- NEW FEATURES o "range" methods now have a 'with.revmap' argument (like "reduce" and "disjoin" methods). o Add coercion from list-like objects to IRangesList objects. o Add "table" method for SimpleAtomicList objects. o The "gaps" method for CompressedIRangesList objects now uses a chunk processing strategy if the input object has more than 10 million list elements. The hope is to reduce memory usage on very big input objects. DEPRECATED AND DEFUNCT o Remove the RangedDataList and RDApplyParams classes, rdapply(), and the "split" and "reduce" methods for RangedData objects. All these things were defunct in BioC 3.4. o Remove 'ignoreSelf' and 'ignoreRedundant' arguments (replaced by 'drop.self' and 'drop.redundant') from findOverlaps,Vector,missing method (were defunct in BioC 3.4). o Remove GappedRanges class (was defunct in BioC 3.4). BUG FIXES o Fix "setdiff" method for CompressedIRangesList for when all ranges are empty. o Fix long standing bug in coercion from Ranges to PartitioningByEnd when the object to coerce has names. CHANGES IN VERSION 2.8.0 ------------------------ NEW FEATURES o "disjoin" methods now support 'with.revmap' argument. o Add 'invert' argument to subsetByOverlaps(), like grep()'s invert. o Add "unstrsplit" method for RleList objects. o findOverlapPairs() allows 'subject' to be missing for self pairing. o Add "union", "intersect" and "setdiff" methods for Pairs. o Add distance,Pairs,missing method. o Add ManyToManyGrouping, with coercion targets from FactorList and DataFrame. o Add Hits->List and Hits->(ManyToMany)Grouping coercions. o Add "as.matrix" method for AtomicList objects. o Add "selfmatch", "duplicated", "order", "rank", and "median" methods for CompressedAtomicList objects. o Add "anyNA" method for CompressedAtomicList objects that ensures recursive=FALSE. o Add "mean" method for CompressedRleList objects. o Support 'global' argument on "which.min" and "which.max" methods for CompressedAtomicList objects. SIGNIFICANT USER-VISIBLE CHANGES o Make mstack,Vector method more consistent with stack,List method. o Optimize and document coercion from AtomicList to RleViews objects. DEPRECATED AND DEFUNCT o Are now defunct (were deprecated in BioC 3.3): - RangedDataList objects. - RDApplyParams objects and rdapply(). - The "split" and "reduce" methods for RangedData objects. - The 'ignoreSelf' and/or 'ignoreRedundant' arguments of the findOverlaps,Vector,missing method (a.k.a. "self findOverlaps" method). - grouplength() - GappedRanges objects. BUG FIXES o Fix special meaning of findOverlaps's maxgap argument when type="within". o isDisjoint(IRangesList()) now returns logical(0) instead of NULL. o Fixes to regroup() and Grouping construction. o Fix rank,CompressedAtomicList method. o Fix fromLast=TRUE for duplicated,CompressedAtomicList method. CHANGES IN VERSION 2.6.0 ------------------------ NEW FEATURES o Add regroup() function. SIGNIFICANT USER-VISIBLE CHANGES o Remove 'algorithm' argument from findOverlaps(), countOverlaps(), overlapsAny(), subsetByOverlaps(), nearest(), distanceToNearest(), findCompatibleOverlaps(), countCompatibleOverlaps(), findSpliceOverlaps(), summarizeOverlaps(), Union(), IntersectionStrict(), and IntersectionNotEmpty(). The argument was added in BioC 3.1 to facilitate the transition from an Interval Tree to a Nested Containment Lists implementation of findOverlaps() and family. The transition is over. o Restore 'maxgap' special meaning (from BioC < 3.1) when calling findOverlaps() (or other member of the family) with 'type' set to "within". o No more limit on the max depth of *on-the-fly* NCList objects. Note that the limit remains and is still 100000 when the user explicitely calls the NCList() or GNCList() constructor. o Rename 'ignoreSelf' and 'ignoreRedundant' argument of the findOverlaps,Vector,missing method -> 'drop.self' and 'drop.redundant'. The old names are still working but deprecated. o Rename grouplength() -> grouplengths() (old name still available but deprecated). o Modify "replaceROWS" method for IRanges objects so that the replaced elements in 'x' get their metadata columns from 'value'. See this thread on bioc-devel: https://stat.ethz.ch/pipermail/bioc-devel/2015-November/008319.html o Optimized which.min() and which.max() for atomic lists. o Remove the ellipsis (...) from all the setops methods, except the methods for Pairs objects. o Add "togroup" method for ManyToOneGrouping objects and deprecate default method. o Modernize "show" method for Ranges objects: now they're displayed more like GRanges objects. o Coercion from IRanges to NormalIRanges now propagates the metadata columns when the object to coerce is already normal. o Don't export CompressedHitsList anymore from the IRanges package. This doesn't seem to be used at all and it's not clear that we need it. DEPRECATED AND DEFUNCT o Deprecate RDApplyParams objects and rdapply(). o Deprecate RangedDataList objects. o Deprecate the "reduce" method for RangedData objects. o Deprecate GappedRanges objects. o Deprecate the 'ignoreSelf' and 'ignoreRedundant' arguments of the findOverlaps,Vector,missing method in favor of the new 'drop.self' and 'drop.redundant' arguments. o Deprecate grouplength() in favor of grouplengths(). o Default "togroup" method is deprecated. o Remove IntervalTree and IntervalForest classes and methods (were defunct in BioC 3.2). o Remove mapCoords() and pmapCoords() generics (were defunct in BioC 3.2). o Remove all "updateObject" methods (they were all obsolete). BUG FIXES o Fix segfault when calling window() on an Rle object of length 0. o Fix "which.min" and "which.max" methods for IntegerList, NumericList, and RleList objects when 'x' is empty or contains empty list elements. o Fix mishandling of zero-width ranges when calling findOverlaps() (or other member of the family) with 'type' set to "within". o Various fixes to "countOverlaps" method for Vector#missing. See svn commit message for commit 116112 for the details. o Fix validity method for NormalIRanges objects (was not checking anything). CHANGES IN VERSION 2.4.0 ------------------------ NEW FEATURES o Add "cbind" methods for binding Rle or RleList objects together. o Add coercion from Ranges to RangesList. o Add "paste" method for CompressedAtomicList objects. o Add "expand" method for Vector objects for expanding a Vector object 'x' based on a column in mcols(x). o Add overlapsAny,integer,Ranges method. o coverage" methods now accept 'shift' and 'weight' supplied as an Rle. SIGNIFICANT USER-VISIBLE CHANGES o The following was moved to S4Vectors: - The FilterRules stuff. - The "aggregate" methods. - The "split" methods. o The "sum", "min", "max", "mean", "any", and "all" methods on CompressedAtomicList objects are 100X faster on lists with 500k elements, 80X faster for 50k elements. o Tweak "c" method for CompressedList objects to make sure it always returns an object of the same class as its 1st argument. o NCList() constructor now propagates the metadata columns. DEPRECATED AND DEFUNCT o RangedData/RangedDataList are not formally deprecated yet but the documentation now officially declares them as superseded by GRanges/GRangesList and discourages their use. o After being deprecated in BioC 3.1, IntervalTree and IntervalForest objects and the "intervaltree" algorithm in findOverlaps() are now defunct. o After being deprecated in BioC 3.1, mapCoords() and pmapCoords() are now defunct. o Remove seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby() (were defunct in BioC 3.1). BUG FIXES o Fix FactorList() constructor when 'compress=TRUE' (note that the levels are combined during compression). o Fix c() on CompressedFactorList objects (was returning a CompressedIntegerList object). CHANGES IN VERSION 2.2.0 ------------------------ NEW FEATURES o Add NCList() and NCLists() for preprocessing a Ranges or RangesList object into an NCList or NCLists object that can be used for fast overlap search with findOverlaps(). NCList() and NCLists() are replacements for IntervalTree() and IntervalForest() that use Nested Containment Lists instead of interval trees. For a one time use, it's not advised to explicitely preprocess the input. This is because findOverlaps() or countOverlaps() will take care of it and do a better job at it (that is, they preprocess only what's needed when it's needed and release memory as they go). o Add coercion methods from Hits to CompressedIntegerList, to PartitioningByEnd, and to Partitioning. SIGNIFICANT USER-VISIBLE CHANGES o The code behind overlap-based operations like findOverlaps(), countOverlaps(), subsetByOverlaps(), summarizeOverlaps(), nearest(), etc... was refactored and improved. Some highlights on what has changed: - The underlying code used for finding/counting overlaps is now based on the Nested Containment List algorithm by Alexander V. Alekseyenko and Christopher J. Lee. - The old algorithm based on interval trees is still available (but deprecated). The 'algorithm' argument was added to most overlap-based operations to let the user choose between the new (algorithm="nclist", the default) and the old (algorithm="intervaltree") algorithm. - With the new algorithm, the hits returned by findOverlaps() are not fully ordered (i.e. ordered by queryHits and subject Hits) anymore, but only partially ordered (i.e. ordered by queryHits only). Other than that, and except for the 3 particular situations mentioned below, choosing one or the other doesn't affect the output, only performance. - Either the query or subject can be preprocessed with NCList() for a Ranges object (replacement for IntervalTree()), NCLists() for a RangesList object (replacement for IntervalForest()), and GNCList() for a GenomicRanges object (replacement for GIntervalTree()). However, for a one time use, it's not advised to explicitely preprocess the input. This is because findOverlaps() or countOverlaps() will take care of it and do a better job at it (that is, they preprocess only what's needed when it's needed and release memory as they go). - With the new algorithm, countOverlaps() on Ranges or GenomicRanges objects doesn't call findOverlaps() to collect all the hits in a growing Hits object and count them only at the end. Instead the counting happens at the C level and the hits are not kept. This reduces memory usage considerably when there is a lot of hits. - When 'minoverlap=0', zero-width ranges are interpreted as insertion points and are considered to overlap with ranges that contain them. This is the 1st situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - When using 'select="arbitrary"', the new algorithm will generally not select the same hits as the old algorithm. This is the 2nd situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - When using the old interval tree algorithm, 'maxgap' has a special meaning if 'type' is "start", "end", or "within". This is not yet the case with the new algorithm. That feature seems somewhat useful though so maybe the new algorithm should also support it? Anyway, this is the 3rd situation where using 'algorithm="nclist"' or 'algorithm="intervaltree"' produces different output. - Objects preprocessed with NCList(), NCLists(), and GNCList() are serializable. o The RleViewsList() constructor function now reorders its 'rleList' argument so that its names match the names on the 'rangesList' argument. o Minor changes to breakInChunks(): - Add 'nchunk' arg. - Now returns a PartitioningByEnd instead of a PartitioningByWidth object. - Now accepts 'chunksize' of 0 if 'totalsize' is 0. o 300x speedup or more when doing unique() on a CompressedRleList object. o 20x speedup or more when doing unlist() on a SimpleRleList object. o Moved the RleTricks.Rnw vignette to the S4Vectors package. DEPRECATED AND DEFUNCT o Deprecated mapCoords() and pmapCoords(). They're replaced by mapToTranscripts() and pmapToTranscripts() from the GenomicFeatures package and mapToAlignments() and pmapToAlignments() from the GenomicAlignments package. o Deprecated IntervalTree and IntervalForest objects. o seqapply(), seqby(), seqsplit(), etc are now defunct (were deprecated in IRanges 2.0.0). o Removed map(), pmap(), and splitAsListReturnedClass() (were defunct in IRanges 2.0.0). o Removed 'with.mapping' argunment from reduce() methods (was defunct in IRanges 2.0.0). BUG FIXES o findOverlaps,Vector,missing method now accepts extra arguments via ... so for example one can specify 'ignore.strand=TRUE' when calling it on a GRanges object (before that, 'findOverlaps(gr, ignore.strand=TRUE)' would fail). o PartitioningByEnd() and PartitioningByWidth() constructors now check that, when 'x' is an integer vector, it cannot contain NAs or negative values. CHANGES IN VERSION 2.0.0 ------------------------ NEW FEATURES o Add mapCoords() and pmapCoords() as replacements for map() and pmap(). o Add coercion from list to RangesList. o Add slice,ANY method as a convenience for slice(as(x, "Rle"), ...). o Add mergeByOverlaps(); acts like base::merge as far as it makes sense. o Add overlapsAny,Vector,missing method. SIGNIFICANT USER-VISIBLE CHANGES o Move Annotated, DataTable, Vector, Hits, Rle, List, SimpleList, and DataFrame classes to new S4Vectors package. o Move isConstant(), classNameForDisplay(), and low-level argument checking helpers isSingleNumber(), isSingleString(), etc... to new S4Vectors package. o Rename Grouping class -> ManyToOneGrouping. Redefine Grouping class as the parent of all groupings (it formalizes the most general kind of grouping). o Change splitAsList() to a generic. o In rbind,DataFrame method, no longer coerce the combined column to the class of the column in the first argument. o Do not carry over row.names attribute from data.frame to DataFrame. o No longer make names valid in [[<-,DataFrame method. o Make the set operations dispatch on Ranges instead of IRanges; they usually return an IRanges, but the input could be any implementation. o Add '...' to splitAsList() generic. o Speed up trim() on a Views object when trimming is actually not needed (no-op). o Speed up validation of IRanges objects by 2x. o Speed up "flank" method for Ranges objects by 4x. DEPRECATED AND DEFUNCT o Defunct map() and pmap(). o reduce() argument 'with.mapping' is now defunct. o splitAsListReturnedClass() is now defunct. o Deprecate seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby(). BUG FIXES o Fix rbind,DataFrame method when first column is a matrix. o Fix a memory leak in the interval tree code. o Fix handling of minoverlap > 1 in findOverlaps(), so that it behaves more consistently and respects 'maxgap', as documented. o Fix findOverlaps,IRanges method for select="last". o Fix subset,Vector-method to handle objects with NULL mcols(x) (e.g. Rle object). o Fix internal helper rbind.mcols() for DataFrame (and potentially other tables). o ranges,SimpleRleList method now returns a SimpleRangesList (instead of CompressedRangesList). o Make flank() work on Ranges object of length 0. CHANGES IN VERSION 1.20.0 ------------------------- NEW FEATURES o Add IntervalForest class from Hector Corrada Bravo. o Add a FilterMatrix class, for holding the results of multiple filters. o Add selfmatch() as a faster equivalent of 'match(x, x)'. o Add "c" method for Views objects (only combine objects with same subject). o Add coercion from SimpleRangesList to SimpleIRangesList. o Add an `%outside%` that is the opposite of `%over%`. o Add validation of length() and names() of Vector objects. o Add "duplicated" and "table" methods for Vector objects. o Add some split methods that dispatch to splitAsList() even when only 'f' is a Vector. o Add set methods (setdiff, intersect, union) for Rle. o Add anyNA methods for Rle and Vector. o Add support for subset(), with(), etc on Vector objects, where the expressions are evaluated in the scope of the mcols and fixed columns. For symbols that should resolve in the calling frame, it is supported and encouraged to escape them with bquote-style ".(x)". o Add "tile" generic and methods for partitioning a ranges object into tiles; useful for iterating over subregions. SIGNIFICANT USER-VISIBLE CHANGES o All functionalities related to XVector objects have been moved to the new XVector package. o Refine how isDisjoint() handles empty ranges. o Remove 'keepLength' argument from "window<-" methods. o unlist( , use.names=FALSE) on a CompressedSplitDataFrameList object now preserves the rownames of the list elements, which is more consistent with what unlist() does on other CompressedList objects. o Splitting a list by a Vector just yields a list, not a List. o The rbind,DataFrame method now handles the case where Rle and vector columns need to be combined (assuming an equivalence between Rle and vector). Also the way the result DataFrame is constructed was changed (avoids undesirable coercions and should be faster). o as.data.frame.DataFrame now passes 'stringsAsFactors=FALSE' and 'check.names=!optional' to the underlying data.frame() call. as(x,"DataFrame") sets 'optional=TRUE' when delegating. Most places where we called as.data.frame(), we now call 'as(x,"data.frame")'. o The [<-,DataFrame method now coerces column sub-replacement value to class of column when the column already exists. o DataFrame() now automatically derives rownames (from the first argument that has some). This is a fairly significant change in behavior, but it probably does better match user behavior. o Make sure that SimpleList objects are coerced to a DataFrame with a single column. The automatic coecion methods created by the methods package were trying to create a DataFrame with one column per element, because DataFrame extends SimpleList. o Change default to 'compress=TRUE' for RleList() constructor. o tapply() now handles the case where only INDEX is a Vector (e.g. an Rle object). o Speedup coverage() in the "tiling case" (i.e. when 'x' is a tiling of the [1, width] interval). This makes it much faster to turn into an Rle a coverage loaded from a BigWig, WIG or BED as a GRanges object. o Allow logical Rle return values from filter rules. o FilterRules no longer requires its elements to be named. o The select,Vector method now returns a DataFrame even when a single column is selected. o Move is.unsorted() generic to BiocGenerics. DEPRECATED AND DEFUNCT o Deprecate seqselect() and subsetByRanges(). o Deprecate 'match.if.overlap' arg of "match" method for Ranges objects. o "match" and "%in%" methods that operate on Views, ViewsList, RangesList, or RangedData objects (20 methods in total) are now defunct. o Remove previously defunct tofactor(). BUG FIXES o The subsetting code for Vector derivatives was substancially refactored. As a consequence, it's now cleaner, simpler, and [ and [[ behave more consistently across Vector derivatives. Some obscure long-standing bugs have been eliminated and the code can be slightly faster in some circumstances. o Fix bug in findOverlaps(); zero-width ranges in the query no longer produce hits ever (regardless of 'maxgap' and 'minoverlap' values). o Correctly free memory allocated for linked list of results compiled for findOverlaps(select="all"). o Various fixes for AsIs and DataFrames. o Allow zero-row replacement values in [<-,DataFrame. o Fix long standing segfault in "[" method for Rle objects (when doing Rle()[0]). o "show" methods now display its most specific class when a column or slot is an S3 object for which class() returns more than one class. o "show" methods now display properly cells that are arrays. o Fix the [<-,DataFrame method for when a value DataFrame has matrix columns. o Fix ifelse() for when one or more of the arguments are Rle objects. o Fix coercion from SimpleList to CompressedList via AtomicList constructors. o Make "show" methods robust to "showHeadLines" and "showTailLines" global options set to NA, Inf or non-integer values. o Fix error condition in eval,FilterRules method. o Corrected an error formatting in eval,FilterRules,ANY method. CHANGES IN VERSION 1.18.0 ------------------------- NEW FEATURES o Add global options 'showHeadLines' and 'showTailLines' to control the number of head/tails lines displayed by "show" methods for Ranges, DataTable, and Hits objects. o "subset" method for Vector objects now considers metadata columns. o Add classNameForDisplay() generic and use it in all "show" methods defined in IRanges and GenomicRanges. o as(x, "DataFrame") now works on *any* R object. o Add findMatches(), an enhanced version of match() that returns all the matches between 'x' and 'table'. The hits are returned in a Hits object. Also add countMatches() for counting the number of matches in 'table' for each element in 'x'. o Add overlapsAny() as a replacement for %in% (now deprecated on range-based objects), and %over% and %within% as convenience wrappers for overlapsAny(). %over% is the replacement for %in%. o Add 'with.mapping' arg to "reduce" methods for IRanges, Ranges, Views, RangesList, and CompressedIRangesList objects. o Add "order" method for Rle objects. o Add subsetByRanges() generic with methods for ANY, NULL, vector, and IRanges for now. This is work-in-progress and more methods will be added soon. The long term plan is to make this a replacement for seqselect(), but with a faster and cleaner implementation. o Add promoters() generic with methods for Ranges, RangesList, Views, and CompressedIRangesList objects. o elementLengths() now works on XVectorList objects (and thus works on DNAStringSet objects and family defined in the Biostrings package). Note that this is the first step towards having relist() work on XVector objects (e.g. DNAString objects) eventhough this is not ready yet. o Add "mstack" method for DataFrame objects. o Add 'name.var' argument to "stack" method for List objects for naming the optional column formed when the elements themselves have named elements. SIGNIFICANT USER-VISIBLE CHANGES o "distanceToNearest" methods now return a Hits instead of a DataFrame object. o The behavior of distance() has changed. Adjacent and overlapping ranges now return a distance of 0L. See ?distance man page for details. A temporary warning will be emitted by distance() until the release of Bioconductor 2.13. o Change arg list of expand() generic: function(x, ...) instead of function(x, colnames, keepEmptyRows). o Dramatic duplicated() and unique() speedups on CompressedAtomicList objects. o Significant endoapply() speedup on XVectorList objects (this benefits DNAStringSet objects and family defined in the Biostrings package). o 2x speedup to "c" method for CompressedList objects. o classNameForDisplay() strips 'Simple' or 'Compressed', which affects all the "show" methods based on it. So now: > IntegerList(1:4, 2:-3) IntegerList of length 2 [[1]] 1 2 3 4 [[2]] 2 1 0 -1 -2 -3 instead of: > IntegerList(1:4, 2:-3) CompressedIntegerList of length 2 [[1]] 1 2 3 4 [[2]] 2 1 0 -1 -2 -3 o Optimization of "[<-" method for Rle objects when no indices are selected (just return self). o "stack" method for List objects now creates a factor for the optional name variable. o Evaluating FilterRules now subsets by each filter individually, rather than subsetting by all at the end. o Optimized which() on CompressedLogicalList objects. o All the binary comparison operations (==, <=, etc...) on Ranges objects are now using compare() behind the scene. This makes them slightly faster and also slightly more memory efficient. DEPRECATED AND DEFUNCT o %in% is now deprecated on range-based objects. Please use %over% instead. More precisely: - "match" and "%in%" methods that operate on Views, ViewsList, RangesList, or RangedData objects (20 methods in total) are now deprecated. - Behavior of match() and %in% on Ranges objects was changed (and will issue a warning) to use equality instead of overlap for comparing elements between Ranges objects 'x' and 'table'. The old behavior is still available for match() via new 'match.if.overlap' arg that is FALSE by default (the arg will be deprecated in BioC 2.13 and removed in BioC 2.14). o tofactor() is now defunct. o '.ignoreElementMetadata' argument of "c" method for IRanges objects is now defunct. BUG FIXES o Small fix to "unlist" method for CompressedList objects when 'use.names' is TRUE and 'x' is a zero-length named List (the zero-length vector returned in that case was not named, now it is). o "resize" method for Ranges objects now allows zero-length 'fix' when 'x' is zero-length. o Subsetting a Views object now subsets its metadata columns. o Names on the vector-like columns of a DataFrame object are now preserved when calling DataFrame(), or when coercing to DataFrame, or when combining DataFrame objects with rbind(). o relist() now propagates the names on 'skeleton' when returning a SimpleList. o Better argument checking in breakInChunks(). o Fix broken "showAsCell" method for ANY. Now tries to coerce uni-dimensional objects to vector instead of data.frame (which never worked anyway, due to a bug). o Fix long standing bug in "flank" method for Ranges objects: it no longer returns an invalid object when NAs are passed thru the 'width' arg. Now it's an error to try to do so. o Fix issue with some of the "as.env" methods not being able to find the environment of the caller. o Fix bug in "showAsCell" method for AtomicList objects: now returns character(0) instead of NULL on an object of length 0. o sort() now drops NA's when 'na.last=NA' on an Rle object (consistent with base::sort). o table() now handles NA's appropriately on an Rle object. o table() now returns all the levels on a factor-Rle object. o Fix sub-replacement of Rles when using Ranges as the index. o Fix bug in [<- method for DataFrame objects. The fix corrects the way a new column created by a subset assignment is filled. Previously, if the first row was set, say, to '1', all values in the column were set to '1' when they needed to be set to NA (for consistency with data.frame). o Fix bug in compare() (was not returning 0 when comparing a 0-width range to itself). o Fix naming of column when passing an AsIs matrix to DataFrame() -- no more .X suffix. o Fix "rbind" method for DataFrame objects when some columns are matrix objects. CHANGES IN VERSION 1.16.0 ------------------------- NEW FEATURES o as( , "SimpleList"), as( , "CompressedList"), and as( , "List") now work on atomic vectors, and each element of the vector corresponds to an element of the returned List (this is consistent with as.list). o Add as.list,Rle method. o Add as.matrix,Views method. Each view corresponds to a row in the returned matrix. Rows corresponding to views shorter than the longest view are right-padded with NAs. o Add FilterClosure closure class for functions placed into a FilterRules. Has methods for getting parameters and showing. o Support 'na.rm' argument in "runsum", "runwtsum", "runq", and "runmean" methods for Rle and RleList objects. o Add splitAsList() and splitAsListReturnedClass(). o Improve summary,FilterRules to support serial evaluation, discarded counts (instead of passed) and percentages. o Make rename work on ordinary vector (in addition to Vector). o Add coercion from RangedData to CompressedIRangesList, IRangesList, or RangesList. It propagates the data columns (aka values) of the RangedData object to the inner metadata columns of the RangesList object. o Add 'NG' arg to PartitioningByEnd() and PartitioningByWidth() constructors. o Make PartitioningByEnd() work on list-like objects (like PartitioningByWidth()). o Fast disjoin() for moderate-sized CompressedIRangesList. o Add countQueryHits() and countSubjectHits(). o coverage() now supports method="auto" and this is the new default. o Add the flippedQuery(), levels(), ngap(), Lngap(), Rngap(), Lencoding(), and Rencoding() getters for OverlapEncodings objects. o Add "encodeOverlaps" method for GRangesList objects. o Enhance "[" methods for IRanges, XVector, XVectorList, and MaskCollection objects, as well as "[<-" method for IRanges objects, by supporting the following subscript types: NULL, Rle, numeric, logical, character, and factor. (All the methods listed above already supported some of those types but no method supported them all). o Add remapHits() for remapping the query and subject hits of a Hits object. o Add match,Hits method. o Add %in%,Vector method. o Add "compare", "==", "!=", "<=", ">=", "<", ">", "is.unsorted", "order", "rank", "match", and "duplicated" methods for XRawList objects. unique() and sort() also work on these objects via the "unique" and "sort" methods for Vector objects. o Add expand() for expanding a DataFrame based on the contents of one or more designated columms. o After being deprecated (in BioC 2.9) and defunct (in BioC 2.10), the "as.vector" method for AtomicList objects is back, but now it mimics what as.vector() does on an ordinary list i.e. it's equivalent to 'as.vector(as.list(x), mode=mode)'. Also coercions from AtomicList to logical/integer/numeric/double/complex/character/raw are back and based on the "as.vector" method for AtomicList objects i.e. they work only on objects with top-level elements of length <= 1. o DataFrame constructor now supports 'check.names' argument. o Add revElements() generic with methods for List and CompressedList objects. SIGNIFICANT USER-VISIBLE CHANGES o Splitting / relisting a Hits object now returns a HitsList instead of an ordinary list. o Operations in the Ops group between a List and an atomic vector operand now coerce the atomic vector to List (SimpleList or CompressedList) before performing the operation. Also, operands are recycled and a better job is done returning zero length results of the correct type. o Change the warning for 'Integer overflow ...' thrown by sum() on integer-Rle's o DataFrame now coerces List/list value to DataFrame in [<-. o Fix as.matrix,DataFrame for zero column DataFrames. Returns an nrow()x0 logical matrix. o union,Hits method now sorts the returned hits first by query hit, then by subject hit. o Add mcols() accessor as the preferred way (over elementMetadata() and values()) to access the metadata columns of a Vector object. o By default, mcols(x) and elementMetadata(x) do NOT propagate the names of x as the row names of the returned DataTable anymore. However the user can still get the old behavior by doing mcols(x, use.names=TRUE). o [<-,XVectorList now preserves the original names instead of propagating the names of the replacement value, which is consistent with how [<- operates on an ordinary vector/list. o coverage() now returns a numeric-Rle when passed numeric weights. o When called on a List object with use.names=TRUE, unlist() no longer tries to mimic the kind of non-sense name mangling that base::unlist() does (e.g. on list(a=1:3)) in a pointless effort to return a vector with unique names. o Remove 'hits' argument from signature of encodeOverlaps() generic function. o unique,Vector now drops the names for consistency with base::unique(). o Remove make.names() coercion in colnames<-,DataFrame for consistency with data.frame. DEPRECATED AND DEFUNCT o Deprecated tofactor(). o Remove RangesMatching, RangesMatchingList, and Binning classes. o Change from deprecated to defunct: matchMatrix(), "dim" method for Hits objects, and RangesMatchingList(). BUG FIXES o Fix bug in pintersect,IRanges,IRanges when input had empty ranges (broken since 2010-03-04). o Avoid integer overflow in mean,Rle method by coercing integer-Rle to numeric-Rle internally. o Change evaluation frame of with,List to parent.frame(), and get the enclosure correct in eval,List. o Many fixes and improvements to coercion from RangesList to RangedData (see commit 68195 for the details). o Fix "runValue" and "ranges" methods for CompressedRleList objects (broken for a very long time). o shift,Ranges method now fails in case of integer overflow instead of returning an invalid Ranges object. o mstack() now works on Vector objects with NULL metadata columns. o In case of integer overflow, coverage() now puts NAs in the returned Rle and issues a warning. o Fix bug in xvcopy,XRawList objects that prevented sequences from being removed from the cache of a BSgenome object. See commit 67171 for the details. o Fix issues related to duplicate column names in DataFrame (see commit 67163 for the details). o Fix a bunch of subsetting methods that were not subsetting the metadata columns: "[", "subseq", and "seqselect" methods for XVector objects, "seqselect" and "window" methods for XVectorList objects, and "[" method for MaskCollection objects. o Fix empty replacement with [<-,Vector o Make %in% robust on an empty 'table' argument when operating on Hits objects. CHANGES IN VERSION 1.14.0 ------------------------- NEW FEATURES o The map generic and RangesMapping class for mapping ranges between sequences according to some alignment. Some useful methods are implemented in GenomicRanges. o The Hits class has experimental support for basic set operations, including setdiff, union and intersect. o Added a number of data manipulation functions and methods, including mstack, multisplit, rename, unsplit for Vector. o Added compare() generic for generalized range-wise comparison of 2 range-based objects. o Added OverlapEncodings class and encodeOverlaps() generic for dealing with "overlap encodings". o subsetByOverlaps() should now work again on an RleViews object. o DataFrame now supports storing an array (like a matrix) in a column. o Added as.matrix,DataFrame method. o Added merge,DataTable,DataTable method. o Added disjointBins,RangesList method. o Added ranges,Rle and ranges,RleList methods. o Added which.max,Rle method. o Added drop,AtomicList method. o Added tofactor() wrapper around togroup(). o Added coercions from vector to any AtomicList subtype (compressed and uncompressed). o Added AtomicList to Character/Numeric/Logical/Integer/Raw/ComplexList coercions. o Added revElements() for reversing individual elements of a List object. SIGNIFICANT USER-VISIBLE CHANGES o RangesMatching has been renamed to Hits and extends Vector, so that it supports metadata columns and other features. o RangesMatchingList has been renamed to HitsList. o The 2 columns of the matrix returned by the "as.matrix" method for Hits objects are now named queryHits/subjectHits instead of query/subject, for consistency with the queryHits() and subjectHits() getters. o queryLength()/subjectLength() are recommended alternatives to dim,Hits. o breakInChunks() returns a PartitioningByWidth object. o The 'weight' arg in "coverage" methods for IRanges, Views and MaskCollection objects now can also be a single string naming a column in elementMetadata(x). o "countOverlaps" methods now propagate the names of the query. DEPRECATED AND DEFUNCT o matchMatrix,Hits is deprecated. o Moved the following deprecated features to defunct status: - use of as.data.frame() or as( , "data.frame") on an AtomicList object; - all coercion methods from AtomicList to atomic vectors; - subsetting an IRanges by Ranges; - subsetting a RangesList or RangedData by RangesList. BUG FIXES o within,RangedData/List now support replacing columns o aggregate() override no longer breaks on . ~ x formulas o "[", "c", "rep.int" and "seqselect" methods for Rle objects are now safer and will raise an error if the object to be returned has a length > .Machine$integer.max o Avoid blowing up memory by not expanding 'logical' Rle's into logical vectors internally in "slice" method for RleList objects. CHANGES IN VERSION 1.12.0 ------------------------- NEW FEATURES o Add "relist" method that works on a List skeleton. o Add XDoubleViews class with support of most of the functionalities available for XIntegerViews. o c() now works on XInteger and XDouble objects (in addition to XRaw objects). o Add min, max, mean, sum, which.min, which.max methods as synonyms for the view* functions. SIGNIFICANT USER-VISIBLE CHANGES o Views and RleViewsList classes don't derive from IRanges and IRangesList classes anymore. o When used on a List or a list, togroup() now returns an integer vector (instead of a factor) for consistency with what it does on other objects (e.g. on a Partitioning object). o Move compact() generic from Biostrings to IRanges. o Drop deprecated 'multiple' argument from "findOverlaps" methods. o Drop deprecated 'start' and 'symmetric' arguments from "resize" method for Ranges objects. DEPRECATED AND DEFUNCT o Using as.data.frame() and or as( , "data.frame") on an AtomicList object is deprecated. o Deprecate all coercion methods from AtomicList to atomic vectors. Those methods were unlisting the object, which can still be done with unlist(). o Deprecate the Binning class. o Remove defunct overlap() and countOverlap(). BUG FIXES o togroup() on a List or a list does not look at the names anymore to infer the grouping, only at the shape of the list-like object. o Fix 'relist(IRanges(), IRangesList())'. o Fix 'rep.int(Rle(), integer(0))'. o Fix some long-standing issues with the XIntegerViews code (better handling of "out of limits" or empty views, overflows, NAs). IRanges/R/0000755000175100017510000000000014641310665013331 5ustar00biocbuildbiocbuildIRanges/R/AtomicList-class.R0000644000175100017510000003342314626176651016644 0ustar00biocbuildbiocbuild### ========================================================================= ### AtomicList objects ### ------------------------------------------------------------------------- ## A list that holds atomic objects setClass("AtomicList", representation("VIRTUAL"), prototype = prototype(elementType = "logical"), contains = "List") setClass("LogicalList", representation("VIRTUAL"), prototype = prototype(elementType = "logical"), contains = "AtomicList") setClass("IntegerList", representation("VIRTUAL"), prototype = prototype(elementType = "integer"), contains = "AtomicList") setClass("NumericList", representation("VIRTUAL"), prototype = prototype(elementType = "numeric"), contains = "AtomicList") setClass("ComplexList", representation("VIRTUAL"), prototype = prototype(elementType = "complex"), contains = "AtomicList") setClass("CharacterList", representation("VIRTUAL"), prototype = prototype(elementType = "character"), contains = "AtomicList") setClass("RawList", representation("VIRTUAL"), prototype = prototype(elementType = "raw"), contains = "AtomicList") setClass("RleList", representation("VIRTUAL"), prototype = prototype(elementType = "Rle"), contains = "AtomicList") setClass("FactorList", representation("VIRTUAL"), prototype = prototype(elementType = "factor"), contains = "IntegerList") setClass("SimpleAtomicList", contains = c("AtomicList", "SimpleList"), representation("VIRTUAL")) setClass("SimpleLogicalList", prototype = prototype(elementType = "logical"), contains = c("LogicalList", "SimpleAtomicList")) setClass("SimpleIntegerList", prototype = prototype(elementType = "integer"), contains = c("IntegerList", "SimpleAtomicList")) setClass("SimpleNumericList", prototype = prototype(elementType = "numeric"), contains = c("NumericList", "SimpleAtomicList")) setClass("SimpleComplexList", prototype = prototype(elementType = "complex"), contains = c("ComplexList", "SimpleAtomicList")) setClass("SimpleCharacterList", prototype = prototype(elementType = "character"), contains = c("CharacterList", "SimpleAtomicList")) setClass("SimpleRawList", prototype = prototype(elementType = "raw"), contains = c("RawList", "SimpleAtomicList")) setClass("SimpleRleList", prototype = prototype(elementType = "Rle"), contains = c("RleList", "SimpleAtomicList")) setClass("SimpleFactorList", prototype = prototype(elementType = "factor"), contains = c("FactorList", "SimpleAtomicList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructors ### .dotargsAsList <- function(type, ...) { listData <- list(...) if (length(listData) == 1) { arg1 <- listData[[1]] if (is.list(arg1) || is(arg1, "List")) listData <- arg1 else if (type == "integer" && class(arg1) == "character") listData <- toListOfIntegerVectors(arg1) # weird special case } listData } AtomicListConstructor <- function(type, compress.default = TRUE) { constructor <- eval(substitute(function(..., compress = compress.default) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") listData <- .dotargsAsList(type, ...) CompressedOrSimple <- if (compress) "Compressed" else "Simple" ans_class <- S4Vectors:::listClassName(CompressedOrSimple, type) if (compress && sum(lengths(listData)) > .Machine$integer.max) { contructor_name <- paste0(toupper(substr(type, start=1L, stop=1L)), substr(type, start=2L, stop=nchar(type)), "List") stop(wmsg("input of ", contructor_name, "() is too big ", "for 'compress=TRUE' (the cumulated length of the ", "list elements in the resulting ", ans_class, " object ", "would exceed 2^31); please call the constructor ", "with 'compress=FALSE' instead")) } if (is(listData, ans_class)) listData else CoercerToList(type, compress)(listData) }, list(type = type))) formals(constructor)$compress <- compress.default constructor } LogicalList <- AtomicListConstructor("logical") IntegerList <- AtomicListConstructor("integer") NumericList <- AtomicListConstructor("numeric") ComplexList <- AtomicListConstructor("complex") CharacterList <- AtomicListConstructor("character") RawList <- AtomicListConstructor("raw") RleList <- AtomicListConstructor("Rle") FactorList <- AtomicListConstructor("factor") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ### Equivalent to 'as.vector(as.list(x), mode=mode)' but faster on ### CompressedAtomicList objects (10x, 75x, or more, depending on 'length(x)'). setMethod("as.vector", "AtomicList", function(x, mode="any") { valid_modes <- c("any", S4Vectors:::ATOMIC_TYPES, "double", "list") mode <- match.arg(mode, valid_modes) if (mode %in% c("any", "list")) return(as.list(x)) x_eltNROWS <- elementNROWS(x) if (any(x_eltNROWS > 1L)) stop("coercing an AtomicList object to an atomic vector ", "is supported only for\n", " objects with top-level elements of length <= 1") ans <- base::rep.int(as.vector(NA, mode=mode), length(x)) ans[x_eltNROWS == 1L] <- as.vector(unlist(x, use.names=FALSE), mode=mode) ans } ) as.matrix.AtomicList <- function(x, col.names=NULL, ...) { p <- PartitioningByEnd(x) vx <- decode(unlist(x, use.names=FALSE)) if (is.null(col.names)) { col.names <- names(vx) } if (is.null(col.names) || is.character(col.names)) { col.ind <- unlist_as_integer(IRanges(1, width(p))) } else if (is.list(col.names) || is(col.names, "List")) { col.names <- unlist(col.names, use.names=FALSE) if (is.factor(col.names)) { col.ind <- as.integer(col.names) col.names <- levels(col.names) } else { col.ind <- selfmatch(col.names) col.names <- col.names[col.ind == seq_along(col.ind)] } } else { stop("'col.names' should be NULL, a character vector or list") } row.ind <- togroup(p) nc <- if (!is.null(col.names)) length(col.names) else max(width(p)) m <- matrix(nrow=length(x), ncol=nc) m[cbind(row.ind, col.ind)] <- vx if (!is.null(col.names)) colnames(m) <- col.names m } setMethod("as.matrix", "AtomicList", function(x, col.names=NULL) as.matrix.AtomicList(x, col.names)) setMethod("drop", "AtomicList", function(x) { x_eltNROWS <- elementNROWS(x) if (any(x_eltNROWS > 1)) stop("All element lengths must be <= 1") x_dropped <- rep.int(NA, sum(x_eltNROWS)) x_unlisted <- unlist(x, use.names = FALSE) x_dropped[x_eltNROWS > 0L] <- x_unlisted if (is.factor(x_unlisted)) { x_dropped <- structure(as.integer(x_dropped), levels=levels(x_unlisted), class="factor") } names(x_dropped) <- names(x) x_dropped }) CoercerToList <- function(type, compress) { .coerceToList <- if (compress) coerceToCompressedList else S4Vectors:::coerceToSimpleList function(from) { .coerceToList(from, type) } } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### General methods ### ### Could actually be made the "table" method for List objects. Will work on ### any List object 'x' for which 'as.factor(unlist(x))' works. setMethod("table", "AtomicList", function(...) { args <- list(...) if (length(args) != 1L) stop("\"table\" method for AtomicList objects ", "can only take one input object") x <- args[[1L]] if (!pcompareRecursively(x)) { ## Not sure why callNextMethod() doesn't work. Is it because of ## dispatch on the ellipsis? #return(callNextMethod()) return(selectMethod("table", "Vector")(...)) } y1 <- togroup(PartitioningByWidth(x)) attributes(y1) <- list(levels=as.character(seq_along(x)), class="factor") y2 <- as.factor(unlist(x, use.names=FALSE)) ans <- table(y1, y2) names(dimnames(ans)) <- NULL x_names <- names(x) if (!is.null(x_names)) rownames(ans) <- x_names ans } ) setMethod("table", "SimpleAtomicList", function(...) { args <- list(...) if (length(args) != 1L) stop("\"table\" method for SimpleAtomicList objects ", "can only take one input object") x <- args[[1L]] levs <- sort(unique(unlist(lapply(x, function(xi) { if (!is.null(levels(xi))) levels(xi) else unique(xi) }), use.names=FALSE))) as.table(do.call(rbind, lapply(x, function(xi) { if (is(xi, "Rle")) runValue(xi) <- factor(runValue(xi), levs) else xi <- factor(xi, levs) table(xi) }))) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Rle methods ### ### 'use.names' is ignored. setMethod("unlist", "SimpleRleList", function (x, recursive=TRUE, use.names=TRUE) { if (!identical(recursive, TRUE)) stop("\"unlist\" method for SimpleRleList objects ", "does not support the 'recursive' argument") if (length(x) == 0L) return(Rle()) ans_values <- unlist(lapply(x@listData, slot, "values"), use.names=FALSE) ans_lengths <- unlist(lapply(x@listData, slot, "lengths"), use.names=FALSE) Rle(ans_values, ans_lengths) } ) setMethod("runLength", "RleList", function(x) { as(lapply(x, runLength), "IntegerList") }) setMethod("runValue", "RleList", function(x) { as(lapply(x, runValue), "List") }) setReplaceMethod("runValue", "SimpleRleList", function(x, value) { if (!identical(elementNROWS(ranges(x)), elementNROWS(value))) stop("elementNROWS() of 'x' and 'value' must match") x@listData <- mapply(function(rle, v) { runValue(rle) <- v rle }, x, value, SIMPLIFY=FALSE) x }) setMethod("ranges", "RleList", function(x, use.names=TRUE, use.mcols=FALSE) { as(lapply(x, ranges, use.names=use.names, use.mcols=use.mcols), "List") }) diceRangesByList <- function(x, list) { listPart <- PartitioningByEnd(list) ## 'x' cannot contain empty ranges so using ## 'hit.empty.query.ranges=TRUE' won't affect the result but ## it makes findOverlaps_IntegerRanges_Partitioning() just a little ## bit faster. hits <- findOverlaps_IntegerRanges_Partitioning( x, listPart, hit.empty.query.ranges=TRUE) ov <- overlapsRanges(x, listPart, hits) ans_unlistData <- shift(ov, 1L - start(listPart)[subjectHits(hits)]) ans_partitioning <- PartitioningByEnd(subjectHits(hits), NG=length(list)) ans <- relist(ans_unlistData, ans_partitioning) names(ans) <- names(list) ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Factor methods ### setMethod("levels", "FactorList", function(x) { CharacterList(lapply(x, levels)) }) setMethod("unlist", "SimpleFactorList", function(x, recursive = TRUE, use.names = TRUE) { levs <- levels(x) if (length(x) > 1L && !all(vapply(levs[-1L], identical, logical(1L), levs[[1L]]))) { stop("inconsistent level sets") } structure(callNextMethod(), levels=as.character(levs[[1L]]), class="factor") }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### .showAtomicList <- function(object, minLines, ...) { len <- length(object) object_names <- names(object) k <- min(minLines, len) d <- len - minLines for (i in seq_len(k)) { if (is.null(object_names)) { label <- i } else { nm <- object_names[[i]] if (is.na(nm)) { label <- "NA" } else { label <- paste0("\"", nm, "\"") } } label <- paste0("[[", label, "]]") if (length(object[[i]]) == 0) { cat(label, " ", sep = "") print(object[[i]]) } else { cat(S4Vectors:::labeledLine(label, object[[i]], labelSep = "", count = FALSE)) } } if (d > 0) cat("...\n<", d, ifelse(d == 1, " more element>\n", " more elements>\n"), sep="") } setMethod("show", "AtomicList", function(object) { cat(classNameForDisplay(object), " of length ", length(object), "\n", sep = "") .showAtomicList(object, 10) } ) setMethod("show", "RleList", function(object) { lo <- length(object) k <- min(5, length(object)) diffK <- lo - 5 cat(classNameForDisplay(object), " of length ", lo, "\n", sep = "") show(as.list(head(object, k))) if (diffK > 0) cat("...\n<", diffK, ifelse(diffK == 1, " more element>\n", " more elements>\n"), sep="") }) IRanges/R/AtomicList-utils.R0000644000175100017510000003022014626176651016667 0ustar00biocbuildbiocbuild### ========================================================================= ### Common operations on AtomicList objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Group generic methods ### emptyOpsReturnValue <- function(.Generic, e1, e2, compress) { dummy.vector <- do.call(.Generic, list(vector(e1@elementType), vector(e2@elementType))) CoercerToList(NULL, compress)(dummy.vector) } recycleList <- function(x, length.out) { if (length.out %% length(x) > 0L) warning("shorter object is not a multiple of longer object length") rep(x, length.out = length.out) } setMethod("Ops", signature(e1 = "AtomicList", e2 = "AtomicList"), function(e1, e2) { if (length(e1) == 0L || length(e2) == 0L) { return(emptyOpsReturnValue(.Generic, e1, e2, compress = FALSE)) } n <- max(length(e1), length(e2)) e1 <- recycleList(e1, n) e2 <- recycleList(e2, n) as(Map(.Generic, e1, e2), "List") }) setMethod("Ops", signature(e1 = "AtomicList", e2 = "atomic"), function(e1, e2) { e2 <- as(e2, class(e1)) callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "atomic", e2 = "AtomicList"), function(e1, e2) { e1 <- as(e1, class(e2)) callGeneric(e1, e2) }) ### Only to make unary + and - work (i.e. '+x' and '-x'). setMethod("Ops", c("AtomicList", "missing"), function(e1, e2) callGeneric(0L, e1) ) setMethod("Math", "AtomicList", function(x) as(lapply(x, .Generic), "List")) setMethod("Math2", "AtomicList", function(x, digits) { if (missing(digits)) digits <- ifelse(.Generic == "round", 0, 6) as(lapply(x, .Generic, digits = digits), "List") }) setMethod("Summary", "AtomicList", function(x, ..., na.rm = FALSE) { sapply(x, .Generic, na.rm = na.rm) }) setMethod("Complex", "AtomicList", function(z) as(lapply(z, .Generic), "List")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Logical methods ### ifelseReturnValue <- function(yes, no, len) { proto <- function(x) new(if (is.atomic(x)) class(x) else x@elementType) v <- logical() v[1L] <- proto(yes)[1L] v[1L] <- proto(no)[1L] v compress <- is(yes, "CompressedList") || is(no, "CompressedList") as(rep(v, length.out = len), if(compress) "CompressedList" else "SimpleList") } setGeneric("ifelse2", function(test, yes, no) standardGeneric("ifelse2")) setMethods("ifelse2", list(c("ANY", "ANY", "List"), c("ANY", "List", "List"), c("ANY", "List", "ANY")), function(test, yes, no) { ans <- ifelseReturnValue(yes, no, length(test)) ok <- !(nas <- is.na(test)) if (any(test[ok])) ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok] if (any(!test[ok])) ans[!test & ok] <- rep(no, length.out = length(ans))[!test & ok] ans[nas] <- NA names(ans) <- names(test) ans }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Numerical methods ### ### which.min() and which.max() setMethods("which.min", list("IntegerList", "NumericList", "RleList"), function(x) setNames(as.integer(lapply(x, which.min)), names(x)) ) setMethods("which.max", list("IntegerList", "NumericList", "RleList"), function(x) setNames(as.integer(lapply(x, which.max)), names(x)) ) toglobal <- function(i, x) { start(PartitioningByEnd(x)) + i - 1L } for (i in c("IntegerList", "NumericList", "RleList")) { setMethod("pmax", i, function(..., na.rm = FALSE) mendoapply(pmax, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmin", i, function(..., na.rm = FALSE) mendoapply(pmin, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmax.int", i, function(..., na.rm = FALSE) mendoapply(pmax.int, ..., MoreArgs = list(na.rm = na.rm))) setMethod("pmin.int", i, function(..., na.rm = FALSE) mendoapply(pmin.int, ..., MoreArgs = list(na.rm = na.rm))) } setMethod("mean", "AtomicList", function(x, ...) sapply(x, mean, ...) ) setMethod("var", c("AtomicList", "missing"), function(x, y=NULL, na.rm=FALSE, use) { if (missing(use)) use <- ifelse(na.rm, "na.or.complete", "everything") sapply(x, var, na.rm=na.rm, use=use) } ) setMethod("var", c("AtomicList", "AtomicList"), function(x, y=NULL, na.rm=FALSE, use) { if (missing(use)) use <- ifelse(na.rm, "na.or.complete", "everything") mapply(var, x, y, MoreArgs=list(na.rm=na.rm, use=use)) } ) setMethod("cov", c("AtomicList", "AtomicList"), function(x, y=NULL, use="everything", method=c("pearson", "kendall", "spearman")) mapply(cov, x, y, MoreArgs=list(use=use, method=match.arg(method))) ) setMethod("cor", c("AtomicList", "AtomicList"), function(x, y=NULL, use="everything", method=c("pearson", "kendall", "spearman")) mapply(cor, x, y, MoreArgs=list(use=use, method=match.arg(method))) ) setMethod("sd", "AtomicList", function(x, na.rm=FALSE) sapply(x, sd, na.rm=na.rm) ) setMethod("median", "AtomicList", function(x, na.rm=FALSE) sapply(x, median, na.rm=na.rm) ) setMethod("quantile", "AtomicList", function(x, ...) do.call(rbind, lapply(x, quantile, ...)) ) setMethod("mad", "AtomicList", function(x, center=median(x), constant=1.4826, na.rm=FALSE, low=FALSE, high=FALSE) { if (!missing(center)) stop("'center' argument is not supported") sapply(x, mad, constant=constant, na.rm=na.rm, low=low, high=high) } ) setMethod("IQR", "AtomicList", function(x, na.rm=FALSE, type=7) sapply(x, IQR, na.rm=na.rm, type=type) ) diff.AtomicList <- function(x, ...) diff(x, ...) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Running window statistic methods ### setMethod("runmed", "SimpleIntegerList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) NumericList(lapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level), compress = FALSE)) setMethod("runmed", "NumericList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) endoapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level)) setMethod("runmed", "RleList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) endoapply(x, runmed, k = k, endrule = match.arg(endrule))) setMethod("runmean", "RleList", function(x, k, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runmean, k = k, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runsum", "RleList", function(x, k, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runsum, k = k, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runwtsum", "RleList", function(x, k, wt, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runwtsum, k = k, wt = wt, endrule = match.arg(endrule), na.rm = na.rm)) setMethod("runq", "RleList", function(x, k, i, endrule = c("drop", "constant"), na.rm = FALSE) endoapply(x, runq, k = k, i = i, endrule = match.arg(endrule), na.rm = na.rm)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Character ### ### TODO: paste, ... setMethod("unstrsplit", "CharacterList", function(x, sep="") unstrsplit(as.list(x), sep=sep) ) setMethod("unstrsplit", "RleList", function(x, sep="") unstrsplit(CharacterList(x, compress=FALSE), sep=sep) ) setMethods("nchar", c("CharacterList", "RleList"), function(x) { relist(nchar(unlist(x, use.names=FALSE)), x) }) setMethods("substring", c("CharacterList", "RleList"), function(text, first, last = 1000000L) { relist(substring(unlist(text, use.names=FALSE), first, last), text) }) setMethods("substr", c("CharacterList", "RleList"), function(x, start, stop) { relist(substr(unlist(x, use.names=FALSE), start, stop), x) }) setMethods("chartr", list(c(old = "ANY", new = "ANY", x = "CharacterList"), c(old = "ANY", new = "ANY", x = "RleList")), function(old, new, x) { relist(chartr(old, new, unlist(x, use.names=FALSE)), x) }) setMethods("toupper", c("CharacterList", "RleList"), function(x) { relist(toupper(unlist(x, use.names=FALSE)), x) }) setMethods("tolower", c("CharacterList", "RleList"), function(x) { relist(tolower(unlist(x, use.names=FALSE)), x) }) setMethods("sub", list(c("ANY", "ANY", "CharacterList"), c("ANY", "ANY", "RleList")), function(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { relist(sub(pattern, replacement, unlist(x, use.names=FALSE), ignore.case, perl, fixed, useBytes), x) }) setMethods("gsub", list(c("ANY", "ANY", "CharacterList"), c("ANY", "ANY", "RleList")), function(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { relist(gsub(pattern, replacement, unlist(x, use.names=FALSE), ignore.case, perl, fixed, useBytes), x) }) setMethods("grepl", list(c("ANY", "CharacterList"), c("ANY", "RleList")), function(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { relist(grepl(pattern, unlist(x, use.names=FALSE), ignore.case, perl, fixed, useBytes), x) }) setMethods("grep", list(c("ANY", "CharacterList"), c("ANY", "RleList")), function(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE) { stopifnot(isTRUEorFALSE(invert), isTRUEorFALSE(value)) g <- grepl(pattern, x, ignore.case, perl, fixed, useBytes) if (invert) { g <- !g } if (value) { x[g] } else { which(g) } }) setMethods("startsWith", list(c("CharacterList", "ANY"), c("RleList", "ANY")), function(x, prefix) { relist(startsWith(unlist(x, use.names = FALSE), prefix), x) }) setMethods("endsWith", list(c("CharacterList", "ANY"), c("RleList", "ANY")), function(x, suffix) { relist(endsWith(unlist(x, use.names = FALSE), suffix), x) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Set/comparison methods ### subgrouping <- function(x) { g <- grouping(togroup(PartitioningByEnd(x)), unlist(x, use.names=FALSE)) as(g, "ManyToOneGrouping") } .unique.RleList <- function(x, incomparables=FALSE, ...) unique(runValue(x), incomparables=incomparables, ...) setMethod("unique", "RleList", .unique.RleList) IRanges/R/CompressedAtomicList-class.R0000644000175100017510000007056314626176651020677 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedAtomicList objects ### ------------------------------------------------------------------------- ## Possible optimizations for compressed lists: ## - order/sort: unlist, order by split factor first ## - cumsum: unlist, cumsum and subtract offsets setClass("CompressedAtomicList", contains = c("AtomicList", "CompressedList"), representation("VIRTUAL")) setClass("CompressedLogicalList", prototype = prototype(elementType = "logical", unlistData = logical()), contains = c("LogicalList", "CompressedAtomicList")) setClass("CompressedIntegerList", prototype = prototype(elementType = "integer", unlistData = integer()), contains = c("IntegerList", "CompressedAtomicList")) setClass("CompressedNumericList", prototype = prototype(elementType = "numeric", unlistData = numeric()), contains = c("NumericList", "CompressedAtomicList")) setClass("CompressedComplexList", prototype = prototype(elementType = "complex", unlistData = complex()), contains = c("ComplexList", "CompressedAtomicList")) setClass("CompressedCharacterList", prototype = prototype(elementType = "character", unlistData = character()), contains = c("CharacterList", "CompressedAtomicList")) setClass("CompressedRawList", prototype = prototype(elementType = "raw", unlistData = raw()), contains = c("RawList", "CompressedAtomicList")) setClass("CompressedRleList", prototype = prototype(elementType = "Rle", unlistData = new("Rle")), contains = c("RleList", "CompressedAtomicList")) setClass("CompressedFactorList", prototype = prototype(elementType = "factor", unlistData = factor()), contains = c("FactorList", "CompressedAtomicList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setListCoercions("logical") setListCoercions("integer") setListCoercions("numeric") setListCoercions("complex") setListCoercions("character") setListCoercions("raw") setListCoercions("Rle") setListCoercions("factor") setMethod("as.list", "CompressedAtomicList", function(x, use.names = TRUE) { if (is(x, "CompressedRleList")) { callNextMethod(x, use.names = use.names) } else { f <- S4Vectors:::map_inner_ROWS_to_list_elements( elementNROWS(x), as.factor=TRUE) ans <- split(x@unlistData, f) if (use.names) { names(ans) <- names(x) } else { names(ans) <- NULL } ans } }) setAs("CompressedAtomicList", "list", function(from) as.list(from)) .from_IPosRanges_to_CompressedIntegerList <- function(from) { ans <- relist(unlist_as_integer(from), from) metadata(ans) <- metadata(from) mcols(ans) <- mcols(from, use.names=FALSE) ans } ### Propagate the names, metadata, and metadata columns. setAs("IPosRanges", "CompressedIntegerList", .from_IPosRanges_to_CompressedIntegerList ) setAs("IPosRanges", "IntegerList", .from_IPosRanges_to_CompressedIntegerList ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### General methods ### setCompressedNumericalListMethod <- function(fun, def, where=topenv(parent.frame())) { types <- c("Logical", "Integer", "Numeric") classNames <- paste0("Compressed", types, "List") lapply(classNames, function(className) { C_fun <- paste0("C_", sub(".", "_", fun, fixed=TRUE), "_", className) body(def) <- eval(call("substitute", body(def))) setMethod(fun, className, def, where=where) }) } setCompressedNumericalListMethod("is.unsorted", function(x, na.rm = FALSE, strictly=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) stopifnot(isTRUEorFALSE(strictly)) .Call2(C_fun, x, na.rm, strictly, PACKAGE="IRanges") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Rle methods ### setMethod("runValue", "CompressedRleList", function(x) { rle <- unlist(x, use.names=FALSE) rlePart <- PartitioningByWidth(runLength(rle)) listPart <- PartitioningByEnd(x) ## 'rlePart' cannot contain empty ranges so using ## Using 'hit.empty.query.ranges=TRUE' won't affect the result ## (because 'rlePart' cannot contain empty ranges) but it makes ## findOverlaps_IntegerRanges_Partitioning() just a little bit faster. hits <- findOverlaps_IntegerRanges_Partitioning( rlePart, listPart, hit.empty.query.ranges=TRUE) ans_partitioning <- PartitioningByEnd(subjectHits(hits), NG=length(x)) ans_unlistData <- runValue(rle)[queryHits(hits)] ans <- relist(ans_unlistData, ans_partitioning) names(ans) <- names(x) ans } ) setReplaceMethod("runValue", "CompressedRleList", function(x, value) { if (!identical(elementNROWS(ranges(x)), elementNROWS(value))) stop("elementNROWS() of 'x' and 'value' must match") runValue(x@unlistData) <- unlist(value, use.names=FALSE) x }) setMethod("runLength", "CompressedRleList", function(x) { width(ranges(x)) }) setMethod("ranges", "CompressedRleList", function(x, use.names=TRUE, use.mcols=FALSE) { rle <- unlist(x, use.names=FALSE) rlePart <- PartitioningByWidth(runLength(rle)) diceRangesByList(rlePart, x) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Factor methods ### setMethod("levels", "CompressedFactorList", function(x) { setNames(rep(CharacterList(levels(x@unlistData)), length(x)), names(x)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Iteration ### setMethod("lapply", "CompressedAtomicList", function(X, FUN, ...) { FUN <- match.fun(FUN) if (is(X, "CompressedRleList")) { callNextMethod(X, FUN, ...) } else { lapply(as.list(X), FUN, ...) } }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Group generic methods ### .do_CompressedList_binary_op <- function(OP, e1, e2, skeleton) { if (!missing(skeleton)) { n <- length(skeleton) } else { n <- max(length(e1), length(e2)) } e1 <- recycleList(e1, n) e2 <- recycleList(e2, n) if (missing(skeleton)) { n1 <- elementNROWS(e1) n2 <- elementNROWS(e2) if (any(n1 != n2)) { en <- ifelse(n1 == 0L | n2 == 0L, 0L, pmax.int(n1, n2)) } else { en <- NULL } nms <- names(e1) if (is.null(nms)) nms <- names(e2) } else { en <- elementNROWS(skeleton) nms <- names(skeleton) } if (!is.null(en)) { e1 <- recycleListElements(e1, en) e2 <- recycleListElements(e2, en) } partitioning <- PartitioningByEnd(e1) names(partitioning) <- nms relist(OP(unlist(e1, use.names=FALSE), unlist(e2, use.names=FALSE)), partitioning) } setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "CompressedAtomicList"), function(e1, e2) { if (length(e1) == 0L || length(e2) == 0L) { return(emptyOpsReturnValue(.Generic, e1, e2, compress = TRUE)) } .do_CompressedList_binary_op(function(x, y) { .Generic <- .Generic callGeneric(x, y) }, e1, e2) }) setMethod("Ops", signature(e1 = "SimpleAtomicList", e2 = "CompressedAtomicList"), function(e1, e2) { if (sum(as.numeric(elementNROWS(e1))) < .Machine$integer.max) e1 <- as(e1, "CompressedList") else e2 <- as(e2, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "SimpleAtomicList"), function(e1, e2) { if (sum(as.numeric(elementNROWS(e2))) < .Machine$integer.max) e2 <- as(e2, "CompressedList") else e1 <- as(e1, "SimpleList") callGeneric(e1, e2) }) setMethod("Ops", signature(e1 = "CompressedAtomicList", e2 = "atomic"), function(e1, e2) { if (length(e2) > 1) { e2 <- S4Vectors:::recycleVector(e2, length(e1)) e2 <- rep(e2, elementNROWS(e1)) } relist(callGeneric(e1@unlistData, e2), e1) }) setMethod("Ops", signature(e1 = "atomic", e2 = "CompressedAtomicList"), function(e1, e2) { if (length(e1) > 1) { e1 <- S4Vectors:::recycleVector(e1, length(e2)) e1 <- rep(e1, elementNROWS(e2)) } relist(callGeneric(e1, e2@unlistData), e2) }) setMethod("Math", "CompressedAtomicList", function(x) { relist(callGeneric(x@unlistData), x) }) setMethod("cumsum", "CompressedAtomicList", function(x) { xunlist <- unlist(x, use.names=FALSE) xcumsum <- cumsum(as.numeric(xunlist)) partition <- PartitioningByEnd(x) ans <- xcumsum - rep(xcumsum[start(partition)] - xunlist[start(partition)], width(partition)) relist(ans, x) }) setMethod("cumprod", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("cummin", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("cummax", "CompressedAtomicList", function(x) { as(lapply(x, .Generic), "CompressedList") }) setMethod("Math2", "CompressedAtomicList", function(x, digits) { if (missing(digits)) digits <- ifelse(.Generic == "round", 0, 6) relist(callGeneric(x@unlistData, digits = digits), x) }) setMethod("any", "CompressedAtomicList", function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) ans <- sum(x, na.rm=TRUE) > 0L if (!na.rm) { ans[!ans & any(is.na(x), na.rm=TRUE)] <- NA } ans }) setMethod("all", "CompressedAtomicList", function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) ans <- !any(!x, na.rm=TRUE) if (!na.rm) { ans[ans & any(is.na(x), na.rm=TRUE)] <- NA } ans }) setMethod("anyNA", "CompressedAtomicList", function(x, recursive=FALSE) { callNextMethod(x, recursive=FALSE) ## recursion will just slow us down }) rowsumCompressedList <- function(x, ..., na.rm = FALSE) { x_flat <- unlist(x, use.names = FALSE) ans <- vector(class(x_flat), length(x)) non_empty <- elementNROWS(x) > 0 if (is.logical(x_flat)) x_flat <- as.integer(x_flat) ans[non_empty] <- rowsum(x_flat, togroup(PartitioningByWidth(x)), reorder = FALSE, na.rm = na.rm)[,1] setNames(ans, names(x)) } setCompressedListSummaryMethod <- function(fun, where=topenv(parent.frame())) { setCompressedNumericalListMethod(fun, function(x, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) .Call2(C_fun, x, na.rm, PACKAGE="IRanges") }, where) } setCompressedListSummaryMethod("sum") setCompressedListSummaryMethod("prod") setCompressedListSummaryMethod("min") setCompressedListSummaryMethod("max") setMethods("range", list("CompressedLogicalList", "CompressedIntegerList", "CompressedNumericList", "CompressedRleList"), function(x, na.rm=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) cbind(min(x, na.rm=na.rm), max(x, na.rm=na.rm)) }) setMethod("Summary", "CompressedRleList", function(x, ..., na.rm = FALSE) { toViewFun <- list(max = viewMaxs, min = viewMins, sum = viewSums) if (!is.null(viewFun <- toViewFun[[.Generic]])) { ans <- viewFun(as(x, "RleViews"), na.rm = na.rm) names(ans) <- names(x) ans } else if (.Generic %in% c("any", "all")) callNextMethod() else sapply(x, .Generic, na.rm = na.rm) }) setMethod("all", "CompressedRleList", function(x, ..., na.rm = FALSE) { args <- list(...) if (length(args) > 0L) stop("Only a single argument in '...' is supported for now") if (!isTRUEorFALSE(na.rm)) stop("'na.rm' must be TRUE or FALSE") rv <- runValue(x) if (na.rm) rv <- rv[!is.na(rv)] rv_eltNROWS <- elementNROWS(rv) ans <- rv_eltNROWS == 0L singletons <- rv_eltNROWS == 1L ans[singletons] <- unlist(rv, use.names = FALSE)[singletons[togroup(PartitioningByWidth(rv))]] ans }) setMethod("Complex", "CompressedAtomicList", function(z) relist(callGeneric(z@unlistData), z)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More list-ized methods ### .setListMethod <- function(f, inputClass, outputBaseClass, whichArg = 1L, remainingSignature = character(), mapply = FALSE, endoapply = FALSE, applyToUnlist = FALSE, where = topenv(parent.frame())) { fargs <- formals(args(get(f))) args <- sapply(names(fargs), as.name) names(args) <- sub("...", "", names(args), fixed = TRUE) if (applyToUnlist) { call2 <- as.call(c(as.name("@"), args[[whichArg]], "partitioning")) args[[whichArg]] <- as.call(c(as.name("@"), args[[whichArg]], "unlistData")) call1 <- as.call(c(as.name(f), args)) call <- as.call(c(as.name("new2"), paste0("Compressed", outputBaseClass), unlistData = call1, partitioning = call2, check = FALSE)) } else { args <- c(args[[whichArg]], as.name(f), args[-whichArg]) if (endoapply) { call <- as.call(c(as.name("endoapply"), args)) } else if (missing(outputBaseClass)) { call <- as.call(c(as.name("sapply"), args, list(simplify = TRUE))) } else { if (mapply) { if (length(args) <= 3) { call <- as.call(c(as.name("mapply"), args[c(2:1,3L)], SIMPLIFY = FALSE)) } else { call <- as.call(c(as.name("mapply"), args[c(2:1,3L)], MoreArgs = as.call(c(as.name("list"), tail(args, -3))), SIMPLIFY = FALSE)) } } else { call <- as.call(c(as.name("lapply"), args)) } if (extends(inputClass, "SimpleList")) { call <- as.call(c(as.name("new2"), paste0("Simple", outputBaseClass), listData = call, check = FALSE)) } else { call <- as.call(c(as.name(outputBaseClass), call, compress = TRUE)) } } } def <- as.function(c(fargs, call)) environment(def) <- parent.frame() setMethod(f, c(rep("ANY", whichArg - 1L), inputClass, remainingSignature), def, where) } .setAtomicListMethod <- function(f, inputBaseClass = "AtomicList", outputBaseClass, whichArg = 1L, remainingSignature = character(), mapply = FALSE, endoapply = FALSE, applyToUnlist = FALSE, addRleList = TRUE, rleListOutputBaseClass = "RleList", where = topenv(parent.frame())) { if (missing(outputBaseClass)) { for (i in inputBaseClass) .setListMethod(f, i, whichArg = whichArg, remainingSignature = remainingSignature, endoapply = endoapply, where = where) } else if (endoapply) { .setListMethod(f, "AtomicList", whichArg = whichArg, remainingSignature = remainingSignature, endoapply = TRUE, where = where) } else { .setListMethod(f, paste0("Simple", inputBaseClass), outputBaseClass = outputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, where = where) .setListMethod(f, paste0("Compressed", inputBaseClass), outputBaseClass = outputBaseClass, whichArg = whichArg, remainingSignature, mapply = mapply, applyToUnlist = applyToUnlist, where = where) if (addRleList) { .setListMethod(f, "SimpleRleList", outputBaseClass = rleListOutputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, where = where) .setListMethod(f, "CompressedRleList", outputBaseClass = rleListOutputBaseClass, whichArg = whichArg, remainingSignature = remainingSignature, mapply = mapply, applyToUnlist = applyToUnlist, where = where) } } } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Logical methods ### .setAtomicListMethod("which", inputBaseClass = "LogicalList", outputBaseClass = "IntegerList", rleListOutputBaseClass = "IntegerList") setMethod("which", "CompressedLogicalList", function(x) { x.flat <- unlist(x, use.names = FALSE) part <- PartitioningByEnd(x) which.global <- which(x.flat) group <- findInterval(which.global, start(part)) which.local <- which.global - start(part)[group] + 1L ans <- splitAsList(which.local, factor(group, seq_len(length(x)))) names(ans) <- names(x) ans }) setMethods("ifelse2", list(c("CompressedLogicalList", "ANY", "ANY"), c("CompressedLogicalList", "ANY", "List"), c("CompressedLogicalList", "List", "ANY"), c("CompressedLogicalList", "List", "List")), function(test, yes, no) { .do_CompressedList_binary_op(function(yes, no) { ifelse(unlist(test, use.names=FALSE), yes, no) }, as(yes, "List"), as(no, "List"), test) }) setMethods("ifelse2", list(c("SimpleLogicalList", "ANY", "ANY"), c("SimpleLogicalList", "ANY", "List"), c("SimpleLogicalList", "List", "ANY"), c("SimpleLogicalList", "List", "List")), function(test, yes, no) { as(mapply(ifelse, test, yes, no, SIMPLIFY=FALSE), "List") }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Numerical methods ### setCompressedListWhichSummaryMethod <- function(fun, where=topenv(parent.frame())) { def <- function(x, global = FALSE) { stopifnot(isTRUEorFALSE(global)) ans <- .Call2(C_fun, x, PACKAGE="IRanges") if (global) { ans <- toglobal(ans, x) } ans } setCompressedNumericalListMethod(fun, def, where) } setCompressedListWhichSummaryMethod("which.min") setCompressedListWhichSummaryMethod("which.max") setMethod("which.min", "CompressedRleList", function(x) { viewWhichMins(as(x, "RleViews"), na.rm=TRUE) - c(0L, head(cumsum(elementNROWS(x)), -1)) }) setMethod("which.max", "CompressedRleList", function(x) { viewWhichMaxs(as(x, "RleViews"), na.rm=TRUE) - c(0L, head(cumsum(elementNROWS(x)), -1)) }) for (i in c("IntegerList", "NumericList", "RleList")) { .setAtomicListMethod("diff", inputBaseClass = i, endoapply = TRUE) } setMethods("mean", list("CompressedLogicalList", "CompressedIntegerList", "CompressedNumericList", "CompressedRleList"), function(x, trim = 0, na.rm = FALSE) { stopifnot(isTRUEorFALSE(na.rm)) stopifnot(isSingleNumber(trim)) if (trim > 0) { return(callNextMethod()) } x_eltNROWS <- if (na.rm) sum(!is.na(x)) else elementNROWS(x) sum(x, na.rm=na.rm) / x_eltNROWS }) setMethod("median", "CompressedAtomicList", function(x, na.rm=FALSE) { stopifnot(isTRUEorFALSE(na.rm)) sx <- sort(x) n <- lengths(sx) half <- (n + 1L)%/%2L even <- n%%2L != 1L ind <- IRanges(half, width=1L+even) NAs <- half == 0L ind <- relist(ind[!NAs], PartitioningByWidth(as.integer(!NAs))) ## ind <- as(half, "IntegerList") ## ind[even] <- ind[even] + as(0:1, "IntegerList") ans <- mean(sx[ind]) if (!na.rm) { NAs <- NAs | anyNA(x) } if (any(NAs)) { ans[NAs] <- as(NA, elementType(x)) } ans }) setMethod("diff", "CompressedAtomicList", function(x, lag = 1L, differences = 1L) { stopifnot(isSingleNumber(lag)) stopifnot(isSingleNumber(differences)) r <- x for (i in seq_len(differences)) r <- tails(r, -lag) - heads(r, -lag) r }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Running window statistic methods ### .setAtomicListMethod("smoothEnds", inputBaseClass = "IntegerList", endoapply = TRUE) .setAtomicListMethod("smoothEnds", inputBaseClass = "NumericList", endoapply = TRUE) .setAtomicListMethod("smoothEnds", inputBaseClass = "RleList", endoapply = TRUE) setMethod("runmed", "CompressedIntegerList", function(x, k, endrule = c("median", "keep", "constant"), algorithm = NULL, print.level = 0) NumericList(lapply(x, runmed, k = k, endrule = match.arg(endrule), algorithm = algorithm, print.level = print.level))) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Character ### nchar_CompressedList <- function(x, type="chars", allowNA=FALSE) { unlisted_x <- unlist(x, use.names=FALSE) unlisted_ans <- nchar(unlisted_x, type=type, allowNA=allowNA) relist(unlisted_ans, x) } ### H.P. (Feb 5, 2018): Does not seem right to output a CompressedList object ### when the input is SimpleList! setMethod("nchar", "CompressedCharacterList", nchar_CompressedList) setMethod("nchar", "SimpleCharacterList", nchar_CompressedList) # not good! setMethod("nchar", "CompressedRleList", nchar_CompressedList) setMethod("nchar", "SimpleRleList", nchar_CompressedList) # not good! setMethod("paste", "CompressedAtomicList", function(..., sep=" ", collapse=NULL) { args <- lapply(list(...), as, "CharacterList") x_eltNROWS <- do.call(pmax, lapply(args, elementNROWS)) args <- lapply(args, recycleListElements, x_eltNROWS) unlisted <- lapply(args, unlist, use.names=FALSE) ans <- relist(do.call(paste, c(unlisted, sep=sep)), PartitioningByWidth(x_eltNROWS)) if (!is.null(collapse)) { ans <- unstrsplit(ans, collapse) } ans }) ## need vectorized start, end ##.setAtomicListMethod("substr") ##.setAtomicListMethod("substring") .setAtomicListMethod("chartr", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) .setAtomicListMethod("tolower", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", applyToUnlist = TRUE) .setAtomicListMethod("toupper", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", applyToUnlist = TRUE) .setAtomicListMethod("sub", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) .setAtomicListMethod("gsub", inputBaseClass = "CharacterList", outputBaseClass = "CharacterList", whichArg = 3L, applyToUnlist = TRUE) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Comparison / sorting ### setMethod("selfmatch", "CompressedAtomicList", function(x, global=FALSE) { g <- subgrouping(x) first <- unlist(g)[start(PartitioningByEnd(g))] ux <- unlist(x, use.names=FALSE) ux[unlist(g)] <- rep(first, lengths(g)) ans <- relist(ux, x) if (!global) { ans <- ans - start(ans) + 1L } ans }) .duplicated.CompressedAtomicList <- function(x, incomparables=FALSE, fromLast=FALSE, nmax=NA, ...) { if (!identical(incomparables, FALSE)) stop("\"duplicated\" method for CompressedList objects ", "does not support the 'incomparables' argument") if (length(list(...)) > 0L) { stop("arguments in '...' are not supported") } stopifnot(isTRUEorFALSE(fromLast)) g <- subgrouping(x) p <- PartitioningByEnd(g) first <- unlist(g)[if (fromLast) end(p) else start(p)] v <- rep(TRUE, length(unlist(g))) v[first] <- FALSE relist(v, x) } setMethod("duplicated", "CompressedAtomicList", .duplicated.CompressedAtomicList) setMethod("rank", "CompressedAtomicList", function (x, na.last = TRUE, ties.method = c("average", "first", "last", "random", "max", "min")) { stopifnot(isTRUE(na.last)) ties.method <- match.arg(ties.method) if (ties.method == "last" || ties.method == "random") stop("'ties.method' last/random not yet supported") p <- PartitioningByEnd(x) o <- order(togroup(p), unlist(x, use.names=FALSE)) r <- unlist_as_integer(IRanges(1L, width=width(p))) gp <- PartitioningByEnd(end(Rle(unlist(x, use.names=FALSE)[o]))) v <- switch(ties.method, average=(r[start(gp)] + r[end(gp)])/2, first=r, ## last=, ## random=, max=r[end(gp)], min=r[start(gp)]) if (ties.method != "first") v <- rep(v, width(gp)) r[o] <- v relist(r, x) }) setMethod("order", "CompressedAtomicList", function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", "shell", "radix")) { args <- list(...) if (length(args) != 1L) stop("\"order\" method for CompressedAtomicList objects ", "can only take one input object") x <- args[[1L]] p <- PartitioningByEnd(x) ux <- unlist(x, use.names=FALSE) o <- order(togroup(p), ux, na.last=na.last, decreasing=decreasing, method=method) skeleton <- if (is.na(na.last) && anyNA(ux)) { skeleton <- PartitioningByWidth(width(p) - sum(is.na(x))) } else p relist(o, skeleton) - start(p) + 1L }) IRanges/R/CompressedDataFrameList-class.R0000644000175100017510000001265614626176651021306 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedDataFrameList objects ### ------------------------------------------------------------------------- setClass("CompressedDataFrameList", contains=c("DataFrameList", "CompressedList"), representation("VIRTUAL", unlistData="DataFrame"), prototype(unlistData=new("DFrame")) ) setClass("CompressedDFrameList", contains=c("DFrameList", "CompressedDataFrameList"), representation(unlistData="DFrame") ) setClass("CompressedSplitDataFrameList", contains=c("SplitDataFrameList", "CompressedDataFrameList"), representation("VIRTUAL") ) setClass("CompressedSplitDFrameList", contains=c("SplitDFrameList", "CompressedDFrameList", "CompressedSplitDataFrameList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### ### Deprecated. ### IMPORTANT NOTE: We won't be able to go thru the Defunct cycle because ### a lot of code around assumes that ncol() can be called on an arbitrary ### object! setMethod("ncol", "CompressedSplitDataFrameList", function(x) { msg <- c("The ncol() method for CompressedSplitDataFrameList ", "objects is deprecated. Please use ncols() on these ", "objects instead.") .Deprecated(msg=wmsg(msg)) if (length(x) == 0L) 0L else structure(rep.int(ncol(x@unlistData), length(x)), names = names(x)) }) setMethod("ncols", "CompressedSplitDataFrameList", function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop(wmsg("'use.names' must be TRUE or FALSE")) ans_names <- if (use.names) names(x) else NULL structure(rep.int(ncol(x@unlistData), length(x)), names=ans_names) } ) setMethod("colnames", "CompressedSplitDataFrameList", function(x, do.NULL = TRUE, prefix = "col") { if (length(x)) { nms <- colnames(x@unlistData, do.NULL = do.NULL, prefix = prefix) rep(CharacterList(nms), length(x)) } else NULL }) setReplaceMethod("rownames", "CompressedSplitDataFrameList", function(x, value) { if (is.null(value)) { rownames(x@unlistData) <- NULL } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") rownames(x@unlistData) <- unlist(value, use.names=FALSE) } else { stop("replacement value must either be NULL or a CharacterList") } x }) setReplaceMethod("colnames", "CompressedSplitDataFrameList", function(x, value) { if (is.null(value)) { colnames(x@unlistData) <- NULL } else if (is.character(value)) { colnames(x@unlistData) <- value } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") if (length(x) > 0) colnames(x@unlistData) <- unlist(value[[1L]]) } else { stop("replacement value must either be NULL or a CharacterList") } x }) setMethod("commonColnames", "CompressedSplitDataFrameList", function(x) colnames(unlist(x, use.names=FALSE))) setMethod("columnMetadata", "CompressedSplitDataFrameList", function(x) { mcols(x@unlistData, use.names=FALSE) }) setReplaceMethod("columnMetadata", "CompressedSplitDataFrameList", function(x, value) { mcols(x@unlistData) <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("[", "CompressedSplitDataFrameList", function(x, i, j, ..., drop=TRUE) { if (!missing(j)) x@unlistData <- x@unlistData[, j, drop=FALSE] if (!missing(i)) x <- callNextMethod(x, i) if (((nargs() - !missing(drop)) > 2) && (ncol(x@unlistData) == 1) && (missing(drop) || drop)) { x <- relist(x@unlistData[[1L]], x) } x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("ANY", "CompressedDataFrameList", function(from) as(from, "CompressedDFrameList") ) setAs("ANY", "CompressedSplitDataFrameList", function(from) as(from, "CompressedSplitDFrameList") ) setListCoercions("DFrame") setAs("ANY", "CompressedSplitDFrameList", function(from) { coerceToCompressedList(from, "DFrame") }) setAs("ANY", "SplitDFrameList", function(from) as(from, "CompressedSplitDFrameList")) setAs("DataFrame", "SplitDFrameList", function(from) as(from, "CompressedSplitDFrameList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Display ### setMethod("classNameForDisplay", "CompressedDFrameList", function(x) sub("^Compressed", "", sub("DFrame", "DataFrame", class(x))) ) IRanges/R/CompressedGrouping-class.R0000644000175100017510000000161214626176651020406 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedGrouping objects ### ------------------------------------------------------------------------- setClass("CompressedGrouping", ### TODO: contain VIRTUAL after R 3.4 release contains=c("Grouping", "CompressedIntegerList")) setClass("CompressedManyToOneGrouping", contains=c("ManyToOneGrouping", "CompressedGrouping")) setClass("CompressedManyToManyGrouping", contains=c("BaseManyToManyGrouping", "CompressedGrouping")) ### ------------------------------------------------------------------------- ### Grouping API implementation ### ---------------------------- ### setMethod("grouplengths", "CompressedGrouping", function(x, i=NULL) grouplengths(PartitioningByEnd(x), i)) setMethod("nobj", "CompressedManyToOneGrouping", function(x) nobj(PartitioningByEnd(x))) IRanges/R/CompressedHitsList-class.R0000644000175100017510000000325614626176651020365 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedHitsList objects ### ------------------------------------------------------------------------- ### [H.P. - 2015/12/17] Why do we need this? Where is it used? ### It's not even exported. setClass("CompressedHitsList", prototype = prototype(elementType = "Hits", unlistData = new("Hits")), contains="CompressedList") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessors ### setMethod("from", "CompressedHitsList", function(x) from(x@unlistData)) setMethod("to", "CompressedHitsList", function(x) to(x@unlistData)) setMethod("nLnode", "CompressedHitsList", function(x) nLnode(x@unlistData)) setMethod("nRnode", "CompressedHitsList", function(x) nRnode(x@unlistData)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### CompressedHitsList <- function(hits, query) { if (!(is(query, "CompressedIRangesList"))) stop("'query' must be a 'CompressedIRangesList' object") if (!is(hits, "Hits")) stop("'hits' must be a 'Hits' object") qspace <- space(query) hspace <- as.integer(qspace[queryHits(hits)]) partitioning <- PartitioningByEnd(hspace, names=names(query@partitioning), NG=length(names(query@partitioning))) newCompressedList0("CompressedHitsList", unlistData=hits, partitioning=partitioning) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ## return as.matrix as on Hits, with indices adjusted setMethod("as.matrix", "CompressedHitsList", function(x) { cbind(queryHits=queryHits(x), subjectHits=subjectHits(x)) }) IRanges/R/CompressedList-class.R0000644000175100017510000003753014641310665017527 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedList objects ### ------------------------------------------------------------------------- setClass("CompressedList", contains="List", representation( "VIRTUAL", unlistData="ANY", partitioning="PartitioningByEnd" ) ) setMethod("classNameForDisplay", "CompressedList", function(x) sub("^Compressed", "", class(x)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.CompressedList.partitioning <- function(x) { dataLength <- NROW(x@unlistData) if (nobj(x@partitioning) != dataLength) "improper partitioning" else NULL } .valid.CompressedList.unlistData <- function(x) { ## FIXME: workaround to support CompressedNormalIRangesList ## elementTypeX <- elementType(x) elementTypeX <- elementType(new(class(x))) if (!extends(class(x@unlistData), elementTypeX)) paste("the 'unlistData' slot must be of class", elementTypeX) else NULL } .valid.CompressedList <- function(x) { c(.valid.CompressedList.unlistData(x), .valid.CompressedList.partitioning(x)) } setValidity2("CompressedList", .valid.CompressedList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### updateObject() ### ### This just implements the generic updateObject strategy that consists ### in calling updateObject() on each **proper** CompressedList slot i.e. ### on the slots added by the CompressedList class, or, said otherwise, on ### the slots that are not inherited. setMethod("updateObject", "CompressedList", function(object, ..., verbose=FALSE) { ## The 'unlistData' slot could be an Rle, DataFrame, IRanges or ## GRanges object, or any vector-like object that needs an update. object@unlistData <- updateObject(object@unlistData, ..., verbose=verbose) ## The 'partitioning' slot is a PartitioningByEnd object which derives ## from IPosRanges so its elementType slot might need to be updated. ## See "updateObject" method for IPosRanges objects for more ## information. object@partitioning <- updateObject(object@partitioning, ..., verbose=verbose) callNextMethod() } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters and setters ### setMethod("length", "CompressedList", function(x) length(x@partitioning)) setMethod("names", "CompressedList", function(x) names(x@partitioning)) setMethod("elementNROWS", "CompressedList", function(x) { ans <- elementNROWS(x@partitioning) names(ans) <- names(x) ans } ) setReplaceMethod("names", "CompressedList", function(x, value) { names(x@partitioning) <- value x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### Use ### IRanges:::newCompressedList0(getClass("MyClass"), ### unlistData, partitioning) ### when calling this from another package. ### .reconcile_mcols <- function(x) { x_mcols <- mcols(x, use.names=FALSE) if (is(x_mcols, "DataFrame") && nrow(x_mcols) == 0L && ncol(x_mcols) == 0L) { x_mcols <- make_zero_col_DFrame(length(x)) mcols(x) <- x_mcols } x } ### Low-level. NOT exported. newCompressedList0 <- function(Class, unlistData, partitioning) { ## Note that 'unlistData_target_class' could also be obtained ## with 'getClassDef(Class)@slots[["unlistData"]]', in which ## case the class name would be returned with the "package" attribute. unlistData_target_class <- getSlots(Class)[["unlistData"]] ## 'unlistData' must derive from the class expected by the "unlistData" ## slot. If it doesn't (e.g. if 'Class' is "CompressedSplitDFrameList" ## and 'unlistData' is an ordinary data.frame), then we coerce it. Note ## that this coercion could fail. if (!is(unlistData, unlistData_target_class)) unlistData <- as(unlistData, unlistData_target_class) ans <- new2(Class, unlistData=unlistData, partitioning=partitioning, check=FALSE) .reconcile_mcols(ans) } ### Low-level. NOT exported. ### Stuff to put in elementMetadata slot can be passed either with ### new_CompressedList_from_list(..., elementMetadata=somestuff) ### or with ### new_CompressedList_from_list(..., mcols=somestuff) ### The latter is the new recommended form. new_CompressedList_from_list <- function(Class, x, ..., mcols) { if (!extends(Class, "CompressedList")) stop("class ", Class, " must extend CompressedList") if (!is.list(x)) stop("'x' must be a list") ans_elementType <- elementType(new(Class)) if (!all(sapply(x, function(xi) extends(class(xi), ans_elementType)))) stop("all elements in 'listData' must be ", ans_elementType, " objects") ans_partitioning <- PartitioningByEnd(x) if (length(x) == 0L) { if (missing(mcols)) return(new2(Class, partitioning=ans_partitioning, ..., check=FALSE)) return(new2(Class, partitioning=ans_partitioning, ..., elementMetadata=mcols, check=FALSE)) } ans_unlistData <- S4Vectors:::compress_listData(x, ans_elementType) if (missing(mcols)) { ans <- new2(Class, unlistData=ans_unlistData, partitioning=ans_partitioning, ..., check=FALSE) } else { ans <- new2(Class, unlistData=ans_unlistData, partitioning=ans_partitioning, ..., elementMetadata=mcols, check=FALSE) } .reconcile_mcols(ans) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### unlist() ### ### Overwrite method for List objects with super fast method for CompressedList ### objects. setMethod("unlist", "CompressedList", function(x, recursive=TRUE, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") unlisted_x <- x@unlistData if (use.names) unlisted_x <- S4Vectors:::set_unlisted_names(unlisted_x, x) unlisted_x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### coerceToCompressedList <- function(from, element.type = NULL, ...) { if (is(from, S4Vectors:::listClassName("Compressed", element.type))) return(from) if (is.list(from) || (is(from, "List") && !is(from, "DataFrame"))) { if (is.list(from)) { v <- S4Vectors:::compress_listData(from, element.type) } else { v <- unlist(from, use.names = FALSE) } part <- PartitioningByEnd(from) } else { v <- from part <- PartitioningByEnd(seq_len(NROW(from))) } if (!is.null(element.type)) { v <- S4Vectors:::coercerToClass(element.type)(v, ...) } to <- relist(v, part) names(to) <- names(from) if (is(from, "List")) mcols(to) <- mcols(from) to } setAs("ANY", "CompressedList", function(from) coerceToCompressedList(from)) setListCoercions <- function(type) { CompressedClass <- S4Vectors:::listClassName("Compressed", type) SimpleClass <- S4Vectors:::listClassName("Simple", type) Class <- S4Vectors:::listClassName("", type) hasCompressedList <- CompressedClass != "CompressedList" if (hasCompressedList) { setAs("ANY", CompressedClass, CoercerToList(type, compress = TRUE)) } setAs("ANY", SimpleClass, CoercerToList(type, compress = FALSE)) setAs("ANY", Class, CoercerToList(type, compress = hasCompressedList)) setAs("SimpleList", Class, CoercerToList(type, compress = FALSE)) setAs("list", Class, CoercerToList(type, compress = FALSE)) } ### Used by the following coercion methods to perform a "dumb split" with ### propagation of the metadata columns: ### - from IntegerRanges to CompressedIRangesList ### - from GenomicRanges to CompressedGRangesList ### - from XStringSet to XStringSetList ### - and maybe more... ### NOT exported. from_Vector_to_CompressedList <- function(from) { ans_mcols <- mcols(from, use.names=FALSE) mcols(from) <- NULL ans <- splitAsList(from) # dumb split mcols(ans) <- ans_mcols ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("extractROWS", "CompressedList", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ans_eltNROWS <- extractROWS(width(x@partitioning), i) ans_breakpoints <- suppressWarnings(cumsum(ans_eltNROWS)) nbreakpoints <- length(ans_breakpoints) if (nbreakpoints != 0L && is.na(ans_breakpoints[[nbreakpoints]])) stop(wmsg("Subsetting operation on ", class(x), " object 'x' ", "produces a result that is too big to be ", "represented as a CompressedList object. ", "Please try to coerce 'x' to a SimpleList object ", "first (with 'as(x, \"SimpleList\")').")) idx_on_unlisted_x <- IRanges(end=extractROWS(end(x@partitioning), i), width=ans_eltNROWS) ans_unlistData <- extractROWS(x@unlistData, idx_on_unlisted_x) ans_partitioning <- new2("PartitioningByEnd", end=ans_breakpoints, NAMES=extractROWS(names(x), i), check=FALSE) ans_elementMetadata <- extractROWS(x@elementMetadata, i) initialize(x, unlistData=ans_unlistData, partitioning=ans_partitioning, elementMetadata=ans_elementMetadata) } ) setMethod("getListElement", "CompressedList", function(x, i, exact=TRUE) { i2 <- normalizeDoubleBracketSubscript(i, x, exact=exact, allow.NA=TRUE, allow.nomatch=TRUE) if (is.na(i2)) return(NULL) unlisted_x <- unlist(x, use.names=FALSE) x_partitioning <- PartitioningByEnd(x) window_start <- start(x_partitioning)[i2] window_end <- end(x_partitioning)[i2] S4Vectors:::Vector_window(unlisted_x, start=window_start, end=window_end) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Concatenation ### .concatenate_CompressedList_objects <- function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) { objects <- S4Vectors:::prepare_objects_to_bind(x, objects) all_objects <- c(list(x), objects) ## 1. Take care of the parallel slots ## Call method for Vector objects to concatenate all the parallel slots ## (only "elementMetadata" in the case of CompressedList) and stick them ## into 'ans'. Note that the resulting 'ans' can be an invalid object ## because its "elementMetadata" slot can be longer (i.e. have more rows) ## than 'ans' itself so we use 'check=FALSE' to skip validation. ans <- callNextMethod(x, objects, use.names=use.names, ignore.mcols=ignore.mcols, check=FALSE) ## 2. Take care of the non-parallel slots ## Concatenate the "unlistData" slots. unlistData_list <- lapply(all_objects, slot, "unlistData") ## Skip validation here too (we'll validate the final object). ans_unlistData <- bindROWS(unlistData_list[[1L]], objects=unlistData_list[-1L], check=FALSE) ## Concatenate the "partitioning" slots. ans_breakpoints <- cumsum(unlist(lapply(all_objects, elementNROWS), use.names=use.names)) ans_partitioning <- PartitioningByEnd(ans_breakpoints) ## Update 'ans' and validate it (if the caller has set 'check' to TRUE). BiocGenerics:::replaceSlots(ans, unlistData=ans_unlistData, partitioning=ans_partitioning, check=check) } setMethod("bindROWS", "CompressedList", .concatenate_CompressedList_objects) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Looping. ### ### Cannot really avoid the cost of extracting X[[i]] for all valid i but tries ### to minimize this cost by using 2 tricks: ### 1. Avoids looping on values of i for which X[[i]] has length 0. Instead ### FUN(X[[i]], ...) is computed only once (because it's the same for all ### these values of i) and placed at the corresponding positions in the ### returned list. ### 2. Turn off object validation during the main loop. Note that there is no ### reason to restrict this trick to CompressedList objects and the same ### trick could be used in the "lapply" method for List objects. ### Does NOT propagate the names. lapply_CompressedList <- function(X, FUN, ...) { FUN <- match.fun(FUN) ans <- vector(mode="list", length=length(X)) unlisted_X <- unlist(X, use.names=FALSE) X_partitioning <- PartitioningByEnd(X) X_elt_width <- width(X_partitioning) empty_idx <- which(X_elt_width == 0L) if (length(empty_idx) != 0L) ans[empty_idx] <- list(FUN(extractROWS(unlisted_X, integer(0)), ...)) non_empty_idx <- which(X_elt_width != 0L) if (length(non_empty_idx) == 0L) return(ans) X_elt_start <- start(X_partitioning) X_elt_end <- end(X_partitioning) ans[non_empty_idx] <- lapply(non_empty_idx, function(i) FUN(extractROWS(unlisted_X, IRanges(X_elt_start[i], X_elt_end[i])), ...)) ans } setMethod("lapply", "CompressedList", function(X, FUN, ...) { ans <- lapply_CompressedList(X, FUN, ...) names(ans) <- names(X) ans } ) setMethod("revElements", "CompressedList", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (length(i) == 0L) return(x) rev <- logical(length(x)) rev <- replaceROWS(rev, i, TRUE) x_partitioning <- PartitioningByEnd(x) from <- ifelse(rev, end(x_partitioning), start(x_partitioning)) by <- ifelse(rev, -1L, 1L) ii <- sequence(width(x_partitioning), from, by) x@unlistData <- extractROWS(x@unlistData, ii) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Recycling ### repLengthOneElements <- function(x, times) { x@unlistData <- rep(x@unlistData, times) x@partitioning@end <- cumsum(times) x } recycleListElements <- function(x, newlen) { x_eltNROWS <- elementNROWS(x) if (identical(x_eltNROWS, newlen)) { return(x) } if (all(x_eltNROWS == 1L)) { ans <- repLengthOneElements(x, newlen) } else { if (any(x_eltNROWS == 0L & newlen > 0L)) { if (is(x, "AtomicList")) { x[x_eltNROWS == 0L & newlen > 0L] <- list(NA) } else { stop("recycling of zero-length elements not supported") } x_eltNROWS <- elementNROWS(x) } times <- ceiling(newlen / x_eltNROWS) times[x_eltNROWS == 0L] <- 0L ans_ir <- rep(as(PartitioningByEnd(x), "IRanges"), times) remainder <- newlen %% x_eltNROWS remainder[x_eltNROWS == 0L] <- 0L if (any(remainder > 0L)) { last <- cumsum(times) width(ans_ir)[last[remainder > 0]] <- remainder[remainder > 0] warning("Some element lengths are not multiples of their ", "corresponding element length in ", deparse(substitute(x))) } ans <- relist(extractROWS(unlist(x, use.names=FALSE), ans_ir), PartitioningByWidth(newlen)) } ans } IRanges/R/CompressedList-comparison.R0000644000175100017510000000543714626176651020605 0ustar00biocbuildbiocbuild### ========================================================================= ### Comparing and ordering CompressedList objects ### ------------------------------------------------------------------------- ### ### Overwrite methods defined in S4Vectors/R/List-comparison.R for List ### objects with optimized methods for CompressedList objects. ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Element-wise (aka "parallel") comparison of 2 List objects. ### ### TODO: Add optimized "==" and "<=" methods for CompressedList objects. ### setMethod("!", "CompressedList", function(x) relist(!unlist(x, use.names=FALSE), x) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### match() ### ### The first match method catches CompressedList,list; 'table' is atomic setMethod("match", c("CompressedList", "vector"), function(x, table, nomatch = NA_integer_, incomparables = NULL, ...) { m <- match(x@unlistData, table, nomatch=nomatch, incomparables=incomparables, ...) relist(m, x) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### duplicated() & unique() ### .duplicated.CompressedList <- function(x, incomparables=FALSE, fromLast=FALSE, nmax=NA) { if (!identical(incomparables, FALSE)) stop("\"duplicated\" method for CompressedList objects ", "does not support the 'incomparables' argument") x_unlistData <- x@unlistData sm <- match(x_unlistData, x_unlistData) # doesn't work on an Rle x_group <- rep.int(seq_along(x), elementNROWS(x)) ans_unlistData <- duplicatedIntegerPairs(x_group, sm, fromLast=fromLast) relist(ans_unlistData, x) } setMethod("duplicated", "CompressedList", .duplicated.CompressedList) .unique.CompressedList <- function(x, ...) { is_dup <- duplicated(x, ...) x_unlistData <- x@unlistData keep_idx <- which(!is_dup@unlistData) ans_unlistData <- x_unlistData[keep_idx] x_group <- rep.int(seq_along(x), elementNROWS(x)) ans_group <- x_group[keep_idx] ans_partitioning <- PartitioningByEnd(ans_group, NG=length(x), names=names(x)) relist(ans_unlistData, ans_partitioning) } setMethod("unique", "CompressedList", .unique.CompressedList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### %in% ### ### The "%in%" method for Vector objects calls is.na() internally. setMethod("is.na", "CompressedList", function(x) relist(is.na(unlist(x, use.names=FALSE)), x) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### order() and related methods. ### ### TODO: Add optimized methods for CompressedList objects. ### IRanges/R/CompressedRangesList-class.R0000644000175100017510000002376614626176651020705 0ustar00biocbuildbiocbuild### ========================================================================= ### CompressedRangesList objects ### ------------------------------------------------------------------------- ### setClass("CompressedRangesList", contains=c("RangesList", "CompressedList"), representation("VIRTUAL") ) setClass("CompressedPosList", contains=c("PosList", "CompressedRangesList"), representation("VIRTUAL") ) setClass("CompressedIntegerRangesList", contains=c("IntegerRangesList", "CompressedRangesList"), representation("VIRTUAL") ) setClass("CompressedIRangesList", contains=c("IRangesList", "CompressedIntegerRangesList"), representation(unlistData="IRanges") ) ### CompressedNormalIRangesList cannot hold NormalIRanges as its elements, ### due to the compression concatenating everything into a single ### NormalIRanges (which could easily become non-normal). So just have it ### hold IRanges, instead. setClass("CompressedNormalIRangesList", contains=c("NormalIRangesList", "CompressedIRangesList"), prototype=prototype( elementType="IRanges", unlistData=new("IRanges") ) ) setClass("CompressedIPosList", contains=c("IPosList", "CompressedPosList", "CompressedIntegerRangesList"), prototype=prototype(unlistData=new("StitchedIPos")) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for CompressedRangesList objects ### setMethod("start", "CompressedRangesList", function(x) relist(start(unlist(x, use.names=FALSE)), x) ) setMethod("end", "CompressedRangesList", function(x) relist(end(unlist(x, use.names=FALSE)), x) ) setMethod("width", "CompressedRangesList", function(x) relist(width(unlist(x, use.names=FALSE)), x) ) setMethod("pos", "CompressedPosList", function(x) relist(pos(unlist(x, use.names=FALSE)), x) ) setMethod(".replaceSEW", "CompressedRangesList", function(x, FUN, ..., value) { if (extends(class(value), "IntegerList")) { if (!identical(lapply(x, names), lapply(value, names)) && !all(elementNROWS(x) == elementNROWS(value))) stop("'value' must have same length and names as current 'ranges'") value <- unlist(value) } else if (is.numeric(value)) { lelts <- sum(elementNROWS(x)) if (lelts != length(value)) value <- rep(value, length.out = lelts) if (!is.integer(value)) value <- as.integer(value) } else { stop("'value' must extend class IntegerList or integer") } FUN <- match.fun(FUN) slot(x, "unlistData", check=FALSE) <- FUN(x@unlistData, ..., value = value) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for CompressedIRangesList objects ### setMethod("isNormal", "CompressedIRangesList", function(x, use.names=FALSE) .Call2("C_isNormal_CompressedIRangesList", x, use.names, PACKAGE="IRanges") ) setMethod("summary", "CompressedIRangesList", function(object) .Call2("C_summary_CompressedIRangesList", object, PACKAGE="IRanges") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion from list-like object to CompressedIRangesList ### ### From ordinary list to CompressedIRangesList .from_list_to_CompressedIRangesList <- function(from) { from <- as_list_of_IRanges(from) new_CompressedList_from_list("CompressedIRangesList", from) } setAs("list", "CompressedIRangesList", .from_list_to_CompressedIRangesList) ### From List derivative to CompressedIRangesList .from_List_to_CompressedIRangesList <- function(from) { new_CompressedList_from_list("CompressedIRangesList", as_list_of_IRanges(from), metadata=metadata(from), mcols=mcols(from, use.names=FALSE)) } ### IntegerRanges objects are List objects so this case is already covered ### by the .from_List_to_CompressedIRangesList() helper above. However, we ### can implement it much more efficiently. .from_IntegerRanges_to_CompressedIRangesList <- function(from) { if (!is(from, "IRanges")) from <- as(from, "IRanges", strict=FALSE) from_Vector_to_CompressedList(from) } setAs("List", "CompressedIRangesList", .from_List_to_CompressedIRangesList) setAs("IntegerRanges", "CompressedIRangesList", .from_IntegerRanges_to_CompressedIRangesList) setAs("List", "IRangesList", function(from) { if (is(from, "CompressedList") || is(from, "IntegerRanges")) as(from, "CompressedIRangesList") else as(from, "SimpleIRangesList") } ) ### This case is already covered by the List-to-CompressedIRangesList coercion ### above. However, we can implement it much more efficiently. setAs("CompressedRleList", "CompressedIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a CompressedIRangesList object") ranges <- as(unlist(from, use.names = FALSE), "IRanges") to <- diceRangesByList(ranges, from) metadata(to) <- metadata(from) mcols(to) <- mcols(from, use.names=FALSE) to }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for CompressedNormalIRangesList objects ### setMethod("getListElement", "CompressedNormalIRangesList", function(x, i, exact=TRUE) newNormalIRangesFromIRanges(callNextMethod()) ) .min_CompressedNormalIRangesList <- function(x, use.names) { if (!is(x, "CompressedNormalIRangesList")) stop("'x' must be a CompressedNormalIRangesList object") use.names <- S4Vectors:::normargUseNames(use.names) .Call2("C_min_CompressedNormalIRangesList", x, use.names, PACKAGE="IRanges") } setMethod("min", "CompressedNormalIRangesList", function(x, ..., na.rm) .min_CompressedNormalIRangesList(x, TRUE)) .max_CompressedNormalIRangesList <- function(x, use.names) { if (!is(x, "CompressedNormalIRangesList")) stop("'x' must be a CompressedNormalIRangesList object") use.names <- S4Vectors:::normargUseNames(use.names) .Call2("C_max_CompressedNormalIRangesList", x, use.names, PACKAGE="IRanges") } setMethod("max", "CompressedNormalIRangesList", function(x, ..., na.rm) .max_CompressedNormalIRangesList(x, TRUE)) ### Coercion from list to CompressedNormalIRangesList. .as.list.CompressedNormalIRangesList <- function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") ans <- lapply_CompressedList(x, newNormalIRangesFromIRanges) if (use.names) names(ans) <- names(x) ans } setMethod("as.list", "CompressedNormalIRangesList", .as.list.CompressedNormalIRangesList) ### Coercion from IntegerRangesList to NormalIRangesList. setAs("NormalIRangesList", "CompressedNormalIRangesList", function(from) { ans <- as(from, "CompressedIRangesList", strict=FALSE) class(ans) <- "CompressedNormalIRangesList" ans } ) setAs("CompressedIRangesList", "CompressedNormalIRangesList", function(from) { if (!all(isNormal(from))) from <- reduce(from, drop.empty.ranges=TRUE) class(from) <- "CompressedNormalIRangesList" from } ) setAs("IntegerRangesList", "CompressedNormalIRangesList", function(from) { as(as(from, "CompressedIRangesList", strict=FALSE), "CompressedNormalIRangesList") } ) setAs("IntegerRangesList", "NormalIRangesList", function(from) { if (is(from, "SimpleIntegerRangesList")) as(from, "SimpleNormalIRangesList") else as(from, "CompressedNormalIRangesList") } ) ### Coercion from LogicalList to NormalIRangesList. setAs("LogicalList", "NormalIRangesList", function(from) { if (is(from, "CompressedList")) as(from, "CompressedNormalIRangesList") else as(from, "SimpleNormalIRangesList") }) setAs("LogicalList", "CompressedNormalIRangesList", function(from) new_CompressedList_from_list("CompressedNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from, use.names=FALSE))) ### Coercion from RleList to NormalIRangesList. setAs("RleList", "NormalIRangesList", function(from) { if (is(from, "CompressedList")) as(from, "CompressedNormalIRangesList") else as(from, "SimpleNormalIRangesList") }) setAs("RleList", "CompressedNormalIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a CompressedNormalIRangesList object") new_CompressedList_from_list("CompressedNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from, use.names=FALSE)) }) ### Coercion from IntegerRanges to IPosList. .from_IntegerRanges_to_CompressedIPosList <- function(from) { from <- as(from, "IRanges") ans <- relist(new_StitchedIPos(from), from) mcols(ans) <- mcols(from, use.names=FALSE) metadata(ans) <- metadata(from) ans } setAs("IntegerRanges", "CompressedIPosList", .from_IntegerRanges_to_CompressedIPosList ) setAs("IntegerRanges", "IPosList", .from_IntegerRanges_to_CompressedIPosList ) setAs("IRanges", "IPosList", .from_IntegerRanges_to_CompressedIPosList ) IRanges/R/DataFrameList-class.R0000644000175100017510000004006014626176651017247 0ustar00biocbuildbiocbuild### ========================================================================= ### DataFrameList objects ### ------------------------------------------------------------------------- setClass("DataFrameList", contains="List", representation("VIRTUAL"), prototype(elementType="DataFrame") ) setClass("DFrameList", contains="DataFrameList", representation("VIRTUAL"), prototype(elementType="DFrame") ) setClass("SimpleDataFrameList", contains=c("DataFrameList", "SimpleList"), representation("VIRTUAL") ) setClass("SimpleDFrameList", contains=c("DFrameList", "SimpleDataFrameList") ) setClass("SplitDataFrameList", contains="DataFrameList", representation("VIRTUAL") ) setClass("SplitDFrameList", contains=c("DFrameList", "SplitDataFrameList"), representation("VIRTUAL") ) setClass("SimpleSplitDataFrameList", contains=c("SplitDataFrameList", "SimpleDataFrameList"), representation("VIRTUAL") ) setClass("SimpleSplitDFrameList", contains=c("SplitDFrameList", "SimpleDFrameList", "SimpleSplitDataFrameList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### ### Deprecated. ### IMPORTANT NOTE: We won't be able to go thru the Defunct cycle because ### a lot of code around assumes that nrow() can be called on an arbitrary ### object! setMethod("nrow", "DataFrameList", function(x) { msg <- c("The nrow() method for DataFrameList objects is ", "deprecated. Please use nrows() on these objects ", "instead.") .Deprecated(msg=wmsg(msg)) if (length(x) == 0L) 0L else elementNROWS(x) }) setMethod("nrows", "DataFrameList", function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop(wmsg("'use.names' must be TRUE or FALSE")) ans <- elementNROWS(x) if (!use.names) names(ans) <- NULL ans } ) ### Deprecated. ### IMPORTANT NOTE: We won't be able to go thru the Defunct cycle because ### a lot of code around assumes that ncol() can be called on an arbitrary ### object! setMethod("ncol", "DataFrameList", function(x) { msg <- c("The ncol() method for DataFrameList objects is ", "deprecated. Please use ncols() on these objects ", "instead.") .Deprecated(msg=wmsg(msg)) if (length(x) == 0L) 0L else unlist(lapply(x, ncol)) }) setMethod("ncols", "DataFrameList", function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop(wmsg("'use.names' must be TRUE or FALSE")) vapply(x, ncol, integer(1), USE.NAMES=use.names) } ) ### Deprecated. ### IMPORTANT NOTE: We won't be able to go thru the Defunct cycle because ### a lot of code around assumes that ncol() can be called on an arbitrary ### object! setMethod("ncol", "SimpleSplitDataFrameList", function(x) { msg <- c("The ncol() method for SimpleSplitDataFrameList objects ", "is deprecated. Please use ncols() on these objects ", "instead.") .Deprecated(msg=wmsg(msg)) if (length(x) == 0L) 0L else structure(rep.int(ncol(x[[1L]]), length(x)), names = names(x)) }) setMethod("ncols", "SimpleSplitDataFrameList", function(x, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop(wmsg("'use.names' must be TRUE or FALSE")) ans_names <- if (use.names) names(x) else NULL structure(rep.int(ncol(x[[1L]]), length(x)), names=ans_names) } ) ### Deprecated. ### IMPORTANT NOTE: We won't be able to go thru the Defunct cycle because ### a lot of code around assumes that dim() can be called on an arbitrary ### object e.g. in S4Vectors:::.NSBS.character_OR_factor(). setMethod("dim", "DataFrameList", function(x) { msg <- c("The dim() method for DataFrameList objects is ", "deprecated. Please use dims() on these objects ", "instead.") .Deprecated(msg=wmsg(msg)) cbind(nrow(x), ncol(x)) }) setMethod("dims", "DataFrameList", function(x, use.names=TRUE) { cbind(nrows(x, use.names=use.names), ncols(x, use.names=FALSE)) } ) setMethod("rownames", "DataFrameList", function(x, do.NULL = TRUE, prefix = "row") { CharacterList(lapply(x, rownames, do.NULL = do.NULL, prefix = prefix)) }) setMethod("colnames", "DataFrameList", function(x, do.NULL = TRUE, prefix = "col") { CharacterList(lapply(x, colnames, do.NULL = do.NULL, prefix = prefix)) }) setMethod("colnames", "SplitDataFrameList", function(x, do.NULL = TRUE, prefix = "col") { if (length(x)) { nms <- colnames(x[[1]], do.NULL = do.NULL, prefix = prefix) rep(CharacterList(nms), length(x)) } else NULL }) setMethod("dimnames", "DataFrameList", function(x) { list(rownames(x), colnames(x)) }) setReplaceMethod("rownames", "SimpleDataFrameList", function(x, value) { if (is.null(value) || is(value, "CharacterList")) { if (is.null(value)) value <- list(NULL) else if (length(x) != length(value)) stop("replacement value must be the same length as x") x@listData <- mapply(function(y, rn) {rownames(y) <- rn; y}, x@listData, value, SIMPLIFY=FALSE) } else { stop("replacement value must be NULL or a CharacterList") } x }) setReplaceMethod("colnames", "SimpleDataFrameList", function(x, value) { if (is.null(value)) { x@listData <- lapply(x@listData, function(y) {colnames(y) <- NULL; y}) } else if (is.character(value)) { for (i in seq_len(length(x))) colnames(x@listData[[i]]) <- value } else if (is(value, "CharacterList")){ if (length(x) != length(value)) stop("replacement value must be the same length as x") for (i in seq_len(length(x))) colnames(x@listData[[i]]) <- value[[i]] } else { stop("replacement value must either be NULL or a CharacterList") } x }) setReplaceMethod("dimnames", "DataFrameList", function(x, value) { if (!is.list(value)) stop("replacement value must be a list") rownames(x) <- value[[1L]] colnames(x) <- value[[2L]] x }) ### NROW(x) and ROWNAMES(x) need to retun length(x) and names(x), ### respectively, on a DataFrameList object, but the default methods ### return dim(x)[1L] and rownames(x), which is not what we want. ### So we need to override them. setMethod("NROW", "DataFrameList", function(x) length(x)) setMethod("ROWNAMES", "DataFrameList", function(x) names(x)) setReplaceMethod("ROWNAMES", "DataFrameList", function(x, value) { names(x) <- value x }) setGeneric("commonColnames", function(x) standardGeneric("commonColnames")) setMethod("commonColnames", "SimpleSplitDataFrameList", function(x) { if (length(x)) colnames(x[[1]]) else NULL }) setGeneric("commonColnames<-", function(x, value) standardGeneric("commonColnames<-")) setReplaceMethod("commonColnames", "SplitDataFrameList", function(x, value) { colnames(x) <- value x }) setGeneric("columnMetadata", function(x, ...) standardGeneric("columnMetadata")) setMethod("columnMetadata", "SimpleSplitDataFrameList", function(x) { if (length(x)) mcols(x[[1]], use.names=FALSE) else NULL }) setGeneric("columnMetadata<-", function(x, ..., value) standardGeneric("columnMetadata<-")) setReplaceMethod("columnMetadata", "SimpleSplitDataFrameList", function(x, value) { x@listData <- lapply(x@listData, function(xi) { mcols(xi) <- value xi }) x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.SimpleSplitDataFrameList <- function(x) { if (length(x)) { firstNames <- colnames(x[[1L]]) l <- as.list(x, use.names = FALSE) if (!all(sapply(l, function(df) identical(firstNames, colnames(df))))) return("column counts or names differ across elements") firstMetaData <- mcols(x[[1L]], use.names=FALSE) # could be NULL if (!all(sapply(l, function(df) { identical(firstMetaData, mcols(df, use.names=FALSE)) }))) return("metadata columns must be identical across elements") } NULL } setValidity2("SimpleSplitDataFrameList", .valid.SimpleSplitDataFrameList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### DataFrameList <- function(...) { listData <- list(...) if (length(listData) == 1 && is.list(listData[[1L]]) && !is.data.frame(listData[[1L]])) listData <- listData[[1L]] if (length(listData) > 0 && !is(listData[[1L]], "DFrame")) listData <- lapply(listData, as, "DFrame") S4Vectors:::new_SimpleList_from_list("SimpleDFrameList", listData) } SplitDataFrameList <- function(..., compress = TRUE, cbindArgs = FALSE) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") listData <- list(...) if (length(listData) == 1 && (is.list(listData[[1L]]) || is(listData[[1L]], "List")) && !(is.data.frame(listData[[1L]]) || is(listData[[1L]], "DFrame"))) listData <- listData[[1L]] if (cbindArgs) { if (is.null(names(listData))) names(listData) <- paste("X", seq_len(length(listData)), sep = "") listData <- do.call(Map, c(list(DataFrame), listData)) } as(listData, if (compress) "CompressedSplitDFrameList" else "SimpleSplitDFrameList") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("[", "SimpleSplitDataFrameList", function(x, i, j, ..., drop=TRUE) { if (!missing(j)) x@listData <- lapply(x@listData, function(y) y[,j,drop=FALSE]) if (!missing(i)) x <- callNextMethod(x, i) if (((nargs() - !missing(drop)) > 2) && (length(x@listData) > 0) && (ncol(x@listData[[1L]]) == 1) && (missing(drop) || drop)) { x <- as(lapply(x@listData, "[[", 1), "List") } x }) setMethod("normalizeSingleBracketReplacementValue", "SplitDataFrameList", function(value, x) { value <- callNextMethod() # call default method if (length(x) != 0L && ncols(x)[[1L]] == ncols(value)[[1L]]) colnames(value)[[1L]] <- colnames(x)[[1L]] value } ) setReplaceMethod("[", "SplitDataFrameList", function(x, i, j,..., value) { if (length(list(...)) > 0L) stop("invalid replacement") value <- normalizeSingleBracketReplacementValue(value, x) if (missing(j)) { if (missing(i)) ans <- callNextMethod(x=x, value=value) else ans <- callNextMethod(x=x, i=i, value=value) return(ans) } colind <- setNames(seq_along(commonColnames(x)), commonColnames(x)) if (missing(i) && is.character(j)) { colnames(value) <- j } j <- normalizeSingleBracketSubscript(j, colind, allow.append=missing(i)) if (missing(i)) { y <- value } else { y <- x[, j, drop=FALSE] if (is.list(i) || (is(i, "List") && !is(i, "IntegerRanges"))) { y <- S4Vectors:::lsubset_List_by_List(y, i, value) } else { y[i] <- value } } if (length(y) < length(x)) { y <- rep(y, length.out=length(x)) } if (is(x, "CompressedList")) { x_eltNROWS <- elementNROWS(x) y_eltNROWS <- elementNROWS(y) if (any(x_eltNROWS != y_eltNROWS)) { indices <- IRanges(start(y@partitioning), width=y_eltNROWS) indices <- rep(indices, x_eltNROWS / y_eltNROWS) if (sum(width(indices)) != sum(x_eltNROWS)) { stop("some element lengths of 'x' are not multiples of the ", "corresponding element lengths of 'value'") } y@unlistData <- y@unlistData[indices, , drop=FALSE] } x@unlistData[, j] <- y@unlistData } else if (is(x, "SimpleList")) { indices <- structure(seq_len(length(x)), names = names(x)) x@listData <- lapply(indices, function(k) { z <- x@listData[[k]] z[j] <- y[[k]] z }) } else { stop(class(x), " objects not supported") } x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("ANY", "DataFrameList", function(from) as(from, "DFrameList") ) setAs("ANY", "SimpleDataFrameList", function(from) as(from, "SimpleDFrameList") ) setAs("ANY", "SplitDataFrameList", function(from) as(from, "SplitDFrameList") ) setAs("ANY", "SimpleSplitDataFrameList", function(from) as(from, "SimpleSplitDFrameList") ) ## Casting DataFrameList -> DFrame implies cast to SplitDataFrameList setAs("DataFrameList", "DFrame", function(from) { as(as(from, "SplitDFrameList"), "DFrame") }) setAs("SplitDataFrameList", "DFrame", function(from) { cols <- sapply(commonColnames(from), function(j) from[,j], simplify=FALSE) DataFrame(cols, check.names=FALSE) } ) setAs("list", "SplitDFrameList", function(from) as(from, "SimpleSplitDFrameList")) setAs("SimpleList", "SplitDFrameList", function(from) as(from, "SimpleSplitDFrameList")) setAs("ANY", "SimpleSplitDFrameList", function(from) { new("SimpleSplitDFrameList", as(from, "SimpleDFrameList")) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Display ### setMethod("classNameForDisplay", "SimpleDFrameList", function(x) sub("^Simple", "", sub("DFrame", "DataFrame", class(x))) ) setMethod("show", "SplitDataFrameList", function(object) { k <- length(object) cumsumN <- cumsum(elementNROWS(object)) N <- tail(cumsumN, 1) cat(classNameForDisplay(object), " of length ", k, "\n", sep = "") if (k == 0L) { cat("<0 elements>\n") } else if ((k == 1L) || (N <= 20L)) { show(as.list(object)) } else { sketch <- function(x) c(head(x, 3), "...", tail(x, 3)) if (k >= 3 && cumsumN[3L] <= 20) showK <- 3 else if (k >= 2 && cumsumN[2L] <= 20) showK <- 2 else showK <- 1 diffK <- k - showK show(as.list(head(object, showK))) if (diffK > 0) cat("...\n<", k - showK, ifelse(diffK == 1, " more element>\n", " more elements>\n"), sep="") } }) IRanges/R/DataFrameList-utils.R0000644000175100017510000000360614626176651017307 0ustar00biocbuildbiocbuild### ========================================================================= ### DataFrameList utilities ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Combining. ### setMethod("cbind", "DataFrameList", function(..., deparse.level=1) mendoapply(cbind, ...)) setMethod("rbind", "DataFrameList", function(..., deparse.level=1) mendoapply(rbind, ...)) setMethod("stack", "DataFrameList", function(x, index.var = "name") { DataFrame(S4Vectors:::stack_index(x, index.var), unlist(x, use.names=FALSE), row.names = unlist(lapply(x, rownames))) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Transforming. ### setClass("SDFLWrapperForTransform", representation(delegate = "SplitDataFrameList"), contains="Vector") setMethod("colnames", "SDFLWrapperForTransform", function(x) { commonColnames(x@delegate) }) setMethod("[[", "SDFLWrapperForTransform", function (x, i, j, ...) { x@delegate[,i] }) setReplaceMethod("[[", "SDFLWrapperForTransform", function(x, i, j, ..., value) { x@delegate[,i] <- value x }) setMethod(S4Vectors:::`column<-`, "SDFLWrapperForTransform", function(x, name, value) { x[[name]] <- value x }) setMethod("as.env", "SDFLWrapperForTransform", function(x, enclos = parent.frame(2)) { env <- S4Vectors:::makeEnvForNames(x, colnames(x), enclos) S4Vectors:::addSelfRef(x@delegate, env) }) transform.SplitDataFrameList <- function(`_data`, ...) { illConceivedWrapper <- new("SDFLWrapperForTransform", delegate=`_data`) transform.DataFrame(illConceivedWrapper, ...)@delegate } setMethod("transform", "SplitDataFrameList", transform.SplitDataFrameList) IRanges/R/Grouping-class.R0000644000175100017510000007527714626176651016403 0ustar00biocbuildbiocbuild### ========================================================================= ### Grouping objects ### ------------------------------------------------------------------------- ### ### We call "grouping" an arbitrary mapping from a collection of NO objects ### to a collection of NG groups, or, more formally, a bipartite graph ### between integer sets [1, NO] and [1, NG]. Objects mapped to a given group ### are said to belong to, or to be assigned to, or to be in that group. ### Additionally, the objects in each group are ordered. So for example the ### 2 following groupings are considered different: ### ### Grouping 1: NG = 3, NO = 5 ### group objects ### 1 : 4, 2 ### 2 : ### 3 : 4 ### ### Grouping 2: NG = 3, NO = 5 ### group objects ### 1 : 2, 4 ### 2 : ### 3 : 4 ### ### There are no restriction on the mapping e.g. any object can be mapped ### to 0, 1, or more groups, and can be mapped twice to the same group. Also ### some or all the groups can be empty. ### ### The Grouping class is a virtual class that formalizes the most general ### kind of grouping. More specific groupings (e.g. many-to-one mappings) ### are formalized via specific Grouping subclasses. setClass("Grouping", contains="IntegerList", representation("VIRTUAL")) setGeneric("nobj", function(x) standardGeneric("nobj")) setGeneric("grouplengths", signature="x", function(x, i=NULL) standardGeneric("grouplengths") ) .subset_by_integer <- function(x, i=NULL) { if (is.null(i)) return(x) if (!is.numeric(i)) stop(wmsg("subscript must be NULL or an integer vector")) if (!is.integer(i)) i <- as.integer(i) x_len <- length(x) if (S4Vectors:::anyMissingOrOutside(i, -x_len, x_len)) stop(wmsg("subscript contains NAs or out of bounds indices")) x[i] } setMethod("grouplengths", "Grouping", function(x, i=NULL) { x_grouplens <- elementNROWS(x) .subset_by_integer(x_grouplens, i) } ) setMethod("show", "Grouping", function(object) { NG <- length(object) NO <- nobj(object) cat(class(object), " with ", NG, ifelse(NG == 1, " group ", " groups "), "and ", NO, ifelse(NO == 1, " object\n", " objects\n"), sep="") if (NG == 0L) return(invisible(NULL)) empty_groups <- which(grouplengths(object) == 0L) cat("Nb of empty groups: ", length(empty_groups), " (", 100.00 * length(empty_groups) / NG, "%)\n", sep="") } ) ### ------------------------------------------------------------------------- ### ManyToOneGrouping objects ### ------------------------- ### A ManyToOneGrouping object represents a grouping where every object in ### the collection belongs to exactly one group. setClass("ManyToOneGrouping", contains="Grouping", representation("VIRTUAL")) setMethod("nobj", "ManyToOneGrouping", function(x) sum(grouplengths(x))) setGeneric("members", signature="x", function(x, i) standardGeneric("members") ) setMethod("members", "ManyToOneGrouping", function(x, i) { if (!is.numeric(i)) stop(wmsg("subscript 'i' must be a vector of integers")) if (!is.integer(i)) i <- as.integer(i) sort(unlist(sapply(i, function(ii) x[[ii]]))) } ) setGeneric("vmembers", signature="x", function(x, L) standardGeneric("vmembers") ) setMethod("vmembers", "ManyToOneGrouping", function(x, L) { if (!is.list(L)) stop(wmsg("'L' must be a list of integer vectors")) lapply(L, function(i) members(x, i)) } ) setGeneric("togroup", signature="x", function(x, j=NULL) standardGeneric("togroup") ) ### Works on any ManyToOneGrouping object 'x' for which unlist() and ### elementNROWS() work. setMethod("togroup", "ManyToOneGrouping", function(x, j=NULL) { x_togroup <- unlist(x, use.names=FALSE) x_eltNROWS <- elementNROWS(x) x_togroup[x_togroup] <- rep.int(seq_along(x_eltNROWS), x_eltNROWS) .subset_by_integer(x_togroup, j) } ) setGeneric("togrouplength", signature="x", function(x, j=NULL) standardGeneric("togrouplength") ) setMethod("togrouplength", "ManyToOneGrouping", function(x, j=NULL) grouplengths(x, togroup(x, j)) ) ### ------------------------------------------------------------------------- ### ManyToManyGrouping objects ### ------------------------- ### A ManyToManyGrouping object represents a grouping where objects ### can map to any number of groups. setClass("ManyToManyGrouping", contains="Grouping", representation("VIRTUAL")) ### ------------------------------------------------------------------------- ### BiIndexGrouping objects ### ----------------------- #setClass("BiIndexGrouping", # contains="ManyToOneGrouping", # representation( # group2object="list", # object2group="integer" # ) #) #setMethod("length", "BiIndexGrouping", function(x) length(x@group2object)) #setMethod("nobj", "BiIndexGrouping", function(x) length(x@object2group)) ### ------------------------------------------------------------------------- ### H2LGrouping and Dups objects ### ---------------------------- ### ### High-to-Low Index ManyToOneGrouping objects. ### setClass("H2LGrouping", contains="ManyToOneGrouping", representation( high2low="integer", low2high="list" ) ) ### For storing the grouping implicitly defined by the "duplicated" ### relationship between elements of an arbitrary vector. setClass("Dups", contains="H2LGrouping") ### Two additional accessors for H2LGrouping objects. setGeneric("high2low", function(x) standardGeneric("high2low")) setMethod("high2low", "H2LGrouping", function(x) x@high2low) setGeneric("low2high", function(x) standardGeneric("low2high")) setMethod("low2high", "H2LGrouping", function(x) x@low2high) ### 'length(x)' and 'nobj(x)' are the same. setMethod("length", "H2LGrouping", function(x) length(x@low2high)) setMethod("nobj", "H2LGrouping", function(x) length(x@high2low)) setMethod("getListElement", "H2LGrouping", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact) if (is.na(x@high2low[i])) c(i, x@low2high[[i]]) else integer() } ) ### Should be more efficient than the default method for ManyToOneGrouping ### objects. setMethod("grouplengths", "H2LGrouping", function(x, i=NULL) { x_grouplens <- elementNROWS(x@low2high) + 1L x_grouplens[!is.na(x@high2low)] <- 0L .subset_by_integer(x_grouplens, i) } ) setMethod("members", "H2LGrouping", function(x, i) { if (!is.numeric(i)) stop(wmsg("subscript 'i' must be a vector of integers")) if (!is.integer(i)) i <- as.integer(i) ## NAs and "subscript out of bounds" are checked at the C level .Call2("C_members_H2LGrouping", x, i, PACKAGE="IRanges") } ) setMethod("vmembers", "H2LGrouping", function(x, L) { if (!is.list(L)) stop(wmsg("'L' must be a list of integer vectors")) .Call2("C_vmembers_H2LGrouping", x, L, PACKAGE="IRanges") } ) setMethod("togroup", "H2LGrouping", function(x, j=NULL) { x_togroup <- x@high2low x_togroup[is.na(x_togroup)] <- which(is.na(x_togroup)) .subset_by_integer(x_togroup, j) } ) ### The default method should be as good (if not better) as this. #setMethod("togrouplength", "H2LGrouping", # function(x) # { # ans <- rep.int(1L, length(x)) # mapped_lows <- setdiff(unique(x@high2low), NA) # for (low in mapped_lows) { # ii <- as.integer(c(low, x@low2high[[low]])) # ans[ii] <- length(ii) # } # ans # } #) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More operations on H2LGrouping objects. These operations are NOT part of ### the core Grouping API. ### ### The rank of group G_i is the number of non-empty groups that are before ### G_i plus one. Or, equivalently, it's the number of non-empty groups with ### an index <= i. setGeneric("grouprank", signature="x", function(x, i=NULL) standardGeneric("grouprank") ) setMethod("grouprank", "H2LGrouping", function(x, i=NULL) { ans <- cumsum(is.na(high2low(x))) if (!is.null(i)) ans <- ans[i] return(ans) } ) ### togrouprank() returns the mapping from objects to group ranks. ### An important property of togrouprank() is that: ### togrouprank(x, neg_idx) ### and ### seq_along(neg_idx) ### are identical, where 'neg_idx' is the vector of the indices of ### the non-empty groups i.e. ### neg_idx <- which(grouplengths(x) != 0L) setGeneric("togrouprank", signature="x", function(x, j=NULL) standardGeneric("togrouprank") ) setMethod("togrouprank", "H2LGrouping", function(x, j=NULL) { to_group <- togroup(x) group_rank <- grouprank(x) ans <- group_rank[to_group] if (!is.null(j)) ans <- ans[j] return(ans) } ) setReplaceMethod("length", "H2LGrouping", function(x, value) { if (!isSingleNumber(value)) stop(wmsg("length must be a single integer")) if (!is.integer(value)) value <- as.integer(value) if (value < 0L) stop(wmsg("length cannot be negative")) if (value > length(x)) stop(wmsg("cannot make a ", class(x), " instance longer")) length(x@high2low) <- value x@low2high <- S4Vectors:::reverseSelfmatchMapping(x@high2low) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.H2LGrouping <- function(x) { if (!is.integer(x@high2low)) return("the 'high2low' slot must contain an integer vector") if (!all(x@high2low >= 1L, na.rm=TRUE)) return("the 'high2low' slot must contain integer values >= 1") if (!all(x@high2low < seq_along(x@high2low), na.rm=TRUE)) { problem <- c("when mapped, elements in the 'high2low' slot must be mapped ", "to elements at a lower position") return(paste(problem, collapse="")) } if (!all(is.na(x@high2low[x@high2low]))) { problem <- c("when mapped, elements in the 'high2low' slot must be mapped ", "to unmapped elements") return(paste(problem, collapse="")) } if (!is.list(x@low2high)) return("the 'low2high' slot must contain a list") if (length(x@high2low) != length(x@low2high)) return("the 'high2low' and 'low2high' slots must have the same length") if (!identical(S4Vectors:::reverseSelfmatchMapping(x@high2low), x@low2high)) { problem <- c("the 'low2high' slot must contain the reverse mapping ", "of the 'high2low' slot") return(paste(problem, collapse="")) } NULL } setValidity("H2LGrouping", function(object) { problems <- .valid.H2LGrouping(object) if (is.null(problems)) TRUE else problems } ) ### For Dups objects only. .duplicated.Dups <- function(x, incomparables=FALSE) { if (!identical(incomparables, FALSE)) stop(wmsg("\"duplicated\" method for Dups objects ", "only accepts 'incomparables=FALSE'")) !is.na(high2low(x)) } setMethod("duplicated", "Dups", .duplicated.Dups) ### For Dups objects only. setMethod("show", "Dups", function(object) { percentage <- 100.00 * sum(duplicated(object)) / length(object) cat(class(object), " of length ", length(object), " (", percentage, "% of duplicates)\n", sep="") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructors. ### .newH2LGrouping <- function(Class, high2low) { if (!is.numeric(high2low)) stop(wmsg("'high2low' must be a vector of integers")) if (!is.integer(high2low)) high2low <- as.integer(high2low) new2(Class, high2low=high2low, low2high=S4Vectors:::reverseSelfmatchMapping(high2low), check=FALSE) } H2LGrouping <- function(high2low=integer()) .newH2LGrouping("H2LGrouping", high2low) Dups <- function(high2low=integer()) .newH2LGrouping("Dups", high2low) setMethod("high2low", "ANY", function(x) { ans <- selfmatch(x) ans[ans == seq_along(x)] <- NA_integer_ ans } ) ### ------------------------------------------------------------------------- ### GroupingRanges objects ### ---------------------- ### ### A GroupingRanges object represents a "block-grouping", that is, a ### grouping where each group is a block of adjacent elements in the original ### collection of objects. GroupingRanges objects support the IPosRanges ### API (e.g. start/end/width) in addition to the Grouping API. ### setClass("GroupingRanges", ## We put IPosRanges before Grouping so GroupingRanges objects inherit ## the "show" method for IPosRanges objects instead of the method for ## Grouping objects. contains=c("IPosRanges", "Grouping"), representation("VIRTUAL") ) ### Overwrite default method with optimized method for GroupingRanges objects. setMethod("grouplengths", "GroupingRanges", function(x, i=NULL) { x_width <- width(x) .subset_by_integer(x_width, i) } ) setClass("GroupingIRanges", contains=c("IRanges", "GroupingRanges")) ### ------------------------------------------------------------------------- ### Partitioning objects ### -------------------- ### ### A Partitioning object is a GroupingRanges object where the blocks fully ### cover the original collection of objects and don't overlap. This makes ### them many-to-one groupings. Furthermore, the blocks must be ordered by ### ascending position on the original collection of objects. ### Note that for a Partitioning object 'x', 'togroup(x)' is sorted in ### increasing order (not necessarily strictly increasing). ### ### The Partitioning class is virtual with 2 concrete direct subclasses: ### PartitioningByEnd and PartitioningByWidth. ### setClass("Partitioning", contains=c("GroupingRanges", "ManyToOneGrouping"), representation( "VIRTUAL", NAMES="character_OR_NULL" # R doesn't like @names !! ), prototype( NAMES=NULL ) ) setMethod("parallel_slot_names", "Partitioning", function(x) c("NAMES", callNextMethod()) ) setMethod("extractROWS", "Partitioning", function(x, i) { i <- normalizeSingleBracketSubscript(i, x) if (!isStrictlySorted(i)) stop(wmsg("Partitioning objects only support subsetting by a ", "strictly sorted subscript that drops empty partitions")) x_nobj <- nobj(x) ans <- callNextMethod() if (nobj(ans) != x_nobj) stop(wmsg("Partitioning objects only support subsetting by a ", "strictly sorted subscript that drops empty partitions")) ans } ) setMethod("bindROWS", "Partitioning", function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) stop(wmsg("Partitioning objects don't support concatenation")) ) ### The default methods below assume that the "length + start/end/width" API ### is already implemented. setMethod("getListElement", "Partitioning", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact) ## The purpose of the code below is to extract 'start(x)[i] - 1' ## (stored in 'ans_shift') and 'width(x)[i]' (stored in 'ans_len') ## in the fastest possible way. Looks like a convoluted way to ## extract those 2 values but it is actually 1000x faster than the ## naive way. ans_shift <- 0L ans_len <- end(x)[i] if (i >= 2L) { ans_shift <- end(x)[i - 1L] ans_len <- ans_len - ans_shift } seq_len(ans_len) + ans_shift } ) ### Overwrite method for ManyToOneGrouping objects with optimized method for ### Partitioning objects. setMethod("togroup", "Partitioning", function(x, j=NULL) { x_width <- width(x) x_togroup <- rep.int(seq_along(x_width), x_width) .subset_by_integer(x_togroup, j) } ) setMethod("names", "Partitioning", function(x) x@NAMES) setReplaceMethod("names", "Partitioning", set_IRanges_names) setMethod("NSBS", "Partitioning", function(i, x, exact=TRUE, strict.upper.bound=TRUE, allow.NAs=FALSE) { i <- range(i) callNextMethod() }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningByEnd uses a compact internal representation that allows ### fast mapping from groups to objects. However, it is not efficient for ### mapping from objects to groups. ### setClass("PartitioningByEnd", contains="Partitioning", representation( end="integer" ), prototype( end=integer() ) ) setMethod("parallel_slot_names", "PartitioningByEnd", function(x) c("end", callNextMethod()) ) setMethod("end", "PartitioningByEnd", function(x) x@end) ### Overwrite method for Ranges objects with optimized method for ### PartitioningByEnd objects. setMethod("length", "PartitioningByEnd", function(x) length(end(x))) ### Overwrite method for ManyToOneGrouping objects with optimized method for ### PartitioningByEnd objects. setMethod("nobj", "PartitioningByEnd", function(x) S4Vectors:::last_or(end(x), 0L) ) setMethod("start", "PartitioningByEnd", function(x) { x_end <- end(x) if (length(x_end) == 0L) return(integer()) c(1L, x_end[-length(x_end)] + 1L) } ) setMethod("width", "PartitioningByEnd", function(x) S4Vectors:::diffWithInitialZero(end(x)) ) .valid.PartitioningByEnd <- function(x) { if (!is.integer(end(x))) return("the ends must be integers") if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissing(end(x))) return("the ends cannot be NAs") if (S4Vectors:::isNotSorted(end(x))) return("the ends must be sorted") if (end(x)[1L] < 0L) return("the ends cannot be negative") if (!is.null(names(end(x)))) return("the ends should not be named") NULL } setValidity2("PartitioningByEnd", .valid.PartitioningByEnd) .numeric2end <- function(x=integer(0), NG=NULL) { if (!is.integer(x)) x <- as.integer(x) if (S4Vectors:::anyMissingOrOutside(x, 0L)) stop(wmsg("when 'x' is an integer vector, ", "it cannot contain NAs or negative values")) if (S4Vectors:::isNotSorted(x)) stop(wmsg("when 'x' is an integer vector, ", "it must be sorted")) if (is.null(NG)) return(x) ## When 'NG' (number of groups) is supplied, then 'x' is considered ## to represent the group assignment of a collection of 'length(x)' ## objects. Therefore the values in 'x' must be >= 1 and <= 'NG'. ## ADDITIONALLY, 'x' must be *sorted* (not strictly) so it can be ## reconstructed from the object returned by PartitioningByEnd() ## by doing togroup() on that object. if (!isSingleNumber(NG)) stop(wmsg("'NG' must be either NULL or a single integer")) if (!is.integer(NG)) NG <- as.integer(NG) NO <- length(x) # nb of objects if (NG == 0L) { if (NO != 0L) stop(wmsg("when 'NG' is 0, 'x' must be of length 0")) } else { ## 'x' is expected to be non-decreasing and with values >= 1 ## and <= 'NG'. x <- cumsum(tabulate(x, nbins=NG)) ## 'x[NG]' is guaranteed to be <= 'NO'. if (x[NG] != NO) stop(wmsg("when 'NG' is supplied, values in 'x' must ", "be >= 1 and <= 'NG'")) } x } .prepare_Partitioning_names <- function(names, ans_len, NG, x_names) { if (!is.null(names)) { if (!is.character(names) || length(names) != ans_len) stop(wmsg("'names' must be either NULL or a character vector ", "of length 'NG' (if supplied) or 'length(x)' ", "(if 'NG' is not supplied)")) return(names) } if (is.null(NG)) return(x_names) # should be of length 'ans_len' NULL } PartitioningByEnd <- function(x=integer(0), NG=NULL, names=NULL) { if (is(x, "List") || is.list(x)) { if (!is.null(NG)) warning(wmsg("'NG' argument is ignored when 'x' ", "is a list-like object")) if (is(x, "CompressedList")) { ## Behaves like a getter for the 'partitioning' slot. ans <- x@partitioning if (!is.null(names)) names(ans) <- names return(ans) } if (is(x, "PartitioningByEnd")) { if (!is.null(names)) names(x) <- names return(x) } x_names <- names(x) if (length(x) == 0L) { ans_end <- integer(0) } else { x_NROWS <- elementNROWS(x) ans_end <- suppressWarnings(cumsum(x_NROWS)) if (is.na(ans_end[[length(ans_end)]])) stop(wmsg(class(x)[[1L]], " object 'x' is too big (the ", "cumulated length of its list elements is >= 2^32)")) } } else { if (!is.numeric(x)) stop(wmsg("'x' must be either a list-like object or ", "a sorted vector of non-NA non-negative integers")) x_names <- names(x) ans_end <- .numeric2end(x, NG) } ans_names <- .prepare_Partitioning_names(names, length(ans_end), NG, x_names) new2("PartitioningByEnd", end=unname(ans_end), NAMES=ans_names, check=FALSE) } setAs("IntegerRanges", "PartitioningByEnd", function(from) { ans <- PartitioningByEnd(end(from), names=names(from)) if (!identical(start(ans), start(from))) stop(wmsg("the IntegerRanges object to coerce does not ", "represent a partitioning")) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningByWidth uses a compact internal representation too. Storing ### the widths instead of the ends would allow the total number of objects ### (nobj(x)) to be greater than 2^31-1 but note that some methods will break ### when this happens, e.g. nobj, end, etc... ### setClass("PartitioningByWidth", contains="Partitioning", representation( width="integer" ), prototype( width=integer() ) ) setMethod("parallel_slot_names", "PartitioningByWidth", function(x) c("width", callNextMethod()) ) setMethod("width", "PartitioningByWidth", function(x) x@width) ### Overwrite method for Ranges objects with optimized method for ### PartitioningByWidth objects. setMethod("length", "PartitioningByWidth", function(x) length(width(x))) setMethod("end", "PartitioningByWidth", function(x) cumsum(width(x))) setMethod("start", "PartitioningByWidth", function(x) { x_width <- width(x) if (length(x_width) == 0L) return(integer()) c(1L, cumsum(x_width[-length(x_width)]) + 1L) } ) .valid.PartitioningByWidth <- function(x) { if (!is.integer(width(x))) return("the widths must be integers") if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissingOrOutside(width(x), 0L)) return("the widths cannot be NAs or negative") if (!is.null(names(width(x)))) return("the widths should not be named") NULL } setValidity2("PartitioningByWidth", .valid.PartitioningByWidth) .numeric2width <- function(x=integer(0), NG=NULL) { if (!is.integer(x)) x <- as.integer(x) if (S4Vectors:::anyMissingOrOutside(x, 0L)) stop(wmsg("when 'x' is an integer vector, ", "it cannot contain NAs or negative values")) if (is.null(NG)) return(x) ## When 'NG' (number of groups) is supplied, then 'x' is considered ## to represent the group assignment of a collection of 'length(x)' ## objects. Therefore the values in 'x' must be >= 1 and <= 'NG'. ## ADDITIONALLY, 'x' must be *sorted* (not strictly) so it can be ## reconstructed from the object returned by PartitioningByWidth() ## by doing togroup() on that object. if (S4Vectors:::isNotSorted(x)) stop(wmsg("when 'x' is an integer vector, it must be sorted")) if (!isSingleNumber(NG)) stop(wmsg("'NG' must be either NULL or a single integer")) if (!is.integer(NG)) NG <- as.integer(NG) NO <- length(x) # nb of objects if (NG == 0L) { if (NO != 0L) stop(wmsg("when 'NG' is 0, 'x' must be of length 0")) } else { ## 'x' is expected to be non-decreasing and with values >= 1 ## and <= 'NG'. x <- tabulate(x, nbins=NG) ## 'sum(x)' is guaranteed to be <= 'NO'. if (sum(x) != NO) stop(wmsg("when 'NG' is supplied, values in 'x' must ", "be >= 1 and <= 'NG'")) } x } PartitioningByWidth <- function(x=integer(0), NG=NULL, names=NULL) { if (is(x, "List") || is.list(x)) { if (!is.null(NG)) warning(wmsg("'NG' argument is ignored when 'x' ", "is a list-like object")) if (is(x, "PartitioningByWidth")) { if (!is.null(names)) names(x) <- names return(x) } x_names <- names(x) ans_width <- elementNROWS(x) } else { if (!is.numeric(x)) stop(wmsg("'x' must be either a list-like object or ", "a vector of non-NA non-negative integers")) x_names <- names(x) ans_width <- .numeric2width(x, NG) } ans_names <- .prepare_Partitioning_names(names, length(ans_width), NG, x_names) new2("PartitioningByWidth", width=unname(ans_width), NAMES=ans_names, check=FALSE) } setAs("IntegerRanges", "PartitioningByWidth", function(from) { ans <- PartitioningByWidth(width(from), names(from)) if (!identical(start(ans), start(from))) stop(wmsg("the IntegerRanges object to coerce does not ", "represent a partitioning")) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### PartitioningMap contains PartitioningByEnd and one additional slot, ### 'mapOrder', to specify a different order. This object is used by the ### pack() function in GenomicFiles and is put in @partitioning of a ### GRangesList of pack()ed ranges. 'mapOrder' records the order of the ### unpacked() ranges. ### setClass("PartitioningMap", contains="PartitioningByEnd", representation( mapOrder="integer" ), prototype( mapOrder=integer() ) ) setGeneric("mapOrder", function(x) standardGeneric("mapOrder")) setMethod("mapOrder", "PartitioningMap", function(x) x@mapOrder) .valid.PartitioningMap <- function(x) { if (length(x) == 0L) return(NULL) if (S4Vectors:::anyMissing(mapOrder(x))) return("mapOrder cannot contain NA values") if (any(mapOrder(x) < 0L)) return("mapOrder values cannot be negative") if (!is.null(names(mapOrder(x)))) return("the mapOrder should not be named") if (length(maporder <- mapOrder(x))) { maxorder <- max(maporder) if (max(maporder) > max(end(x))) return("max mapOrder value must be == max(end(object))") } NULL } setValidity2("PartitioningMap", .valid.PartitioningMap) PartitioningMap <- function(x=integer(), mapOrder=integer(), ...) new("PartitioningMap", PartitioningByEnd(x=x), mapOrder=mapOrder, ...) setAs("PartitioningByEnd", "PartitioningMap", function(from) new("PartitioningMap", from, mapOrder=numeric()) ) setMethod("show", "PartitioningMap", function(object) { cat(class(object), " of length ", length(object), "\n") cat("mapOrder: ", mapOrder(object), "\n") print(PartitioningByEnd(object)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps() ### ### A simple findOverlaps method that doesn't use NCList but works only ### on a subject with *adjacent* ranges sorted non-decreasingly. ### Can be 30% faster or more than the real findOverlaps() (NCList-based) ### when 'query' is such that 'start(query)' and 'end(query)' are also sorted ### non-decreasingly (which is the case if for example 'query' is a ### Partitioning object). ### TODO: Add a "findOverlaps" method for Partitioning,Partitioning in the ### findOverlaps-methods.R file that calls this. findOverlaps_IntegerRanges_Partitioning <- function(query, subject, hit.empty.query.ranges=FALSE, hit.empty.subject.ranges=FALSE) { if (!is(query, "IntegerRanges")) stop(wmsg("'query' must be an IntegerRanges object")) if (!is(subject, "Partitioning")) stop(wmsg("'subject' must be a Partitioning object")) if (!isTRUEorFALSE(hit.empty.query.ranges) || !isTRUEorFALSE(hit.empty.subject.ranges)) stop(wmsg("'hit.empty.query.ranges' and 'hit.empty.subject.ranges' ", "must be TRUE or FALSE")) q_len <- length(query) q_start <- start(query) q_end <- end(query) s_len <- length(subject) s_end <- end(subject) if (!hit.empty.query.ranges) { q_idx <- which(width(query) != 0L) q_start <- q_start[q_idx] q_end <- q_end[q_idx] } if (!hit.empty.subject.ranges) { s_idx <- which(width(subject) != 0L) s_end <- s_end[s_idx] } vec <- c(0L, s_end) + 0.5 q_start2subject <- findInterval(q_start, vec) q_end2subject <- findInterval(q_end, vec) q_hits <- rep.int(seq_along(q_start), q_end2subject - q_start2subject + 1L) s_hits <- sequence(q_end2subject - q_start2subject + 1L, q_start2subject) ## If 'query' is a Partitioning object, all hits are guaranteed to be ## valid. if (!is(query, "Partitioning")) { ## Remove invalid hits. is_valid <- 1L <= s_hits & s_hits <= length(s_end) q_hits <- q_hits[is_valid] s_hits <- s_hits[is_valid] } ## Remap hits to original query/subject. if (!hit.empty.query.ranges) q_hits <- q_idx[q_hits] if (!hit.empty.subject.ranges) s_hits <- s_idx[s_hits] ## Make and return Hits object. Hits(q_hits, s_hits, q_len, s_len, sort.by.query=TRUE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Old stuff (deprecated & defunct) ### setMethod("togroup", "ANY", function(x, j=NULL) { msg <- wmsg( "Using togroup() on a ", class(x), " object is defunct. ", "Please use togroup(PartitioningByWidth(...)) instead." ) .Defunct(msg=msg) } ) IRanges/R/Hits-class-leftovers.R0000644000175100017510000000467614626176651017522 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 4/29/2014 ### Most of the stuff that used to be in the IRanges/R/Hits-class.R file was ### moved to the S4Vectors package (to R/Hits-class.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Hits-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### ### Turn SortedByQueryHits object 'from' into a PartitioningByEnd object that ### describes the grouping of hits by query. .from_SortedByQueryHits_to_PartitioningByEnd <- function(from) PartitioningByEnd(queryHits(from), NG=queryLength(from)) setAs("SortedByQueryHits", "PartitioningByEnd", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "Partitioning", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "IntegerRanges", .from_SortedByQueryHits_to_PartitioningByEnd ) setAs("SortedByQueryHits", "IRanges", function(from) as(.from_SortedByQueryHits_to_PartitioningByEnd(from), "IRanges") ) ### Turn SortedByQueryHits object 'from' into a CompressedIntegerList object ### with one list element per element in the original query. .from_SortedByQueryHits_to_CompressedIntegerList <- function(from) { ans_partitioning <- .from_SortedByQueryHits_to_PartitioningByEnd(from) relist(subjectHits(from), ans_partitioning) } setAs("SortedByQueryHits", "CompressedIntegerList", .from_SortedByQueryHits_to_CompressedIntegerList ) setAs("SortedByQueryHits", "IntegerList", .from_SortedByQueryHits_to_CompressedIntegerList ) setAs("SortedByQueryHits", "List", .from_SortedByQueryHits_to_CompressedIntegerList ) .as.list.SortedByQueryHits <- function(x) as.list(.from_SortedByQueryHits_to_CompressedIntegerList(x)) setMethod("as.list", "SortedByQueryHits", .as.list.SortedByQueryHits) .from_Hits_to_CompressedIntegerList <- function(from) { as(as(from, "SortedByQueryHits"), "CompressedIntegerList") } setAs("Hits", "List", .from_Hits_to_CompressedIntegerList) setAs("Hits", "IntegerList", .from_Hits_to_CompressedIntegerList) setAs("Hits", "CompressedIntegerList", .from_Hits_to_CompressedIntegerList) setMethod("as.list", "Hits", function(x) as.list(as(x, "SortedByQueryHits"))) setAs("Hits", "Grouping", function(from) ManyToManyGrouping(as(from, "List"), nobj=nRnode(from))) IRanges/R/IPos-class.R0000644000175100017510000005221214626176651015443 0ustar00biocbuildbiocbuild### ========================================================================= ### IPos objects ### ------------------------------------------------------------------------- ### setClass("IPos", contains=c("Pos", "IPosRanges"), representation( "VIRTUAL", NAMES="character_OR_NULL" # R doesn't like @names !! ) ) ### Combine the new "parallel slots" with those of the parent class. Make ### sure to put the new parallel slots **first**. See R/Vector-class.R file ### in the S4Vectors package for what slots should or should not be considered ### "parallel". setMethod("parallel_slot_names", "IPos", function(x) c("NAMES", callNextMethod()) ) setClass("UnstitchedIPos", contains="IPos", representation( pos="integer" ) ) ### Combine the new "parallel slots" with those of the parent class. Make ### sure to put the new parallel slots **first**. See R/Vector-class.R file ### in the S4Vectors package for what slots should or should not be considered ### "parallel". setMethod("parallel_slot_names", "UnstitchedIPos", function(x) c("pos", callNextMethod()) ) setClass("StitchedIPos", contains="IPos", representation( pos_runs="IRanges" # An unnamed IRanges instance that has # been "stitched" (see below). ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity ### .OLD_IPOS_INSTANCE_MSG <- c( "Starting with BioC 3.10, the class attribute of all ", "IPos **instances** needs to be set to \"StitchedIPos\". ", "Please update this object with 'updateObject(object, verbose=TRUE)' ", "and re-serialize it." ) .validate_IPos <- function(x) { if (class(x) == "IPos") return(paste(.OLD_IPOS_INSTANCE_MSG, collapse="")) NULL } setValidity2("IPos", .validate_IPos) ### TODO: Add validity methods for UnstitchedIPos and StitchedIPos objects. ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Very low-level UnstitchedIPos and StitchedIPos constructors ### ### For maximum efficiency, these constructors trust all the supplied ### arguments and do not validate the object. ### .unsafe_new_UnstitchedIPos <- function(pos, names=NULL, mcols=NULL, metadata=list()) { new2("UnstitchedIPos", pos=pos, NAMES=names, elementMetadata=mcols, metadata=metadata, check=FALSE) } ### Trusts all supplied arguments and does not validate the object. .unsafe_new_StitchedIPos <- function(pos_runs, names=NULL, mcols=NULL, metadata=list()) { new2("StitchedIPos", pos_runs=pos_runs, NAMES=names, elementMetadata=mcols, metadata=metadata, check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### updateObject() ### ### NOT exported but used in the GenomicRanges package. get_IPos_version <- function(object) { if (.hasSlot(object, "NAMES")) return("current") if (class(object) != "IPos") return(">= 2.19.4 and < 2.19.9") return("< 2.19.4") } .updateObject_IPos <- function(object, ..., verbose=FALSE) { if (.hasSlot(object, "NAMES")) { ## 'object' was made with IRanges >= 2.19.9. if (verbose) message("[updateObject] ", class(object), " object is current.\n", "[updateObject] Nothing to update.") return(callNextMethod()) } if (verbose) message("[updateObject] ", class(object), " object ", "uses internal representation from\n", "[updateObject] IRanges ", get_IPos_version(object), ". ", "Updating it ... ", appendLF=FALSE) if (class(object) == "UnstitchedIPos") { ## 'object' is an UnstitchedIPos instance that was made with ## IRanges >= 2.19.4 and < 2.19.9. object <- .unsafe_new_UnstitchedIPos(object@pos, NULL, object@elementMetadata, object@metadata) } else { ## 'object' is either an IPos instance that was made with ## IRanges < 2.19.4 or a StitchedIPos instance that was made with ## IRanges >= 2.19.4 and < 2.19.9. object <- .unsafe_new_StitchedIPos(object@pos_runs, NULL, object@elementMetadata, object@metadata) } if (verbose) message("OK") callNextMethod() } setMethod("updateObject", "IPos", .updateObject_IPos) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessors ### setMethod("pos", "UnstitchedIPos", function(x) x@pos) ### This really should be the method for StitchedIPos objects but we define a ### method for IPos objects for backward compatibility with old IPos instances. setMethod("pos", "IPos", function(x) unlist_as_integer(x@pos_runs)) setMethod("length", "UnstitchedIPos", function(x) length(x@pos)) ### This really should be the method for StitchedIPos objects but we define a ### method for IPos objects for backward compatibility with old IPos instances. setMethod("length", "IPos", function(x) sum(width(x@pos_runs))) setMethod("names", "IPos", function(x) x@NAMES) setReplaceMethod("names", "IPos", function(x, value) { x@NAMES <- S4Vectors:::normarg_names(value, "IPos", length(x)) x } ) ### No `pos<-` setter at the moment for IPos objects! Should we have it? ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Collapse runs of "stitchable integer ranges" ### ### In an IntegerRanges object 'x', 2 ranges x[i] and x[i+1] are "stitchable" ### if start(x[i+1]) == end(x[i])+1. For example, in the following object: ### 1: .....xxxx............. ### 2: ...xx................. ### 3: .........xxx.......... ### 4: ............xxxxxx.... ### 5: ..................x... ### x[3] and x[4] are stitchable, and x[4] and x[5] are stitchable. So ### x[3], x[4], and x[5] form a run of "stitchable ranges" that will collapse ### into the following single range after stitching: ### .........xxxxxxxxxx... ### Note that x[1] and x[3] are not stitchable because they are not ### consecutive vector elements (but they would if we removed x[2]). ### stitch_IntegerRanges() below takes any IntegerRanges derivative and ### returns an IRanges object (so is NOT an endomorphism). Note that this ### transformation preserves 'sum(width(x))'. ### Also note that this is an "inter range transformation". However unlike ### range(), reduce(), gaps(), or disjoin(), its result depends on the order ### of the elements in the input vector. It's also idempotent like range(), ### reduce(), and disjoin() (gaps() is not). ### TODO: Define and export stitch() generic and method for IntegerRanges ### objects (in inter-range-methods.R). ### Maybe it would also make sense to have an isStitched() generic like we ### have isDisjoint() to provide a quick and easy way to check the state of ### the object before applying the transformation to it. In theory each ### idempotent inter range transformation could have a "state checker" so ### maybe add isReduced() too (range() probably doesn't need one). stitch_IntegerRanges <- function(x) { if (length(x) == 0L) return(IRanges()) x_start <- start(x) x_end <- end(x) ## Find runs of stitchable elements along 'x'. ## Each run is described by the indices of its first ('run_from') and ## last ('run_to') elements in 'x'. ## The runs form a partitioning of 'x'. new_run_idx <- which(x_start[-1L] != x_end[-length(x)] + 1L) run_from <- c(1L, new_run_idx + 1L) run_to <- c(new_run_idx, length(x)) IRanges(x_start[run_from], x_end[run_to]) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### 'pos' must be an integer vector with no NAs. .make_StitchedIPos_from_pos <- function(pos, names=NULL, mcols=NULL, metadata=list()) { pos_runs <- as(pos, "IRanges") .unsafe_new_StitchedIPos(pos_runs, names, mcols, metadata) } .from_UnstitchedIPos_to_StitchedIPos <- function(from) { .make_StitchedIPos_from_pos(from@pos, from@NAMES, from@elementMetadata, from@metadata) } ### 'pos_runs' must be an IRanges object. .make_UnstitchedIPos_from_pos_runs <- function(pos_runs, names=NULL, mcols=NULL, metadata=list()) { pos <- unlist_as_integer(pos_runs) .unsafe_new_UnstitchedIPos(pos, names, mcols, metadata) } .from_StitchedIPos_to_UnstitchedIPos <- function(from) { .make_UnstitchedIPos_from_pos_runs(from@pos_runs, from@NAMES, from@elementMetadata, from@metadata) } ### 'pos' must be an integer vector with no NAs or an IntegerRanges derivative. ### This is NOT checked! new_UnstitchedIPos <- function(pos=integer(0)) { if (is(pos, "UnstitchedIPos")) return(pos) if (is(pos, "StitchedIPos")) return(.from_StitchedIPos_to_UnstitchedIPos(pos)) if (is.integer(pos)) { ## Treat 'pos' as a vector of single positions. names <- names(pos) if (!is.null(names)) names(pos) <- NULL return(.unsafe_new_UnstitchedIPos(pos, names)) } ## 'pos' is an IntegerRanges derivative. Treat its ranges as runs of ## consecutive positions. ans_len <- sum(width(pos)) # no more integer overflow in R >= 3.5 if (ans_len > .Machine$integer.max) stop("too many positions in 'pos'") .make_UnstitchedIPos_from_pos_runs(pos) } ### 'pos' must be an integer vector with no NAs or an IntegerRanges derivative. ### This is NOT checked! new_StitchedIPos <- function(pos=integer(0)) { if (is(pos, "StitchedIPos")) return(pos) if (is(pos, "UnstitchedIPos")) return(.from_UnstitchedIPos_to_StitchedIPos(pos)) if (is.integer(pos)) { ## Treat 'pos' as a vector of single positions. names <- names(pos) if (!is.null(names)) names(pos) <- NULL return(.make_StitchedIPos_from_pos(pos, names)) } ## 'pos' is an IntegerRanges derivative. Treat its ranges as runs of ## consecutive positions. ans_len <- sum(width(pos)) # no more integer overflow in R >= 3.5 if (ans_len > .Machine$integer.max) stop("too many positions in 'pos'") pos_runs <- stitch_IntegerRanges(pos) pos_runs <- pos_runs[width(pos_runs) != 0L] .unsafe_new_StitchedIPos(pos_runs) } ### Returns an integer vector with no NAs or an IntegerRanges derivative. .normarg_pos <- function(pos) { if (is(pos, "IntegerRanges")) return(pos) if (is.numeric(pos)) { if (!is.integer(pos)) storage.mode(pos) <- "integer" # preserve the names if (anyNA(pos)) stop("'pos' cannot contain NAs") return(pos) } ans <- try(as(pos, "IRanges"), silent=TRUE) if (inherits(ans, "try-error")) stop("'pos' must represent positions") ans } .normarg_stitch <- function(stitch, pos) { if (!(is.logical(stitch) && length(stitch) == 1L)) stop("'stitch' must be TRUE, FALSE, or NA") if (!is.na(stitch)) return(stitch) is(pos, "IntegerRanges") && !is(pos, "UnstitchedIPos") } ### If the input object 'pos' is itself an IPos object, its metadata columns ### are propagated. IPos <- function(pos=integer(0), names=NULL, ..., stitch=NA) { mcols <- DataFrame(..., check.names=FALSE) pos <- .normarg_pos(pos) stitch <- .normarg_stitch(stitch, pos) if (stitch) { ans <- new_StitchedIPos(pos) } else { ans <- new_UnstitchedIPos(pos) } if (!is.null(names)) names(ans) <- names if (length(mcols) != 0L) mcols(ans) <- mcols ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("UnstitchedIPos", "StitchedIPos", .from_UnstitchedIPos_to_StitchedIPos) setAs("StitchedIPos", "UnstitchedIPos", .from_StitchedIPos_to_UnstitchedIPos) .check_IntegerRanges_for_coercion_to_IPos <- function(from, to) { if (!all(width(from) == 1L)) stop(wmsg("all the ranges in the ", class(from), " object to ", "coerce to ", to, " must have a width of 1")) } .from_IntegerRanges_to_UnstitchedIPos <- function(from) { .check_IntegerRanges_for_coercion_to_IPos(from, "UnstitchedIPos") ans <- new_UnstitchedIPos(from) names(ans) <- names(from) mcols(ans) <- mcols(from, use.names=FALSE) metadata(ans) <- metadata(from) ans } .from_IntegerRanges_to_StitchedIPos <- function(from) { .check_IntegerRanges_for_coercion_to_IPos(from, "StitchedIPos") ans <- new_StitchedIPos(from) names(ans) <- names(from) mcols(ans) <- mcols(from, use.names=FALSE) metadata(ans) <- metadata(from) ans } setAs("IntegerRanges", "UnstitchedIPos", .from_IntegerRanges_to_UnstitchedIPos) setAs("IntegerRanges", "StitchedIPos", .from_IntegerRanges_to_StitchedIPos) setAs("IntegerRanges", "IPos", .from_IntegerRanges_to_UnstitchedIPos) setAs("ANY", "UnstitchedIPos", function(from) IPos(from, stitch=FALSE)) setAs("ANY", "StitchedIPos", function(from) IPos(from, stitch=TRUE)) setAs("ANY", "IPos", function(from) IPos(from)) ### S3/S4 combo for as.data.frame.IPos ### The "as.data.frame" method for IntegerRanges objects works on an IPos ### object but returns a data.frame with identical "start" and "end" columns, ### and a "width" column filled with 1. We overwrite it to return a data.frame ### with a "pos" column instead of the "start" and "end" columns, and no ### "width" column. .as.data.frame.IPos <- function(x, row.names=NULL, optional=FALSE) { if (!identical(optional, FALSE)) warning(wmsg("'optional' argument was ignored")) ans <- data.frame(pos=pos(x), row.names=row.names, stringsAsFactors=FALSE) x_mcols <- mcols(x, use.names=FALSE) # can be NULL! if (!is.null(x_mcols)) ans <- cbind(ans, as.data.frame(x_mcols, optional=TRUE)) ans } as.data.frame.IPos <- function(x, row.names=NULL, optional=FALSE, ...) .as.data.frame.IPos(x, row.names=NULL, optional=FALSE, ...) setMethod("as.data.frame", "IPos", .as.data.frame.IPos) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting ### ### NOT exported but used in the GenomicRanges package. ### 'pos_runs' must be an IRanges or GRanges object or any range-based ### object as long as it supports start(), end(), width(), and is subsettable. ### 'i' must be an IntegerRanges object with no zero-width ranges. extract_pos_runs_by_ranges <- function(pos_runs, i) { map <- S4Vectors:::map_ranges_to_runs(width(pos_runs), start(i), width(i)) ## Because 'i' has no zero-width ranges, 'mapped_range_span' cannot ## contain zeroes and so 'mapped_range_Ltrim' and 'mapped_range_Rtrim' ## cannot contain garbbage. mapped_range_offset <- map[[1L]] mapped_range_span <- map[[2L]] mapped_range_Ltrim <- map[[3L]] mapped_range_Rtrim <- map[[4L]] run_idx <- sequence(mapped_range_span, from=mapped_range_offset+1L) pos_runs <- pos_runs[run_idx] if (length(run_idx) != 0L) { Rtrim_idx <- cumsum(mapped_range_span) Ltrim_idx <- c(1L, Rtrim_idx[-length(Rtrim_idx)] + 1L) trimmed_start <- start(pos_runs)[Ltrim_idx] + mapped_range_Ltrim trimmed_end <- end(pos_runs)[Rtrim_idx] - mapped_range_Rtrim start(pos_runs)[Ltrim_idx] <- trimmed_start end(pos_runs)[Rtrim_idx] <- trimmed_end new_len <- sum(width(pos_runs)) # no more integer overflow in R >= 3.5 if (new_len > .Machine$integer.max) stop("subscript is too big") } pos_runs } ### This really should be the method for StitchedIPos objects but we define a ### method for IPos objects for backward compatibility with old IPos instances. setMethod("extractROWS", "IPos", function(x, i) { ans <- callNextMethod() if (is(x, "UnstitchedIPos")) return(ans) i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ## TODO: Maybe make this the coercion method from NSBS to ## IntegerRanges. if (is(i, "RangesNSBS")) { ir <- i@subscript ir <- ir[width(ir) != 0L] } else { ir <- as(as.integer(i), "IRanges") } new_pos_runs <- extract_pos_runs_by_ranges(x@pos_runs, ir) new_pos_runs <- stitch_IntegerRanges(new_pos_runs) BiocGenerics:::replaceSlots(ans, pos_runs=new_pos_runs, check=FALSE) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### .IPos_summary <- function(object) { object_class <- classNameForDisplay(object) object_len <- length(object) object_mcols <- mcols(object, use.names=FALSE) object_nmc <- if (is.null(object_mcols)) 0L else ncol(object_mcols) paste0(object_class, " object with ", object_len, " ", ifelse(object_len == 1L, "position", "positions"), " and ", object_nmc, " metadata ", ifelse(object_nmc == 1L, "column", "columns")) } ### S3/S4 combo for summary.IPos summary.IPos <- function(object, ...) .IPos_summary(object, ...) setMethod("summary", "IPos", summary.IPos) .from_IPos_to_naked_character_matrix_for_display <- function(x) { m <- cbind(pos=showAsCell(pos(x))) cbind_mcols_for_display(m, x) } setMethod("makeNakedCharacterMatrixForDisplay", "IPos", .from_IPos_to_naked_character_matrix_for_display ) show_IPos <- function(x, margin="", print.classinfo=FALSE) { version <- get_IPos_version(x) if (version != "current") stop(c(wmsg("This ", class(x), " object uses internal representation ", "from IRanges ", version, ", and so needs to be updated ", "before it can be displayed or used. ", "Please update it with:"), "\n\n object <- updateObject(object, verbose=TRUE)", "\n\n and re-serialize it.")) cat(margin, summary(x), ":\n", sep="") ## makePrettyMatrixForCompactPrinting() assumes that head() and tail() ## work on 'xx'. xx <- as(x, "IPos") out <- makePrettyMatrixForCompactPrinting(xx) if (print.classinfo) { .COL2CLASS <- c(pos="integer") classinfo <- makeClassinfoRowForCompactPrinting(x, .COL2CLASS) ## A sanity check, but this should never happen! stopifnot(identical(colnames(classinfo), colnames(out))) out <- rbind(classinfo, out) } if (nrow(out) != 0L) rownames(out) <- paste0(margin, " ", rownames(out)) ## We set 'max' to 'length(out)' to avoid the getOption("max.print") ## limit that would typically be reached when 'showHeadLines' global ## option is set to Inf. print(out, quote=FALSE, right=TRUE, max=length(out)) } setMethod("show", "IPos", function(object) show_IPos(object, print.classinfo=TRUE) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Concatenation ### .concatenate_StitchedIPos_objects <- function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) { objects <- S4Vectors:::prepare_objects_to_bind(x, objects) all_objects <- c(list(x), objects) ans_len <- sum(lengths(all_objects)) # no more integer overflow # in R >= 3.5 if (ans_len > .Machine$integer.max) stop("too many integer positions to concatenate") ## 1. Take care of the parallel slots ## Call method for Vector objects to concatenate all the parallel ## slots (only "elementMetadata" in the case of IPos) and stick them ## into 'ans'. Note that the resulting 'ans' can be an invalid object ## because its "elementMetadata" slot can be longer (i.e. have more rows) ## than 'ans' itself so we use 'check=FALSE' to skip validation. ans <- callNextMethod(x, objects, use.names=use.names, ignore.mcols=ignore.mcols, check=FALSE) ## 2. Take care of the non-parallel slots ## Concatenate the "pos_runs" slots. pos_runs_list <- lapply(all_objects, slot, "pos_runs") ans_pos_runs <- stitch_IntegerRanges( bindROWS(pos_runs_list[[1L]], pos_runs_list[-1L]) ) BiocGenerics:::replaceSlots(ans, pos_runs=ans_pos_runs, check=check) } ### This really should be the method for StitchedIPos objects but we define a ### method for IPos objects for backward compatibility with old IPos instances. setMethod("bindROWS", "IPos", function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) { if (is(x, "UnstitchedIPos")) return(callNextMethod()) x <- updateObject(x, check=FALSE) .concatenate_StitchedIPos_objects(x, objects, use.names, ignore.mcols, check) } ) IRanges/R/IPosList-class.R0000644000175100017510000000064114626176651016276 0ustar00biocbuildbiocbuild### ========================================================================= ### IPosList objects ### ------------------------------------------------------------------------- setClass("IPosList", contains=c("PosList", "IntegerRangesList"), representation("VIRTUAL"), prototype(elementType="IPos") ) setClass("SimpleIPosList", contains=c("IPosList", "SimplePosList", "SimpleIntegerRangesList") ) IRanges/R/IPosRanges-class.R0000644000175100017510000001507314626176651016607 0ustar00biocbuildbiocbuild### ========================================================================= ### IPosRanges objects ### ------------------------------------------------------------------------- ### ### The ranges in an IPosRanges derivative are closed, one-dimensional ### intervals with integer end points and on the domain of integers. ### ### The direct IPosRanges subclasses defined in the IRanges package are: ### IRanges, IPos, NCList, and GroupingRanges. setClass("IPosRanges", contains="IntegerRanges", representation("VIRTUAL") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### ### The checking of the names(x) is taken care of by the validity method for ### Vector objects. setValidity2("IPosRanges", validate_Ranges) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### updateObject() ### ### Value of elementType slot has changed from "integer" to "ANY" for ### IPosRanges objects in IRanges 2.13.22 (Bioc 3.7). It will soon change ### again to "StitchedIPos". ### setMethod("updateObject", "IPosRanges", function(object, ..., verbose=FALSE) { target <- new(class(object))@elementType current <- object@elementType if (identical(target, current)) { if (verbose) message("[updateObject] Internal representation of ", class(object), " object is current.\n", "[updateObject] Nothing to update.") } else { if (verbose) message("[updateObject] elementType slot of ", class(object), " object should be set to \"", target, "\",\n", "[updateObject] not to \"", current, "\".\n", "[updateObject] Updating it ... ", appendLF=FALSE) object@elementType <- target if (verbose) message("OK") } callNextMethod() } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### Propagate the names. setMethod("as.character", "IPosRanges", function(x) { if (length(x) == 0L) return(setNames(character(0), names(x))) x_start <- start(x) x_end <- end(x) ans <- paste0(x_start, "-", x_end) idx <- which(x_start == x_end) ans[idx] <- as.character(x_start)[idx] names(ans) <- names(x) ans } ) ### The as.factor() generic doesn't have the ... argument so this method ### cannot support the 'ignore.strand' argument. setMethod("as.factor", "IPosRanges", function(x) factor(as.character(x), levels=as.character(sort(unique(x)))) ) setMethod("as.matrix", "IPosRanges", function(x, ...) matrix(data=c(start(x), width(x)), ncol=2, dimnames=list(names(x), NULL)) ) ### S3/S4 combo for as.data.frame.IPosRanges .as.data.frame.IPosRanges <- function(x, row.names=NULL, optional=FALSE) { if (!identical(optional, FALSE)) warning(wmsg("'optional' argument was ignored")) ans <- data.frame(start=start(x), end=end(x), width=width(x), row.names=row.names, check.names=FALSE, stringsAsFactors=FALSE) ans$names <- names(x) x_mcols <- mcols(x, use.names=FALSE) # can be NULL! if (!is.null(x_mcols)) ans <- cbind(ans, as.data.frame(x_mcols, optional=TRUE)) ans } as.data.frame.IPosRanges <- function(x, row.names=NULL, optional=FALSE, ...) .as.data.frame.IPosRanges(x, row.names=NULL, optional=FALSE, ...) setMethod("as.data.frame", "IPosRanges", .as.data.frame.IPosRanges) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### show() ### .IPosRanges_summary <- function(object) { object_class <- classNameForDisplay(object) object_len <- length(object) object_mcols <- mcols(object, use.names=FALSE) object_nmc <- if (is.null(object_mcols)) 0L else ncol(object_mcols) paste0(object_class, " object with ", object_len, " ", ifelse(object_len == 1L, "range", "ranges"), " and ", object_nmc, " metadata ", ifelse(object_nmc == 1L, "column", "columns")) } ### S3/S4 combo for summary.IPosRanges summary.IPosRanges <- function(object, ...) .IPosRanges_summary(object, ...) setMethod("summary", "IPosRanges", summary.IPosRanges) .from_IPosRanges_to_naked_character_matrix_for_display <- function(x) { m <- cbind(start=showAsCell(start(x)), end=showAsCell(end(x)), width=showAsCell(width(x))) cbind_mcols_for_display(m, x) } setMethod("makeNakedCharacterMatrixForDisplay", "IPosRanges", .from_IPosRanges_to_naked_character_matrix_for_display ) show_IPosRanges <- function(x, margin="", print.classinfo=FALSE) { cat(margin, summary(x), ":\n", sep="") ## makePrettyMatrixForCompactPrinting() assumes that 'x' is subsettable ## but not all IPosRanges objects are (and if even when they are, ## subsetting them can be costly). However IRanges objects are assumed ## to be subsettable so if 'x' is not one then we turn it into one (this ## coercion is expected to work on any IPosRanges object). if (!is(x, "IRanges")) x <- as(x, "IRanges", strict=FALSE) out <- makePrettyMatrixForCompactPrinting(x) if (print.classinfo) { .COL2CLASS <- c( start="integer", end="integer", width="integer" ) classinfo <- makeClassinfoRowForCompactPrinting(x, .COL2CLASS) ## A sanity check, but this should never happen! stopifnot(identical(colnames(classinfo), colnames(out))) out <- rbind(classinfo, out) } if (nrow(out) != 0L) rownames(out) <- paste0(margin, " ", rownames(out)) ## We set 'max' to 'length(out)' to avoid the getOption("max.print") ## limit that would typically be reached when 'showHeadLines' global ## option is set to Inf. print(out, quote=FALSE, right=TRUE, max=length(out)) } setMethod("show", "IPosRanges", function(object) show_IPosRanges(object, print.classinfo=TRUE) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting ### ### Avoid infinite recursion that we would otherwise get: ### IRanges(1:4, 8)[[1]] ### # Error: C stack usage 7969636 is too close to the limit setMethod("getListElement", "IPosRanges", function(x, i, exact=TRUE) { ## A temporary situation stop(wmsg(class(x), " objects don't support [[, as.list(), ", "lapply(), or unlist() at the moment")) } ) IRanges/R/IPosRanges-comparison.R0000644000175100017510000001270314626176651017651 0ustar00biocbuildbiocbuild### ========================================================================= ### Comparing and ordering the ranges in IPosRanges derivatives ### ------------------------------------------------------------------------- ### setMethod("pcompareRecursively", "IPosRanges", function(x) FALSE) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pcompare() ### ### Ranges are ordered by starting position first and then by width. ### This way, the space of ranges is totally ordered. ### This "pcompare" method returns one of the 13 predefined codes (>= -6 and ### <= 6) described in the man page. The signs of those codes reflect this ### order. ### setMethod("pcompare", c("IPosRanges", "IPosRanges"), function(x, y) { .Call2("C_pcompare_IPosRanges", start(x), width(x), start(y), width(y), PACKAGE="IRanges") } ) rangeComparisonCodeToLetter <- function(code) { if (!is.integer(code)) stop("'code' must be an integer vector") code <- code + 7L code[code < 1L | 14L < code] <- 14L levels <- c(letters[1:13], "X") structure(code, levels=levels, class="factor") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### match() ### setMethod("match", c("IPosRanges", "IPosRanges"), function(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash")) { if (!is.null(incomparables)) stop("\"match\" method for IPosRanges objects ", "only accepts 'incomparables=NULL'") ## Equivalent to (but faster than): ## findOverlaps(x, table, type="equal", select="first") ## except when 'x' or 'table' contain empty ranges. matchIntegerPairs(start(x), width(x), start(table), width(table), nomatch=nomatch, method=method) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### selfmatch() ### setMethod("selfmatch", "IPosRanges", function(x, method=c("auto", "quick", "hash")) selfmatchIntegerPairs(start(x), width(x), method=method) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### order() and related methods. ### ### is.unsorted(), order(), sort(), rank() on IPosRanges derivatives are ### consistent with the order implied by pcompare(). ### is.unsorted() is a quick/cheap way of checking whether an IPosRanges ### derivative is already sorted, e.g., called prior to a costly sort. ### sort() and rank() will work out-of-the-box on an IPosRanges derivative ### thanks to the method for List objects (which delegates to the method for ### Vector objects). ### .IPosRanges_as_integer_pairs <- function(x) { a <- start(x) b <- width(x) list(a, b) } setMethod("is.unsorted", "IPosRanges", function(x, na.rm=FALSE, strictly=FALSE) { if (!identical(na.rm, FALSE)) warning("\"is.unsorted\" method for IPosRanges objects ", "ignores the 'na.rm' argument") if (!isTRUEorFALSE(strictly)) stop("'strictly' must be TRUE of FALSE") ## It seems that creating the integer pairs below is faster when ## 'x' is already sorted (TODO: Investigate why). Therefore, and ## somewhat counterintuitively, is.unsorted() can be faster when 'x' ## is already sorted (which, in theory, is the worst-case scenario ## because S4Vectors:::sortedIntegerPairs() will then need to take a ## full walk on 'x') than when it is unsorted (in which case ## S4Vectors:::sortedIntegerPairs() might stop walking on 'x' after ## checking its first 2 elements only -- the best-case scenario). pairs <- .IPosRanges_as_integer_pairs(x) !S4Vectors:::sortedIntegerPairs(pairs[[1L]], pairs[[2L]], strictly=strictly) } ) .order_IPosRanges <- function(x, decreasing=FALSE) { if (!isTRUEorFALSE(decreasing)) stop("'decreasing' must be TRUE or FALSE") pairs <- .IPosRanges_as_integer_pairs(x) orderIntegerPairs(pairs[[1L]], pairs[[2L]], decreasing=decreasing) } ### 'na.last' is pointless (IPosRanges derivatives don't contain NAs) so is ### ignored. ### 'method' is also ignored at the moment. setMethod("order", "IPosRanges", function(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")) { ## Turn off this warning for now since it triggers spurious warnings ## when calling sort() on an IPosRangesList derivative. The root of ## the problem is inconsistent defaults for 'na.last' between order() ## and sort(), as reported here: ## https://stat.ethz.ch/pipermail/r-devel/2015-November/072012.html #if (!identical(na.last, TRUE)) # warning("\"order\" method for IPosRanges objects ", # "ignores the 'na.last' argument") if (!isTRUEorFALSE(decreasing)) stop("'decreasing' must be TRUE or FALSE") ## All arguments in '...' are guaranteed to be IPosRanges derivatives. args <- list(...) if (length(args) == 1L) return(.order_IPosRanges(args[[1L]], decreasing)) order_args <- c(unlist(lapply(args, .IPosRanges_as_integer_pairs), recursive=FALSE, use.names=FALSE), list(na.last=na.last, decreasing=decreasing)) do.call(order, order_args) } ) IRanges/R/IRanges-class.R0000644000175100017510000002573414626176651016132 0ustar00biocbuildbiocbuild### ========================================================================= ### IRanges objects ### ------------------------------------------------------------------------- ### ### The IRanges class is a simple container for storing a vector of integer ### ranges. ### setClass("IRanges", contains="IPosRanges", representation( start="integer", width="integer", NAMES="character_OR_NULL" # R doesn't like @names !! ) ) ### A NormalIRanges object is an IRanges object where the ranges are: ### (a) not empty (i.e. they have a non-null width); ### (b) not overlapping; ### (c) ordered from left to right; ### (d) not even adjacent (i.e. there must be a non empty gap between 2 ### consecutive ranges). ### If 'x' is an IRanges object of length >= 2, then 'x' is normal iff: ### start(x)[i] <= end(x)[i] < start(x)[i+1] <= end(x)[i+1] ### for every 1 <= i < length(x). ### If length(x) == 1, then 'x' is normal iff width(x)[1] >= 1. ### If length(x) == 0, then 'x' is normal. setClass("NormalIRanges", contains="IRanges") ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### parallel_slot_names() ### ### Combine the new "parallel slots" with those of the parent class. Make ### sure to put the new parallel slots **first**. See R/Vector-class.R file ### in the S4Vectors package for what slots should or should not be considered ### "parallel". setMethod("parallel_slot_names", "IRanges", function(x) c("start", "width", "NAMES", callNextMethod()) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Getters ### setMethod("start", "IRanges", function(x, ...) x@start) setMethod("width", "IRanges", function(x) x@width) setMethod("names", "IRanges", function(x) x@NAMES) setMethod("ranges", "IntegerRanges", function(x, use.names=TRUE, use.mcols=FALSE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") if (!isTRUEorFALSE(use.mcols)) stop("'use.mcols' must be TRUE or FALSE") ans_start <- start(x) ans_width <- width(x) ans_names <- if (use.names) names(x) else NULL ans_mcols <- if (use.mcols) mcols(x, use.names=FALSE) else NULL new2("IRanges", start=ans_start, width=ans_width, NAMES=ans_names, elementMetadata=ans_mcols, check=FALSE) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isEmpty() and isNormal() ### .isNormal_IRanges <- function(x) .Call2("C_isNormal_IRanges", x, PACKAGE="IRanges") setMethod("isNormal", "IRanges", .isNormal_IRanges) ### Fast methods for NormalIRanges objects. setMethod("isEmpty", "NormalIRanges", function(x) length(x) == 0L) setMethod("isNormal", "NormalIRanges", function(x) TRUE) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods. ### ### Note: defined for NormalIRanges objects only. ### For an ordinary IRanges object 'x', it's not clear what the semantic ### should be. In particular, should empty ranges be ignored or not? If not ### then we could end up with 'min(x)' > 'max(x)' (e.g. when 'x' is made of 1 ### empty range) which is not nice. Another (and more pragmatic) reason for ### not defining these methods for IRanges objects is that I don't need them ### at the moment. ### setMethod("max", "NormalIRanges", function(x, ..., na.rm) { if (isEmpty(x)) { warning("empty ", class(x), " object; returning -Inf") -Inf } else { end(x)[length(x)] } } ) setMethod("min", "NormalIRanges", function(x, ..., na.rm) { if (isEmpty(x)) { warning("empty ", class(x), " object; returning Inf") Inf } else { start(x)[1L] } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### ### Validity of IRanges objects is taken care of by the validity method for ### IPosRanges objects. ### ### NormalIRanges objects .valid.NormalIRanges <- function(x) { if (!.isNormal_IRanges(x)) return("object is not normal") NULL } setValidity2("NormalIRanges", .valid.NormalIRanges) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("IntegerRanges", "IRanges", function(from) ranges(from, use.mcols=TRUE) ) ### Helper function (not exported) used by the "coerce" methods defined in ### IRanges-utils.R. Believe it or not but the implicit "coerce" methods do ### NOT check that they return a valid object! newNormalIRangesFromIRanges <- function(x, check=TRUE) { if (!is(x, "IRanges")) stop("'x' must be an IRanges object") if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Check only what needs to be checked. if (check) { msg <- .valid.NormalIRanges(x) if (!is.null(msg)) stop(wmsg(msg)) } class(x) <- "NormalIRanges" x } ### The returned IRanges instance is guaranteed to be normal. setAs("logical", "IRanges", function(from) as(as(from, "NormalIRanges"), "IRanges") ) .from_logical_to_NormalIRanges <- function(from) .Call2("C_from_logical_to_NormalIRanges", from, PACKAGE="IRanges") setAs("logical", "NormalIRanges", .from_logical_to_NormalIRanges) ### coercion from integer .from_integer_to_IRanges <- function(from) .Call2("C_from_integer_to_IRanges", from, PACKAGE="IRanges") setAs("integer", "IRanges", .from_integer_to_IRanges) setAs("integer", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(from, "IRanges")) ) setMethod("as.integer", "NormalIRanges", function(x) unlist_as_integer(x)) setAs("numeric", "IRanges", function(from) as(as.integer(from), "IRanges")) setAs("numeric", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(as.integer(from), "IRanges"))) ### coercion from character .from_character_to_IRanges <- function(from) { stopifnot(is.character(from)) if (anyNA(from)) stop(wmsg("converting a character vector to an IRanges object ", "does not support NAs")) error_msg <- wmsg( "The character vector to convert to an IRanges object must ", "contain strings of the form \"start-end\" or \"start..end\", ", "with end >= start - 1, or just \"pos\". For example: \"2501-2900\", ", "\"2501..2900\", or \"740\"." ) ## We want to split on the first occurence of "-" that is preceeded by ## a digit (ignoring and removing the spaces in between if any). from <- sub("([[:digit:]])[[:space:]]*-", "\\1..", from) split2 <- CharacterList(strsplit(from, "..", fixed=TRUE)) split2_eltNROWS <- elementNROWS(split2) if (!all(split2_eltNROWS <= 2L)) stop(error_msg) ans_start <- suppressWarnings(as.integer(heads(split2, n=1L))) ans_end <- suppressWarnings(as.integer(tails(split2, n=1L))) if (anyNA(ans_start) || anyNA(ans_end)) stop(error_msg) IRanges(ans_start, ans_end, names=names(from)) } setAs("character", "IRanges", .from_character_to_IRanges) .from_factor_to_IRanges <- function(from) { from <- setNames(as.character(from), names(from)) .from_character_to_IRanges(from) } setAs("factor", "IRanges", .from_factor_to_IRanges) setAs("ANY", "IPosRanges", function(from) as(from, "IRanges")) setAs("ANY", "IntegerRanges", function(from) as(from, "IRanges")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Low-level setters for IRanges objects. ### ### All these low-level setters preserve the length of the object. ### The choice was made to implement a "resizing" semantic: ### (1) changing the start preserves the end (so it changes the width) ### (2) changing the end preserves the start (so it changes the width) ### (3) changing the width preserves the start (so it changes the end) ### .set_IRanges_start <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Fix elementType slot on-the-fly. x <- updateObject(x, check=FALSE) old_start <- start(x) ## Use 'x@start[]' instead of 'x@start' so the right value is recycled. x@start[] <- S4Vectors:::numeric2integer(value) x@width <- width(x) - start(x) + old_start if (check) validObject(x) x } setReplaceMethod("start", "IRanges", function(x, ..., value) .set_IRanges_start(x, value) ) .set_IRanges_end <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Fix elementType slot on-the-fly. x <- updateObject(x, check=FALSE) ## Use 'x@width[]' instead of 'x@width' so the right value is recycled. x@width[] <- width(x) - end(x) + S4Vectors:::numeric2integer(value) if (check) validObject(x) x } setReplaceMethod("end", "IRanges", function(x, ..., value) .set_IRanges_end(x, value) ) .set_IRanges_width <- function(x, value, check=TRUE) { if (!isTRUEorFALSE(check)) stop("'check' must be TRUE or FALSE") ## Fix elementType slot on-the-fly. x <- updateObject(x, check=FALSE) ## Use 'x@width[]' instead of 'x@width' so the right value is recycled. x@width[] <- S4Vectors:::numeric2integer(value) if (check) validObject(x) x } setReplaceMethod("width", "IRanges", function(x, ..., value) .set_IRanges_width(x, value) ) set_IRanges_names <- function(x, value) { ## Fix elementType slot on-the-fly. x <- updateObject(x, check=FALSE) x@NAMES <- S4Vectors:::normarg_names(value, class(x), length(x)) ## No need to validate an IRanges object after setting its names so ## this should be safe. x } setReplaceMethod("names", "IRanges", set_IRanges_names) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting ### setMethod("extractROWS", "NormalIRanges", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (is(x, "NormalIRanges")) { if (!isStrictlySorted(i)) stop("subscript must extract elements at strictly sorted ", "positions when\n subsetting a ", class(x), " object") } callNextMethod() } ) setMethod("replaceROWS", "IRanges", function(x, i, value) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) ans_start <- replaceROWS(start(x), i, start(value)) ans_width <- replaceROWS(width(x), i, width(value)) ans_mcols <- replaceROWS(mcols(x, use.names=FALSE), i, mcols(value, use.names=FALSE)) BiocGenerics:::replaceSlots(x, start=ans_start, width=ans_width, mcols=ans_mcols, check=FALSE) } ) setMethod("replaceROWS", "NormalIRanges", function(x, i, value) { ans <- callNextMethod() validObject(ans) ans } ) IRanges/R/IRanges-constructor.R0000644000175100017510000002517314626176651017407 0ustar00biocbuildbiocbuild### ========================================================================= ### The IRanges constructor ### ------------------------------------------------------------------------- ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Tree low-level helpers ### .is_numeric_or_NAs <- function(x) { is.numeric(x) || is.logical(x) && all(is.na(x)) } ### Input can contain NAs. Output must be an unnamed integer vector. .start_as_unnamed_integer <- function(start, what="a start") { if (is.integer(start)) ## We still pass 'start' thru 'as.integer()' to drop any troublemaker ## attribute like "names", "dim", or "dimnames". ## See https://github.com/Bioconductor/IRanges/issues/37 return(as.integer(start)) old_warn <- getOption("warn") options(warn=2L) on.exit(options(warn=old_warn)) start <- try(as.integer(start), silent=TRUE) if (inherits(start, "try-error")) stop(wmsg("each range must have ", what, " that ", "is < 2^31 and > - 2^31")) start } ### Input can contain NAs. Output must be an unnamed integer vector. .width_as_unnamed_integer <- function(width, msg="a non-negative width") { if (any(width < 0, na.rm=TRUE)) stop(wmsg("each range must have ", msg)) if (is.integer(width)) ## We still pass 'width' thru 'as.integer()' to drop any troublemaker ## attribute like "names", "dim", or "dimnames". ## See https://github.com/Bioconductor/IRanges/issues/37 return(as.integer(width)) old_warn <- getOption("warn") options(warn=2L) on.exit(options(warn=old_warn)) width <- try(as.integer(width), silent=TRUE) if (inherits(width, "try-error")) stop(wmsg("each range must have a width that is < 2^31")) width } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Low-level IRanges constructors ### .new_empty_IRanges <- function() new2("IRanges", check=FALSE) .new_IRanges_from_start_end <- function(start, end) { if (!.is_numeric_or_NAs(start) || !.is_numeric_or_NAs(end)) stop(wmsg("'start' and 'end' must be numeric vectors")) if (anyNA(start) || anyNA(end)) stop(wmsg("'start' or 'end' cannot contain NAs")) if (length(start) == 0L || length(end) == 0L) return(.new_empty_IRanges()) start <- .start_as_unnamed_integer(start) end <- .start_as_unnamed_integer(end, what="an end") ## We want to perform this operation in "double" space rather ## than in "integer" space so we use 1.0 instead of 1L. width <- 1.0 + end - start width <- .width_as_unnamed_integer(width, msg="an end that is greater or equal to its start minus one") start <- S4Vectors:::recycleVector(start, length(width)) new2("IRanges", start=start, width=width, check=FALSE) } .new_IRanges_from_start_width <- function(start, width) { if (!.is_numeric_or_NAs(start) || !.is_numeric_or_NAs(width)) stop(wmsg("'start' and 'width' must be numeric vectors")) if (anyNA(start) || anyNA(width)) stop(wmsg("'start' or 'width' cannot contain NAs")) if (length(start) == 0L || length(width) == 0L) return(.new_empty_IRanges()) start <- .start_as_unnamed_integer(start) width <- .width_as_unnamed_integer(width) ## We want to perform this operation in "double" space rather ## than in "integer" space so we use -1.0 instead of -1L. end <- -1.0 + start + width end <- .start_as_unnamed_integer(end, what="an end") start <- S4Vectors:::recycleVector(start, length(end)) width <- S4Vectors:::recycleVector(width, length(end)) new2("IRanges", start=start, width=width, check=FALSE) } .new_IRanges_from_end_width <- function(end, width) { if (!.is_numeric_or_NAs(end) || !.is_numeric_or_NAs(width)) stop(wmsg("'end' and 'width' must be numeric vectors")) if (anyNA(end) || anyNA(width)) stop(wmsg("'end' or 'width' cannot contain NAs")) if (length(end) == 0L || length(width) == 0L) return(.new_empty_IRanges()) end <- .start_as_unnamed_integer(end, what="an end") width <- .width_as_unnamed_integer(width) ## We want to perform this operation in "double" space rather ## than in "integer" space so we use 1.0 instead of 1L. start <- 1.0 + end - width start <- .start_as_unnamed_integer(start) start <- suppressWarnings(as.integer(start)) width <- S4Vectors:::recycleVector(width, length(start)) new2("IRanges", start=start, width=width, check=FALSE) } .solve_start_end_width <- function(start, end, width) { if (!.is_numeric_or_NAs(start) || !.is_numeric_or_NAs(end) || !.is_numeric_or_NAs(width)) stop(wmsg("'start', 'end', and 'width', must be numeric vectors")) L1 <- length(start) L2 <- length(end) L3 <- length(width) if (min(L1, L2, L3) == 0L) return(.new_empty_IRanges()) if (is.logical(start)) { start <- as.integer(start) } else { start <- .start_as_unnamed_integer(start) } if (is.logical(end)) { end <- as.integer(end) } else { end <- .start_as_unnamed_integer(end, what="an end") } if (is.logical(width)) { width <- as.integer(width) } else { width <- .width_as_unnamed_integer(width) } ans_len <- max(L1, L2, L3) start <- S4Vectors:::recycleVector(start, ans_len) end <- S4Vectors:::recycleVector(end, ans_len) width <- S4Vectors:::recycleVector(width, ans_len) .Call2("C_solve_start_end_width", start, end, width, PACKAGE="IRanges") } .new_IRanges <- function(start=NULL, end=NULL, width=NULL) { start_is_null <- is.null(start) end_is_null <- is.null(end) width_is_null <- is.null(width) nb_of_nulls <- sum(start_is_null, end_is_null, width_is_null) if (nb_of_nulls == 3L) return(.new_empty_IRanges()) if (nb_of_nulls == 2L) stop(wmsg("at least two of the 'start', 'end', and 'width' ", "arguments must be supplied")) if (width_is_null) return(.new_IRanges_from_start_end(start, end)) if (end_is_null) return(.new_IRanges_from_start_width(start, width)) if (start_is_null) return(.new_IRanges_from_end_width(end, width)) .solve_start_end_width(start, end, width) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### High-level IRanges constructor ### IRanges <- function(start=NULL, end=NULL, width=NULL, names=NULL, ...) { mcols <- DataFrame(..., check.names=FALSE) if (!is.null(start) && is.null(end) && is.null(width)) { ans <- as(start, "IRanges") } else { ans <- .new_IRanges(start=start, end=end, width=width) } if (!is.null(names)) names(ans) <- names if (length(mcols) != 0L) mcols(ans) <- mcols ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The SEW interface: start=NA/end=NA/width=NA ### ### Some of the functions that support the SEW interface: narrow(), ### XVector::subseq(), XVector::xvcopy(), Biostrings::BStringSet() (and ### family), BSgenome::getSeq(), etc... ### .normargSEW <- function(x, argname) { if (!S4Vectors:::isNumericOrNAs(x)) stop("'", argname, "' must be a vector of integers") if (!is.integer(x)) x <- as.integer(x) x } ### Use of 'rep.refwidths=TRUE' is supported only when 'refwidths' is of ### length 1. ### If 'rep.refwidths=FALSE' (the default) then 'start', 'end' and 'width' ### are recycled to 'length(refwidths)' (it's an error if one of them is ### longer than 'refwidths'). Otherwise, 'refwidths' is replicated L times ### where L is the length of the longest of 'start', 'end' and 'width'. ### The returned value is an IRanges object of the same length as 'refwidths' ### (after replication if 'rep.refwidths=TRUE'). solveUserSEW <- function(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE) { if (!is.numeric(refwidths)) stop("'refwidths' must be a vector of integers") if (!is.integer(refwidths)) refwidths <- as.integer(refwidths) start <- .normargSEW(start, "start") end <- .normargSEW(end, "end") width <- .normargSEW(width, "width") ## From here, 'refwidths', 'start', 'end' and 'width' are guaranteed to be ## integer vectors. NAs in 'start', 'end' and 'width' are OK but not in ## 'refwidths' so this should be checked at the C level. if (!isTRUEorFALSE(rep.refwidths)) stop("'rep.refwidths' must be TRUE or FALSE") if (!isTRUEorFALSE(translate.negative.coord)) stop("'translate.negative.coord' must be TRUE or FALSE") if (!isTRUEorFALSE(allow.nonnarrowing)) stop("'allow.nonnarrowing' must be TRUE or FALSE") Lsew <- c(length(start), length(end), length(width)) maxLsew <- max(Lsew) minLsew <- min(Lsew) if (minLsew == 0L && maxLsew > 1L) stop("'start', 'end' and 'width' cannot mix zero-length ", "and longer-than-one vectors") ## Check 'start', 'end', and 'width' *without* recycling them. Recycling ## is done at the C level. if (rep.refwidths) { if (length(refwidths) != 1L) stop("'rep.refwidths=TRUE' can be used only when 'refwidths' ", "is of length 1") ## 'ans_len' is the length of the longest of 'start', 'end' ## and 'width'. if (minLsew == 0L) { ans_len <- 0L } else { ans_len <- maxLsew } refwidths <- rep.int(refwidths, ans_len) } else { ans_len <- length(refwidths) if (ans_len == 0L) { if (maxLsew > 1L) stop("'start', 'end' or 'width' is longer than 'refwidths'") } else { if (minLsew == 0L) stop("cannot recycle empty 'start', 'end' or 'width'") if (maxLsew > ans_len) stop("'start', 'end' or 'width' is longer than 'refwidths'") } } .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord, allow.nonnarrowing, PACKAGE="IRanges") } ### Returns an IRanges instance of length 1. Not exported. solveUserSEWForSingleSeq <- function(x_length, start=NA, end=NA, width=NA) { solved_SEW <- try(solveUserSEW(x_length, start=start, end=end, width=width), silent = TRUE) if (is(solved_SEW, "try-error")) stop("Invalid sequence coordinates.\n", " Please make sure the supplied 'start', 'end' and 'width' arguments\n", " are defining a region that is within the limits of the sequence.") solved_SEW } IRanges/R/IRanges-utils.R0000644000175100017510000001466514626176651016166 0ustar00biocbuildbiocbuild### ========================================================================= ### Utility functions for creating or modifying IRanges objects ### ------------------------------------------------------------------------- ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "successiveIRanges" function. ### ### Note that the returned IRanges object is guaranteed to be normal in the ### following cases: ### (a) when length(width) == 0 ### (b) when length(width) == 1 and width > 0 ### (c) when length(width) >= 2 and all(width > 0) and all(gapwidth > 0) ### However, the function doesn't try to turn the result into a NormalIRanges ### object. ### successiveIRanges <- function(width, gapwidth=0, from=1) { if (!is.numeric(width)) stop("'width' must be an integer vector") if (length(width) == 0L) return(IRanges()) if (!is.integer(width)) width <- as.integer(width) # this drops the names else if (!is.null(names(width))) names(width) <- NULL # unname() used to be broken on 0-length vectors if (S4Vectors:::anyMissingOrOutside(width, 0L)) stop("'width' cannot contain NAs or negative values") if (!is.numeric(gapwidth)) stop("'gapwidth' must be an integer vector") if (!is.integer(gapwidth)) gapwidth <- as.integer(gapwidth) if (S4Vectors:::anyMissing(gapwidth)) stop("'gapwidth' cannot contain NAs") if (length(gapwidth) != length(width) - 1L) { if (length(gapwidth) != 1L) stop("'gapwidth' must a single integer or an integer vector ", "with one less element than the 'width' vector") gapwidth <- rep.int(gapwidth, length(width) - 1L) } if (!isSingleNumber(from)) stop("'from' must be a single integer") if (!is.integer(from)) from <- as.integer(from) ans_start <- cumsum(width[-length(width)] + gapwidth) ans_start <- from + c(0L, ans_start) ## 'ans_start' could contain NAs in case of an integer overflow in ## cumsum(), hence the use of 'check=TRUE' here: new2("IRanges", start=ans_start, width=width, check=TRUE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### slidingIRanges() ### slidingIRanges <- function(len, width, shift = 1L) { start <- seq(1L, len-width, by=shift) end <- seq(width, len, by=shift) IRanges(start, end) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### breakInChunks() ### ### TODO: Should not be in IRanges-utils.R because it returns a ### PartitioningByEnd object, not an IRanges object. So move it to another ### file, e.g. to Partitioning-class.R. breakInChunks() is actually a ### specialized PartitioningByEnd constructor. ### .normarg_totalsize <- function(totalsize) { if (!isSingleNumber(totalsize)) stop("'totalsize' must be a single integer") if (!is.integer(totalsize)) totalsize <- as.integer(totalsize) if (totalsize < 0L) stop("'totalsize' cannot be negative") totalsize } .normarg_nchunk_or_chunksize <- function(nchunk, totalsize, what) { if (!isSingleNumber(nchunk)) stop("'", what, "' must be a single integer") if (!is.integer(nchunk)) nchunk <- as.integer(nchunk) if (nchunk < 0L) stop("'", what, "' cannot be negative") if (nchunk == 0L && totalsize != 0L) stop("'", what, "' can be 0 only if 'totalsize' is 0") nchunk } breakInChunks <- function(totalsize, nchunk, chunksize) { totalsize <- .normarg_totalsize(totalsize) if (!missing(nchunk)) { if (!missing(chunksize)) stop("only one of 'nchunk' and 'chunksize' can be specified") ## All chunks will have more or less the same size, with the difference ## between smallest and biggest chunks guaranteed to be <= 1. nchunk <- .normarg_nchunk_or_chunksize(nchunk, totalsize, "nchunk") if (nchunk == 0L) return(PartitioningByEnd()) chunksize <- totalsize / nchunk # floating point division breakpoints <- as.integer(cumsum(rep.int(chunksize, nchunk))) ## The last value in 'breakpoints' *should* be 'totalsize' but there is ## always some uncertainty about what coercing the result of a floating ## point operation to integer will produce. So we set this value ## manually to 'totalsize' just in case. breakpoints[[nchunk]] <- totalsize } else { if (missing(chunksize)) stop("one of 'nchunk' and 'chunksize' must be specified") ## All chunks will have the requested size, except maybe the last one. chunksize <- .normarg_nchunk_or_chunksize(chunksize, totalsize, "chunksize") if (totalsize == 0L) return(PartitioningByEnd()) quot <- totalsize %/% chunksize # integer division breakpoints <- cumsum(rep.int(chunksize, quot)) if (quot == 0L || breakpoints[[quot]] != totalsize) breakpoints <- c(breakpoints, totalsize) } PartitioningByEnd(breakpoints) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "centeredIRanges" function. ### centeredIRanges <- function(center, flank) { if (!is.numeric(center)) stop("'center' must be a numeric vector") if (!is.numeric(flank)) stop("'flank' must be a numeric vector") IRanges(start=center-flank, end=center+flank) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "whichAsIRanges" function. ### ### Note that unlike the standard which() function, whichAsIRanges() drops ### the names of 'x'. ### whichAsIRanges <- function(x) { if (!is.logical(x)) stop("'x' must be a logical vector") as(x, "NormalIRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercing an IRanges object to a NormalIRanges object. ### asNormalIRanges <- function(x, force=TRUE) { if (!is(x, "IntegerRanges")) stop("'x' must be an IntegerRanges object") else if (!is(x, "IRanges")) x <- as(x, "IRanges") if (!isTRUEorFALSE(force)) stop("'force' must be TRUE or FALSE") if (force) x <- reduce(x, drop.empty.ranges=TRUE) newNormalIRangesFromIRanges(x, check=!force) } .asNormalIRanges <- function(from) asNormalIRanges(from, force=TRUE) setAs("IRanges", "NormalIRanges", .asNormalIRanges) IRanges/R/IRangesList-class.R0000644000175100017510000001621014626176651016753 0ustar00biocbuildbiocbuild### ========================================================================= ### IRangesList objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### IRangesList ### setClass("IRangesList", contains="IntegerRangesList", representation("VIRTUAL"), prototype(elementType="IRanges") ) setClass("SimpleIRangesList", contains=c("IRangesList", "SimpleIntegerRangesList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### NormalIRangesList ### setClass("NormalIRangesList", contains="IRangesList", representation("VIRTUAL"), prototype(elementType="NormalIRanges") ) setClass("SimpleNormalIRangesList", contains=c("NormalIRangesList", "SimpleIRangesList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.NormalIRangesList <- function(x) { if (!all(isNormal(x))) return("at least one element of object is not normal") NULL } setValidity2("NormalIRangesList", .valid.NormalIRangesList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion from list-like object to SimpleIRangesList ### ### Try to turn an arbitrary list-like object into an ordinary list of ### IRanges objects. as_list_of_IRanges <- function(from) { if (is(from, "IntegerRanges")) { if (!is(from, "IRanges")) from <- as(from, "IRanges", strict=FALSE) along_idx <- setNames(seq_along(from), names(from)) names(from) <- NULL mcols(from) <- NULL lapply(along_idx, function(i) from[i]) } else { lapply(from, as, "IRanges", strict=FALSE) } } ### From ordinary list to SimpleIRangesList .from_list_to_SimpleIRangesList <- function(from) { from <- as_list_of_IRanges(from) S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", from) } setAs("list", "SimpleIRangesList", .from_list_to_SimpleIRangesList) setAs("list", "IRangesList", .from_list_to_SimpleIRangesList) ### From List derivative to SimpleIRangesList .from_List_to_SimpleIRangesList <- function(from) { S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", as_list_of_IRanges(from), metadata=metadata(from), mcols=mcols(from, use.names=FALSE)) } setAs("List", "SimpleIRangesList", .from_List_to_SimpleIRangesList) ### Automatic coercion methods from SimpleList, IntegerRangesList, or ### SimpleIntegerRangesList to SimpleIRangesList silently return a broken ### object (unfortunately these dummy automatic coercion methods don't bother ### to validate the object they return). So we overwrite them. setAs("SimpleList", "SimpleIRangesList", .from_List_to_SimpleIRangesList) setAs("IntegerRangesList", "SimpleIRangesList", .from_List_to_SimpleIRangesList) setAs("SimpleIntegerRangesList", "SimpleIRangesList", .from_List_to_SimpleIRangesList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isNormal() ### setMethod("isNormal", "SimpleIRangesList", function(x, use.names=FALSE) .Call2("C_isNormal_SimpleIRangesList", x, use.names, PACKAGE="IRanges") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### IRangesList <- function(..., compress=TRUE) { if (!isTRUEorFALSE(compress)) stop("'compress' must be TRUE or FALSE") objects <- list(...) if (length(objects) == 2L && setequal(names(objects), c("start", "end")) && !is(objects[[1L]], "IntegerRanges") && !is(objects[[2L]], "IntegerRanges")) { if (!compress) stop(wmsg("'compress' must be TRUE when passing the 'start' ", "and 'end' arguments")) ans_start <- IntegerList(objects[["start"]], compress=TRUE) ans_end <- IntegerList(objects[["end"]], compress=TRUE) ans_partitioning <- PartitioningByEnd(ans_start) if (!identical(ans_partitioning, PartitioningByEnd(ans_end))) stop("'start' and 'end' are not compatible") unlisted_start <- unlist(ans_start, use.names=FALSE) unlisted_end <- unlist(ans_end, use.names=FALSE) unlisted_ans <- IRanges(start=unlisted_start, end=unlisted_end) return(relist(unlisted_ans, ans_partitioning)) } if (length(objects) == 1L) { tmp <- objects[[1L]] if (is.list(tmp) || (is(tmp, "List") && !is(tmp, "IntegerRanges"))) objects <- tmp } if (compress) as(objects, "CompressedIRangesList") else as(objects, "SimpleIRangesList") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### More coercions ### setMethod("unlist", "SimpleNormalIRangesList", function(x, recursive = TRUE, use.names = TRUE) { x <- S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", lapply(x, as, "IRanges")) callGeneric() }) ### Coercion from IntegerRangesList to NormalIRangesList. .from_IntegerRangesList_to_SimpleNormalIRangesList <- function(from) { S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), mcols=mcols(from, use.names=FALSE), metadata=metadata(from)) } setAs("IntegerRangesList", "SimpleNormalIRangesList", .from_IntegerRangesList_to_SimpleNormalIRangesList ) setAs("SimpleIRangesList", "SimpleNormalIRangesList", .from_IntegerRangesList_to_SimpleNormalIRangesList ) setAs("LogicalList", "SimpleNormalIRangesList", function(from) S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from, use.names=FALSE))) ### Coercion from RleList to NormalIRangesList. setAs("RleList", "SimpleNormalIRangesList", function(from) { if ((length(from) > 0) && (!is.logical(runValue(from[[1L]])) || S4Vectors:::anyMissing(runValue(from[[1L]])))) stop("cannot coerce a non-logical 'RleList' or a logical 'RleList' ", "with NAs to a SimpleNormalIRangesList object") S4Vectors:::new_SimpleList_from_list("SimpleNormalIRangesList", lapply(from, as, "NormalIRanges"), metadata = metadata(from), mcols = mcols(from, use.names=FALSE)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods for NormalIRangesList objects. ### setMethod("max", "SimpleNormalIRangesList", function(x, ..., na.rm) .Call2("C_max_SimpleNormalIRangesList", x, PACKAGE="IRanges") ) setMethod("min", "SimpleNormalIRangesList", function(x, ..., na.rm) .Call2("C_min_SimpleNormalIRangesList", x, PACKAGE="IRanges") ) IRanges/R/IntegerRangesList-class.R0000644000175100017510000001325614626176651020167 0ustar00biocbuildbiocbuild### ========================================================================= ### IntegerRangesList objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setGeneric(".replaceSEW", signature="x", # not exported function(x, FUN, ..., value) standardGeneric(".replaceSEW") ) setMethod(".replaceSEW", "IntegerRangesList", function(x, FUN, ..., value) { if (extends(class(value), "IntegerList")) { value <- S4Vectors:::VH_recycle(value, x, ".replaceSEW", "x") } else if (is.numeric(value)) { lelts <- sum(elementNROWS(x)) if (lelts != length(value)) value <- rep(value, length.out = lelts) if (!is.integer(value)) value <- as.integer(value) value <- split(value, factor(space(x), names(x))) } else { stop("'value' must extend class IntegerList or integer") } FUN <- match.fun(FUN) if (is(x, "CompressedRangesList")) { unlist_ans <- FUN(unlist(x, use.names=FALSE), ..., value=unlist(value, use.names=FALSE)) return(relist(unlist_ans, x)) } for (i in seq_len(length(x))) x[[i]] <- FUN(x[[i]], ..., value = value[[i]]) x } ) setReplaceMethod("start", "IntegerRangesList", function(x, ..., value) .replaceSEW(x, "start<-", ..., value=value) ) setReplaceMethod("end", "IntegerRangesList", function(x, ..., value) .replaceSEW(x, "end<-", ..., value=value) ) setReplaceMethod("width", "IntegerRangesList", function(x, ..., value) .replaceSEW(x, "width<-", ..., value=value) ) setMethod("space", "IntegerRangesList", function(x) { space <- names(x) if (!is.null(space)) space <- factor(rep.int(space, elementNROWS(x)), unique(space)) space }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isNormal() ### ### Test the list elements of an IntegerRangesList object 'x' individually and ### return a vector of TRUE's or FALSE's parallel to 'x'. More precisely, is ### equivalent to 'sapply(x, FUN)', when FUN is 'isNormal'. ### setMethod("isNormal", "IntegerRangesList", function(x, use.names=FALSE) vapply(x, isNormal, logical(1), USE.NAMES=use.names) ) setMethod("whichFirstNotNormal", "IntegerRangesList", function(x) unlist(lapply(x, whichFirstNotNormal)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### ### NOT exported but used in the Biostrings package. show_IntegerRangesList <- function(x, with.header=TRUE) { x_len <- length(x) if (with.header) cat(classNameForDisplay(x), " object of length ", x_len, if (x_len != 0L) ":" else "", "\n", sep="") cumsumN <- end(PartitioningByEnd(x)) N <- tail(cumsumN, 1) if (x_len == 0L) { ## Display nothing. } else if (x_len <= 3L || (x_len <= 5L && N <= 20L)) { ## Display full object. show(as.list(x)) } else { ## Display truncated object. if (cumsumN[[3L]] <= 20L) { showK <- 3L } else if (cumsumN[[2L]] <= 20L) { showK <- 2L } else { showK <- 1L } show(as.list(x[seq_len(showK)])) diffK <- x_len - showK cat("...\n", "<", diffK, " more element", ifelse(diffK == 1L, "", "s"), ">\n", sep="") } } setMethod("show", "IntegerRangesList", function(object) show_IntegerRangesList(object) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### merge() ### ### Merges various IntegerRangesList objects into a single IntegerRangesList ### object. The merging is either by name (if all the IntegerRangesList ### objects have names), or by position (if any IntegerRangesList object is ### missing names). When merging by name, and in case of duplicated names ### within a given IntegerRangesList, the elements corresponding to the ### duplicated names are ignored. ### When merging by position, all the IntegerRangesList objects must have the ### same length. ### Note that the "range" method for IntegerRangesList objects expects "merge" ### to behave like this. .merge_IntegerRangesList <- function(...) { args <- unname(list(...)) if (length(args) == 0L) stop("nothing to merge") x <- args[[1L]] spaceList <- lapply(args, names) names <- spaces <- unique(do.call(c, spaceList)) if (any(S4Vectors:::sapply_isNULL(spaceList))) { ## Merging by position. if (!all(unlist(lapply(args, length)) == length(x))) stop("if any IntegerRangesList objects to merge are missing ", "names, all must have same length") names <- NULL spaces <- seq_len(length(x)) } ranges <- lapply(spaces, function(space) { r <- lapply(args, `[[`, space) do.call(c, S4Vectors:::delete_NULLs(r)) }) names(ranges) <- names if (is(x, "CompressedList")) ans <- new_CompressedList_from_list(class(x), ranges) else ans <- S4Vectors:::new_SimpleList_from_list(class(x), ranges) ans } setMethod("merge", c("IntegerRangesList", "missing"), function(x, y, ...) .merge_IntegerRangesList(x, ...) ) setMethod("merge", c("missing", "IntegerRangesList"), function(x, y, ...) .merge_IntegerRangesList(y, ...) ) setMethod("merge", c("IntegerRangesList", "IntegerRangesList"), function(x, y, ...) .merge_IntegerRangesList(x, y, ...) ) IRanges/R/MaskCollection-class.R0000644000175100017510000002777014626176651017513 0ustar00biocbuildbiocbuild### ========================================================================= ### MaskCollection objects ### ------------------------------------------------------------------------- setClass("MaskCollection", contains="IntegerRangesList", representation( nir_list="list", # a list of NormalIRanges objects width="integer", active="logical", NAMES="character", # R doesn't like @names !! desc="character" ), prototype( nir_list=list(), width=0L, active=logical(0), NAMES=as.character(NA), desc=as.character(NA) ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "length" and accessor methods. ### setGeneric("nir_list", function(x) standardGeneric("nir_list")) setMethod("nir_list", "MaskCollection", function(x) x@nir_list) setMethod("length", "MaskCollection", function(x) length(nir_list(x))) setMethod("width", "MaskCollection", function(x) x@width) setMethod("active", "MaskCollection", function(x) { ans <- x@active names(ans) <- names(x) ans } ) setReplaceMethod("active", "MaskCollection", function(x, value) { if (!is.logical(value) || S4Vectors:::anyMissing(value)) stop("'value' must be a logical vector with no NAs") x@active[] <- value x } ) setMethod("names", "MaskCollection", function(x) if (length(x@NAMES) == 1 && is.na(x@NAMES)) NULL else x@NAMES ) setReplaceMethod("names", "MaskCollection", function(x, value) { if (is.null(value)) { x@NAMES <- NA_character_ return(x) } value <- as.character(value) ii <- is.na(value) if (any(ii)) value[ii] <- "" if (length(value) > length(x)) stop("too many names") if (length(value) < length(x)) value <- c(value, character(length(x) - length(value))) x@NAMES <- value x } ) setGeneric("desc", function(x) standardGeneric("desc")) setMethod("desc", "MaskCollection", function(x) if (length(x@desc) == 1 && is.na(x@desc)) NULL else x@desc ) setGeneric("desc<-", signature="x", function(x, value) standardGeneric("desc<-") ) setReplaceMethod("desc", "MaskCollection", function(x, value) { if (is.null(value)) { x@desc <- as.character(NA) return(x) } if (!is.character(value)) stop("'value' must be NULL or a character vector") ii <- is.na(value) if (any(ii)) value[ii] <- "" if (length(value) > length(x)) stop("too many names") if (length(value) < length(x)) value <- c(value, character(length(x) - length(value))) x@desc <- value x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity. ### .valid.MaskCollection.width <- function(x) { if (!isSingleInteger(width(x)) || width(x) < 0) return("the width of the collection must be a single non-negative integer") NULL } .valid.MaskCollection.nir_list <- function(x) { if (!is.list(nir_list(x)) || !all(sapply(nir_list(x), function(nir) is(nir, "NormalIRanges")))) return("the 'nir_list' slot must contain a list of NormalIRanges objects") if (!all(1 <= min(x)) || !all(max(x) <= width(x))) return("the min and max of the masks must be >= 1 and <= width of the collection") NULL } .valid.MaskCollection.active <- function(x) { if (!is.logical(active(x)) || S4Vectors:::anyMissing(active(x))) return("the 'active' slot must be a logical vector with no NAs") if (length(active(x)) != length(x)) return("the length of the 'active' slot differs from the length of the object") NULL } .valid.MaskCollection.names <- function(x) { if (S4Vectors:::anyMissing(names(x))) return("the names must be non-NA strings") NULL } .valid.MaskCollection.desc <- function(x) { if (!is.character(x@desc)) return("the 'desc' slot must contain a character vector") if (is.null(desc(x))) return(NULL) if (S4Vectors:::anyMissing(desc(x))) return("the descriptions must be non-NA strings") if (length(desc(x)) != length(x)) return("number of descriptions and number of elements differ") NULL } .valid.MaskCollection <- function(x) { ## The 'width' slot needs to be checked separately and we must return ## if it's invalid. This is because .valid.MaskCollection.nir_list() ## won't work properly if 'x@width' is NA. problems <- .valid.MaskCollection.width(x) if (!is.null(problems)) return(problems) c(.valid.MaskCollection.nir_list(x), .valid.MaskCollection.active(x), .valid.MaskCollection.names(x), .valid.MaskCollection.desc(x)) } setValidity2("MaskCollection", .valid.MaskCollection) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The safe and user-friendly "Mask" constructor. ### Mask <- function(mask.width, start=NULL, end=NULL, width=NULL) { nir <- asNormalIRanges(IRanges(start=start, end=end, width=width), force=FALSE) new2("MaskCollection", nir_list=list(nir), width=S4Vectors:::numeric2integer(mask.width), active=TRUE, check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "max" and "min" methods. ### setMethod("max", "MaskCollection", function(x, ..., na.rm) { if (length(x) == 0) return(integer(0)) sapply(nir_list(x), max) } ) setMethod("min", "MaskCollection", function(x, ..., na.rm) { if (length(x) == 0) return(integer(0)) sapply(nir_list(x), min) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "maskedwidth" and "maskedratio" generics and methods. ### setGeneric("maskedwidth", function(x) standardGeneric("maskedwidth")) setMethod("maskedwidth", "MaskCollection", function(x) { nir_list <- nir_list(x) if (length(nir_list) == 0) integer(0) else sapply(nir_list, function(nir) sum(width(nir))) } ) setGeneric("maskedratio", function(x) standardGeneric("maskedratio")) setMethod("maskedratio", "MaskCollection", function(x) maskedwidth(x) / width(x)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Subsetting. ### setMethod("getListElement", "MaskCollection", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact) nir_list(x)[[i]] } ) ### Always behaves like an endomorphism (i.e. ignores the 'drop' argument and ### behaves like if it was actually set to FALSE). setMethod("extractROWS", "MaskCollection", function(x, i) { i <- normalizeSingleBracketSubscript(i, x, as.NSBS=TRUE) if (anyDuplicated(i)) stop("subscript would generate duplicated elements") slot(x, "nir_list", check=FALSE) <- extractROWS(nir_list(x), i) slot(x, "active", check=FALSE) <- extractROWS(active(x), i) if (!is.null(names(x))) slot(x, "NAMES", check=FALSE) <- extractROWS(names(x), i) if (!is.null(desc(x))) slot(x, "desc", check=FALSE) <- extractROWS(desc(x), i) mcols(x) <- extractROWS(mcols(x, use.names=FALSE), i) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "append" method. ### ### TODO: Be more consistent with "[" which doesn't allow subscripts with ### duplicated positive values in order to make it harder for the user to ### produce a MaskCollection object with duplicated names. ### The "append" method below makes this too easy (with append(x, x)). ### .append.names.or.desc <- function(nm1, l1, nm2, l2, after) { if (is.null(nm1) && is.null(nm2)) return(as.character(NA)) if (is.null(nm1)) nm1 <- rep.int("", l1) if (is.null(nm2)) nm2 <- rep.int("", l2) append(nm1, nm2, after=after) } setMethod("append", c("MaskCollection", "MaskCollection"), function(x, values, after=length(x)) { if (width(values) != width(x)) stop("'x' and 'values' must have the same width") if (!isSingleNumber(after)) stop("'after' must be a single number") if (length(values) == 0) return(x) ans_nir_list <- append(nir_list(x), nir_list(values), after=after) ans_active <- append(active(x), active(values), after=after) l1 <- length(x) l2 <- length(values) ans_NAMES <- .append.names.or.desc(names(x), l1, names(values), l2, after) ans_desc <- .append.names.or.desc(desc(x), l1, desc(values), l2, after) ## This transformation must be atomic. x@nir_list <- ans_nir_list x@active <- ans_active x@NAMES <- ans_NAMES x@desc <- ans_desc x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### collapse() ### setGeneric("collapse", function(x) standardGeneric("collapse")) ### Always return a MaskCollection object of length 1 where the mask is active. setMethod("collapse", "MaskCollection", function(x) { keep_it <- active(x) if (!all(keep_it)) x <- x[keep_it] if (length(x) == 1) return(x) nir_list <- nir_list(x) if (length(nir_list) == 0) { nir1 <- new("NormalIRanges") } else { start1 <- unlist(lapply(nir_list, start)) width1 <- unlist(lapply(nir_list, width)) ranges <- new2("IRanges", start=start1, width=width1, check=FALSE) nir1 <- asNormalIRanges(ranges, force=TRUE) } ## This transformation must be atomic. x@nir_list <- list(nir1) x@active <- TRUE x@NAMES <- as.character(NA) x@desc <- as.character(NA) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### From a MaskCollection object to a NormalIRanges object. setAs("MaskCollection", "NormalIRanges", function(from) collapse(from)[[1L]] ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### MaskCollection.show_frame <- function(x) { lx <- length(x) cat("masks:") if (lx == 0) { cat(" NONE\n") } else { cat("\n") ## Explictely specify 'row.names=NULL' otherwise data.frame() will ## try to use the names of the first component that has suitable ## names, which could be 'active(x)' (3rd component) if 'x' has names. frame <- data.frame(maskedwidth=maskedwidth(x), maskedratio=maskedratio(x), active=active(x), row.names=NULL, check.names=FALSE) frame$names <- names(x) frame$desc <- desc(x) show(frame) if (lx >= 2) { margin <- format("", width=nchar(as.character(lx))) cat("all masks together:\n") mask0 <- collapse(`active<-`(x, TRUE)) frame <- data.frame(maskedwidth=maskedwidth(mask0), maskedratio=maskedratio(mask0), check.names=FALSE) row.names(frame) <- margin show(frame) if (sum(active(x)) < lx) { cat("all active masks together:\n") mask1 <- collapse(x) frame <- data.frame(maskedwidth=maskedwidth(mask1), maskedratio=maskedratio(mask1), check.names=FALSE) row.names(frame) <- margin show(frame) } } } } setMethod("show", "MaskCollection", function(object) { lo <- length(object) cat(class(object), " of length ", lo, " and width ", width(object), "\n", sep="") MaskCollection.show_frame(object) } ) IRanges/R/NCList-class.R0000644000175100017510000003744214626176651015735 0ustar00biocbuildbiocbuild### ========================================================================= ### NCList and NCLists objects ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### An S4 implementation of Nested Containment List (NCList). ### ### We deliberately do NOT extend IRanges. setClass("NCList", contains="IPosRanges", representation( nclist="integer", ranges="IRanges" ) ) setMethod("length", "NCList", function(x) length(x@ranges)) setMethod("names", "NCList", function(x) names(x@ranges)) setMethod("start", "NCList", function(x, ...) start(x@ranges)) setMethod("end", "NCList", function(x, ...) end(x@ranges)) setMethod("width", "NCList", function(x) width(x@ranges)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### .shift_ranges_to_first_circle() and ### .shift_ranges_in_groups_to_first_circle() ### ### TODO: Move to intra-range-methods.R, rename (e.g. shiftToFirstCircle()), ### make it a generic with methods for IRanges and IRangesList, export, and ### document. ### ### Returns a single integer. .normarg_circle.length1 <- function(circle.length) { msg <- "'circle.length' must be a single positive integer or NA" if (!isSingleNumberOrNA(circle.length)) stop(msg) if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) if (!is.na(circle.length) && circle.length <= 0L) stop(msg) circle.length } ### Returns an integer vector of length 'x_len'. .normarg_circle.length2 <- function(circle.length, x_len, what) { msg <- c("'circle.length' must be an integer vector ", "with positive or NA values") if (!is.atomic(circle.length)) stop(msg) if (!(length(circle.length) == 1L || length(circle.length) == x_len)) stop("'circle.length' must have length 1 or length of ", what) all_NAs <- all(is.na(circle.length)) if (!(all_NAs || is.numeric(circle.length))) stop(msg) if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) if (!all_NAs && min(circle.length, na.rm=TRUE) <= 0L) stop(msg) if (length(circle.length) == x_len) return(circle.length) rep.int(circle.length, x_len) } ### 'circle.length' assumed to have length 1 or length of 'x'. .shift_ranges_to_first_circle <- function(x, circle.length) { if (all(is.na(circle.length))) return(x) x_start0 <- start(x) - 1L # 0-based start x_shift0 <- x_start0 %% circle.length - x_start0 x_shift0[is.na(x_shift0)] <- 0L shift(x, x_shift0) } ### 'length(circle.length)' assumed to be >= 'length(x_groups)'. .shift_ranges_in_groups_to_first_circle <- function(x, x_groups, circle.length) { circle.length <- head(circle.length, n=length(x_groups)) if (all(is.na(circle.length))) return(x) unlisted_groups <- unlist(x_groups, use.names=FALSE) circle_len <- rep.int(NA_integer_, length(x)) circle_len[unlisted_groups + 1L] <- rep.int(circle.length, elementNROWS(x_groups)) .shift_ranges_to_first_circle(x, circle_len) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### NCList constructor ### ### Returns an external pointer to the NCList C struct. .NCList_xp <- function(x_start, x_end, x_subset) { ans <- .Call2("C_new_NCList", PACKAGE="IRanges") reg.finalizer(ans, function(e) .Call("C_free_NCList", e, PACKAGE="IRanges") ) .Call2("C_build_NCList", ans, x_start, x_end, x_subset, PACKAGE="IRanges") } .nclist <- function(x_start, x_end, x_subset=NULL) { nclist_xp <- .NCList_xp(x_start, x_end, x_subset) .Call2("C_new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE="IRanges") } NCList <- function(x, circle.length=NA_integer_) { if (!is(x, "IntegerRanges")) stop("'x' must be an IntegerRanges object") if (!is(x, "IRanges")) x <- as(x, "IRanges") ans_mcols <- mcols(x, use.names=FALSE) mcols(x) <- NULL circle.length <- .normarg_circle.length1(circle.length) x <- .shift_ranges_to_first_circle(x, circle.length) x_nclist <- .nclist(start(x), end(x)) new2("NCList", nclist=x_nclist, ranges=x, elementMetadata=ans_mcols, check=FALSE) } ### NOT exported. print_NCList <- function(x) { if (!is(x, "NCList")) stop("'x' must be an NCList object") .Call2("C_print_NCListAsINTSXP", x@nclist, start(x@ranges), end(x@ranges), PACKAGE="IRanges") invisible(NULL) } setAs("IntegerRanges", "NCList", function(from) NCList(from)) ### Inefficient because it rebuilds a new NCList object from scratch for the ### selected ranges. Supported for completeness only! setMethod("extractROWS", c("NCList", "ANY"), function(x, i) { x_class <- class(x) x <- as(x, "IRanges") as(callGeneric(), x_class) } ) ### Inefficient because it rebuilds a new NCList object from scratch for the ### concatenated ranges. Supported for completeness only! setMethod("bindROWS", "NCList", function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) { x_class <- class(x) x <- as(x, "IRanges") as(callGeneric(), x_class) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps_NCList() ### ### NOT exported. findOverlaps_NCList <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length=NA_integer_) { if (!(is(query, "IntegerRanges") && is(subject, "IntegerRanges"))) stop("'query' and 'subject' must be IntegerRanges objects") if (!isSingleNumber(maxgap)) stop("'maxgap' must be a single integer") if (!is.integer(maxgap)) maxgap <- as.integer(maxgap) if (!isSingleNumber(minoverlap)) stop("'minoverlap' must be a single integer") if (!is.integer(minoverlap)) minoverlap <- as.integer(minoverlap) type <- match.arg(type) select <- match.arg(select) circle.length <- .normarg_circle.length1(circle.length) if (is(subject, "NCList")) { nclist <- subject@nclist nclist_is_q <- FALSE query <- .shift_ranges_to_first_circle(query, circle.length) } else if (is(query, "NCList")) { nclist <- query@nclist nclist_is_q <- TRUE subject <- .shift_ranges_to_first_circle(subject, circle.length) } else { ## We'll do "on-the-fly preprocessing". nclist <- NULL nclist_is_q <- NA query <- .shift_ranges_to_first_circle(query, circle.length) subject <- .shift_ranges_to_first_circle(subject, circle.length) } .Call2("C_find_overlaps_NCList", start(query), end(query), start(subject), end(subject), nclist, nclist_is_q, maxgap, minoverlap, type, select, circle.length, PACKAGE="IRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Representation of a list of NCList objects ### setClass("NCLists", contains="IntegerRangesList", representation( nclists="list", rglist="CompressedIRangesList" ), prototype( elementType="NCList" ) ) ### Combine the new "parallel slots" with those of the parent class. Make ### sure to put the new parallel slots **first**. See R/Vector-class.R file ### in the S4Vectors package for what slots should or should not be considered ### "parallel". setMethod("parallel_slot_names", "NCLists", function(x) c("nclists", "rglist", callNextMethod()) ) ### TODO: Move rglist() generic from GenomicRanges to IRanges #setMethod("rglist", "NCLists", function(x, ...) x@rglist) setMethod("ranges", "NCLists", function(x, use.names=TRUE, use.mcols=FALSE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") if (!isTRUEorFALSE(use.mcols)) stop("'use.mcols' must be TRUE or FALSE") ans <- x@rglist if (!use.names) names(ans) <- NULL if (use.mcols) mcols(ans) <- mcols(x, use.names=FALSE) ans } ) setMethod("length", "NCLists", function(x) length(x@rglist)) setMethod("names", "NCLists", function(x) names(x@rglist)) setMethod("start", "NCLists", function(x, ...) start(x@rglist)) setMethod("end", "NCLists", function(x, ...) end(x@rglist)) setMethod("width", "NCLists", function(x) width(x@rglist)) setMethod("elementNROWS", "NCLists", function(x) elementNROWS(x@rglist)) setMethod("getListElement", "NCLists", function (x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact) new2("NCList", nclist=x@nclists[[i]], ranges=x@rglist[[i]], check=FALSE) } ) setAs("NCLists", "CompressedIRangesList", function(from) ranges(from, use.mcols=TRUE) ) setAs("NCLists", "IRangesList", function(from) ranges(from, use.mcols=TRUE) ) .extract_groups_from_RangesList <- function(x) { x_partitioning <- PartitioningByEnd(x) relist(unlist_as_integer(x_partitioning) - 1L, x_partitioning) } .nclists <- function(x, x_groups) { x_start <- start(x) x_end <- end(x) lapply(x_groups, function(group) .nclist(x_start, x_end, x_subset=group)) } ### NCLists constructor. NCLists <- function(x, circle.length=NA_integer_) { if (!is(x, "IntegerRangesList")) stop("'x' must be an IntegerRangesList object") if (!is(x, "CompressedIRangesList")) x <- as(x, "CompressedIRangesList") ans_mcols <- mcols(x, use.names=FALSE) mcols(x) <- NULL unlisted_x <- unlist(x, use.names=FALSE) x_groups <- .extract_groups_from_RangesList(x) circle.length <- .normarg_circle.length2(circle.length, length(x_groups), "'x'") unlisted_x <- .shift_ranges_in_groups_to_first_circle( unlisted_x, x_groups, circle.length) x <- relist(unlisted_x, x) x_nclists <- .nclists(unlisted_x, x_groups) new2("NCLists", nclists=x_nclists, rglist=x, elementMetadata=ans_mcols, check=FALSE) } setAs("IntegerRangesList", "NCLists", function(from) NCLists(from)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### find_overlaps_in_groups_NCList() ### ### NOT exported. Workhorse behind findOverlaps_NCLists() below and behind ### GenomicRanges:::findOverlaps_GNCList(). find_overlaps_in_groups_NCList <- function( q, q_space, q_groups, s, s_space, s_groups, nclists, nclist_is_q, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length) { if (!(is(q, "IntegerRanges") && is(s, "IntegerRanges"))) stop("'q' and 's' must be IntegerRanges object") if (!is(q_groups, "CompressedIntegerList")) stop("'q_groups' must be a CompressedIntegerList object") if (!is(s_groups, "CompressedIntegerList")) stop("'s_groups' must be a CompressedIntegerList object") if (!isSingleNumber(maxgap)) stop("'maxgap' must be a single integer") if (!is.integer(maxgap)) maxgap <- as.integer(maxgap) if (!isSingleNumber(minoverlap)) stop("'minoverlap' must be a single integer") if (!is.integer(minoverlap)) minoverlap <- as.integer(minoverlap) type <- match.arg(type) select <- match.arg(select) q_circle_len <- circle.length q_circle_len[which(nclist_is_q)] <- NA_integer_ q <- .shift_ranges_in_groups_to_first_circle(q, q_groups, q_circle_len) s_circle_len <- circle.length s_circle_len[which(!nclist_is_q)] <- NA_integer_ s <- .shift_ranges_in_groups_to_first_circle(s, s_groups, s_circle_len) .Call2("C_find_overlaps_in_groups_NCList", start(q), end(q), q_space, q_groups, start(s), end(s), s_space, s_groups, nclists, nclist_is_q, maxgap, minoverlap, type, select, circle.length, PACKAGE="IRanges") } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps_NCLists() ### .split_and_remap_hits <- function(all_hits, query, subject, select) { ## Compute list element lengths and offsets for 'query'. query_partitioning <- PartitioningByEnd(query) query_eltNROWS <- width(query_partitioning) query_offsets <- start(query_partitioning) - 1L ## Compute list element lengths and offsets for 'subject'. subject_partitioning <- PartitioningByEnd(subject) subject_eltNROWS <- width(subject_partitioning) subject_offsets <- start(subject_partitioning) - 1L if (select != "all") { ans <- head(relist(all_hits, query), n=length(subject)) if (select != "count") ans <- ans - head(subject_offsets, n=length(ans)) return(ans) } q_hits <- queryHits(all_hits) query_breakpoints <- end(query_partitioning) h_skeleton <- PartitioningByEnd(findInterval(query_breakpoints, q_hits)) lapply(seq_len(min(length(query), length(subject))), function(i) { hits <- all_hits[h_skeleton[[i]]] hits@from <- hits@from - query_offsets[[i]] hits@to <- hits@to - subject_offsets[[i]] hits@nLnode <- query_eltNROWS[[i]] hits@nRnode <- subject_eltNROWS[[i]] hits }) } ### NOT exported. ### Return an ordinary list of: ### (a) SortedByQueryHits objects if 'select' is "all". In that case the ### list has the length of the shortest of 'query' or 'subject'. ### (b) integer vectors if 'select' is not "all". In that case the list is ### parallel to and has the same shape as 'query'. findOverlaps_NCLists <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count"), circle.length=NA_integer_) { if (!(is(query, "IntegerRangesList") && is(subject, "IntegerRangesList"))) stop("'query' and 'subject' must be IntegerRangesList objects") type <- match.arg(type) select <- match.arg(select) circle.length <- .normarg_circle.length2(circle.length, max(length(query), length(subject)), "longest of 'query' or 'subject'") if (is(subject, "NCLists")) { nclists <- subject@nclists nclist_is_q <- rep.int(FALSE, length(nclists)) subject <- subject@rglist } else if (is(query, "NCLists")) { nclists <- query@nclists nclist_is_q <- rep.int(TRUE, length(nclists)) query <- query@rglist } else { ## We'll do "on-the-fly preprocessing". NG <- min(length(query), length(subject)) nclists <- vector(mode="list", length=NG) nclist_is_q <- rep.int(NA, length(nclists)) } if (!is(query, "CompressedIRangesList")) query <- as(query, "CompressedIRangesList") q <- unlist(query, use.names=FALSE) q_groups <- .extract_groups_from_RangesList(query) if (!is(subject, "CompressedIRangesList")) subject <- as(subject, "CompressedIRangesList") s <- unlist(subject, use.names=FALSE) s_groups <- .extract_groups_from_RangesList(subject) all_hits <- find_overlaps_in_groups_NCList( q, NULL, q_groups, s, NULL, s_groups, nclists, nclist_is_q, maxgap, minoverlap, type, select, circle.length) .split_and_remap_hits(all_hits, query, subject, select) } IRanges/R/RangedSelection-class.R0000644000175100017510000000314314626176651017636 0ustar00biocbuildbiocbuild### ========================================================================= ### Selection of features and columns by intervals and column names ### ------------------------------------------------------------------------- setClass("RangedSelection", representation(ranges = "IntegerRangesList", colnames = "character")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "RangedSelection", function(x, use.names=TRUE, use.mcols=FALSE) x@ranges ) setReplaceMethod("ranges", "RangedSelection", function(x, value) { x@ranges <- value x }) setMethod("colnames", "RangedSelection", function(x, do.NULL = TRUE, prefix = "col") x@colnames) setReplaceMethod("colnames", "RangedSelection", function(x, value) { x@colnames <- value x }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### RangedSelection <- function(ranges = IRangesList(), colnames = character()) { if (!is(ranges, "IntegerRangesList")) stop("'ranges' must be an IntegerRangesList") if (!is.character(colnames) || S4Vectors:::anyMissing(colnames)) stop("'colnames' must be a character vector without missing values") new("RangedSelection", ranges = ranges, colnames = colnames) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("IntegerRangesList", "RangedSelection", function(from) RangedSelection(from) ) IRanges/R/Ranges-and-RangesList-classes.R0000644000175100017510000001626114626176651021155 0ustar00biocbuildbiocbuild### ========================================================================= ### Ranges and RangesList objects ### ------------------------------------------------------------------------- ### ### The Ranges class is a virtual class that serves as the parent class for ### any container that represents a vector of ranges. The core Ranges API ### consists of the start(), end() and width() getters. All 3 getters must ### return an integer vector parallel to the object and with no NAs. In ### addition the 3 vectors must satisfy the 2 following properties: ### ### (1) all(width(x) >= 0) is TRUE ### (2) all(start(x) + width(x) - 1L == end(x)) is TRUE ### ### The direct Ranges subclasses defined in the IRanges and GenomicRanges ### packages are: Pos, IntegerRanges, and GenomicRanges. ### ### The Ranges hierarchy: ### - showing the top-level classes only; ### - showing only classes defined in the IRanges package; ### - all classes showed in the diagram are virtual except IRanges, NCList, ### UnstitchedIPos, and StitchedIPos (marked with an asterisk). ### ### Ranges ### ^ ^ ### / \ ### IntegerRanges Pos ### ^ ^ ^ ### | | \ ### Views IPosRanges <----- IPos ### ^ ^ ^ ^ ^ ^ ### / / | | | \ ### . / | | | \ ### / | | UnstitchedIPos* \ ### / | | StitchedIPos* ### IRanges* | | ### ^ NCList* | ### | GroupingRanges ### | ^ ### . | ### . ### ### General ranges. setClass("Ranges", contains="List", representation("VIRTUAL")) ### Positions (i.e. ranges of with 1). setClass("Pos", contains="Ranges", representation("VIRTUAL")) ### All ranges are on a single space. ### Direct IntegerRanges subclasses: Views, IPosRanges. setClass("IntegerRanges", contains="Ranges", representation("VIRTUAL")) setClass("RangesList", contains="List", representation("VIRTUAL"), prototype(elementType="Ranges") ) setClass("SimpleRangesList", contains=c("RangesList", "SimpleList"), representation("VIRTUAL") ) setClass("PosList", contains="RangesList", representation("VIRTUAL"), prototype(elementType="Pos") ) setClass("SimplePosList", contains=c("PosList", "SimpleRangesList"), representation("VIRTUAL") ) setClass("IntegerRangesList", contains="RangesList", representation("VIRTUAL"), prototype(elementType="IntegerRanges") ) setClass("SimpleIntegerRangesList", contains=c("IntegerRangesList", "SimpleRangesList"), representation("VIRTUAL") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for Ranges derivatives ### ### Note that the 3 default methods below implement a circular relationship. ### So Ranges subclasses must overwrite at least 2 of them! setMethod("start", "Ranges", function(x, ...) {1L - width(x) + end(x)}) setMethod("end", "Ranges", function(x, ...) {width(x) - 1L + start(x)}) setMethod("width", "Ranges", function(x) {end(x) - start(x) + 1L}) setMethod("length", "Ranges", function(x) length(start(x))) setGeneric("mid", function(x, ...) standardGeneric("mid")) setMethod("mid", "Ranges", function(x) start(x) + as.integer((width(x) - 1) / 2) ) ### A Ranges object is considered empty iff all its ranges are empty. setMethod("isEmpty", "Ranges", function(x) all(width(x) == 0L)) setMethod("showAsCell", "Ranges", function(object) as.character(object)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Ranges extends List ### setMethod("elementNROWS", "Ranges", function(x) setNames(width(x), names(x)) ) unlist_as_integer <- function(x) { stopifnot(is(x, "Ranges")) if (is(x, "Pos")) return(pos(x)) sequence(width(x), from=start(x)) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for Pos derivatives ### ### Pos subclasses only need to implement a "pos" method. setMethod("start", "Pos", function(x) pos(x)) setMethod("end", "Pos", function(x) pos(x)) setMethod("width", "Pos", function(x) rep.int(1L, length(x))) setMethod("as.integer", "Pos", function(x) pos(x)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for IntegerRanges derivatives ### #setGeneric("isDisjointAndSorted", # function(x, ...) standardGeneric("isDisjointAndSorted") #) #setMethod("isDisjointAndSorted", "IntegerRanges", # function(x) # { # x_start <- start(x) # x_len <- length(x_start) # if (x_len <= 1L) # return(TRUE) # x_end <- end(x) # all(x_start[-1L] > x_end[-x_len]) # } #) ### Being "normal" is stronger that being "disjoint and sorted". setGeneric("isNormal", function(x, ...) standardGeneric("isNormal")) setMethod("isNormal", "IntegerRanges", function(x) { x_start <- start(x) x_end <- end(x) if (any(x_end < x_start)) return(FALSE) x_len <- length(x_start) if (x_len <= 1L) return(TRUE) all(x_start[-1L] > x_end[-x_len] + 1L) } ) setGeneric("whichFirstNotNormal", function(x) standardGeneric("whichFirstNotNormal") ) setMethod("whichFirstNotNormal", "IntegerRanges", function(x) { is_ok <- width(x) >= 1L if (length(x) >= 2) is_ok <- is_ok & c(TRUE, start(x)[-1L] - end(x)[-length(x)] >= 2L) which(!is_ok)[1L] } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for RangesList derivatives ### setMethod("start", "RangesList", function(x) as(lapply(x, start), "SimpleIntegerList") ) setMethod("end", "RangesList", function(x) as(lapply(x, end), "SimpleIntegerList") ) setMethod("width", "RangesList", function(x) as(lapply(x, width), "SimpleIntegerList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods for PosList derivatives ### setMethod("pos", "PosList", function(x) as(lapply(x, pos), "SimpleIntegerList") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Validity ### validate_Ranges <- function(x) { x_start <- start(x) x_end <- end(x) x_width <- width(x) validity_failures <- .Call2("C_validate_Ranges", x_start, x_end, x_width, PACKAGE="IRanges") if (!is.null(validity_failures)) return(validity_failures) if (!(is.null(names(x_start)) && is.null(names(x_end)) && is.null(names(x_width)))) return(wmsg("'start(x)', 'end(x)', and 'width(x)' ", "cannot have names on them")) NULL } validate_Pos <- function(x) { x_width <- width(x) if (!all(x_width == 1L)) return(wmsg()) x_pos <- pos(x) x_start <- start(x) if (!all(x_pos == x_start)) return(wmsg()) NULL } IRanges/R/Rle-class-leftovers.R0000644000175100017510000000552014626176651017322 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 7/2/2014 ### Most of the stuff that used to be in the IRanges/R/Rle-class.R file was ### moved to the S4Vectors package (to R/Rle-class.R and R/Rle-utils.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Rle-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "Rle", function(x, use.names=TRUE, use.mcols=FALSE) IRanges(start(x), width=width(x)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setAs("Rle", "IRanges", function(from) { if (!is.logical(runValue(from)) || S4Vectors:::anyMissing(runValue(from))) stop("cannot coerce a non-logical 'Rle' or a logical 'Rle' ", "with NAs to an IRanges object") keep <- runValue(from) ## The returned IRanges instance is guaranteed to be normal. ans_start <- start(from)[keep] ans_width <- runLength(from)[keep] new2("IRanges", start=ans_start, width=ans_width, check=FALSE) }) setAs("Rle", "NormalIRanges", function(from) newNormalIRangesFromIRanges(as(from, "IRanges"), check=FALSE)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### General methods ### setGeneric("findRange", signature = "vec", function(x, vec) standardGeneric("findRange")) setMethod("findRange", signature = c(vec = "Rle"), function(x, vec) { run <- findRun(x, vec) if (S4Vectors:::anyMissing(run)) stop("all 'x' values must be in [1, 'length(vec)']") IRanges(start = start(vec)[run], width = width(vec)[run], names = names(x)) }) setGeneric("orderAsRanges", signature = c("x"), # not exported function(x, na.last = TRUE, decreasing = FALSE) standardGeneric("orderAsRanges")) setMethod("orderAsRanges", "Rle", function(x, na.last = TRUE, decreasing = FALSE) { ord <- base::order(runValue(x), na.last = na.last, decreasing = decreasing) new2("IRanges", start = start(x)[ord], width = runLength(x)[ord], check = FALSE) }) setGeneric("splitRanges", signature = "x", function(x) standardGeneric("splitRanges")) setMethod("splitRanges", "Rle", function(x) { split(IRanges(start = start(x), width = runLength(x)), runValue(x)) }) setMethod("splitRanges", "vector_OR_factor", function(x) { callGeneric(Rle(x)) }) IRanges/R/RleViews-class.R0000644000175100017510000001041714626176651016332 0ustar00biocbuildbiocbuild### ========================================================================= ### RleViews objects ### ------------------------------------------------------------------------- ### ### The RleViews class is the basic container for storing a set of views ### (start/end locations) on the same Rle object, called the "subject" ### vector. setClass("RleViews", contains=c("Views", "RleList"), representation( subject="Rle" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### User-friendly constructor. ### setMethod("Views", "Rle", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) new_Views(subject, start=start, end=end, width=width, names=names, Class="RleViews") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("AtomicList", "RleViews", function(from) { to <- Views(as(unlist(from, use.names = FALSE), "Rle"), PartitioningByEnd(from)) names(to) <- names(from) to }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "show" method. ### ### The 2 helper functions below convert a given view on an Rle object ### into a character-string. ### Both assume that 'start' <= 'end' (so they don't check it) and ### padd the result with spaces to produce the "margin effect" ### if 'start' or 'end' are out of limits. RleViews.show_vframe_header <- function(iW, startW, endW, widthW) { cat(format("", width=iW+1), format("start", width=startW, justify="right"), " ", format("end", width=endW, justify="right"), " ", format("width", width=widthW, justify="right"), "\n", sep="") } RleViews.show_vframe_line <- function(x, i, iW, startW, endW, widthW) { lsx <- length(subject(x)) start <- start(x)[i] end <- end(x)[i] width <- end - start + 1 snippetWidth <- getOption("width") - 10 - iW - startW - endW - widthW if (width > 0 && lsx > 0 && start <= lsx && end >= 1) { snippetStart <- max(min(start,lsx),1) snippetEnd <- max(min(end,lsx,start + snippetWidth),1) snippet <- format(as.vector(extractROWS(subject(x), IRanges(snippetStart, snippetEnd)))) snippet <- snippet[cumsum(nchar(snippet) + 1L) < snippetWidth] if (length(snippet) < width) { snippet <- c(snippet, "...") } snippet <- paste(snippet, collapse = " ") } else { snippet <- " " } cat(format(paste("[", i,"]", sep=""), width=iW, justify="right"), " ", format(start, width=startW, justify="right"), " ", format(end, width=endW, justify="right"), " ", format(width, width=widthW, justify="right"), " ", "[", snippet, "]\n", sep="") } ### 'half_nrow' must be >= 1 RleViews.show_vframe <- function(x, half_nrow=9L) { cat("\nviews:") lx <- length(x) if (lx == 0) cat(" NONE\n") else { cat("\n") iW <- nchar(as.character(lx)) + 2 # 2 for the brackets startMax <- max(start(x)) startW <- max(nchar(startMax), nchar("start")) endMax <- max(end(x)) endW <- max(nchar(endMax), nchar("end")) widthMax <- max(width(x)) widthW <- max(nchar(widthMax), nchar("width")) RleViews.show_vframe_header(iW, startW, endW, widthW) if (lx <= 2*half_nrow+1) { for (i in seq_len(lx)) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) } else { for (i in 1:half_nrow) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) cat(format("...", width=iW, justify="right"), " ", format("...", width=startW, justify="right"), " ", format("...", width=endW, justify="right"), " ", format("...", width=widthW, justify="right"), " ...\n", sep="") for (i in (lx-half_nrow+1L):lx) RleViews.show_vframe_line(x, i, iW, startW, endW, widthW) } } } setMethod("show", "RleViews", function(object) { cat("Views on a ", length(subject(object)), "-length Rle subject\n", sep="") RleViews.show_vframe(object) } ) IRanges/R/RleViews-utils.R0000644000175100017510000000445014626176651016365 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply", "viewMins", "viewMaxs", and "viewSums" generics and ### methods. ### setMethod("viewApply", "RleViews", function(X, FUN, ..., simplify = TRUE) { X <- trim(X) ans <- aggregate(subject(X), start = structure(start(X), names = names(X)), end = end(X), FUN = FUN, ..., simplify = simplify) if (!simplify) { ans <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans, metadata=metadata(X), mcols=mcols(X)) } ans }) setMethod("viewMins", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewMins_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewMaxs", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewMaxs_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewSums", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewSums_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewMeans", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewMeans_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewWhichMins", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewWhichMins_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewWhichMaxs", "RleViews", function(x, na.rm = FALSE) .Call2("C_viewWhichMaxs_RleViews", trim(x), na.rm, PACKAGE="IRanges")) setMethod("viewRangeMaxs", "RleViews", function(x, na.rm = FALSE) { maxs <- viewWhichMaxs(trim(x), na.rm = na.rm) if (S4Vectors:::anyMissing(maxs)) stop("missing values present, set 'na.rm = TRUE'") findRange(maxs, subject(x)) }) setMethod("viewRangeMins", "RleViews", function(x, na.rm = FALSE) { mins <- viewWhichMins(trim(x), na.rm = na.rm) if (S4Vectors:::anyMissing(mins)) stop("missing values present, set 'na.rm = TRUE'") findRange(mins, subject(x)) }) IRanges/R/RleViewsList-class.R0000644000175100017510000000675114626176651017174 0ustar00biocbuildbiocbuild### ========================================================================= ### RleViewsList objects ### ------------------------------------------------------------------------- setClass("RleViewsList", representation("VIRTUAL"), prototype = prototype(elementType = "RleViews"), contains = "ViewsList") setClass("SimpleRleViewsList", prototype = prototype(elementType = "RleViews"), contains = c("RleViewsList", "SimpleViewsList")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor. ### setMethod("subject", "SimpleRleViewsList", function(x) S4Vectors:::new_SimpleList_from_list("SimpleRleList", lapply(x, slot, "subject")) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor. ### setMethod("Views", "RleList", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) RleViewsList(rleList = subject, rangesList = start)) RleViewsList <- function(..., rleList, rangesList) { views <- list(...) if (!missing(rleList) && !missing(rangesList)) { if (length(views) > 0) stop(wmsg("'...' must be empty when 'rleList' and 'rangesList' ", "are specified")) if (!is(rleList, "RleList")) stop(wmsg("'rleList' must be a RleList object")) if (!is(rangesList, "IntegerRangesList")) { rangesList <- try(IRangesList(rangesList), silent = TRUE) if (inherits(rangesList, "try-error")) stop(wmsg("'rangesList' must be a IntegerRangesList object")) } if (length(rleList) != length(rangesList)) stop("'rleList' and 'rangesList' must have the same length") rleList_names <- names(rleList) rangesList_names <- names(rangesList) if (!(is.null(rleList_names) || is.null(rangesList_names) || identical(rleList_names, rangesList_names))) { if (anyDuplicated(rleList_names,) || anyDuplicated(rangesList_names)) stop(wmsg("when both 'rleList' and 'rangesList' have names, ", "the names on each object cannot have duplicates")) if (!setequal(rleList_names, rangesList_names)) stop(wmsg("when both 'rleList' and 'rangesList' have names, ", "the set of names must be the same on each object")) warning(wmsg("'rleList' was reordered so that its names ", "match the names on 'rangesList'")) rleList <- rleList[rangesList_names] } views <- Map(Views, rleList, rangesList) } else if ((length(views) > 0) && (!missing(rleList) || !missing(rangesList))) { stop(wmsg("cannot specify 'rleList' or 'rangesList' ", "when specifying '...'")) } else { if (length(views) == 1 && is.list(views[[1L]])) views <- views[[1L]] if (!all(sapply(views, is, "RleViews"))) stop(wmsg("all elements in '...' must be RleViews objects")) } S4Vectors:::new_SimpleList_from_list("SimpleRleViewsList", views) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### setAs("RleViewsList", "SimpleIRangesList", function(from) IRangesList(lapply(from, as, "IRanges"), compress=FALSE)) setAs("RleViewsList", "IRangesList", function(from) as(from, "SimpleIRangesList")) IRanges/R/RleViewsList-utils.R0000644000175100017510000000746014626176651017225 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply", "viewMins", "viewMaxs", and "viewSums" generics and ### methods. ### setMethod("viewApply", "RleViewsList", function(X, FUN, ..., simplify = TRUE) { ans_listData <- lapply(structure(seq_along(X), names=names(X)), function(i) { ans_elt <- aggregate( subject(X[[i]]), start=structure(start(X[[i]]), names=names(start(X[[i]]))), end=end(X[[i]]), FUN=FUN, ..., simplify=simplify) if (!simplify) ans_elt <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans_elt, metadata=metadata(X[[i]]), mcols=mcols(X[[i]], use.names=FALSE)) ans_elt }) S4Vectors:::new_SimpleList_from_list("SimpleList", ans_listData, metadata=metadata(X), mcols=mcols(X, use.names=FALSE))}) .summaryRleViewsList <- function(x, FUN, na.rm = FALSE, outputListType = NULL) { FUN <- match.fun(FUN) if (length(x) == 0) { outputListType <- "SimpleList" listData <- list() } else { if (is.null(outputListType)) { valuesClass <- class(runValue(subject(x[[1L]]))) if (valuesClass == "integer" || valuesClass == "logical") outputListType <- "SimpleIntegerList" else if (valuesClass == "numeric") outputListType <- "SimpleNumericList" else stop("cannot compute numeric summary over a non-numeric Rle") } listData <- lapply(structure(seq_len(length(x)), names = names(x)), function(i) FUN(x[[i]], na.rm = na.rm)) } S4Vectors:::new_SimpleList_from_list(outputListType, listData, metadata = metadata(x), mcols = mcols(x, use.names=FALSE)) } setMethod("viewMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMins, na.rm = na.rm)) setMethod("viewMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMaxs, na.rm = na.rm)) setMethod("viewSums", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewSums, na.rm = na.rm)) setMethod("viewMeans", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewMeans, na.rm = na.rm, outputListType = "SimpleNumericList")) setMethod("viewWhichMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewWhichMins, na.rm = na.rm, outputListType = "SimpleIntegerList")) setMethod("viewWhichMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewWhichMaxs, na.rm = na.rm, outputListType = "SimpleIntegerList")) setMethod("viewRangeMaxs", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewRangeMaxs, na.rm = na.rm, outputListType = "SimpleIRangesList")) setMethod("viewRangeMins", "RleViewsList", function(x, na.rm = FALSE) .summaryRleViewsList(x, FUN = viewRangeMins, na.rm = na.rm, outputListType = "SimpleIRangesList")) IRanges/R/SimpleGrouping-class.R0000644000175100017510000001033514626176651017535 0ustar00biocbuildbiocbuild### ========================================================================= ### Grouping objects implemented with an IntegerList ### ------------------------------------------------------------------------- setClass("SimpleGrouping", ### TODO: contain VIRTUAL after R 3.4 release contains=c("Grouping", "SimpleIntegerList")) setClass("SimpleManyToOneGrouping", contains=c("ManyToOneGrouping", "SimpleGrouping")) setClass("BaseManyToManyGrouping", representation(nobj="integer"), ### TODO: contain VIRTUAL after R 3.4 release contains="ManyToManyGrouping", validity=function(object) { if (!isSingleNumber(object@nobj)) "'nobj' must be a single, non-NA number" }) setClass("SimpleManyToManyGrouping", contains=c("BaseManyToManyGrouping", "SimpleGrouping")) ### ------------------------------------------------------------------------- ### Grouping API implementation ### ---------------------------- ### setMethod("nobj", "BaseManyToManyGrouping", function(x) x@nobj) ### ------------------------------------------------------------------------- ### Constructors ### ---------------------------- ### ManyToOneGrouping <- function(..., compress=TRUE) { CompressedOrSimple <- if (compress) "Compressed" else "Simple" Class <- paste0(CompressedOrSimple, "ManyToOneGrouping") new(Class, IntegerList(..., compress=compress)) } ManyToManyGrouping <- function(..., nobj, compress=TRUE) { CompressedOrSimple <- if (compress) "Compressed" else "Simple" Class <- paste0(CompressedOrSimple, "ManyToManyGrouping") new(Class, IntegerList(..., compress=compress), nobj=nobj) } ### ------------------------------------------------------------------------- ### Coercion ### ---------------------------- ### setOldClass(c("grouping", "integer")) ## utils::relist dipatches only on 'skeleton' so this is here instead of in R setMethod("relist", c("grouping", "missing"), function(flesh, skeleton) { relist(as.integer(flesh), PartitioningByEnd(attr(flesh, "ends"))) }) setMethod("split", c("ANY", "ManyToOneGrouping"), function(x, f, drop=FALSE) { stopifnot(isTRUEorFALSE(drop)) ans <- extractList(x, f) if (drop) { ans <- ans[lengths(ans) > 0L] } ans }) setAs("grouping", "Grouping", function(from) { as(from, "ManyToOneGrouping") }) setAs("grouping", "ManyToOneGrouping", function(from) { ManyToOneGrouping(relist(from), compress=TRUE) }) setAs("vector", "Grouping", function(from) { if (anyNA(from)) as(from, "ManyToManyGrouping") else as(from, "ManyToOneGrouping") }) setAs("vector", "ManyToOneGrouping", function(from) { to <- as(grouping(from), "Grouping") names(to) <- from[unlist(to)[end(PartitioningByEnd(to))]] to }) setAs("factor", "ManyToOneGrouping", function(from) { ManyToOneGrouping(splitAsList(seq_along(from), from)) }) setAs("vector", "ManyToManyGrouping", function(from) { g <- as(from, "ManyToOneGrouping") if (anyNA(from)) g <- g[-length(g)] ManyToManyGrouping(g, nobj=length(from)) }) setAs("ManyToOneGrouping", "factor", function(from) { levels <- if (!is.null(names(from))) { names(from) } else { as.character(seq_along(from)) } structure(togroup(from), levels=levels, class="factor") }) setMethod("as.factor", "ManyToOneGrouping", function(x) { as(x, "factor") }) makeGroupNames <- function(x) { if (is.null(x)) { x <- character(length(x)) } ind <- which(x == "") x[ind] <- paste("Group", ind, sep = ".") x } levelCols <- function(by) { DataFrame(expand.grid(lapply(by, levels))) } setAs("FactorList", "Grouping", function(from) { l <- as.list(from) names(l) <- makeGroupNames(names(from)) as(DataFrame(l), "Grouping") }) setAs("DataFrame", "Grouping", function(from) { factors <- lapply(from, as.factor) l <- splitAsList(seq_len(nrow(from)), factors) mcols(l) <- levelCols(factors) if (anyNA(from, recursive=TRUE)) { ManyToManyGrouping(l, nobj=nrow(from)) } else { ManyToOneGrouping(l) } }) IRanges/R/Vector-class-leftovers.R0000644000175100017510000000765314626176651020053 0ustar00biocbuildbiocbuild### ========================================================================= ### IMPORTANT NOTE - 4/29/2014 ### Most of the stuff that used to be in the IRanges/R/Vector-class.R file ### was moved to the S4Vectors package (to R/Vector-class.R). ### The stuff that could not be moved there was *temporarily* kept here in ### Vector-class-leftovers.R but will need to find a new home (in S4Vectors ### or in IRanges). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Other subsetting-related operations ### ### S3/S4 combo for window<-.Vector `window<-.Vector` <- function(x, start=NA, end=NA, width=NA, ..., value) { window(x, start, end, width, ...) <- value x } `.window<-.Vector` <- function(x, start=NA, end=NA, width=NA, ..., value) { i <- solveUserSEWForSingleSeq(NROW(x), start, end, width) li <- width(i) if (li == 0L) { ## Surprisingly, in that case, `[<-` on standard vectors does not ## even look at 'value'. So neither do we... return(x) } lv <- NROW(value) if (lv == 0L) stop("replacement has length zero") value <- normalizeSingleBracketReplacementValue(value, x) if (li != lv) { if (li %% lv != 0L) warning("number of values supplied is not a sub-multiple ", "of the number of values to be replaced") value <- extractROWS(value, rep(seq_len(lv), length.out=li)) } c(window(x, end=start(i)-1L), value, window(x, start=end(i)+1L)) } setReplaceMethod("window", "Vector", `.window<-.Vector`) ### S3/S4 combo for window<-.vector `window<-.vector` <- `window<-.Vector` setReplaceMethod("window", "vector", `window<-.vector`) ### S3/S4 combo for window<-.factor `window<-.factor` <- function(x, start=NA, end=NA, width=NA, ..., value) { levels <- levels(x) x <- as.character(x) value <- as.character(value) factor(callGeneric(), levels=levels) } setReplaceMethod("window", "factor", `window<-.factor`) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Looping methods ### #.tapplyDefault <- base::tapply #environment(.tapplyDefault) <- topenv() .tapplyDefault <- function(X, INDEX, FUN = NULL, ..., simplify = TRUE) { if (!is.null(FUN)) FUN <- match.fun(FUN) if (missing(INDEX)) stop("'INDEX' is missing") if (!is(INDEX, "RleList")) { if (!is.list(INDEX) && !is(INDEX, "Rle")) INDEX <- Rle(INDEX) INDEX <- RleList(INDEX) } nI <- length(INDEX) namelist <- vector("list", nI) names(namelist) <- names(INDEX) extent <- integer(nI) nx <- NROW(X) one <- 1L group <- Rle(one, nx) ngroup <- one for (i in seq_len(nI)) { index <- INDEX[[i]] if (!is.factor(runValue(index))) runValue(index) <- factor(runValue(index)) offset <- index runValue(offset) <- ngroup * (as.integer(runValue(index)) - one) if (length(index) != nx) stop("arguments must have same length") namelist[[i]] <- levels(index) extent[i] <- nlevels(index) group <- group + offset ngroup <- ngroup * nlevels(index) } if (is.null(FUN)) return(as.vector(group)) groupRanges <- splitRanges(group) ans <- lapply(groupRanges, function(i) FUN(extractROWS(X, i), ...)) index <- as.integer(names(ans)) if (simplify && all(unlist(lapply(ans, length), use.names=FALSE) == 1L)) { ansmat <- array(dim = extent, dimnames = namelist) ans <- unlist(ans, recursive = FALSE) } else { ansmat <- array(vector("list", prod(extent)), dim = extent, dimnames = namelist) } if (length(index) > 0) { names(ans) <- NULL ansmat[index] <- ans } ansmat } setMethod("tapply", c("Vector", "ANY"), .tapplyDefault) setMethod("tapply", c("ANY", "Vector"), .tapplyDefault) setMethod("tapply", c("Vector", "Vector"), .tapplyDefault) IRanges/R/Views-class.R0000644000175100017510000002667614626176651015705 0ustar00biocbuildbiocbuild### ========================================================================= ### Views objects ### ------------------------------------------------------------------------- ### ### The Views virtual class is a general container for storing a set of views ### on an arbitrary Vector object, called the "subject". ### setClass("Views", contains="IntegerRanges", representation( "VIRTUAL", subject="Vector", ranges="IRanges" ) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### parallel_slot_names() ### ### Combine the new "parallel slots" with those of the parent class. Make ### sure to put the new parallel slots **first**. See R/Vector-class.R file ### in the S4Vectors package for what slots should or should not be considered ### "parallel". setMethod("parallel_slot_names", "Views", function(x) c("ranges", callNextMethod()) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setGeneric("subject", function(x) standardGeneric("subject")) setMethod("subject", "Views", function(x) x@subject) setMethod("ranges", "Views", function(x, use.names=TRUE, use.mcols=FALSE) x@ranges ) setGeneric("ranges<-", function(x, ..., value) standardGeneric("ranges<-")) setReplaceMethod("ranges", "Views", function(x, ..., value) { stop("ranges setter for Views objects not ready yet") } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Methods derived from the IRanges interface. ### setMethod("start", "Views", function(x, ...) start(ranges(x))) setMethod("width", "Views", function(x) width(ranges(x))) setMethod("names", "Views", function(x) names(ranges(x))) setReplaceMethod("start", "Views", function(x, ..., value) { start(x@ranges, ...) <- value x } ) setReplaceMethod("end", "Views", function(x, ..., value) { end(x@ranges, ...) <- value x } ) setReplaceMethod("width", "Views", function(x, ..., value) { width(x@ranges, ...) <- value x } ) setReplaceMethod("names", "Views", function(x, value) { names(x@ranges) <- value x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Constructor ### ### The low-level "Views" constructor. ### NOT exported but used in XVector, Biostrings, and triplex packages. ### TODO: - add a 'check.limits' arg (default to TRUE) for raising an error if ### some views are "out of limits". new_Views <- function(subject, start=NULL, end=NULL, width=NULL, names=NULL, Class=NULL) { if (is(start, "IntegerRanges")) { if (!is.null(end) || !is.null(width)) stop(wmsg("'end' and 'width' must be NULLs when ", "'start' is an IntegerRanges object")) ans_ranges <- start if (class(ans_ranges) != "IRanges") ans_ranges <- as(ans_ranges, "IRanges") ## Keep the names that are already in 'ranges' unless the 'names' arg ## was specified. if (!is.null(names)) names(ans_ranges) <- names ans_mcols <- mcols(ans_ranges, use.names=FALSE) mcols(ans_ranges) <- NULL } else { ans_ranges <- IRanges(start=start, end=end, width=width, names=names) ans_mcols <- NULL } if (is.null(Class)) Class <- paste(class(subject), "Views", sep="") new2(Class, subject=subject, ranges=ans_ranges, elementMetadata=ans_mcols, check=FALSE) } ### The user-friendly "Views" constructor. ### TODO: Same as for the new_Views() function above. setGeneric("Views", signature="subject", function(subject, start=NULL, end=NULL, width=NULL, names=NULL) standardGeneric("Views") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### unlist() ### ### Overwrite method for List objects with optimized method for Views objects. setMethod("unlist", "Views", function(x, recursive=TRUE, use.names=TRUE) { if (!isTRUEorFALSE(use.names)) stop("'use.names' must be TRUE or FALSE") unlisted_x <- subject(x)[ranges(x)] if (use.names) unlisted_x <- S4Vectors:::set_unlisted_names(unlisted_x, x) unlisted_x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion. ### ### Returns a single view covering the entire sequence. setAs("Vector", "Views", function(from) Views(from, start=1L, width=length(from)) ) setAs("Views", "IntegerRanges", function(from) ranges(from)) setAs("Views", "IRanges", function(from) ranges(from)) ### Unfortunately, even if we've already defined the IRanges->NormalIRanges ### "coerce" method to override the silly implicit one, we still need to ### define the ->NormalIRanges ones for every that contains ### IRanges. Otherwise, again, 'as(x, "NormalIRanges")' would call another ### silly implicit method when 'x' is a instance. ### Yes, this is another S4 "feature": ### https://stat.ethz.ch/pipermail/r-devel/2008-April/049027.html setAs("Views", "NormalIRanges", function(from) asNormalIRanges(ranges(from), force=TRUE) ) setMethod("as.matrix", "Views", function(x, rev = FALSE, max.width = NA) { x_ranges <- restrict(ranges(x), start = 1L) if (is.na(max.width)) { max.width <- max(width(x_ranges)) } rev <- S4Vectors:::recycleVector(rev, length(x)) part <- PartitioningByWidth(x_ranges) from <- ifelse(rev, end(part), start(part)) by <- ifelse(rev, -1L, 1L) ord <- sequence(width(part), from=from, by=by) v <- extractROWS(subject(x), x_ranges)[ord] v_fill <- rep.int(NA, max.width * length(x)) part <- PartitioningByWidth(rep(max.width, length(x))) i <- unlist_as_integer(IRanges(start(part), width = width(x_ranges))) v_fill[i] <- as.vector(v) matrix(v_fill, ncol = max.width, byrow = TRUE, dimnames = list(names(x), NULL)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Extracting a view. ### setMethod("getListElement", "Views", function(x, i, exact=TRUE) { i <- normalizeDoubleBracketSubscript(i, x, exact=exact) start <- start(x)[i] end <- end(x)[i] if (start < 1L || end > length(subject(x))) stop("view is out of limits") extractROWS(subject(x), IRanges(start, end)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Concatenation ### .check_that_Views_objects_are_concatenable <- function(x, objects) { ok <- vapply( objects, function(object) isTRUE(all.equal(subject(object), subject(x))), logical(1), USE.NAMES=FALSE ) if (!all(ok)) stop(wmsg("the Views objects to concatenate ", "must have the same subject")) } .concatenate_Views_objects <- function(x, objects=list(), use.names=TRUE, ignore.mcols=FALSE, check=TRUE) { objects <- S4Vectors:::prepare_objects_to_bind(x, objects) .check_that_Views_objects_are_concatenable(x, objects) callNextMethod() } setMethod("bindROWS", "Views", .concatenate_Views_objects) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "trim" function. ### setGeneric("trim", signature="x", function(x, use.names=TRUE, ...) standardGeneric("trim") ) setMethod("trim", "Views", function(x, use.names=TRUE) { if (length(x) == 0L) return(x) if (min(start(x)) >= 1L && max(end(x)) <= length(subject(x))) return(x) x@ranges <- restrict(ranges(x), start=1L, end=length(subject(x)), keep.all.ranges=TRUE, use.names=use.names) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "subviews" function. ### ### TODO: - add a 'check.limits' arg (default to TRUE) for raising an error if ### some views are "out of limits" setGeneric("subviews", signature="x", function(x, start=NA, end=NA, width=NA, use.names=TRUE) standardGeneric("subviews") ) setMethod("subviews", "Views", function(x, start=NA, end=NA, width=NA, use.names=TRUE) trim(narrow(x, start=start, end=end, width=width, use.names=use.names)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "successiveViews" function. ### successiveViews <- function(subject, width, gapwidth=0, from=1) { ranges <- successiveIRanges(width, gapwidth=gapwidth, from=from) Views(subject, ranges) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "slidingViews" function. ### slidingViews <- function(subject, width, shift = 1L) { ranges <- slidingIRanges(length(subject), width, shift) Views(subject, ranges) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "viewApply" function. ### setGeneric("viewApply", signature="X", function(X, FUN, ..., simplify = TRUE) standardGeneric("viewApply") ) setMethod("viewApply", "Views", function(X, FUN, ..., simplify = TRUE) { X <- trim(X) Xsubject <- subject(X) Xstart <- start(X) Xwidth <- width(X) ans <- sapply(structure(seq_len(length(X)), names=names(X)), function(i) FUN(extractROWS(Xsubject, IRanges(start=Xstart[i], width=Xwidth[i])), ...), simplify = simplify) if (!simplify) { ans <- S4Vectors:::new_SimpleList_from_list("SimpleList", ans, metadata = metadata(X), mcols = mcols(X, use.names=FALSE)) } ans } ) setGeneric("viewMins", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMins")) setGeneric("viewMaxs", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMaxs")) setGeneric("viewSums", signature="x", function(x, na.rm = FALSE) standardGeneric("viewSums")) setGeneric("viewMeans", signature="x", function(x, na.rm = FALSE) standardGeneric("viewMeans")) setGeneric("viewWhichMins", signature="x", function(x, na.rm = FALSE) standardGeneric("viewWhichMins")) setGeneric("viewWhichMaxs", signature="x", function(x, na.rm = FALSE) standardGeneric("viewWhichMaxs")) setGeneric("viewRangeMaxs", function(x, na.rm = FALSE) standardGeneric("viewRangeMaxs")) setGeneric("viewRangeMins", function(x, na.rm = FALSE) standardGeneric("viewRangeMins")) setMethod("Summary", "Views", function(x, ..., na.rm = FALSE) { viewSummaryFunMap <- list(min = viewMins, max = viewMaxs, sum = viewSums) viewSummaryFun <- viewSummaryFunMap[[.Generic]] if (!is.null(viewSummaryFun)) { if (length(list(...))) stop("Passing multiple arguments to '", .Generic, "' is not supported.") viewSummaryFun(x, na.rm = na.rm) } else { Summary(ranges(x), ..., na.rm = na.rm) } }) setMethod("mean", "Views", viewMeans) setMethod("which.max", "Views", function(x) { viewWhichMaxs(x, na.rm = TRUE) }) setMethod("which.min", "Views", function(x) { viewWhichMins(x, na.rm = TRUE) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Show ### setMethod("showAsCell", "Views", function(object) { showAsCell(as(object, relistToClass(subject(object)))) }) IRanges/R/ViewsList-class.R0000644000175100017510000000373214626176651016525 0ustar00biocbuildbiocbuild### ========================================================================= ### ViewsList objects ### ------------------------------------------------------------------------- setClass("ViewsList", contains="IntegerRangesList", representation("VIRTUAL"), prototype(elementType="Views") ) setClass("SimpleViewsList", contains=c("ViewsList", "SimpleList"), representation("VIRTUAL") ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Accessor methods. ### setMethod("ranges", "SimpleViewsList", function(x, use.names=TRUE, use.mcols=FALSE) S4Vectors:::new_SimpleList_from_list("SimpleIRangesList", lapply(x, ranges, use.names=use.names, use.mcols=use.mcols)) ) setMethod("start", "SimpleViewsList", function(x, ...) start(ranges(x))) setMethod("end", "SimpleViewsList", function(x, ...) end(ranges(x))) setMethod("width", "SimpleViewsList", function(x) width(ranges(x))) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Coercion ### setMethod("as.matrix", "ViewsList", function(x, rev = FALSE, use.names = FALSE) { if (!isTRUEorFALSE(use.names)) stop("use.names must be TRUE or FALSE") if (!is(rev, "List")) rev <- as(rev, "List") rev <- S4Vectors:::VH_recycle(rev, x, "rev", "x") max_width <- max(max(width(restrict(ranges(x), start = 1L)))) m <- do.call(rbind, mapply(as.matrix, x, rev, IntegerList(max_width), SIMPLIFY = FALSE)) nms <- names(x) if (!is.null(nms) && use.names) { nms <- rep(nms, elementNROWS(x)) rownms <- rownames(m) if (is.null(rownms)) rownms <- unlist_as_integer(IRanges(1L, width=elementNROWS(x))) rownames(m) <- paste(nms, rownms, sep = ".") } m }) IRanges/R/cbind-Rle-methods.R0000644000175100017510000000322514626176651016726 0ustar00biocbuildbiocbuild### ========================================================================= ### Binding Rle or RleList objects together ### ------------------------------------------------------------------------- ### Return a DataFrame object with 1 row per run. Its first column is ### "runLength" and is followed by 1 column per supplied Rle object. setMethod("cbind", "Rle", function(...) { args <- list(...) args_names <- names(args) if (is.null(args_names)) { noname_idx <- seq_along(args) } else { noname_idx <- which(args_names %in% c("", NA_character_)) } if (length(noname_idx) != 0L) names(args)[noname_idx] <- paste0("V", noname_idx) ## TODO: Call disjoin() with 'with.revmap=TRUE' and use the revmap ## to avoid the call to findOverlaps() below. ans_runs <- disjoin(do.call(c, unname(lapply(args, ranges)))) DataFrame( runLength=width(ans_runs), DataFrame( lapply(args, function(x) { run_idx <- findOverlaps(ans_runs, ranges(x), type="within", select="arbitrary") runValue(x)[run_idx] }) ) ) } ) ### The supplied RleList objects are recycled the "mapply way" if necessary. ### Return a CompressedSplitDataFrameList object parallel to the longest ### supplied RleList object. setMethod("cbind", "RleList", function(...) { args <- list(...) DF_list <- do.call(mapply, c(list(cbind), args, list(SIMPLIFY=FALSE))) as(DF_list, "CompressedSplitDataFrameList") } ) IRanges/R/coverage-methods.R0000644000175100017510000002556314626176651016733 0ustar00biocbuildbiocbuild### ========================================================================= ### coverage() ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### .coverage_IRanges() and coverage_CompressedIRangesList() ### ### These 2 internal helpers are the workhorses behind most "coverage" ### methods. All the hard work is almost entirely performed at the C level. ### Only some argument checking/normalization plus the "folding" of the ### result are performed in R. ### .fold_and_truncate_coverage <- function(cvg, circle.length, width) { cvg <- fold(cvg, circle.length) if (is.na(width)) return(cvg) head(cvg, n=width) } ### Returns an Rle object. .coverage_IRanges <- function(x, shift=0L, width=NULL, weight=1L, circle.length=NA, method=c("auto", "sort", "hash", "naive")) { ## Check 'x'. if (!is(x, "IRanges")) stop("'x' must be an IRanges object") ## 'shift' will be checked at the C level. if (is(shift, "Rle")) shift <- S4Vectors:::decodeRle(shift) ## Check 'width'. if (is.null(width)) { width <- NA_integer_ } else if (!isSingleNumberOrNA(width)) { stop("'width' must be NULL or a single integer") } else if (!is.integer(width)) { width <- as.integer(width) } ## 'weight' will be checked at the C level. if (is(weight, "Rle")) weight <- S4Vectors:::decodeRle(weight) ## Check 'circle.length'. if (!isSingleNumberOrNA(circle.length)) stop("'circle.length' must be a single integer") if (!is.integer(circle.length)) circle.length <- as.integer(circle.length) ## Check 'method'. method <- match.arg(method) ## Ready to go... ans <- .Call2("C_coverage_IRanges", x, shift, width, weight, circle.length, method, PACKAGE="IRanges") if (is.na(circle.length)) return(ans) .fold_and_truncate_coverage(ans, circle.length, width) } ### Return an ordinary list. .normarg_shift_or_weight_list <- function(arg, argname) { if (!is.list(arg)) { if (!(is.numeric(arg) || (is(arg, "Rle") && is.numeric(runValue(arg))) || is(arg, "List"))) stop("'", argname, "' must be a numeric vector ", "or a list-like object") arg <- as.list(arg) } if (length(arg) != 0L) { idx <- which(sapply(arg, is, "Rle")) if (length(idx) != 0L) arg[idx] <- lapply(arg[idx], S4Vectors:::decodeRle) } arg } .check_arg_names <- function(arg, argname, x_names, x_names.label) { arg_names <- names(arg) if (!(is.null(arg_names) || identical(arg_names, x_names))) stop("when '", argname, "' has names, ", "they must be identical to ", x_names.label) } ## Some packages like easyRNASeq or TEQC pass 'width' as a named list-like ## object where each list element is a single number, an NA, or a NULL, when ## calling coverage() on an IntegerRangesList object. They do so because, for ## whatever reason, we've been supporting this for a while. However, it never ## really made sense to support a named list-like object for 'width'. ## TODO: Deprecate support for this. Preferred 'width' form: NULL or an ## integer vector. An that's it. .unlist_width <- function(width, x_names, x_names.label) { if (!identical(names(width), x_names)) stop("when 'width' is a list-like object, it must be named ", "and its names must be identical to ", x_names.label) width_eltNROWS <- elementNROWS(width) if (!all(width_eltNROWS <= 1L)) stop("when 'width' is a list-like object, each list element ", "should contain at most 1 element or be NULL") width[width_eltNROWS == 0L] <- NA_integer_ setNames(unlist(width, use.names=FALSE), x_names) } ### NOT exported but used in the GenomicRanges package. ### Return a SimpleRleList object of the length of 'x'. coverage_CompressedIRangesList <- function(x, shift=0L, width=NULL, weight=1L, circle.length=NA, method=c("auto", "sort", "hash", "naive"), x_names.label="'x' names") { ## Check 'x'. if (!is(x, "CompressedIRangesList")) stop("'x' must be a CompressedIRangesList object") x_names <- names(x) ## Check and normalize 'shift'. shift <- .normarg_shift_or_weight_list(shift, "shift") .check_arg_names(shift, "shift", x_names, x_names.label) ## Check and normalize 'width'. if (is.null(width)) { width <- NA_integer_ } else { if (is.numeric(width)) { .check_arg_names(width, "width", x_names, x_names.label) } else if (is.list(width) || is(width, "List")) { width <- .unlist_width(width, x_names, x_names.label) } else { ## We purposedly omit to mention that 'width' can also be a named ## list-like object because this will be deprecated soon (this is ## why it's not documented in man/coverage-methods.Rd either). stop("'width' must be NULL or an integer vector") } if (!is.integer(width)) width <- setNames(as.integer(width), names(width)) } ## Check and normalize 'weight'. weight <- .normarg_shift_or_weight_list(weight, "weight") .check_arg_names(weight, "weight", x_names, x_names.label) ## Check and normalize 'circle.length'. if (identical(circle.length, NA)) { circle.length <- NA_integer_ } else if (!is.numeric(circle.length)) { stop("'circle.length' must be an integer vector") } else if (!is.integer(circle.length)) { circle.length <- setNames(as.integer(circle.length), names(circle.length)) } .check_arg_names(circle.length, "circle.length", x_names, x_names.label) ## Check and normalize 'method'. method <- match.arg(method) ## Ready to go... ans_listData <- .Call2("C_coverage_CompressedIRangesList", x, shift, width, weight, circle.length, method, PACKAGE="IRanges") ## "Fold" the coverage vectors in 'ans_listData' associated with a ## circular sequence. ## Note that the C code should have raised an error or warning already if ## the length of 'circle.length' or 'width' didn't allow proprer recycling ## to the length of 'x'. So using silent 'rep( , length.out=length(x))' is ## safe. circle.length <- rep(circle.length, length.out=length(x)) fold_idx <- which(!is.na(circle.length)) if (length(fold_idx) != 0L) { width <- rep(width, length.out=length(x)) ## Because we "fold" the coverage vectors in an lapply() loop, it will ## be inefficient if 'x' has a lot of list elements associated with a ## circular sequence. ans_listData[fold_idx] <- lapply(fold_idx, function(i) .fold_and_truncate_coverage(ans_listData[[i]], circle.length[i], width[i])) } names(ans_listData) <- names(x) S4Vectors:::new_SimpleList_from_list("SimpleRleList", ans_listData, metadata=metadata(x), mcols=mcols(x, use.names=FALSE)) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### coverage() generic and methods. ### setGeneric("coverage", signature="x", function(x, shift=0L, width=NULL, weight=1L, ...) standardGeneric("coverage") ) ### NOT exported but used in the GenomicRanges package. replace_with_mcol_if_single_string <- function(arg, x) { if (!isSingleString(arg)) return(arg) x_mcols <- mcols(x, use.names=FALSE) j <- which(colnames(x_mcols) == arg) if (length(j) == 0L) stop(wmsg("'mcols(x)' has no \"", arg, "\" column")) if (length(j) > 1L) stop(wmsg("'mcols(x)' has more than one \"", arg, "\" column")) x_mcols[[j]] } setMethod("coverage", "IntegerRanges", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) { shift <- replace_with_mcol_if_single_string(shift, x) weight <- replace_with_mcol_if_single_string(weight, x) .coverage_IRanges(as(x, "IRanges"), shift=shift, width=width, weight=weight, method=method) } ) ### Overwrite above method with optimized method for StitchedIPos objects. setMethod("coverage", "StitchedIPos", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) { CAN_ONLY_ETC <- c(" can only be a single number when ", "calling coverage() on a StitchedIPos object") if (!isSingleNumber(shift)) stop(wmsg("'shift'", CAN_ONLY_ETC)) if (!isSingleNumber(weight)) stop(wmsg("'weight'", CAN_ONLY_ETC)) x <- x@pos_runs callGeneric() } ) setMethod("coverage", "Views", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) { if (is.null(width)) width <- length(subject(x)) coverage(as(x, "IRanges"), shift=shift, width=width, weight=weight, method=method) } ) setMethod("coverage", "IntegerRangesList", function(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) { x_mcols <- mcols(x, use.names=FALSE) x_mcolnames <- colnames(x_mcols) if (isSingleString(shift)) { if (!(shift %in% x_mcolnames)) stop("the string supplied for 'shift' (\"", shift, "\")", "is not a valid metadata column name of 'x'") shift <- x_mcols[[shift]] } if (isSingleString(width)) { if (!(width %in% x_mcolnames)) stop("the string supplied for 'width' (\"", width, "\")", "is not a valid metadata column name of 'x'") width <- x_mcols[[width]] } if (isSingleString(weight)) { if (!(weight %in% x_mcolnames)) stop("the string supplied for 'weight' (\"", weight, "\")", "is not a valid metadata column name of 'x'") weight <- x_mcols[[weight]] } coverage_CompressedIRangesList(as(x, "CompressedIRangesList"), shift=shift, width=width, weight=weight, method=method) } ) IRanges/R/cvg-methods.R0000644000175100017510000001672214626176651015714 0ustar00biocbuildbiocbuild### ========================================================================= ### cvg() ### ------------------------------------------------------------------------- ### ### A better coverage(). ### ### It all started when I came across this: ### https://stackoverflow.com/questions/17138760/counting-overlaps-of-integer-ranges ### setGeneric("cvg", signature="x", function(x, from=NA, to=NA, weight=1L, varname="cvg", collapse=FALSE, ...) standardGeneric("cvg") ) ### TODO: Methods for IntegerRanges and IntegerRangesList objects (defined in ### this file) need to support the 'circle.length' argument. ### TODO: Implement method for GenomicRanges objects (in GenomicRanges package). ### Should it support the 'ignore.strand' argument? ### TODO: The end user should be able to switch between collapsed and expanded ### representation by calling collapse() and expand() on the returned object. ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Restriction window helpers ### ### Use these helpers in the "restrict" and "gaps" methods. ### ### Should we turn these helpers into methods of a generic function? ### Like effectiveRestrictionWindow()? or effectiveFromTo()? ### Also maybe export and document them so the user can actually use them ### to see what effective restriction windows are being used and also see ### the invalid windows causing an error (by calling ### effectiveRestrictionWindow() with check=FALSE). ### ### Return an integer vector of length 2. effective_restriction_window_for_IntegerRanges <- function(x, from=NA, to=NA, check=TRUE) { stopifnot(is(x, "IntegerRanges"), isSingleNumberOrNA(from), isSingleNumberOrNA(to)) if (!is.integer(from)) from <- as.integer(from) if (!is.integer(to)) to <- as.integer(to) if (is.na(from) || is.na(to)) { if (length(x) == 0L) return(c(from=from, to=to)) x_range <- range(x) if (is.na(from)) from <- start(x_range) if (is.na(to)) to <- end(x_range) } if (check) { width <- to - from + 1L if (width < 0L) stop(wmsg("invalid from-to: ", from, "-", to)) } c(from=from, to=to) } ### Return an N x 2 integer matrix where N is length(x). effective_restriction_windows_for_IntegerRangesList <- function(x, from=NA, to=NA, check=TRUE) { stopifnot(is(x, "IntegerRangesList"), is.numeric(from) || is.logical(from) && all(is.na(from)), is.numeric(to) || is.logical(to) && all(is.na(to))) if (!is.integer(from)) from <- as.integer(from) if (!is.integer(to)) to <- as.integer(to) from <- S4Vectors:::V_recycle(from, x, "from", "x") to <- S4Vectors:::V_recycle(to, x, "to", "x") x_range <- range(x) na_idx <- which(is.na(from)) from[na_idx] <- as.integer(start(x_range))[na_idx] na_idx <- which(is.na(to)) to[na_idx] <- as.integer(end(x_range))[na_idx] if (check) { width <- to - from + 1L if (any(width < 0L, na.rm=TRUE)) stop(wmsg("some of the restriction windows defined by the ", "supplied 'from' and 'to' have a negative width")) } cbind(from=from, to=to) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### "cvg" methods ### ### Takes an IntegerRanges derivative 'x' and returns its coverage as the ### metadata column of an IPos object by default. If 'collapse' is TRUE', ### the coverage is returned in an IRanges object instead. ### The 'from' and 'to' arguments control the window of integer positions ### for which to compute and return coverage. ### Each of these arguments must be a single integer or NA. When set to NA ### (the default), 'from' is replaced internally with 'min(start(x))', ### and 'to' with 'max(end(x))'. ### All the integer positions in the from-to window are represented in the ### returned object. More precisely, the returned IPos or IRanges 'ans' is ### disjoint, strictly sorted, and with no gaps between the ranges, and its ### ranges span the from-to window (i.e. 'reduce(ans)' will return the single ### range from-to). In particular, when 'ans' is an IPos object, 'pos(ans)' ### returns the from:to sequence. cvg_IntegerRanges <- function(x, from=NA, to=NA, weight=1L, varname="cvg", collapse=FALSE) { stopifnot(isSingleString(varname), isTRUEorFALSE(collapse)) from_to <- effective_restriction_window_for_IntegerRanges(x, from, to) shift <- 1L - from_to[[1L]] width <- from_to[[2L]] + shift if (length(x) == 0L) { if (is.na(width) || width == 0L) { ans <- IRanges() } else { ans <- IRanges(from_to[[1L]], width=width) } ## 'weight' determines the type of Rle. cvg0 <- Rle(weight * 0L, sum(width(ans))) } else { ## Compute coverage as an Rle object. cvg0 <- coverage(x, shift=shift, width=width, weight=weight) ans_width <- runLength(cvg0) ans_end <- cumsum(ans_width) - shift ans <- IRanges(end=ans_end, width=ans_width) } if (collapse) { var <- runValue(cvg0) } else { ans <- new_StitchedIPos(ans) var <- cvg0 } mcols(ans) <- S4Vectors:::new_DataFrame(setNames(list(var), varname)) ans } ### Takes an IntegerRangesList derivative 'x' and returns its coverage as the ### inner metadata column of an IPosList object by default. If 'collapse' is ### TRUE, the coverage is returned in an IRangesList object instead. ### The 'from' and 'to' arguments control the windows of integer positions ### for which to compute and return coverage. ### Each of these arguments must be an integer vector parallel to 'x', ### possibly with NAs. If shorter than 'x', they'll be recycled to the length ### of 'x'. ### The object to return is computed with a fast implementation of ### ### mapply(cvg_IntegerRanges, x, from, to, weight, ### MoreArgs=list(varname=varname, collapse=collapse)) ### ### and then returned as an IPosList or IRangesList, obeying 'collapse'. cvg_IntegerRangesList <- function(x, from=NA, to=NA, weight=1L, varname="cvg", collapse=FALSE) { stopifnot(isSingleString(varname), isTRUEorFALSE(collapse)) from_to <- effective_restriction_windows_for_IntegerRangesList(x, from, to) shift <- 1L - unname(from_to[ , 1L]) width <- unname(from_to[ , 2L]) + shift ## Compute coverage as a SimpleRleList object. cvg0 <- coverage(x, shift=shift, width=width, weight=weight) ans_width <- as(runLength(cvg0), "CompressedIntegerList") ans_end <- as(cumsum(ans_width), class(ans_width)) - shift unlisted_ans <- IRanges(end=unlist(ans_end, use.names=FALSE), width=unlist(ans_width, use.names=FALSE)) if (collapse) { var <- unlist(runValue(cvg0), use.names=FALSE) } else { unlisted_ans <- new_StitchedIPos(unlisted_ans) if (length(cvg0) == 0L) { ## 'weight' determines the type of Rle. var <- Rle(weight * 0L, 0L) } else { var <- unlist(cvg0, use.names=FALSE) } } mcols(unlisted_ans) <- S4Vectors:::new_DataFrame( setNames(list(var), varname)) if (collapse) { ans <- relist(unlisted_ans, ans_width) } else { ans <- relist(unlisted_ans, cvg0) } ans } setMethod("cvg", "IntegerRanges", cvg_IntegerRanges) setMethod("cvg", "IntegerRangesList", cvg_IntegerRangesList) IRanges/R/extractList.R0000644000175100017510000002756014626176651016004 0ustar00biocbuildbiocbuild### ========================================================================= ### Group elements of a vector-like object into a list-like object ### ------------------------------------------------------------------------- ### ### What should go in this file? ### ### - All "relist" methods defined in IRanges should go here. ### - extractList() generic and default method. ### ### TODO: Maybe put the default methods for the reverse transformations here ### (unlist, unsplit, and unsplit<-). ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### relist() ### setMethod("relist", c("ANY", "PartitioningByEnd"), function(flesh, skeleton) { ans_class <- relistToClass(flesh) skeleton_len <- length(skeleton) if (skeleton_len == 0L) { flesh_len2 <- 0L } else { flesh_len2 <- end(skeleton)[skeleton_len] } if (NROW(flesh) != flesh_len2) stop("shape of 'skeleton' is not compatible with 'NROW(flesh)'") if (extends(ans_class, "CompressedList")) return(newCompressedList0(ans_class, flesh, skeleton)) if (!extends(ans_class, "SimpleList")) stop("don't know how to split or relist a ", class(flesh), " object as a ", ans_class, " object") listData <- lapply(skeleton, function(i) extractROWS(flesh, i)) ## TODO: Once "window" methods have been revisited/tested and ## 'window(flesh, start=start, end=end)' is guaranteed to do the ## right thing for any 'flesh' object (in particular it subsets a ## data.frame-like object along the rows), then replace the line above ## by the code below (which should be more efficient): #skeleton_start <- start(skeleton) #skeleton_end <- end(skeleton) #FUN <- function(start, end) window(flesh, start=start, end=end) #names(skeleton_start) <- names(skeleton) #listData <- mapply(FUN, skeleton_start, skeleton_end) ## or, if we don't trust mapply(): #skeleton_start <- start(skeleton) #skeleton_end <- end(skeleton) #X <- seq_len(skeleton_len) #names(X) <- names(skeleton) #listData <- lapply(X, function(i) window(flesh, # start=skeleton_start[i], # end=skeleton_end[i])) S4Vectors:::new_SimpleList_from_list(ans_class, listData) } ) setMethod("relist", c("ANY", "List"), function(flesh, skeleton) { relist(flesh, PartitioningByEnd(skeleton)) } ) setMethod("relist", c("Vector", "list"), function(flesh, skeleton) { relist(flesh, PartitioningByEnd(skeleton)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### default_splitAsList() ### ### Equivalent to 'setNames(splitAsList(unname(x), seq_along(x)), names(x))' ### but slightly faster (e.g. twice faster on 'IRanges(1, 1:500000)'). .dumb_splitAsList <- function(x) { f <- PartitioningByEnd(seq_along(x), names=names(x)) relist(unname(x), f) } ### 'f' is assumed to be an integer vector with no NAs. .splitAsList_by_integer <- function(x, f, drop) { if (length(f) > NROW(x)) stop("'f' cannot be longer than 'NROW(x)' when it's an integer vector") if (!identical(drop, FALSE)) warning("'drop' is ignored when 'f' is an integer vector") f_is_not_sorted <- S4Vectors:::isNotSorted(f) if (f_is_not_sorted) { idx <- base::order(f) f <- f[idx] x <- extractROWS(x, idx) } tmp <- Rle(f) f <- cumsum(runLength(tmp)) names(f) <- as.character(runValue(tmp)) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be a factor with no NAs. .splitAsList_by_factor <- function(x, f, drop) { x_NROW <- NROW(x) f_len <- length(f) f_levels <- levels(f) f <- as.integer(f) if (f_len > x_NROW) f <- head(f, n=x_NROW) f_is_not_sorted <- S4Vectors:::isNotSorted(f) if (f_is_not_sorted) { idx <- base::order(f) x <- extractROWS(x, idx) } f <- tabulate(f, nbins=length(f_levels)) names(f) <- f_levels if (drop) f <- f[f != 0L] f <- cumsum(f) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be an integer-Rle object with no NAs. .splitAsList_by_integer_Rle <- function(x, f, drop) { if (length(f) > NROW(x)) stop("'f' cannot be longer than data when it's an integer-Rle") if (!identical(drop, FALSE)) warning("'drop' is ignored when 'f' is an integer-Rle") f_vals <- runValue(f) f_lens <- runLength(f) f_is_not_sorted <- S4Vectors:::isNotSorted(f_vals) if (f_is_not_sorted) { idx <- base::order(f_vals) xranges <- successiveIRanges(f_lens)[idx] f_vals <- f_vals[idx] f_lens <- f_lens[idx] x <- extractROWS(x, xranges) } tmp <- Rle(f_vals, f_lens) f <- cumsum(runLength(tmp)) names(f) <- as.character(runValue(tmp)) f <- PartitioningByEnd(f) relist(x, f) } ### 'f' is assumed to be an Rle object with no NAs. .splitAsList_by_Rle <- function(x, f, drop) { x_NROW <- NROW(x) f_len <- length(f) f_vals <- runValue(f) if (!is.factor(f_vals)) { f_vals <- as.factor(f_vals) if (f_len > x_NROW) { runValue(f) <- f_vals f <- head(f, n=x_NROW) f_vals <- runValue(f) } } else if (f_len > x_NROW) { f <- head(f, n=x_NROW) f_vals <- runValue(f) } f_lens <- runLength(f) f_levels <- levels(f_vals) f_vals <- as.integer(f_vals) f_is_not_sorted <- S4Vectors:::isNotSorted(f_vals) if (f_is_not_sorted) { idx <- base::order(f_vals) xranges <- successiveIRanges(f_lens)[idx] x <- extractROWS(x, xranges) f <- S4Vectors:::tabulate2(f_vals, nbins=length(f_levels), weight=f_lens) if (drop) { f_levels <- f_levels[f != 0L] f <- f[f != 0L] } } else if (length(f_vals) == length(f_levels) || drop) { if (drop) f_levels <- as.character(runValue(f)) f <- f_lens } else { f <- integer(length(f_levels)) f[f_vals] <- f_lens } names(f) <- f_levels f <- cumsum(f) f <- PartitioningByEnd(f) relist(x, f) } .to_factor <- function(x) { if (is(x, "Rle")) { runValue(x) <- as.factor(runValue(x)) x } else as.factor(x) } ### Took this out of the still-in-incubation LazyList package .interaction2 <- function(factors) { nI <- length(factors) nx <- length(factors[[1L]]) factors <- lapply(factors, .to_factor) useRle <- any(vapply(factors, is, logical(1), "Rle")) if (useRle) { group <- as(factors[[1L]], "Rle") runValue(group) <- as.integer(runValue(group)) } else { group <- as.integer(factors[[1L]]) } ngroup <- nlevels(factors[[1L]]) for (i in tail(seq_len(nI), -1L)) { index <- factors[[i]] if (useRle) { offset <- as(index, "Rle") runValue(offset) <- ngroup * (as.integer(runValue(offset)) - 1L) } else { offset <- ngroup * (as.integer(index) - 1L) } group <- group + offset ngroup <- ngroup * nlevels(index) } if (useRle) { runValue(group) <- structure(runValue(group), levels=as.character(seq_len(ngroup)), class="factor") group } else { structure(group, levels=as.character(seq_len(ngroup)), class="factor") } } .normarg_f <- function(f, x) { if (is(f, "formula")) { if (length(f) == 3L) stop("formula 'f' should not have a left hand side") f <- S4Vectors:::formulaValues(x, f) } if (is.list(f) || is(f, "List")) { if (length(f) == 1L) { f <- .to_factor(f[[1L]]) } else { f <- .interaction2(f) } } f_len <- length(f) if (f_len < NROW(x)) { if (f_len == 0L) stop("split factor has length 0 but 'NROW(x)' is > 0") if (NROW(x) %% f_len != 0L) warning("'NROW(x)' is not a multiple of split factor length") f <- rep(f, length.out=NROW(x)) } f } ## about 3X faster than as.factor on a ~450k tx ids ## caveats: no NAs, and radix sort of levels does not support all encodings ## todo: Would be faster if sort() returned grouping info, ## but then we might coalesce this with the order/split. ## todo: if we could pass na.rm=TRUE to grouping(), NAs would be handled .as_factor <- function(x) { if (is.factor(x)) return(x) if (is.null(x)) return(factor()) g <- grouping(x) p <- PartitioningByEnd(relist(g)) levs <- as.character(x[g[end(p)]]) if (is.character(x)) { o <- order(levs, method="radix") map <- integer(length(levs)) # or rep(NA_integer_, length(levs)) for NAs map[o] <- seq_along(o) ref <- map[togroup(p)] levs <- levs[o] } else { ref <- togroup(p) } f <- integer(length(x)) f[g] <- ref structure(f, levels=levs, class="factor") } ### Called by the splitAsList,ANY,ANY method defined in the S4Vectors package. default_splitAsList <- function(x, f, drop=FALSE) { if (!isTRUEorFALSE(drop)) stop("'drop' must be TRUE or FALSE") if (missing(f)) return(.dumb_splitAsList(x)) f <- .normarg_f(f, x) if (anyNA(f)) { keep_idx <- which(!is.na(f)) x <- extractROWS(x, keep_idx) f <- f[keep_idx] } if (is.integer(f)) return(.splitAsList_by_integer(x, f, drop)) if (!is(f, "Rle")) { f <- .as_factor(f) return(.splitAsList_by_factor(x, f, drop)) } ## From now on, 'f' is guaranteed to be an Rle. f_vals <- runValue(f) if (!((is.vector(f_vals) && is.atomic(f_vals)) || is.factor(f_vals))) stop("'f' must be an atomic vector or a factor (possibly in Rle form)") if (is.integer(f_vals)) return(.splitAsList_by_integer_Rle(x, f, drop)) return(.splitAsList_by_Rle(x, f, drop)) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### extractList() ### ### Would extractGroups be a better name for this? ### Or extractGroupedROWS? (analog to extractROWS, except that the ROWS are ### grouped). ### ### 'x' must be a vector-like object and 'i' a list-like object. ### Must return a list-like object parallel to 'i' and with same "shape" as ### 'i' (i.e. same elementNROWS). If 'i' has names, they should be ### propagated to the returned value. The list elements of the returned value ### must have the class of 'x'. setGeneric("extractList", function(x, i) standardGeneric("extractList")) ### Default method. setMethod("extractList", c("ANY", "ANY"), function(x, i) { if (is(i, "IntegerRanges")) return(relist(extractROWS(x, i), i)) if (is.list(i)) { unlisted_i <- unlist(i, recursive=FALSE, use.names=FALSE) } else { i <- as(i, "List", strict=FALSE) ## The various "unlist" methods for List derivatives don't know ## how to operate recursively and don't support the 'recursive' ## arg. unlisted_i <- unlist(i, use.names=FALSE) } relist(extractROWS(x, unlisted_i), i) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### resplit() and regroup() ### ### Similar to regroupBySupergroup() but there is no assumption that ### the new grouping is a super-grouping of the current grouping. For ### resplit(), the grouping is expressed as a factor, although it is ### effectively a synonym of regroup(), since the latter coerces the ### input to a Grouping. ### resplit <- function(x, f) { regroup(x, f) } regroup <- function(x, g) { g <- as(g, "Grouping") gends <- end(PartitioningByEnd(g)) xg <- x[unlist(g, use.names=FALSE)] p <- PartitioningByEnd(end(PartitioningByEnd(xg))[gends]) names(p) <- names(g) relist(unlist(xg, use.names=FALSE, recursive=FALSE), p) } IRanges/R/extractListFragments.R0000644000175100017510000002227614626176651017652 0ustar00biocbuildbiocbuild### ========================================================================= ### extractListFragments() ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### regroupBySupergroup() ### ### A very efficient way to concatenate groups of successive list elements ### in 'x'. ### 'x' must be a list-like object (typically a CompressedList object). ### 'supergroups' must be an object that defines a partitioning of ### 'seq_along(x)' (i.e. it could be used to do ### 'relist(seq_along(x), supergroups)'). It will be immediately replaced with ### 'PartitioningByEnd(supergroups)' so it should be an object that is ### accepted by the PartitioningByEnd() constructor (note that this constructor ### is a no-op if 'supergroups' is already a PartitioningByEnd object). ### Return a list-like object of the same elementType() as 'x' and parallel ### to 'supergroups'. The names on 'supergroups' are propagated but not the ### metadata columns. ### ### Some properties: ### - Behaves as an endomorphism on a CompressedList or PartitioningByEnd ### object. ### - This ### regroupBySupergroup(x, length(x))[[1L]] ### is equivalent to ### unlist(x, use.names=FALSE) ### ### Other possible names for regroupBySupergroup: regroup, ### mergeGroupsInSupergroups, combineGroupsOfListElements, ### unlistGroupsOfListElements, unlistBySupergroup. ### ### TODO: Maybe export and document this? regroupBySupergroup <- function(x, supergroups) { supergroups <- PartitioningByEnd(supergroups) x_breakpoints <- end(PartitioningByEnd(x)) ans_breakpoints <- x_breakpoints[end(supergroups)] nleading0s <- length(supergroups) - length(ans_breakpoints) if (nleading0s != 0L) ans_breakpoints <- c(rep.int(0L, nleading0s), ans_breakpoints) ans_partitioning <- PartitioningByEnd(ans_breakpoints, names=names(supergroups)) if (is(x, "PartitioningByEnd")) return(ans_partitioning) relist(unlist(x, use.names=FALSE), ans_partitioning) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### extractListFragments() ### INCOMPATIBLE_ARANGES_MSG <- c( "'aranges' is incompatible with the cumulated ", "length of all the list elements in 'x'" ) ### If 'hits.only' is FALSE (the default), return a Partitioning object of ### the same class as 'x' (endomorphism). Otherwise, return a list of 2 ### integer vectors of the same length. .extractPartitioningFragments_by_Partitioning <- function(x, aranges, hits.only=FALSE, msg.if.incompatible=INCOMPATIBLE_ARANGES_MSG) { if (!is(x, "Partitioning")) stop(wmsg("'x' must be a Partitioning object")) if (!is(aranges, "Partitioning")) stop(wmsg("'aranges' must be a Partitioning object")) if (!isTRUEorFALSE(hits.only)) stop(wmsg("'hits.only' must be TRUE or FALSE")) if (!is.character(msg.if.incompatible)) stop(wmsg("'msg.if.incompatible' must be a character vector")) x_end <- end(x) aranges_end <- end(aranges) if (S4Vectors:::last_or(x_end, 0L) != S4Vectors:::last_or(aranges_end, 0L)) stop(wmsg(msg.if.incompatible)) C_ans <- .Call2("C_find_partition_overlaps", x_end, aranges_end, !hits.only, PACKAGE="IRanges") if (hits.only) return(C_ans) revmap <- C_ans[[1L]] revmap2 <- C_ans[[2L]] ans_names <- names(x)[revmap] ans <- new2("PartitioningByEnd", end=C_ans[[3L]], NAMES=ans_names, check=FALSE) ans <- as(ans, class(x)) mcols(ans) <- DataFrame(revmap=revmap, revmap2=revmap2) ans } .extractListFragments_by_Partitioning <- function(x, aranges, use.mcols=FALSE, msg.if.incompatible=INCOMPATIBLE_ARANGES_MSG) { if (is(x, "Partitioning")) { ans <- .extractPartitioningFragments_by_Partitioning(x, aranges, msg.if.incompatible=msg.if.incompatible) if (use.mcols) { revmap <- mcols(ans, use.names=FALSE)[ , "revmap"] mcols(ans) <- mcols(x, use.names=FALSE)[revmap, , drop=FALSE] } return(ans) } if (!is(x, "List")) { if (!is.list(x)) stop(wmsg("'x' must be a list-like object")) if (!use.mcols) stop(wmsg("'use.mcols' must be set to TRUE ", "when 'x' is an ordinary list")) } ## Will work out-of-the box on any List derivative 'x' that supports [ ## and windows() e.g. all the AtomicList derivatives, IRanges, GRanges, ## DNAStringSet, DNAStringSetList, GAlignments, GAlignmentsList objects ## and more... x_partitioning <- PartitioningByEnd(x) hits <- .extractPartitioningFragments_by_Partitioning( x_partitioning, aranges, hits.only=TRUE, msg.if.incompatible=msg.if.incompatible) revmap <- hits[[1L]] revmap2 <- hits[[2L]] ans <- x[revmap] if (!use.mcols) mcols(ans) <- DataFrame(revmap=revmap, revmap2=revmap2) Ltrim <- pmax(start(aranges)[revmap2] - start(x_partitioning)[revmap], 0L) Rtrim <- pmax(end(x_partitioning)[revmap] - end(aranges)[revmap2], 0L) windows(ans, start=1L+Ltrim, end=-1L-Rtrim) } ### Return a PartitioningByEnd object of length 2 * length(aranges) + 1. .make_PartitioningByEnd_from_aranges <- function(aranges, x, msg.if.incompatible) { if (!is(aranges, "IntegerRanges")) stop(wmsg("'aranges' must be an IntegerRanges derivative ", "(e.g. an IRanges object")) ## Check that 'aranges' is disjoint and sorted. ## This is the case if and only if 'start_end' is sorted. If 'aranges' ## is a NormalIRanges or Partitioning object, then it's disjoint and sorted ## so we can skip this check. start_end <- as.vector( rbind(start(aranges) - 1L, end(aranges), deparse.level=0) ) if (!is(aranges, "NormalIRanges") && !is(aranges, "Partitioning") && S4Vectors:::isNotSorted(start_end)) stop(wmsg("'aranges' must be disjoint and sorted")) ## Check that 'aranges' is compatible with 'x'. x_cumlen <- nobj(PartitioningByEnd(x)) start_end_len <- length(start_end) # = 2 * length(aranges) if (start_end_len >= 2L && (start_end[[1L]] < 0L || start_end[[start_end_len]] > x_cumlen)) stop(wmsg(msg.if.incompatible)) ans_end <- c(start_end, x_cumlen) new2("PartitioningByEnd", end=ans_end, check=FALSE) } ### Act as an endomorphism. ### 'x' must be a list-like object. ### 'aranges' must be an IntegerRanges object that is disjoint, sorted, ### and compatible with the cumulated length of all the list elements in 'x'. extractListFragments <- function(x, aranges, use.mcols=FALSE, msg.if.incompatible=INCOMPATIBLE_ARANGES_MSG) { if (!isTRUEorFALSE(use.mcols)) stop(wmsg("'use.mcols' must be TRUE or FALSE")) if (is(aranges, "Partitioning")) { ans <- .extractListFragments_by_Partitioning(x, aranges, use.mcols=use.mcols, msg.if.incompatible=msg.if.incompatible) return(ans) } aranges <- .make_PartitioningByEnd_from_aranges(aranges, x, INCOMPATIBLE_ARANGES_MSG) ans <- .extractListFragments_by_Partitioning(x, aranges, msg.if.incompatible=msg.if.incompatible) revmap2 <- mcols(ans, use.names=FALSE)[ , "revmap2"] ans <- ans[revmap2 %% 2L == 0L] if (use.mcols) { revmap <- mcols(ans, use.names=FALSE)[ , "revmap"] mcols(ans) <- mcols(x, use.names=FALSE)[revmap, , drop=FALSE] } else { mcols(ans)[ , "revmap2"] <- mcols(ans, use.names=FALSE)[ , "revmap2"] %/% 2L } ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### equisplit() ### ### A simple wrapper to extractListFragments() ### ### Will work out-of-the box on any object 'x' that supports ### extractListFragments() **and relist()** e.g. IRanges, GRanges, ### DNAStringSet, GAlignments objects and more... Won't work on AtomicList ### derivatives or DNAStringSetList or GAlignmentsList objects because they ### don't support relist(). equisplit <- function(x, nchunk, chunksize, use.mcols=FALSE) { if (!isTRUEorFALSE(use.mcols)) stop(wmsg("'use.mcols' must be TRUE or FALSE")) x_cumlen <- nobj(PartitioningByEnd(x)) aranges <- breakInChunks(x_cumlen, nchunk=nchunk, chunksize=chunksize) unlisted_ans <- extractListFragments(x, aranges) unlisted_ans_mcols <- mcols(unlisted_ans, use.names=FALSE) revmap <- unlisted_ans_mcols[ , "revmap"] revmap2 <- unlisted_ans_mcols[ , "revmap2"] if (use.mcols) { mcols(unlisted_ans) <- mcols(x, use.names=FALSE)[revmap, , drop=FALSE] } else { mcols(unlisted_ans) <- DataFrame(revmap=revmap) } ans_partitioning <- PartitioningByEnd(revmap2, NG=length(aranges)) relist(unlisted_ans, ans_partitioning) } IRanges/R/findOverlaps-methods.R0000644000175100017510000005067714626176651017600 0ustar00biocbuildbiocbuild### ========================================================================= ### findOverlaps (and related) methods ### ------------------------------------------------------------------------- ### ## internal generic setGeneric("process_self_hits", signature="x", # not exported function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) standardGeneric("process_self_hits")) setMethod("process_self_hits", "SortedByQueryHits", function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) { x <- as(x, "SortedByQuerySelfHits") select <- match.arg(select) if (!isTRUEorFALSE(drop.self)) stop("'drop.self' must be TRUE or FALSE") if (!isTRUEorFALSE(drop.redundant)) stop("'drop.redundant' must be TRUE or FALSE") if (drop.self) { self_idx <- which(isSelfHit(x)) if (length(self_idx) != 0L) x <- x[-self_idx] } if (drop.redundant) { redundant_idx <- which(isRedundantHit(x)) if (length(redundant_idx) != 0L) x <- x[-redundant_idx] } selectHits(x, select=select) } ) setMethod("process_self_hits", "SortedByQueryHitsList", function(x, select=c("all", "first", "last", "arbitrary"), drop.self=FALSE, drop.redundant=FALSE) { x <- as(x, "SortedByQuerySelfHitsList") select <- match.arg(select) ans <- lapply(x, process_self_hits, select, drop.self, drop.redundant) if (select != "all") return(IntegerList(ans)) S4Vectors:::new_SimpleList_from_list("HitsList", ans, subjectOffsets = x@subjectOffsets) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### findOverlaps() ### ### Find objects in the query that overlap those in the subject. ### setGeneric("findOverlaps", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ...) standardGeneric("findOverlaps") ) findOverlaps_IntegerRanges <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary")) { if (is.integer(query)) query <- IRanges(query, width=1L) type <- match.arg(type) select <- match.arg(select) findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) } setMethod("findOverlaps", c("IntegerRanges", "IntegerRanges"), findOverlaps_IntegerRanges ) setMethod("findOverlaps", c("integer", "IntegerRanges"), findOverlaps_IntegerRanges ) setMethod("findOverlaps", c("Vector", "missing"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ..., drop.self=FALSE, drop.redundant=FALSE) { select <- match.arg(select) result <- findOverlaps(query, query, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), select="all", ...) process_self_hits(result, select, drop.self, drop.redundant) } ) # might consider making this the following: # setMethod("findOverlaps", c("IntegerRangesList", "IntegerRangesList"), # function(query, subject, maxgap = -1L, minoverlap = 0L, # type = c("any", "start", "end", "within", "equal"), # select = c("all", "first", "last", "arbitrary"), # drop = FALSE) # { # findOverlaps(query, NCLists(query), # maxgap = maxgap, minoverlap = minoverlap, # type = match.arg(type), select = match.arg(select), drop = drop) # } # ) setMethod("findOverlaps", c("IntegerRangesList", "IntegerRangesList"), function(query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), drop = FALSE) { type <- match.arg(type) select <- match.arg(select) query <- as.list(query) subject <- as.list(subject) origSubject <- subject if (!is.null(names(subject)) && !is.null(names(query))) { subject <- subject[names(query)] names(subject) <- names(query) # get rid of NA's in names } else { subject <- subject[seq_along(query)] } ## NULL's are introduced where they do not match ## We replace those with empty IRanges subject[S4Vectors:::sapply_isNULL(subject)] <- list(IRanges()) ans <- lapply(seq_len(length(subject)), function(i) { findOverlaps(query[[i]], subject[[i]], maxgap = maxgap, minoverlap = minoverlap, type = type, select = select) }) names(ans) <- names(subject) if (select == "all") { ans <- HitsList(ans, origSubject) } else if (drop) { off <- head(c(0L, cumsum(sapply(origSubject, length))), -1) names(off) <- names(origSubject) if (is.null(names(ans))) off <- off[seq_along(ans)] else off <- off[names(ans)] ans <- unlist(ans, use.names=FALSE) + rep.int(unname(off), sapply(ans, length)) } else { ans <- IntegerList(ans) } ans }) setMethod("findOverlaps", c("Pairs", "missing"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("Pairs", "ANY"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("ANY", "Pairs"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(query, zipup(subject), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) setMethod("findOverlaps", c("Pairs", "Pairs"), function (query, subject, maxgap = -1L, minoverlap = 0L, type = c("any", "start", "end", "within", "equal"), select = c("all", "first", "last", "arbitrary"), ...) { findOverlaps(zipup(query), zipup(subject), maxgap=maxgap, minoverlap=minoverlap, type=type, select=select, ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### countOverlaps() ### setGeneric("countOverlaps", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("countOverlaps") ) .default_countOverlaps <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { if (is.integer(query)) query <- IRanges(query, width=1L) type <- match.arg(type) if (missing(subject)) { hits <- findOverlaps(query, maxgap=maxgap, minoverlap=minoverlap, type=type, ...) } else { hits <- findOverlaps(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, ...) } ans <- countQueryHits(hits) names(ans) <- names(query) ans } setMethod("countOverlaps", c("Vector", "Vector"), .default_countOverlaps) setMethod("countOverlaps", c("integer", "Vector"), .default_countOverlaps) setMethod("countOverlaps", c("Vector", "missing"), .default_countOverlaps) countOverlaps_IntegerRanges <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal")) { type <- match.arg(type) ans <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="count") names(ans) <- names(query) ans } setMethod("countOverlaps", c("IntegerRanges", "IntegerRanges"), countOverlaps_IntegerRanges ) setMethod("countOverlaps", c("IntegerRangesList", "IntegerRangesList"), function(query, subject, maxgap=-1L, minoverlap=0L, type = c("any", "start", "end", "within", "equal")) { IntegerList(mapply(countOverlaps, query, subject, MoreArgs = list(maxgap = maxgap, minoverlap = minoverlap, type = match.arg(type)), SIMPLIFY = FALSE)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### overlapsAny() ### ### Same args and signature as countOverlaps(). setGeneric("overlapsAny", signature=c("query", "subject"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("overlapsAny") ) ### NOT exported but used in the bsseq package. default_overlapsAny <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { if (is.integer(query)) query <- IRanges(query, width=1L) type <- match.arg(type) if (missing(subject)) { ahit <- findOverlaps(query, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary", ...) } else { ahit <- findOverlaps(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary", ...) } !is.na(ahit) } setMethod("overlapsAny", c("Vector", "Vector"), default_overlapsAny) setMethod("overlapsAny", c("integer", "Vector"), default_overlapsAny) setMethod("overlapsAny", c("Vector", "missing"), default_overlapsAny) setMethod("overlapsAny", c("IntegerRangesList", "IntegerRangesList"), function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) { query <- as.list(query) subject <- as.list(subject) type <- match.arg(type) if (!is.null(names(query)) && !is.null(names(subject))) { subject <- subject[names(query)] names(subject) <- names(query) # get rid of NA's in names } else { subject <- subject[seq_along(query)] } ## NULL's are introduced where they do not match ## We replace those with empty IRanges subject[S4Vectors:::sapply_isNULL(subject)] <- list(IRanges()) LogicalList(lapply(structure(seq_len(length(query)), names = names(query)), function(i) overlapsAny(query[[i]], subject[[i]], maxgap=maxgap, minoverlap=minoverlap, type=type, ...))) } ) ### Convenience wrappers for the 3 most common use cases. `%over%` <- function(query, subject) overlapsAny(query, subject) `%within%` <- function(query, subject) overlapsAny(query, subject, type="within") `%outside%` <- function(query, subject) !overlapsAny(query, subject) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### subsetByOverlaps() ### ### First 2 arguments are 'x' and 'ranges' like for the ### transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps() functions ### from the GenomicFeatures package and the snpsByOverlaps() function from ### the BSgenome package. setGeneric("subsetByOverlaps", signature=c("x", "ranges"), function(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) standardGeneric("subsetByOverlaps") ) ### NOT exported but used in the bsseq package. default_subsetByOverlaps <- function(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) { ov_any <- overlapsAny(x, ranges, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type), ...) if (invert) ov_any <- !ov_any x[ov_any] } setMethod("subsetByOverlaps", c("Vector", "Vector"), default_subsetByOverlaps) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### overlapsRanges() ### ### Extracts the actual regions of intersection between the overlapping ranges. ### setGeneric("overlapsRanges", signature=c("query", "subject"), function(query, subject, hits=NULL, ...) standardGeneric("overlapsRanges") ) setMethod("overlapsRanges", c("IntegerRanges", "IntegerRanges"), function(query, subject, hits=NULL, ...) { if (is.null(hits)) { hits <- findOverlaps(query, subject, ...) } else { if (!is(hits, "Hits")) stop("'hits' must be a Hits object") if (length(list(...)) != 0L) stop(wmsg("Extra arguments are only accepted when the 'hits' ", "argument is not supplied, in which case they are ", "passed to the internal call to findOverlaps(). ", "See ?overlapsRanges for more information.")) if (queryLength(hits) != length(query) || subjectLength(hits) != length(subject)) stop("'hits' is not compatible with 'query' and 'subject'") } ### Could be replaced by 1-liner: ### pintersect(query[queryHits(hits)], subject[subjectHits(hits)]) ### but will fail if 'query' or 'subject' is a kind of IntegerRanges ### object that cannot be subsetted (e.g. Partitioning object). m <- as.matrix(hits) qstart <- start(query)[m[,1L]] qend <- end(query)[m[,1L]] sstart <- start(subject)[m[,2L]] send <- end(subject)[m[,2L]] IRanges(pmax.int(qstart, sstart), pmin.int(send, qend)) } ) setMethod("overlapsRanges", c("IntegerRangesList", "IntegerRangesList"), function(query, subject, hits=NULL, ...) { if (is.null(hits)) { hits <- findOverlaps(query, subject, ...) } else { if (!is(hits, "HitsList")) stop("'hits' must be a HitsList object") if (length(list(...)) != 0L) stop(wmsg("Extra arguments are only accepted when the 'hits' ", "argument is not supplied, in which case they are ", "passed to the internal call to findOverlaps(). ", "See ?overlapsRanges for more information.")) if (length(hits) != length(query) || length(hits) != length(subject)) stop("'query', 'subject', and 'hits' must have the same length") } queries <- as.list(query, use.names = FALSE) subjects <- as.list(subject, use.names = FALSE) els <- as.list(hits, use.names = FALSE) ans <- lapply(seq_len(length(hits)), function(i) overlapsRanges(queries[[i]], subjects[[i]], els[[i]])) ans <- as(ans, "SimpleIRangesList") names(ans) <- names(hits) ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### poverlaps() ### setGeneric("poverlaps", signature=c("query", "subject"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal"), ...) standardGeneric("poverlaps") ) setMethod("poverlaps", c("IntegerRanges", "IntegerRanges"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { stopifnot(isSingleNumber(maxgap)) stopifnot(isSingleNumber(minoverlap)) type <- match.arg(type) if (type == "any") { query <- query + maxgap } else if (type == "within") { if (maxgap > 0L) { warning("'maxgap' is ignored when type=='within'") } return(start(query) >= start(subject) & end(query) <= end(subject) & width(query) >= minoverlap) } amount <- pmin(end(query), end(subject)) - pmax(start(query), start(subject)) + 1L overlaps <- amount >= minoverlap samePos <- function(x, y) { x <= (y + maxgap) & x >= (y - maxgap) } keep <- switch(type, any = TRUE, start = samePos(start(query), start(subject)), end = samePos(end(query), end(subject)), equal = samePos(start(query), start(subject)) & samePos(end(query), end(subject))) overlaps & keep } ) setMethod("poverlaps", c("integer", "IntegerRanges"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { poverlaps(IRanges(query, width=1L), subject, maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type)) }) setMethod("poverlaps", c("IntegerRanges", "integer"), function(query, subject, maxgap=0L, minoverlap=1L, type=c("any", "start", "end", "within", "equal")) { poverlaps(query, IRanges(subject, width=1L), maxgap=maxgap, minoverlap=minoverlap, type=match.arg(type)) }) ### Convenience operators for poverlaps() `%pover%` <- function(query, subject) poverlaps(query, subject) `%pwithin%` <- function(query, subject) poverlaps(query, subject, type="within") `%poutside%` <- function(query, subject) !poverlaps(query, subject) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Merge two sets of ranges by overlap into a DataFrame ### mergeByOverlaps <- function(query, subject, ...) { hits <- findOverlaps(query, subject, ...) query_df <- as(extractROWS(query, queryHits(hits)), "DataFrame") colnames(query_df)[1L] <- deparse(substitute(query)) subject_df <- as(extractROWS(subject, subjectHits(hits)), "DataFrame") colnames(subject_df)[1L] <- deparse(substitute(subject)) cbind(query_df, subject_df) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Convenience for dereferencing overlap hits to a Pairs ### findOverlapPairs <- function(query, subject, ...) { if (missing(subject)) { hits <- findOverlaps(query, ...) subject <- query } else { hits <- findOverlaps(query, subject, ...) } Pairs(query, subject, hits=hits) } IRanges/R/inter-range-methods.R0000644000175100017510000004551114626176651017346 0ustar00biocbuildbiocbuild### ========================================================================= ### Inter-range methods ### ------------------------------------------------------------------------- ### ### TODO: We need a ranges() setter for Views objects that provides this ### functionality. Once we have it, use it instead of this. .set_Views_ranges <- function(x, new_ranges) { new_mcols <- mcols(new_ranges, use.names=FALSE) mcols(new_ranges) <- NULL BiocGenerics:::replaceSlots(x, ranges=new_ranges, elementMetadata=new_mcols, check=FALSE) } ### NOT exported but used in the GenomicRanges package global2local_revmap <- function(unlisted_revmap, y, x) { offsets <- rep.int(start(PartitioningByEnd(x)) - 1L, elementNROWS(y)) unlisted_revmap - offsets } ### NOT exported but used in the GenomicFeatures package local2global_revmap <- function(unlisted_revmap, y, x) { offsets <- rep.int(start(PartitioningByEnd(x)) - 1L, elementNROWS(y)) unlisted_revmap + offsets } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### range() ### ### Always return an IRanges (or NormalIRanges) *instance* whatever ### IntegerRanges derivative the input is, so does NOT act like an ### endomorphism in general. setMethod("range", "IntegerRanges", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!identical(na.rm, FALSE)) warning("\"range\" method for IntegerRanges objects ", "ignores the 'na.rm' argument") args <- unname(list(x, ...)) ## TODO: Replace line below with ## args <- lapply(args, ranges) ## when ranges() works on IntegerRanges objects. args <- lapply(args, function(arg) IRanges(start(arg), width=width(arg))) ir <- do.call(c, args) ans <- .Call2("C_range_IRanges", ir, PACKAGE="IRanges") if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") if (with.revmap){ mcols(ans) <- DataFrame(revmap=IntegerList(seq_along(ir))) } ans } ) ### Overwrite above method with optimized method for StitchedIPos objects. ### Like the above method, return an IRanges instance. setMethod("range", "StitchedIPos", function(x, ..., with.revmap=FALSE, ignore.strand=FALSE, na.rm=FALSE) { x <- x@pos_runs callGeneric() } ) setMethod("range", "IntegerRangesList", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (length(list(x, ...)) >= 2L) x <- merge(x, ...) endoapply(x, range, with.revmap=with.revmap) } ) ### Equivalent to, but much faster than, 'endoapply(x, range)'. .range_CompressedIRangesList <- function(x, with.revmap=FALSE) { ## 'x_start' and 'x_end' are CompressedIntegerList objects with the ## same shape as 'x'. x_start <- start(x) x_end <- end(x) ## TEMPORARY HACK! if (!requireNamespace("XVector", quietly=TRUE)) stop("the XVector package is required by the \"range\" method ", "for CompressedIRangesList objects") ## 'sv' and 'ev' are XIntegerViews objects (see XVector package). sv <- Views(x_start@unlistData, x_start@partitioning) ev <- Views(x_end@unlistData, x_end@partitioning) is_not_empty_view <- width(sv) != 0L # same as 'width(ev) != 0L' unlisted_ans <- IRanges(viewMins(sv)[is_not_empty_view], viewMaxs(ev)[is_not_empty_view]) ans_partitioning <- PartitioningByEnd(cumsum(is_not_empty_view)) if (with.revmap) { x_partitioning <- unname(PartitioningByEnd(x)) global_revmap <- relist(seq_along(unlist(x, use.names=FALSE)), x_partitioning[width(x_partitioning) != 0L]) local_revmap <- global2local_revmap(global_revmap, ans_partitioning, x) mcols(unlisted_ans)$revmap <- local_revmap } ans <- relist(unlisted_ans, ans_partitioning) names(ans) <- names(x) ans } setMethod("range", "CompressedIRangesList", function(x, ..., with.revmap=FALSE, na.rm=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (length(list(x, ...)) >= 2L) x <- merge(x, ...) .range_CompressedIRangesList(x, with.revmap=with.revmap) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### reduce() ### setGeneric("reduce", signature="x", function(x, drop.empty.ranges=FALSE, ...) standardGeneric("reduce") ) ### Always return an IRanges (or NormalIRanges) *instance* whatever ### IntegerRanges derivative the input is, so does NOT act like an ### endomorphism in general. setMethod("reduce", "IntegerRanges", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { if (!isTRUEorFALSE(drop.empty.ranges)) stop("'drop.empty.ranges' must be TRUE or FALSE") if (!isSingleNumber(min.gapwidth)) stop("'min.gapwidth' must be a single integer") if (!is.integer(min.gapwidth)) min.gapwidth <- as.integer(min.gapwidth) if (min.gapwidth < 0L) stop("'min.gapwidth' must be non-negative") if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!isTRUEorFALSE(with.inframe.attrib)) stop("'with.inframe.attrib' must be TRUE or FALSE") C_ans <- .Call2("C_reduce_IntegerRanges", start(x), width(x), drop.empty.ranges, min.gapwidth, with.revmap, with.inframe.attrib, PACKAGE="IRanges") ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") if (with.revmap) { mcols(ans) <- DataFrame(revmap=IntegerList(C_ans$revmap)) } if (with.inframe.attrib) { inframe <- new2("IRanges", start=C_ans$inframe.start, width=width(x), check=FALSE) attr(ans, "inframe") <- inframe } ans } ) setMethod("reduce", "Views", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { new_ranges <- callGeneric(x@ranges, drop.empty.ranges=drop.empty.ranges, min.gapwidth=min.gapwidth, with.revmap=with.revmap, with.inframe.attrib=with.inframe.attrib) .set_Views_ranges(x, new_ranges) } ) setMethod("reduce", "IntegerRangesList", function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { endoapply(x, reduce, drop.empty.ranges = drop.empty.ranges, min.gapwidth = min.gapwidth, with.revmap=with.revmap, with.inframe.attrib = with.inframe.attrib) } ) ### 'with.inframe.attrib' is ignored for now. ### TODO: Support 'with.inframe.attrib=TRUE'. .reduce_CompressedIRangesList <- function(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) { if (!isTRUEorFALSE(drop.empty.ranges)) stop("'drop.empty.ranges' must be TRUE or FALSE") if (!isSingleNumber(min.gapwidth)) stop("'min.gapwidth' must be a single integer") if (!is.integer(min.gapwidth)) min.gapwidth <- as.integer(min.gapwidth) if (min.gapwidth < 0L) stop("'min.gapwidth' must be non-negative") if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") if (!identical(with.inframe.attrib, FALSE)) stop("'with.inframe.attrib' argument not yet supported ", "when reducing a CompressedIRangesList object") C_ans <- .Call2("C_reduce_CompressedIRangesList", x, drop.empty.ranges, min.gapwidth, with.revmap, PACKAGE="IRanges") unlisted_ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (with.revmap) mcols(unlisted_ans) <- DataFrame(revmap=IntegerList(C_ans$revmap)) ans_partitioning <- PartitioningByEnd(C_ans$breakpoints) names(ans_partitioning) <- names(x) relist(unlisted_ans, ans_partitioning) } setMethod("reduce", "CompressedIRangesList", .reduce_CompressedIRangesList) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### gaps() ### setGeneric("gaps", signature="x", function(x, start=NA, end=NA, ...) standardGeneric("gaps") ) ### Always return an IRanges (or NormalIRanges) *instance* whatever ### IntegerRanges derivative the input is, so does NOT act like an ### endomorphism in general. setMethod("gaps", "IntegerRanges", function(x, start=NA, end=NA) { start <- S4Vectors:::normargSingleStartOrNA(start) end <- S4Vectors:::normargSingleEndOrNA(end) C_ans <- .Call2("C_gaps_IntegerRanges", start(x), width(x), start, end, PACKAGE="IRanges") ans <- new2("IRanges", start=C_ans$start, width=C_ans$width, check=FALSE) if (is(x, "NormalIRanges")) ans <- as(ans, "NormalIRanges") ans } ) setMethod("gaps", "Views", function(x, start=NA, end=NA) { if (!isSingleNumberOrNA(start)) stop("'start' must be a single integer") if (!is.integer(start)) start <- as.integer(start) if (!isSingleNumberOrNA(end)) stop("'end' must be a single integer") if (!is.integer(end)) end <- as.integer(end) if (is.na(start)) start <- 1L if (is.na(end)) end <- length(subject(x)) new_ranges <- gaps(x@ranges, start=start, end=end) .set_Views_ranges(x, new_ranges) } ) .gaps_RangesList <- function(x, start=NA, end=NA) { x_len <- length(x) if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be an integer vector or NA") if (!is.integer(start)) start <- as.integer(start) if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be an integer vector or NA") if (!is.integer(end)) end <- as.integer(end) start <- IntegerList(as.list(S4Vectors:::recycleVector(start, x_len))) end <- IntegerList(as.list(S4Vectors:::recycleVector(end, x_len))) mendoapply(gaps, x, start = start, end = end) } setMethod("gaps", "IntegerRangesList", .gaps_RangesList) .gaps_CompressedIRangesList <- function(x, start=NA, end=NA) { ## Normalize 'start'. if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be an integer vector or NA") if (!is.integer(start)) start <- as.integer(start) if (length(start) != 1L) start <- S4Vectors:::V_recycle(start, x, x_what="start", skeleton_what="x") ## Normalize 'end'. if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be an integer vector or NA") if (!is.integer(end)) end <- as.integer(end) if (length(end) != 1L) end <- S4Vectors:::V_recycle(end, x, x_what="end", skeleton_what="x") chunksize <- 10000000L if (length(x) <= chunksize) { ## Process all at once. ans <- .Call2("C_gaps_CompressedIRangesList", x, start, end, PACKAGE="IRanges") return(ans) } ## Process by chunk. verbose <- getOption("verbose", default=FALSE) chunks <- as(breakInChunks(length(x), chunksize=chunksize), "IRanges") ans_chunks <- lapply(seq_along(chunks), function(i) { if (verbose) cat("Processing chunk #", i, "/", length(chunks), " ... ", sep="") chunk <- chunks[i] x_chunk <- extractROWS(x, chunk) start_chunk <- if (length(start) == 1L) start else extractROWS(start, chunk) end_chunk <- if (length(end) == 1L) end else extractROWS(end, chunk) ans_chunk <- .gaps_CompressedIRangesList(x_chunk, start=start_chunk, end=end_chunk) if (verbose) cat("OK\n") ans_chunk }) do.call(c, ans_chunks) } setMethod("gaps", "CompressedIRangesList", .gaps_CompressedIRangesList) ### 'start' and 'end' are ignored. setMethod("gaps", "MaskCollection", function(x, start=NA, end=NA) { start <- 1L end <- width(x) x@nir_list <- lapply(nir_list(x), function(nir) gaps(nir, start=start, end=end) ) x@NAMES <- as.character(NA) x@desc <- as.character(NA) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### disjoin() ### setGeneric("disjoin", function(x, ...) standardGeneric("disjoin")) ### Always return an IRanges *instance* whatever IntegerRanges derivative ### the input is, so does NOT act like an endomorphism in general. ### FIXME: Does not properly handle zero-width ranges at the moment e.g. ### disjoin(IRanges(c(1, 11, 13), width=c(2, 5, 0)) returns ### IRanges(c(1, 11, 13), width=c(2, 2, 3)) when it should return ### IRanges(c(1, 11, 13, 13), width=c(2, 2, 0, 3)). setMethod("disjoin", "IntegerRanges", function(x, with.revmap=FALSE) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") ## starts: original starts and end+1 when inside another interval ## ends: original ends and start-1 when inside another interval starts <- unique(start(x)) ends <- unique(end(x)) adj_start <- head(sort(unique(c(starts, ends + 1L))), -1L) adj_end <- tail(sort(unique(c(ends, starts - 1L))), -1L) adj_width <- adj_end - adj_start + 1L adj <- new2("IRanges", start=adj_start, width=adj_width, check=FALSE) adj <- subsetByOverlaps(adj, x, minoverlap=1L) if (with.revmap) mcols(adj)$revmap <- as(sort(findOverlaps(adj, x)),"List") adj } ) ### Basically a no-op but returns a NormalIRanges *instance* for consistency ### with how the other inter-range transformations (range(), reduce(), gaps()) ### behave on a NormalIRanges object. setMethod("disjoin", "NormalIRanges", function(x) as(x, "NormalIRanges")) setMethod("disjoin", "IntegerRangesList", function(x, with.revmap=FALSE) endoapply(x, disjoin, with.revmap=with.revmap) ) setMethod("disjoin", "CompressedIRangesList", function(x, with.revmap=FALSE, ...) { if (!isTRUEorFALSE(with.revmap)) stop("'with.revmap' must be TRUE or FALSE") .wunlist <- function(x) ## unlist CompressedIntegerList, with integer(0) as 0 { w <- integer(length(x)) w[elementNROWS(x) != 0L] <- unlist(x, use.names=FALSE) w } rng <- range(x) if (sum(.wunlist(width(rng) + 1)) > .Machine$integer.max) return(endoapply(x, disjoin, with.revmap=with.revmap, ...)) ## localize coordinates off0 <- head(.wunlist(width(rng) + 1L), -1L) offset <- c(1L, cumsum(off0)) - .wunlist(start(rng)) local <- unlist(shift(x, offset), use.names=FALSE) ## disjoin d <- disjoin(local, with.revmap=with.revmap, ...) vec <- unlist(start(shift(rng, offset)), use.names=FALSE) lvls <- factor(seq_along(x)) lvls0 <- lvls[elementNROWS(rng) != 0] f <- lvls0[findInterval(start(d), vec)] ans <- split(d, f) ## globalize coordinates ans <- shift(ans, -offset) ## localize 'revmap' if (with.revmap) { unlisted_ans <- unlist(ans, use.names=FALSE) global_revmap <- mcols(unlisted_ans, use.names=FALSE)$revmap local_revmap <- global2local_revmap(global_revmap, ans, x) mcols(unlisted_ans)$revmap <- local_revmap ans <- relist(unlisted_ans, ans) } names(ans) <- names(x) ans }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### isDisjoint() ### setGeneric("isDisjoint", function(x, ...) standardGeneric("isDisjoint")) setMethod("isDisjoint", "IntegerRanges", function(x) { x_len <- length(x) if (x_len < 2L) return(TRUE) x_start <- start(x) x_end <- end(x) oo <- order(x) start2 <- x_start[oo] end2 <- x_end[oo] all(start2[-1L] > end2[-x_len]) } ) ### Overwrite above method with optimized method for StitchedIPos objects. setMethod("isDisjoint", "StitchedIPos", function(x) callGeneric(x@pos_runs)) setMethod("isDisjoint", "NormalIRanges", function(x) TRUE) setMethod("isDisjoint", "IntegerRangesList", function(x) vapply(x, isDisjoint, logical(1)) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### disjointBins() ### ## make intervals disjoint by segregating them into separate IntegerRanges setGeneric("disjointBins", function(x, ...) standardGeneric("disjointBins")) setMethod("disjointBins", "IntegerRanges", function(x) { x_ord <- NULL if (S4Vectors:::isNotSorted(start(x))) { # minimize work for sorted ranges (common) x_ord <- order(x) x <- x[x_ord] } bins <- .Call2("C_disjointBins_IntegerRanges", start(x), width(x), PACKAGE="IRanges") if (!is.null(x_ord)) { rev_ord <- integer(length(x_ord)) rev_ord[x_ord] <- seq_along(rev_ord) bins <- bins[rev_ord] } names(bins) <- names(x) bins } ) ### Overwrite above method with trivial method for NormalIRanges objects. setMethod("disjointBins", "NormalIRanges", function(x) rep.int(1L, length(x))) setMethod("disjointBins", "IntegerRangesList", function(x) as(lapply(x, disjointBins), "IntegerList") ) IRanges/R/intra-range-methods.R0000644000175100017510000006574314626176651017353 0ustar00biocbuildbiocbuild### ========================================================================= ### Intra-range methods ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### update_ranges() ### ### For internal use by the intra-range methods. ### Update the ranges of a Ranges derivative. Return an object of the same ### class as and parallel to 'x'. setGeneric("update_ranges", signature="x", function(x, start=NULL, end=NULL, width=NULL, use.names=TRUE) standardGeneric("update_ranges") ) ### Shallow check: only check type and length, not the content. .check_start_or_end_or_width <- function(start, x_len) { if (is.null(start)) return() stopifnot(is.integer(start)) stopifnot(length(start) == x_len) } ### Does not validate the modified object. setMethod("update_ranges", "IRanges", function(x, start=NULL, end=NULL, width=NULL, use.names=TRUE) { use.names <- S4Vectors:::normargUseNames(use.names) narg <- sum(!is.null(start), !is.null(end), !is.null(width)) if (narg == 0L) { if (!(use.names || is.null(x@NAMES))) x@NAMES <- NULL return(x) } stopifnot(narg <= 2L) x_len <- length(x) .check_start_or_end_or_width(start, x_len) .check_start_or_end_or_width(end, x_len) .check_start_or_end_or_width(width, x_len) if (narg == 2L) { if (!is.null(end)) { if (is.null(start)) { ## 'end' and 'width' supplied start <- end - width + 1L } else { ## 'start' and 'end' supplied width <- end - start + 1L } } args <- list(start=start, width=width) } else { stopifnot(is.null(width)) if (is.null(start)) { ## only 'end' supplied width <- end - x@start + 1L args <- list(width=width) } else { ## only 'start' supplied width <- x@width - (start - x@start) args <- list(start=start, width=width) } } if (use.names) { more_args <- list(check=FALSE) } else { more_args <- list(NAMES=NULL, check=FALSE) } args <- c(list(x), args, more_args) do.call(BiocGenerics:::replaceSlots, args) } ) setMethod("update_ranges", "Views", function(x, start=NULL, end=NULL, width=NULL, use.names=TRUE) { x@ranges <- update_ranges(x@ranges, start=start, end=end, width=width, use.names=use.names) x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### shift() ### setGeneric("shift", signature="x", function(x, shift=0L, use.names=TRUE) standardGeneric("shift") ) ### Returns an NA-free unnamed integer vector of length 1 or length 'x_len'. ### If the user-supplied 'shift' vector is constant then the returned vector ### is guaranteed to be of length <= 1. .normarg_shift <- function(shift, x_len) { if (!is.numeric(shift)) stop("'shift' must be a numeric vector") if (!is.integer(shift)) { shift <- as.integer(shift) } else if (!is.null(names(shift))) { names(shift) <- NULL } shift_len <- length(shift) if (shift_len == 0L) { if (x_len != 0L) stop(wmsg("'length(shift)' is 0 but 'length(x)' is not")) return(shift) } if (shift_len == 1L) { if (is.na(shift)) stop(wmsg("'shift' cannot be NA")) return(shift) } if (shift_len > x_len) stop(wmsg("'length(shift)' is greater than 'length(x)'")) if (x_len %% length(shift) != 0L) warning(wmsg("'length(x)' is not a multiple of 'length(shift)'")) if (anyNA(shift)) stop(wmsg("'shift' cannot contain NAs")) if (isConstant(shift)) return(shift[[1L]]) suppressWarnings(S4Vectors:::recycleVector(shift, x_len)) } setMethod("shift", "Ranges", function(x, shift=0L, use.names=TRUE) { shift <- .normarg_shift(shift, length(x)) if (is(x, "NormalIRanges") && length(shift) >= 2L) stop("'shift' must be a single number when shifting ", "a NormalIRanges object") new_start <- start(x) + shift x <- update_ranges(x, start=new_start, width=width(x), use.names=use.names) validObject(x) x } ) ### Overwrite above method with optimized method for IPos objects. ### An IPos object cannot hold names so the 'use.names' arg has no effect. ### NOTE: We only support shifting by a single value at the moment! setMethod("shift", "IPos", function(x, shift=0L, use.names=TRUE) { shift <- .normarg_shift(shift, length(x)) if (is(x, "UnstitchedIPos")) { new_pos <- pos(x) + shift ans <- BiocGenerics:::replaceSlots(x, pos=new_pos, check=FALSE) return(ans) } if (length(shift) >= 2L) stop("'shift' must be a single number when shifting ", "a StitchedIPos object") new_pos_runs <- callGeneric(x@pos_runs, shift=shift) BiocGenerics:::replaceSlots(x, pos_runs=new_pos_runs, check=FALSE) } ) setMethod("shift", "RangesList", function(x, shift=0L, use.names=TRUE) { if (!is(shift, "List")) shift <- as(shift, "List") shift <- S4Vectors:::VH_recycle(shift, x, "shift", "x") if (is(x, "CompressedRangesList")) { unlisted_shift <- unlist(shift, use.names=FALSE) new_unlistData <- shift(x@unlistData, shift=unlisted_shift, use.names=use.names) ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) return(ans) } mendoapply(shift, x, shift, MoreArgs=list(use.names=use.names)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### narrow() ### ### The default "narrow" method calls windows() so we only need to implement ### a "windows" method for IntegerRanges objects to make narrow() work on ### these objects. setMethod("windows", "Ranges", function(x, start=NA, end=NA, width=NA) { ir <- make_IRanges_from_windows_args(x, start, end, width) if (length(x) == 0L) return(x) ans_start <- start(x) + start(ir) - 1L ans_width <- width(ir) update_ranges(x, start=ans_start, width=ans_width) } ) setMethod("narrow", "MaskCollection", function(x, start=NA, end=NA, width=NA, use.names=TRUE) { solved_SEW <- solveUserSEWForSingleSeq(width(x), start, end, width) solved_start <- start(solved_SEW) solved_end <- end(solved_SEW) solved_width <- width(solved_SEW) x@nir_list <- lapply(nir_list(x), function(nir) shift(restrict(nir, start=solved_start, end=solved_end), 1L - solved_start) ) x@width <- solved_width if (!S4Vectors:::normargUseNames(use.names)) names(x) <- NULL x } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### resize() ### setGeneric("resize", signature="x", function(x, width, fix="start", use.names=TRUE, ...) standardGeneric("resize") ) setMethod("resize", "Ranges", function(x, width, fix="start", use.names=TRUE) { if (is(x, "NormalIRanges")) stop("resizing a NormalIRanges object is not supported") lx <- length(x) if (!is.numeric(width) || S4Vectors:::anyMissing(width)) stop("'width' must be a numeric vector without NA's") if (!is.integer(width)) width <- as.integer(width) if (S4Vectors:::anyMissingOrOutside(width, 0L)) stop("'width' values must be non-negative") if (!(is.character(fix) || (is(fix, "Rle") && is.character(runValue(fix)))) || (length(fix) == 0L && length(x) > 0L) || (length(setdiff(unique(fix), c("start", "end", "center"))) > 0)) { stop("'fix' must be a character vector or character Rle ", "with values in \"start\", \"end\", and \"center\"") } if (!is(fix, "Rle")) fix <- Rle(fix) if (length(fix) != lx) fix <- rep(fix, length.out = lx) ans_width <- S4Vectors:::recycleVector(width, lx) ans_start <- start(x) if (!identical(runValue(fix), "start")) { fixEnd <- as(fix == "end", "IRanges") if (length(fixEnd) > 0) { value <- extractROWS(ans_start, fixEnd) + (extractROWS(width(x), fixEnd) - extractROWS(ans_width, fixEnd)) ans_start <- replaceROWS(ans_start, fixEnd, value) } fixCenter <- as(fix == "center", "IRanges") if (length(fixCenter) > 0) { value <- extractROWS(ans_start, fixCenter) + (extractROWS(width(x), fixCenter) - extractROWS(ans_width, fixCenter)) %/% 2L ans_start <- replaceROWS(ans_start, fixCenter, value) } } update_ranges(x, start=ans_start, width=ans_width, use.names=use.names) } ) setMethod("resize", "RangesList", function(x, width, fix="start", use.names=TRUE, ...) { if (!is(width, "List")) width <- as(width, "List") width <- S4Vectors:::VH_recycle(width, x, "width", "x") if (!is(fix, "List")) fix <- as(fix, "List") fix <- S4Vectors:::VH_recycle(fix, x, "fix", "x") if (is(x, "CompressedRangesList")) { unlisted_width <- unlist(width, use.names=FALSE) unlisted_fix <- unlist(fix, use.names=FALSE) new_unlistData <- resize(x@unlistData, width=unlisted_width, fix=unlisted_fix, use.names=use.names, ...) ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) return(ans) } mendoapply(resize, x, width, fix, MoreArgs=list(use.names=use.names, ...)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### flank() ### setGeneric("flank", signature="x", function(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) standardGeneric("flank") ) setMethod("flank", "Ranges", function(x, width, start=TRUE, both=FALSE, use.names=TRUE) { if (is(x, "NormalIRanges")) stop("flanking a NormalIRanges object is not supported") width <- recycleIntegerArg(width, "width", length(x)) if (!is.logical(start) || S4Vectors:::anyMissing(start)) stop("'start' must be logical without NA's") start <- S4Vectors:::recycleVector(unname(start), length(x)) if (!isTRUEorFALSE(both)) stop("'both' must be TRUE or FALSE") ans_start <- integer(length(x)) if (both) { idx1 <- which(start) idx2 <- which(!start) width <- abs(width) ans_width <- 2L * width ans_start[idx1] <- start(x)[idx1] - width[idx1] ans_start[idx2] <- end(x)[idx2] - width[idx2] + 1L } else { idx1a <- which(start & width >= 0L) idx1b <- which(start & width < 0L) idx2a <- which(!start & width >= 0L) idx2b <- which(!start & width < 0L) ans_width <- abs(width) ans_start[idx1a] <- start(x)[idx1a] - width[idx1a] ans_start[idx1b] <- start(x)[idx1b] ans_start[idx2a] <- end(x)[idx2a] + 1L ans_start[idx2b] <- end(x)[idx2b] + width[idx2b] + 1L } update_ranges(x, start=ans_start, width=ans_width, use.names=use.names) } ) setMethod("flank", "RangesList", function(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) { if (!is(width, "List")) width <- as(width, "List") width <- S4Vectors:::VH_recycle(width, x, "width", "x") if (!is(start, "List")) start <- as(start, "List") start <- S4Vectors:::VH_recycle(start, x, "start", "x") if (is(x, "CompressedRangesList")) { unlisted_width <- unlist(width, use.names=FALSE) unlisted_start <- unlist(start, use.names=FALSE) new_unlistData <- flank(x@unlistData, width=unlisted_width, start=unlisted_start, both=both, use.names=use.names, ...) ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) return(ans) } mendoapply(flank, x, width, start, MoreArgs=list(both=both, use.names=use.names, ...)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### promoters() and terminators() ### setGeneric("promoters", signature="x", function(x, upstream=2000, downstream=200, use.names=TRUE, ...) standardGeneric("promoters") ) setGeneric("terminators", signature="x", function(x, upstream=2000, downstream=200, use.names=TRUE, ...) standardGeneric("terminators") ) ### Computes the promoter regions if 'site' is set to "TSS" (Transcription ### Start Site), or the terminator regions if it's set to "TES" (Transcription ### End Site). .IntegerRanges_promoters <- function(x, upstream, downstream, use.names=TRUE, site=c("TSS", "TES")) { site <- match.arg(site) if (is(x, "NormalIRanges")) stop(wmsg("promoters() or terminators() is not supported ", "on a NormalIRanges object")) x_len <- length(x) upstream <- recycleIntegerArg(upstream, "upstream", x_len) downstream <- recycleIntegerArg(downstream, "downstream", x_len) if (x_len == 0L) return(update_ranges(x, use.names=use.names)) if (min(upstream) < 0L || min(downstream) < 0L) stop("'upstream' and 'downstream' must be integers >= 0") x_site <- if (site == "TSS") start(x) else end(x) new_start <- x_site - upstream new_end <- x_site + downstream - 1L update_ranges(x, start=new_start, end=new_end, use.names=use.names) } setMethod("promoters", "IntegerRanges", function(x, upstream=2000, downstream=200, use.names=TRUE) .IntegerRanges_promoters(x, upstream, downstream, use.names=use.names, site="TSS") ) setMethod("terminators", "IntegerRanges", function(x, upstream=2000, downstream=200, use.names=TRUE) .IntegerRanges_promoters(x, upstream, downstream, use.names=use.names, site="TES") ) .RangesList_promoters <- function(x, FUN, upstream, downstream, use.names=TRUE) { if (!is(upstream, "List")) upstream <- as(upstream, "List") upstream <- S4Vectors:::VH_recycle(upstream, x, "upstream", "x") if (!is(downstream, "List")) downstream <- as(downstream, "List") downstream <- S4Vectors:::VH_recycle(downstream, x, "downstream", "x") if (is(x, "CompressedRangesList")) { unlisted_upstream <- unlist(upstream, use.names=FALSE) unlisted_downstream <- unlist(downstream, use.names=FALSE) new_unlistData <- FUN(x@unlistData, upstream=unlisted_upstream, downstream=unlisted_downstream, use.names=use.names) ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) return(ans) } mendoapply(FUN, x, upstream, downstream, MoreArgs=list(use.names=use.names)) } setMethod("promoters", "RangesList", function(x, upstream=2000, downstream=200, use.names=TRUE) .RangesList_promoters(x, promoters, upstream, downstream, use.names=use.names) ) setMethod("terminators", "RangesList", function(x, upstream=2000, downstream=200, use.names=TRUE) .RangesList_promoters(x, terminators, upstream, downstream, use.names=use.names) ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### reflect() ### setGeneric("reflect", signature="x", function(x, bounds, use.names=TRUE) standardGeneric("reflect") ) setMethod("reflect", "IntegerRanges", function(x, bounds, use.names=TRUE) { if (is(x, "NormalIRanges")) stop("reflecting a NormalIRanges object is not supported") if (!is(bounds, "IntegerRanges")) stop("'bounds' must be an IntegerRanges object") if (length(x) > 1 && length(bounds) == 0) stop("'bounds' is an empty IntegerRanges object") if (length(x) < length(bounds)) bounds <- head(bounds, length(x)) ans_start <- (2L * start(bounds) + width(bounds) - 1L) - end(x) update_ranges(x, start=ans_start, width=width(x), use.names=use.names) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### restrict() ### setGeneric("restrict", signature="x", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) standardGeneric("restrict") ) ### We distinguish 4 exclusive types of relationship between a range in 'x' ### and its corresponding restriction interval (specified by 'start' and ### 'end'): ### - Type 1: They have a non-empty intersection. ### - Type 2: The restriction interval is empty but its bounds are within ### the range in 'x'. ### - Type 3: The restriction interval is not empty and is adjacent to the ### range in 'x' i.e. the range in 'x' ends at start - 1 or starts ### at end - 1. ### ### drop.ranges.mode: ### 0L: Ranges in 'x' that are empty after restriction are dropped. ### 1L: Ranges in 'x' that are not overlapping and not even adjacent ### with the region of restriction are dropped. ### "Not overlapping and not adjacent" means that they end strictly ### before start - 1 or start strictly after end + 1. ### Those that are not overlapping but are however adjacent are kept ### (and are empty after restriction). ### 2L: All ranges in 'x' are kept after restriction. ### Note that the only mode compatible with a NormalIRanges object is 0L. .restrict_IntegerRanges <- function(x, start, end, drop.ranges.mode, use.names) { if (!S4Vectors:::isNumericOrNAs(start)) stop("'start' must be a vector of integers") if (!is.integer(start)) start <- as.integer(start) if (!S4Vectors:::isNumericOrNAs(end)) stop("'end' must be a vector of integers") if (!is.integer(end)) end <- as.integer(end) if (length(x) != 0L) { if (length(start) == 0L || length(start) > length(x)) stop("invalid 'start' length") if (length(end) == 0L || length(end) > length(x)) stop("invalid 'end' length") } start <- S4Vectors:::recycleVector(start, length(x)) end <- S4Vectors:::recycleVector(end, length(x)) use.names <- S4Vectors:::normargUseNames(use.names) ans_start <- start(x) ans_end <- end(x) if (use.names) ans_names <- names(x) else ans_names <- NULL ans_mcols <- mcols(x, use.names=FALSE) ## Compare ranges in 'x' with 'start'. if (drop.ranges.mode == 0L) far_too_left <- !is.na(start) & (ans_end < start) else far_too_left <- !is.na(start) & (ans_end < start - 1L) if (drop.ranges.mode == 2L) { ans_end[far_too_left] <- start[far_too_left] - 1L } else { ## Drop the ranges that are far too left with respect to the ## region of restriction. keep_idx <- which(!far_too_left) ans_start <- ans_start[keep_idx] ans_end <- ans_end[keep_idx] if (!is.null(ans_names)) ans_names <- ans_names[keep_idx] ans_mcols <- extractROWS(ans_mcols, keep_idx) start <- start[keep_idx] end <- end[keep_idx] } ## Fix 'ans_start'. too_left <- !is.na(start) & (ans_start < start) ans_start[too_left] <- start[too_left] ## Compare ranges in 'x' with 'end'. if (drop.ranges.mode == 0L) far_too_right <- !is.na(end) & (ans_start > end) else far_too_right <- !is.na(end) & (ans_start > end + 1L) if (drop.ranges.mode == 2L) { ans_start[far_too_right] <- end[far_too_right] + 1L } else { ## Drop the ranges that are far too right with respect to the ## region of restriction. keep_idx <- which(!far_too_right) ans_start <- ans_start[keep_idx] ans_end <- ans_end[keep_idx] if (!is.null(ans_names)) ans_names <- ans_names[keep_idx] ans_mcols <- extractROWS(ans_mcols, keep_idx) start <- start[keep_idx] end <- end[keep_idx] } ## Fix 'ans_end'. too_right <- !is.na(end) & (ans_end > end) ans_end[too_right] <- end[too_right] ans_width <- ans_end - ans_start + 1L BiocGenerics:::replaceSlots(x, start=ans_start, width=ans_width, NAMES=ans_names, mcols=ans_mcols, check=FALSE) } setMethod("restrict", "IntegerRanges", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) { if (!isTRUEorFALSE(keep.all.ranges)) stop("'keep.all.ranges' must be TRUE or FALSE") use.names <- S4Vectors:::normargUseNames(use.names) if (is(x, "NormalIRanges")) { if (keep.all.ranges) stop("'keep.all.ranges=TRUE' is not supported ", "when 'x' is a NormalIRanges object") drop.ranges.mode <- 0L } else { if (keep.all.ranges) drop.ranges.mode <- 2L else drop.ranges.mode <- 1L } .restrict_IntegerRanges(x, start, end, drop.ranges.mode, use.names) } ) setMethod("restrict", "Views", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) { new_ranges <- restrict(ranges(x), start=start, end=end, keep.all.ranges=keep.all.ranges, use.names=use.names) BiocGenerics:::replaceSlots(x, ranges=new_ranges, check=FALSE) } ) setMethod("restrict", "RangesList", function(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) { if (!isTRUEorFALSE(keep.all.ranges)) stop("'keep.all.ranges' must be TRUE or FALSE") if (!is(start, "List")) start <- as(start, "List") start <- S4Vectors:::VH_recycle(start, x, "start", "x") if (!is(end, "List")) end <- as(end, "List") end <- S4Vectors:::VH_recycle(end, x, "end", "x") if (is(x, "CompressedRangesList")) { if (!keep.all.ranges) { drop <- (!is.na(end) & start(x) > end + 1L) | (!is.na(start) & end(x) < start - 1L) } unlisted_start <- unlist(start, use.names=FALSE) unlisted_end <- unlist(end, use.names=FALSE) new_unlistData <- restrict(x@unlistData, start=unlisted_start, end=unlisted_end, keep.all.ranges=keep.all.ranges, use.names=use.names) if (keep.all.ranges) { ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) } else { ans <- relist(new_unlistData, PartitioningByWidth(lengths(x) - sum(drop))) } return(ans) } mendoapply(restrict, x, start, end, MoreArgs=list(keep.all.ranges=keep.all.ranges, use.names=use.names)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### threebands() ### setGeneric("threebands", signature="x", function(x, start=NA, end=NA, width=NA) standardGeneric("threebands") ) ### Method for IRanges only! setMethod("threebands", "IRanges", function(x, start=NA, end=NA, width=NA) { middle <- narrow(x, start=start, end=end, width=width, use.names=FALSE) left <- right <- middle left@start <- start(x) left@width <- start(middle) - start(x) right@start <- end(middle) + 1L right@width <- end(x) - end(middle) list(left=left, middle=middle, right=right) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### Zooming (symmetrically scales the width). ### ### TODO: Implement a zoom() generic and methods and make the "Ops" method ### below a simple convenience wrapper for zoom(). Recommend the use of zoom() ### over "Ops" methods in packages and scripts. Reserve "Ops" methods as a ### convenience when working interactively. ### setMethod("Ops", c("Ranges", "numeric"), function(e1, e2) { if (S4Vectors:::anyMissing(e2)) stop("NA not allowed as zoom factor") if ((length(e1) < length(e2) && length(e1)) || (length(e1) && !length(e2)) || (length(e1) %% length(e2) != 0)) stop("zoom factor length not a multiple of number of ranges") if (.Generic == "*") { e2 <- ifelse(e2 < 0, abs(1/e2), e2) r <- e1 mid <- (start(r)+end(r))/2 w <- width(r)/e2 update_ranges(r, start = as.integer(ceiling(mid - w/2)), width = as.integer(floor(w))) } else { if (.Generic == "-") { e2 <- -e2 .Generic <- "+" } if (.Generic == "+") { if (any(-e2*2 > width(e1))) stop("adjustment would result in ranges with negative widths") update_ranges(e1, start = as.integer(start(e1) - e2), end = as.integer(end(e1) + e2)) } } } ) setMethod("Ops", c("RangesList", "numeric"), function(e1, e2) { for (i in seq_len(length(e1))) e1[[i]] <- callGeneric(e1[[i]], e2) e1 }) setMethod("Ops", c("CompressedRangesList", "numeric"), function(e1, e2) { relist(callGeneric(unlist(e1, use.names = FALSE), e2), e1) }) IRanges/R/multisplit.R0000644000175100017510000000065714626176651015702 0ustar00biocbuildbiocbuild### ========================================================================= ### multisplit() ### ------------------------------------------------------------------------- ### multisplit <- function(x, f) { if (!is.list(f) && !is(f, "List")) stop("'f' must be a list") if (length(x) != length(f)) stop("Length of 'f' must equal length of 'x'") splitAsList(rep(x, elementNROWS(f)), unlist(f, use.names = FALSE)) } IRanges/R/nearest-methods.R0000644000175100017510000001540214626176651016570 0ustar00biocbuildbiocbuild### ========================================================================= ### nearest (and related) methods ### ------------------------------------------------------------------------- ### setClassUnion("IntegerRanges_OR_missing", c("IntegerRanges", "missing")) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### precede() and follow() ### setGeneric("precede", function(x, subject = x, ...) standardGeneric("precede")) setMethod("precede", c("IntegerRanges", "IntegerRanges_OR_missing"), function(x, subject, select = c("first", "all")) { select <- match.arg(select) s <- start(subject) ord <- NULL if (S4Vectors:::isNotSorted(s)) { ord <- base::order(s) s <- s[ord] } if (select == "all") { srle <- Rle(s) s <- runValue(srle) } i <- findInterval(end(x), s) + 1L i[i > length(s)] <- NA if (select == "all") { vectorToHits(i, srle, ord) } else { if (!is.null(ord)) i <- ord[i] i } } ) setGeneric("follow", function(x, subject = x, ...) standardGeneric("follow")) setMethod("follow", c("IntegerRanges", "IntegerRanges_OR_missing"), function(x, subject, select = c("last", "all")) { select <- match.arg(select) e <- end(subject) ord <- NULL if (S4Vectors:::isNotSorted(e)) { ord <- base::order(e) e <- e[ord] } if (select == "all") { srle <- Rle(e) e <- runValue(srle) } i <- findInterval(start(x) - 1L, e) i[i == 0] <- NA if (select == "all") { vectorToHits(i, srle, ord) } else { if (!is.null(ord)) i <- ord[i] i } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### nearest() ### ### Used in GenomicRanges. ### TODO: Move to Hits-class.R vectorToHits <- function(i, srle, ord) { lx <- length(i) v <- !is.na(i) i <- i[v] w <- width(srle)[i] subj <- unlist_as_integer(IRanges(start(srle)[i], width=w)) m <- cbind(queryHits = rep(seq(lx)[v], w), subjectHits = if (!is.null(ord)) ord[subj] else subj) if (!is.null(ord)) m <- m[orderIntegerPairs(m[,1L], m[,2L]),,drop=FALSE] Hits(m[ , 1L], m[ , 2L], lx, length(srle), sort.by.query=TRUE) } setGeneric("nearest", function(x, subject, ...) standardGeneric("nearest")) setMethod("nearest", c("IntegerRanges", "IntegerRanges_OR_missing"), function(x, subject, select = c("arbitrary", "all")) { select <- match.arg(select) if (!missing(subject)) { ol <- findOverlaps(x, subject, maxgap = 0L, select = select) } else { subject <- x ol <- findOverlaps(x, maxgap = 0L, select = select, drop.self = TRUE) } if (select == "all") { olv <- selectHits(ol, select="first") } else olv <- ol x <- x[is.na(olv)] before <- precede(x, subject, if (select == "all") "all" else "first") after <- follow(x, subject, if (select == "all") "all" else "last") if (select == "all") { before0 <- before before <- selectHits(before, select="first") after0 <- after after <- selectHits(after, select="first") } leftdist <- (start(subject)[before] - end(x)) rightdist <- (start(x) - end(subject)[after]) left <- leftdist < rightdist left[is.na(left)] <- is.na(after)[is.na(left)] if (select == "all") { filterHits <- function(hits, i) { m <- as.matrix(hits[as(hits, "IRanges")[i]]) m[,1L] <- map[m[,1L]] m } map <- which(is.na(olv)) right <- !left left[leftdist == rightdist] <- TRUE m <- rbind(as.matrix(ol), filterHits(before0, left), filterHits(after0, right)) m <- m[orderIntegerPairs(m[,1L], m[,2L]),, drop=FALSE] ## unname() required because in case 'm' has only 1 row ## 'm[ , 1L]' and 'm[ , 2L]' will return a named atomic vector ol@from <- unname(m[ , 1L]) ol@to <- unname(m[ , 2L]) } else { olv[is.na(olv)] <- ifelse(left, before, after) ol <- olv } ol }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### distance() ### setGeneric("distance", function(x, y, ...) standardGeneric("distance")) setMethod("distance", c("IntegerRanges", "IntegerRanges"), function(x, y) { max_start <- pmax.int(start(x), start(y)) min_end <- pmin.int(end(x), end(y)) pmax.int(max_start - min_end - 1L, 0L) } ) setMethod("distance", c("Pairs", "missing"), function(x, y) { distance(first(x), second(x)) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### distanceToNearest() ### setGeneric("distanceToNearest", function(x, subject = x, ...) standardGeneric("distanceToNearest")) setMethod("distanceToNearest", c("IntegerRanges", "IntegerRanges_OR_missing"), function(x, subject, select = c("arbitrary", "all")) { select <- match.arg(select) if (missing(subject)) { subject <- x x_nearest <- nearest(x, select = select) } else { x_nearest <- nearest(x, subject, select = select) } if (select == "arbitrary") { queryHits <- seq_along(x)[!is.na(x_nearest)] subjectHits <- x_nearest[!is.na(x_nearest)] } else { queryHits <- queryHits(x_nearest) subjectHits <- subjectHits(x_nearest) } if (!length(subjectHits) || all(is.na(subjectHits))) { Hits(nLnode=length(x), nRnode=length(subject), distance=integer(0), sort.by.query=TRUE) } else { distance = distance(x[queryHits], subject[subjectHits]) Hits(queryHits, subjectHits, length(x), length(subject), distance, sort.by.query=TRUE) } } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### selectNearest() ### selectNearest <- function(hits, x, subject) { hits <- as(hits, "SortedByQueryHits") hitsByQuery <- relist(hits, as(hits, "Partitioning")) dist <- distance(x[queryHits(hits)], subject[subjectHits(hits)]) distByQuery <- relist(dist, hitsByQuery) unlist(hitsByQuery[distByQuery == min(distByQuery)]) } IRanges/R/range-squeezers.R0000644000175100017510000000160714626176651016610 0ustar00biocbuildbiocbuild### ========================================================================= ### Generic functions for squeezing the ranges out of a range-based object ### ------------------------------------------------------------------------- ### Extract the ranges as an IRanges object. setGeneric("ranges", signature="x", function(x, use.names=TRUE, use.mcols=FALSE, ...) standardGeneric("ranges") ) ### Extract the ranges as an IRangesList object. setGeneric("rglist", signature="x", function(x, use.names=TRUE, use.mcols=FALSE, ...) standardGeneric("rglist") ) ### Pairs method. setMethod("rglist", "Pairs", function(x, use.names=TRUE, use.mcols=FALSE) { stopifnot(isTRUEorFALSE(use.mcols)) rl <- zipup(ranges(first(x)), ranges(second(x))) if (!use.mcols) { mcols(rl) <- NULL } rl }) IRanges/R/read.Mask.R0000644000175100017510000003372014626176651015276 0ustar00biocbuildbiocbuild### ========================================================================= ### Read a mask from a file ### ----------------------- ### ### From an NCBI "agp" file (for chrY in hs b36v3): ### library(BSgenome.Hsapiens.NCBI.b36v3) ### file1 <- system.file("extdata", "hs_b36v3_chrY.agp", package="IRanges") ### mask1 <- read.agpMask(file1, seqname="chrY", mask.width=length(Hsapiens$chrY)) ### ### From an UCSC "gap" file (for chrY in hg18): ### library(BSgenome.Hsapiens.UCSC.hg18) ### file2 <- system.file("extdata", "chrY_gap.txt", package="IRanges") ### mask2 <- read.gapMask(file2, seqname="chrY", mask.width=length(Hsapiens$chrY)) ### ### From an UCSC "lift" file (for hg18): ### file3 <- system.file("extdata", "hg18liftAll.lft", package="IRanges") ### mask3 <- read.liftMask(file3, seqname="chr1") ### ### From a RepeatMasker .out file (for chrM in ce2): ### library(BSgenome.Celegans.UCSC.ce2) ### file4 <- system.file("extdata", "ce2chrM.fa.out", package="IRanges") ### mask4 <- read.rmMask(file4, seqname="chrM", mask.width=length(Celegans$chrM)) ### ### From a Tandem Repeats Finder .bed file (for chrM in ce2): ### file5 <- system.file("extdata", "ce2chrM.bed", package="IRanges") ### mask5 <- read.trfMask(file5, seqname="chrM", mask.width=length(Celegans$chrM)) ### ### ------------------------------------------------------------------------- .showDistinctSeqnamesAndStop <- function(seqnames) { distinct_seqnames <- paste("\"", unique(seqnames), "\"", sep="") distinct_seqnames <- paste(distinct_seqnames, collapse=", ") stop(length(distinct_seqnames), " distinct seqnames found in this file: ", distinct_seqnames) } .newEmptyMask <- function(seqname, mask.width, mask.name, mask.desc, nofound_what="information") { msg <- paste("No ", nofound_what, " found for sequence \"", seqname, "\" in this file. ", sep="") if (is.na(mask.width)) stop(msg, "Please use the\n", " 'mask.width' argument to specify the width of the empty mask to\n", " return (i.e. the length of the sequence this mask will be put on).") warning(msg, "returning empty mask") ans <- Mask(mask.width) # empty mask names(ans) <- mask.name desc(ans) <- paste(mask.desc, "(empty)") ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.agpMask() and read.gapMask() extract the AGAPS mask from an NCBI ### "agp" file or a UCSC "gap" file, respectively. ### .guessGapFileCOL2CLASS <- function(file) { ## UCSC "gap" files generally have the 9 columns below except for some ## organisms like Rhesus that have only 8 columns (no 'bin' column). COL2CLASS <- c( `bin`="integer", `chrom`="character", `chr_start`="integer", `chr_stop`="integer", `part_no`="integer", `part_type`="character", `gap_len`="integer", `gap_type`="character", `bridge`="character" ) line1 <- try(read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, nrows=1L, check.names=FALSE), silent=TRUE) if (!inherits(line1, "try-error")) return(COL2CLASS) COL2CLASS <- COL2CLASS[-1L] line1 <- try(read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, nrows=1L, check.names=FALSE), silent=TRUE) if (!inherits(line1, "try-error")) return(COL2CLASS) stop("unable to guess the column names in \"gap\" file '", file, "', sorry") } .read.agpORgapFile <- function(agp_or_gap, file) { if (agp_or_gap == "agp") { COL2CLASS <- c( `chrom`="character", `chr_start`="integer", `chr_stop`="integer", `part_no`="integer", `part_type`="character", `gap_len`="character", `gap_type`="character", `linkage`="character", `empty`="character" ) } else if (agp_or_gap == "gap") { COL2CLASS <- .guessGapFileCOL2CLASS(file) } else { stop("read.Mask internal error: please report") } COLS <- c( "chrom", "chr_start", "chr_stop", "part_type", "gap_len", "gap_type" ) COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" data <- read.table(file, sep="\t", col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE, fill=TRUE) } .read.agpORgapMask <- function(agp_or_gap, file, seqname, mask.width, gap.types, use.gap.types) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) if (!is.null(gap.types) && (!is.character(gap.types) || S4Vectors:::anyMissing(gap.types) || anyDuplicated(gap.types))) stop("'gap.types' must be 'NULL' or a character vector ", "with no NAs and no duplicated") if (!isTRUEorFALSE(use.gap.types)) stop("'use.gap.types' must be TRUE or FALSE") data <- .read.agpORgapFile(agp_or_gap, file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$chrom) data <- data[data$chrom == seqname, ] ii <- data$part_type == "N" if (agp_or_gap == "agp") { data <- data[ii, ] } else if (!all(ii)) { warning("gap file contains gaps with a part_type that is not N") } if (length(gap.types) == 1 && gap.types == "?") { found_types <- paste("\"", unique(data$gap_type), "\"", sep="") found_types <- paste(found_types, collapse=", ") stop("gap types found in this file for sequence \"", seqname, "\": ", found_types) } mask.name <- "AGAPS" mask.desc <- "assembly gaps" if (!is.null(gap.types)) { data <- data[data$gap_type %in% gap.types, ] mask.desc <- paste(mask.desc, " [type=", paste(gap.types, collapse="|"), "]", sep="") } if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, mask.name, mask.desc, mask.desc)) if (agp_or_gap == "agp") ranges_start <- data$chr_start else ranges_start <- data$chr_start + 1L ranges <- IRanges(start=ranges_start, width=as.integer(data$gap_len)) ## Sanity check if (!identical(end(ranges), data$chr_stop)) stop("broken \"", agp_or_gap, "\" file: contains inconsistent ", "chr_start/chr_stop/gap_len values ", "for assembly gaps in sequence \"", seqname, "\"") if (use.gap.types) { names(ranges) <- data$gap_type if (S4Vectors:::isNotStrictlySorted(start(ranges))) ranges <- ranges[base::order(start(ranges))] if (!isNormal(ranges)) stop("cannot use the gap types when some gaps are adjacent or overlap") nir1 <- asNormalIRanges(ranges, force=FALSE) } else { nir1 <- asNormalIRanges(ranges, force=TRUE) } ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES=mask.name, desc=mask.desc, check=FALSE) } read.agpMask <- function(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) .read.agpORgapMask("agp", file, seqname, mask.width, gap.types, use.gap.types) read.gapMask <- function(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) .read.agpORgapMask("gap", file, seqname, mask.width, gap.types, use.gap.types) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.liftMask() extracts the AGAPS mask from a UCSC "lift" file. ### .read.liftFile <- function(file) { COL2CLASS <- c( `offset`="integer", `xxxx`="NULL", # not sure how to call this `width`="integer", `seqname`="character", `seqlen`="integer" ) read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE) } read.liftMask <- function(file, seqname="?", mask.width=NA) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) data <- .read.liftFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$seqname) data <- data[data$seqname == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "AGAPS", "assembly gaps")) ## Sanity checks seqlen0 <- unique(data$seqlen) if (length(seqlen0) != 1) stop("broken \"lift\" file: contains different lengths ", "for sequence \"", seqname, "\"") if (!is.na(mask.width) && mask.width != seqlen0) stop("when supplied, 'mask.width' must match the length found ", "in the file for sequence \"", seqname, "\"") contigs0 <- IRanges(start=data$offset+1, width=data$width) contigs1 <- asNormalIRanges(contigs0, force=TRUE) if (length(contigs1) != length(contigs0)) warning("some contigs are adjacent or overlapping") contigs <- Mask(seqlen0, start=start(contigs1), width=width(contigs1)) ans <- gaps(contigs) names(ans) <- "AGAPS" desc(ans) <- "assembly gaps" ans } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.rmMask() extracts the RM mask from a RepeatMasker .out file. ### ### See http://www.repeatmasker.org/webrepeatmaskerhelp.html for a ### description of the RepeatMasker output format. ### .read.rmFile <- function(file) { COL2CLASS <- c( `SW_score`="integer", `perc_div`="numeric", `perc_del`="numeric", `perc_ins`="numeric", `query_sequence`="character", `begin_in_query`="integer", `end_in_query`="integer", `left_in_query`="character", `C`="character", `matching_repeat`="character", `repeat_class_or_family`="character", `begin_in_repeat`="integer", `end_in_repeat`="integer", `left_in_repeat`="character", `ID`="character" ) COLS <- c("query_sequence", "begin_in_query", "end_in_query", "ID") COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, skip=3, check.names=FALSE) } read.rmMask <- function(file, seqname="?", mask.width=NA, use.IDs=FALSE) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) if (!isTRUEorFALSE(use.IDs)) stop("'use.IDs' must be TRUE or FALSE") data <- .read.rmFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$query_sequence) data <- data[data$query_sequence == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "RM", "RepeatMasker")) ranges <- IRanges(start=data$begin_in_query, end=data$end_in_query) if (use.IDs) { names(ranges) <- data$ID if (S4Vectors:::isNotStrictlySorted(start(ranges))) ranges <- ranges[base::order(start(ranges))] if (!isNormal(ranges)) stop("cannot use the repeat IDs when some repeats are adjacent or overlap") nir1 <- asNormalIRanges(ranges, force=FALSE) } else { nir1 <- asNormalIRanges(ranges, force=TRUE) } ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES="RM", desc="RepeatMasker", check=FALSE) } ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### read.trfMask() extracts the TRF mask from a Tandem Repeats Finder .bed ### file. ### .read.trfFile <- function(file) { COL2CLASS <- c( `chrom`="character", `chromStart`="integer", `chromEnd`="integer", `name`="character", `period`="integer", `copyNum`="numeric", `consensusSize`="integer", `perMatch`="integer", `perIndel`="integer", `score`="integer", `A`="integer", `C`="integer", `G`="integer", `T`="integer", `entropy`="numeric", `sequence`="character" ) COLS <- c("chrom", "chromStart", "chromEnd") COL2CLASS[!(names(COL2CLASS) %in% COLS)] <- "NULL" read.table(file, col.names=names(COL2CLASS), colClasses=COL2CLASS, check.names=FALSE) } read.trfMask <- function(file, seqname="?", mask.width=NA) { if (!isSingleString(seqname)) stop("'seqname' must be a single string") if (!isSingleNumberOrNA(mask.width)) stop("'mask.width' must be a single integer or 'NA'") if (!is.integer(mask.width)) mask.width <- as.integer(mask.width) data <- .read.trfFile(file) if (seqname == "?") .showDistinctSeqnamesAndStop(data$chrom) data <- data[data$chrom == seqname, ] if (nrow(data) == 0) return(.newEmptyMask(seqname, mask.width, "TRF", "Tandem Repeats Finder")) ranges <- IRanges(start=data$chromStart+1, end=data$chromEnd) nir1 <- asNormalIRanges(ranges, force=TRUE) ## Don't use new2(): the validity of the new mask needs to be checked! new2("MaskCollection", nir_list=list(nir1), width=mask.width, active=TRUE, NAMES="TRF", desc="Tandem Repeats Finder", check=FALSE) } IRanges/R/reverse-methods.R0000644000175100017510000000563314626176651016607 0ustar00biocbuildbiocbuild### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### The "reverse" generic and methods. ### setGeneric("reverse", function(x, ...) standardGeneric("reverse")) setMethod("reverse", "character", function(x, ...) { if (length(x) == 0) return(x) sapply(strsplit(x, NULL, fixed=TRUE), function(xx) paste(rev(xx), collapse="")) } ) ### This method does NOT preserve normality. .IRanges.reverse <- function(x, ...) { if (length(x) == 0L) return(x) args <- S4Vectors:::extraArgsAsList(NULL, ...) argnames <- names(args) n2p <- match(c("start", "end", "use.names"), argnames) if (is.na(n2p[1L])) { start <- min(start(x)) } else { start <- args[[n2p[1L]]] if (!is.numeric(start)) stop("'start' must be a vector of integers") if (!is.integer(start)) start <- as.integer(start) if (S4Vectors:::anyMissing(start)) stop("'start' contains NAs") } if (is.na(n2p[2L])) { end <- max(end(x)) } else { end <- args[[n2p[2L]]] if (!is.numeric(end)) stop("'end' must be a vector of integers") if (!is.integer(end)) end <- as.integer(end) if (S4Vectors:::anyMissing(end)) stop("'end' contains NAs") } if (!is.na(n2p[3L]) && !S4Vectors:::normargUseNames(args[[n2p[3L]]])) x <- set_IRanges_names(x, NULL) ## WARNING: -end(x) *must* appear first in this expression if we want ## the supplied 'start' and 'end' to be recycled properly. ## Remember that in R, because of the recycling, addition of numeric ## vectors of different lengths is not associative i.e. in general ## '(x + y) + z' is not the same as 'x + (y + z)'. For example: ## (integer(6) + 1:2) + 1:3 and integer(6) + (1:2 + 1:3) ## are not the same. x@start[] <- -end(x) + start + end x } setMethod("reverse", "IRanges", .IRanges.reverse) setMethod("reverse", "NormalIRanges", function(x, ...) { ## callNextMethod() temporarily breaks 'x' as a NormalIRanges object ## because the returned ranges are ordered from right to left. x <- callNextMethod() BiocGenerics:::replaceSlots(x, start=rev(start(x)), width=rev(width(x)), NAMES=rev(names(x)), mcols=S4Vectors:::revROWS(mcols(x, use.names=FALSE))) } ) setMethod("reverse", "Views", function(x, ...) { x@subject <- rev(subject(x)) x@ranges <- reverse(ranges(x), start=1L, end=length(subject(x))) x } ) setMethod("reverse", "MaskCollection", function(x, ...) { start <- 1L end <- width(x) x@nir_list <- lapply(nir_list(x), function(nir) reverse(nir, start=start, end=end) ) x } ) IRanges/R/seqapply.R0000644000175100017510000000250014626176651015317 0ustar00biocbuildbiocbuild### ========================================================================= ### The stuff in this file should go somewhere else, probably close to ### splitAsList() (currently defined in S4Vectors/R/split-methods.R) ### ------------------------------------------------------------------------- ### ## NOT exported. `splitAsList<-` <- function(x, f, drop = FALSE, ..., value) { if (!isTRUEorFALSE(drop)) stop("'drop' must be TRUE or FALSE") if (NROW(x) != length(f)) stop("Length of 'f' must equal the length of 'x'") ind <- splitAsList(seq_len(NROW(x)), f, drop = drop) if (length(ind) != length(value)) stop("Length of 'value' must equal the length of a split on 'f'") replaceROWS(x, unlist(ind, use.names=FALSE), unlist(value, use.names = FALSE)) } setMethod("unsplit", "List", function(value, f, drop = FALSE) { value_flat <- unlist(value, use.names = FALSE) if (NROW(value_flat) != length(f)) stop("Length of 'unlist(value)' must equal length of 'f'") splitAsList(value_flat, f, drop = drop) <- value if (!is.null(ROWNAMES(value_flat))) { nms <- relist(ROWNAMES(value_flat), value) splitAsList(ROWNAMES(value_flat), f, drop = drop) <- nms } value_flat }) setReplaceMethod("split", "Vector", function(x, f, drop = FALSE, ..., value) { splitAsList(x, f, drop = drop, ...) <- value x }) IRanges/R/setops-methods.R0000644000175100017510000002571714626176651016456 0ustar00biocbuildbiocbuild### ========================================================================= ### Set operations ### ------------------------------------------------------------------------- ### ### 1) Vector-wise set operations: union, intersect, setdiff ### ### When the input are IntegerRanges objects, the functions in that group ### interpret each supplied object ('x' or 'y') as a set of integer values. ### Therefore, if 2 IRanges objects 'x1' and 'x2' represent the same set ### of integers, then each of these functions will return the same result ### when 'x1' is replaced with 'x2' in the input. The returned IRanges ### object is guaranteed to be normal but is *not* promoted to ### NormalIRanges. ### ### 2) Element-wise (aka "parallel") set operations: punion, pintersect, ### psetdiff, pgap ### ### The functions in that group take 2 *objects* of the same length and ### return an object of the same class and length as the first argument. ### ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### union() ### ### Always return an IRanges *instance* whatever IntegerRanges derivatives ### are passed to it (e.g. IPos, NCList or NormalIRanges), so does NOT act ### like an endomorphism in general. setMethod("union", c("IntegerRanges", "IntegerRanges"), function(x, y) { ## We downgrade 'x' to an IRanges instance so 'c(x, y)' is guaranteed ## to work (even e.g. if 'x' is a NormalIRanges object). x <- as(x, "IRanges", strict=TRUE) reduce(c(x, y), drop.empty.ranges=TRUE) } ) setMethod("union", c("IntegerRangesList", "IntegerRangesList"), function(x, y) mendoapply(union, x, y)) setMethod("union", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { len <- max(length(x), length(y)) if (length(x) != len) x <- x[S4Vectors:::recycleVector(seq_len(length(x)), len)] if (length(y) != len) y <- y[S4Vectors:::recycleVector(seq_len(length(y)), len)] xy <- c(unlist(x, use.names = FALSE), unlist(y, use.names = FALSE)) xy_list <- split(xy, factor(c(togroup(PartitioningByWidth(x)), togroup(PartitioningByWidth(y))), seq_len(length(x)))) names(xy_list) <- names(x) reduce(xy_list, drop.empty.ranges=TRUE) }) setMethod("union", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### intersect() ### ### Always return an IRanges *instance* whatever IntegerRanges derivatives ### are passed to it (e.g. IPos, NCList or NormalIRanges), so does NOT act ### like an endomorphism in general. setMethod("intersect", c("IntegerRanges", "IntegerRanges"), function(x, y) { if (length(x) == 0L) return(x) start <- min(c(start(x), start(y))) end <- max(c(end(x), end(y))) setdiff(x, gaps(y, start=start, end=end)) } ) setMethod("intersect", c("IntegerRangesList", "IntegerRangesList"), function(x, y) mendoapply(intersect, x, y)) setMethod("intersect", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { nonempty <- elementNROWS(x) != 0L rx <- unlist(range(x), use.names = FALSE) startx <- integer() startx[nonempty] <- start(rx) endx <- integer() endx[nonempty] <- end(rx) setdiff(x, gaps(y, start = startx, end = endx)) }) setMethod("intersect", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) setMethod("intersect", c("CompressedAtomicList", "CompressedAtomicList"), function(x, y) { fx <- if (!is(x, "IntegerList")) as(x, "FactorList") else x fy <- if (!is(y, "IntegerList")) as(y, "FactorList") else y m <- S4Vectors:::matchIntegerPairs(togroup(PartitioningByEnd(x)), unlist(fx, use.names=FALSE), togroup(PartitioningByEnd(y)), unlist(fy, use.names=FALSE), nomatch=0L) m[duplicated(m)] <- 0L x[relist(m > 0L, x)] }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### setdiff() ### ### Always return an IRanges *instance* whatever IntegerRanges derivatives ### are passed to it (e.g. IPos, NCList or NormalIRanges), so does NOT act ### like an endomorphism in general. setMethod("setdiff", c("IntegerRanges", "IntegerRanges"), function(x, y) { if (length(x) == 0L) return(x) start <- min(c(start(x), start(y))) end <- max(c(end(x), end(y))) gaps(union(gaps(x, start=start, end=end), y), start=start, end=end) } ) setMethod("setdiff", c("IntegerRangesList", "IntegerRangesList"), function(x, y) mendoapply(setdiff, x, y)) setMethod("setdiff", c("CompressedIRangesList", "CompressedIRangesList"), function(x, y) { nonempty <- elementNROWS(x) != 0L rx <- unlist(range(x), use.names = FALSE) startx <- rep(NA_integer_, length(x)) startx[nonempty] <- start(rx) endx <- rep(NA_integer_, length(x)) endx[nonempty] <- end(rx) gaps(union(gaps(x), y), start = startx, end = endx) }) setMethod("setdiff", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### punion() ### setGeneric("punion", signature=c("x", "y"), function(x, y, ...) standardGeneric("punion") ) setMethod("punion", c("IntegerRanges", "IntegerRanges"), function(x, y, fill.gap=FALSE) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") if (!isTRUEorFALSE(fill.gap)) stop("'fill.gap' must be TRUE or FALSE") if (!fill.gap) { gap <- pmax.int(start(x), start(y)) - pmin.int(end(x), end(y)) - 1L if (any(gap > 0L)) stop("some pair of ranges have a gap within ", "the 2 members of the pair.\n", " Use 'fill.gap=TRUE' to enforce their ", "union by filling the gap.") } ans_start <- pmin.int(start(x), start(y)) ans_end <- pmax.int(end(x), end(y)) ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, end=ans_end, names=ans_names) } ) setMethod("punion", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pintersect() ### setGeneric("pintersect", signature=c("x", "y"), function(x, y, ...) standardGeneric("pintersect") ) setMethod("pintersect", c("IntegerRanges", "IntegerRanges"), function(x, y, resolve.empty=c("none", "max.start", "start.x")) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_start <- pmax.int(start(x), start(y)) ans_end <- pmin.int(end(x), end(y)) ans_width <- ans_end - ans_start + 1L keep_empty_x <- width(x) == 0L if (any(keep_empty_x)) { keep_empty_x <- keep_empty_x & ((start(x) >= start(y) & start(x) <= end(y)) | (start(x) == start(y) & width(y) == 0L)) } if (any(keep_empty_x)) { ans_start[keep_empty_x] <- start(x)[keep_empty_x] ans_width[keep_empty_x] <- 0L } keep_empty_y <- width(y) == 0L if (any(keep_empty_y)) { keep_empty_y <- keep_empty_y & start(y) >= start(x) & start(y) <= end(x) } if (any(keep_empty_y)) { ans_start[keep_empty_y] <- start(y)[keep_empty_y] ans_width[keep_empty_y] <- 0L } check_empty <- ans_width < 0L check_empty[keep_empty_x | keep_empty_y] <- FALSE if (any(check_empty)) { resolve.empty <- match.arg(resolve.empty) if (resolve.empty == "none") { stop("some intersections produce ambiguous empty ranges.\n", " Use argument 'resolve.empty' to resolve them.") } else { ans_width[check_empty] <- 0L if (resolve.empty == "start.x") ans_start[check_empty] <- start(x)[check_empty] } } ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, width=ans_width, names=ans_names) } ) setMethod("pintersect", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### psetdiff() ### setGeneric("psetdiff", signature=c("x", "y"), function(x, y, ...) standardGeneric("psetdiff") ) setMethod("psetdiff", c("IntegerRanges", "IntegerRanges"), function(x, y) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_start <- start(x) ans_end <- end(x) if (any((start(y) > ans_start) & (end(y) < ans_end))) stop("some ranges in 'y' have their end points strictly inside\n", " the range in 'x' that they need to be subtracted from.\n", " Cannot subtract them.") start2 <- pmax.int(ans_start, start(y)) end2 <- pmin.int(ans_end, end(y)) ii <- start2 <= end2 jj <- end2 == ans_end kk <- ii & jj ans_end[kk] <- start2[kk] - 1L kk <- ii & (!jj) ans_start[kk] <- end2[kk] + 1L ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, end=ans_end, names=ans_names) } ) setMethod("psetdiff", c("Pairs", "missing"), function(x, y, ...) { callGeneric(first(x), second(x), ...) }) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### pgap() ### setGeneric("pgap", signature=c("x", "y"), function(x, y, ...) standardGeneric("pgap") ) setMethod("pgap", c("IntegerRanges", "IntegerRanges"), function(x, y) { if (length(x) != length(y)) stop("'x' and 'y' must have the same length") ans_end_plus1 <- pmax.int(start(x), start(y)) ans_start <- pmin.int(end(x), end(y)) + 1L ans_width <- ans_end_plus1 - ans_start ans_width[ans_width < 0L] <- 0L ans_names <- names(x) if (is.null(ans_names)) ans_names <- names(y) IRanges(start=ans_start, width=ans_width, names=ans_names) } ) IRanges/R/slice-methods.R0000644000175100017510000000662214626176651016232 0ustar00biocbuildbiocbuild### ========================================================================= ### Slice the bread ### ------------------------------------------------------------------------- setGeneric("slice", signature="x", function(x, lower=-Inf, upper=Inf, ...) standardGeneric("slice")) setMethod("slice", "Rle", function(x, lower = -Inf, upper = Inf, includeLower = TRUE, includeUpper = TRUE, rangesOnly = FALSE) { if (!isSingleNumber(lower)) { stop("'lower' must be a single number") } if (!isSingleNumber(upper)) { stop("'upper' must be a single number") } if (!isTRUEorFALSE(includeLower)) { stop("'includeLower' must be TRUE or FALSE") } if (!isTRUEorFALSE(includeUpper)) { stop("'includeUpper' must be TRUE or FALSE") } if (!isTRUEorFALSE(rangesOnly)) { stop("'rangesOnly' must be TRUE or FALSE") } if (lower == -Inf) { ranges <- Rle(TRUE, length(x)) } else if (includeLower) { ranges <- (x >= lower) } else { ranges <- (x > lower) } if (upper < Inf) { if (includeUpper) { ranges <- ranges & (x <= upper) } else { ranges <- ranges & (x < upper) } } if (rangesOnly) { as(ranges, "IRanges") } else { Views(x, ranges) } }) setMethod("slice", "RleList", function(x, lower = -Inf, upper = Inf, includeLower = TRUE, includeUpper = TRUE, rangesOnly = FALSE) { if (!isSingleNumber(lower)) stop("'lower' must be a single number") if (!isSingleNumber(upper)) stop("'upper' must be a single number") if (!isTRUEorFALSE(includeLower)) stop("'includeLower' must be TRUE or FALSE") if (!isTRUEorFALSE(includeUpper)) stop("'includeUpper' must be TRUE or FALSE") if (!isTRUEorFALSE(rangesOnly)) stop("'rangesOnly' must be TRUE or FALSE") if (lower == -Inf) { ranges <- RleList(lapply(elementNROWS(x), function(len) Rle(TRUE, len)), compress=FALSE) } else if (includeLower) { ranges <- (x >= lower) } else { ranges <- (x > lower) } if (upper < Inf) { if (includeUpper) { ranges <- ranges & (x <= upper) } else { ranges <- ranges & (x < upper) } } if (rangesOnly) { as(ranges, "CompressedIRangesList") } else { RleViewsList(rleList = x, rangesList = as(ranges, "SimpleIRangesList")) } }) setMethod("slice", "ANY", function(x, lower=-Inf, upper=Inf, ...) { slice(as(x, "Rle"), lower=lower, upper=upper, ...) }) IRanges/R/subsetting-utils.R0000644000175100017510000000655214626176651017021 0ustar00biocbuildbiocbuild### ========================================================================= ### Subsetting utility functions ### ------------------------------------------------------------------------- ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### RangesNSBS objects. ### setClass("RangesNSBS", # not exported contains="NSBS", representation( subscript="IRanges" ) ) ### Construction methods. ### Supplied arguments are trusted so we don't check them! setMethod("NSBS", "IntegerRanges", function(i, x, exact=TRUE, strict.upper.bound=TRUE, allow.NAs=FALSE) { i_len <- length(i) if (i_len == 0L) { ## Return a NativeNSBS object of length 0. i <- NULL return(callGeneric()) } x_NROW <- NROW(x) if (is(i, "UnstitchedIPos")) { ## Return a NativeNSBS object. i <- i@pos return(callGeneric()) } if (is(i, "StitchedIPos")) i <- i@pos_runs # TODO: Use collapse() (or stitch()?) when # it's available i_start <- start(i) i_end <- end(i) if (min(i_start) < 1L || strict.upper.bound && max(i_end) > x_NROW) S4Vectors:::.subscript_error("subscript contains out-of-bounds ", "ranges") if (i_len > 1L) { ans <- new2("RangesNSBS", subscript=i, upper_bound=x_NROW, upper_bound_is_strict=strict.upper.bound, check=FALSE) return(ans) } if (i_end > i_start) { ans <- new2("RangeNSBS", subscript=c(i_start, i_end), upper_bound=x_NROW, upper_bound_is_strict=strict.upper.bound, check=FALSE) return(ans) } ## Return a NativeNSBS object of length <= 1. if (i_end == i_start) { i <- i_start } else { i <- NULL } callGeneric() } ) ### Other methods. setMethod("as.integer", "RangesNSBS", function(x) unlist_as_integer(x@subscript) ) setMethod("length", "RangesNSBS", function(x) sum(width(x@subscript))) setMethod("anyDuplicated", "RangesNSBS", function(x, incomparables=FALSE, ...) !isDisjoint(x@subscript) ) setMethod("isStrictlySorted", "RangesNSBS", function(x) isNormal(x@subscript)) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### "extractROWS" methods for subsetting *by* an IntegerRanges object. ### setMethod("extractROWS", c("vector_OR_factor", "RangesNSBS"), function(x, i) { start <- start(i@subscript) width <- width(i@subscript) S4Vectors:::extract_ranges_from_vector_OR_factor(x, start, width) } ) setMethod("extractROWS", c("array", "RangesNSBS"), S4Vectors:::default_extractROWS ) setMethod("extractROWS", c("data.frame", "RangesNSBS"), S4Vectors:::default_extractROWS ) setMethod("extractROWS", c("Rle", "RangesNSBS"), function(x, i) { start <- start(i@subscript) width <- width(i@subscript) ans <- S4Vectors:::extract_ranges_from_Rle(x, start, width) mcols(ans) <- extractROWS(mcols(x, use.names=FALSE), i) ans } ) IRanges/R/tile-methods.R0000644000175100017510000000604314626176651016065 0ustar00biocbuildbiocbuild### ========================================================================= ### "tile" methods ### ------------------------------------------------------------------------- ### ### TODO: We have a profileration of tools for creating these "sliding ### windows" or "tiles": successiveIRanges(), tileGenome(), tile(), and now ### slidingWindows(). With no visible coherent naming scheme. Introducing ### a new verb each time we get frustrated because the existing tools don't ### let us create tiles or windows exactly the way we'd like for the use case ### of the day is not a sustainable strategy in the long run. This just adds ### more and more confusion for the end user. ### So some effort will need to be done towards unification of all these ### tools. H.P. -- Oct 16, 2016. setGeneric("tile", function(x, n, width, ...) standardGeneric("tile"), signature="x") setMethod("tile", "IntegerRanges", function(x, n, width, ...) { if (!missing(n)) { if (!missing(width)) stop("only one of 'n' and 'width' can be specified") if (any(IRanges::width(x) < n)) stop("some width(x) are less than 'n'") if (any(n < 0L)) stop("some 'n' are negative") n <- S4Vectors:::recycleVector(n, length(x)) } if (!missing(width)) { if (!missing(n)) stop("only one of 'n' and 'width' can be specified") if (any(width < 0L)) stop("some 'width' are negative") n <- ceiling(width(x) / width) } width <- IRanges::width(x) / n ## The floor() is intentional for compatibility with Jim Kent's BigWig code ## tileGenome() uses ceiling() instead tile.end <- floor(unlist_as_integer(IRanges(rep(1L, length(n)), width=n)) * rep(width, n)) tile.end.abs <- tile.end + rep(start(x), n) - 1L tile.width <- S4Vectors:::diffWithInitialZero(as.integer(tile.end.abs)) p <- PartitioningByWidth(n, names = names(x)) tile.width[start(p)] <- tile.end[start(p)] relist(IRanges(width=tile.width, end=tile.end.abs), p) }) ### ========================================================================= ### "slidingWindows" methods ### ------------------------------------------------------------------------- ### setGeneric("slidingWindows", function(x, width, step = 1L, ...) standardGeneric("slidingWindows"), signature="x") setMethod("slidingWindows", "IntegerRanges", function(x, width, step = 1L) { if (!isSingleNumber(width)) stop("'width' must be a single, non-NA number") if (!isSingleNumber(step)) stop("'step' must be a single, non-NA number") if (any(width < 0L)) stop("some 'width' are negative") if (any(step < 0L)) stop("some 'step' are negative") n <- ceiling(pmax(width(x) - width, 0L) / step) + 1L window.starts <- unlist_as_integer(IRanges(rep(0L, length(n)), width=n)) * step + 1L windows <- restrict(IRanges(window.starts, width=width), end=rep(width(x), n)) windows.abs <- shift(windows, rep(start(x), n) - 1L) relist(windows.abs, PartitioningByWidth(n, names = names(x))) }) IRanges/R/windows-methods.R0000644000175100017510000001332614626176651016624 0ustar00biocbuildbiocbuild### ========================================================================= ### windows() ### ------------------------------------------------------------------------- ### windows() is a "parallel" version of window() for list-like objects. That ### is, it does 'mendoapply(window, x, start, end, width)' but uses a fast ### implementation. setGeneric("windows", signature="x", function(x, start=NA, end=NA, width=NA) standardGeneric("windows") ) ### NOT exported. ### Low-level utility used by various "windows" methods. make_IRanges_from_windows_args <- function(x, start=NA, end=NA, width=NA) { x_eltNROWS <- elementNROWS(x) if (!is(start, "IntegerRanges")) return(solveUserSEW(x_eltNROWS, start=start, end=end, width=width)) if (!(identical(end, NA) && identical(width, NA))) stop(wmsg("'end' or 'width' should not be specified or must be ", "set to NA when 'start' is an IntegerRanges object")) if (!is(start, "IRanges")) start <- as(start, "IRanges") ir <- S4Vectors:::V_recycle(start, x, x_what="start", skeleton_what="x") if (any(start(ir) < 1L) || any(end(ir) > x_eltNROWS)) stop(wmsg("'start' contains out-of-bounds ranges")) ir } setMethod("windows", "list_OR_List", function(x, start=NA, end=NA, width=NA) { ir <- make_IRanges_from_windows_args(x, start, end, width) if (length(x) == 0L) return(x) ## -- Slow path (loops over the list elements of 'x') -- #for (k in seq_along(x)) # x[[k]] <- extractROWS(x[[k]], ir[k]) #return(x) ## -- Fast path -- ## Unlist 'x' (preserving the inner names) and shift the ranges ## in 'ir'. if (is.list(x)) { unlisted_x <- bindROWS(x[[1L]], x[-1L]) } else { unlisted_x <- unlist(x, use.names=FALSE) } offsets <- c(0L, end(PartitioningByEnd(x))[-length(x)]) ir <- shift(ir, shift=offsets) ## Subset. unlisted_ans <- extractROWS(unlisted_x, ir) ## Relist. ans_breakpoints <- cumsum(width(ir)) ans_partitioning <- PartitioningByEnd(ans_breakpoints, names=names(x)) ans <- as(relist(unlisted_ans, ans_partitioning), class(x)) ## Propagate 'metadata(x)' and 'mcols(x)'. if (is(x, "List")) { metadata(ans) <- metadata(x) mcols(ans) <- mcols(x, use.names=FALSE) } ans } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### narrow() ### ### A recursive version of windows() i.e. on a list-like object it's ### equivalent to: ### ### mendoapply(narrow, x, start, end, width, ### MoreArgs=list(use.names=use.names)) ### setGeneric("narrow", signature="x", function(x, start=NA, end=NA, width=NA, use.names=TRUE) standardGeneric("narrow") ) ### Should operate recursively on an ordinary list or an IntegerRangesList, ### GenomicRangesList, DNAStrinSetList, or GAlignmentsList derivative. ### But not on an IRanges, GRanges, DNAStringSet, or GAlignments object ### where it's equivalent to windows(). setMethod("narrow", "ANY", function(x, start=NA, end=NA, width=NA, use.names=TRUE) { call_windows <- is(x, "Ranges") || !is(x, "list_OR_List") || !pcompareRecursively(x) if (call_windows) { ## We've reached a leaf in the recursion tree. ans <- windows(x, start=start, end=end, width=width) if (!S4Vectors:::normargUseNames(use.names)) names(ans) <- NULL return(ans) } if (!is(start, "List")) start <- as(start, "List") start <- S4Vectors:::VH_recycle(start, x, "start", "x") if (!is(width, "List")) width <- as(width, "List") width <- S4Vectors:::VH_recycle(width, x, "width", "x") if (!is(end, "List")) end <- as(end, "List") end <- S4Vectors:::VH_recycle(end, x, "end", "x") if (is(x, "CompressedList")) { unlisted_start <- unlist(start, use.names=FALSE) unlisted_end <- unlist(end, use.names=FALSE) unlisted_width <- unlist(width, use.names=FALSE) new_unlistData <- narrow(x@unlistData, start=unlisted_start, end=unlisted_end, width=unlisted_width, use.names=use.names) ans <- BiocGenerics:::replaceSlots(x, unlistData=new_unlistData, check=FALSE) return(ans) } mendoapply(narrow, x, start, end, width, MoreArgs=list(use.names=use.names)) } ) ### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### heads() and tails() ### ### These are just convenience wrappers around windows(). ### ### They do 'mendoapply(head, x, n)' and 'mendoapply(tail, x, n)', ### respectively, but use a fast implementation. ### .normarg_n <- function(n, x_eltNROWS) { if (!is.numeric(n)) stop("'n' must be an integer vector") if (!is.integer(n)) n <- as.integer(n) if (any(is.na(n))) stop("'n' cannot contain NAs") n <- pmin(x_eltNROWS, n) neg_idx <- which(n < 0L) if (length(neg_idx) != 0L) n[neg_idx] <- pmax(n[neg_idx] + x_eltNROWS[neg_idx], 0L) n } heads <- function(x, n=6L) { x_eltNROWS <- unname(elementNROWS(x)) n <- .normarg_n(n, x_eltNROWS) windows(x, start=1L, width=n) } tails <- function(x, n=6L) { x_eltNROWS <- unname(elementNROWS(x)) n <- .normarg_n(n, x_eltNROWS) windows(x, end=x_eltNROWS, width=n) } IRanges/R/zzz.R0000644000175100017510000000022114626176651014314 0ustar00biocbuildbiocbuild### .onUnload <- function(libpath) { library.dynam.unload("IRanges", libpath) } .test <- function() BiocGenerics:::testPackage("IRanges") IRanges/README.md0000644000175100017510000000067114626176651014423 0ustar00biocbuildbiocbuild[](https://bioconductor.org/) **IRanges** is an R/Bioconductor package that provides the foundation of integer range manipulation in Bioconductor. See https://bioconductor.org/packages/IRanges for more information including how to install the release version of the package (please refrain from installing directly from GitHub). IRanges/TODO0000644000175100017510000001221014626176651013624 0ustar00biocbuildbiocbuildImmediate TODO list ------------------- - Bug fix: Combining RangedData objects is currently broken (IRanges 1.9.20): library(IRanges) ranges <- IRanges(c(1,2,3),c(4,5,6)) rd1 <- RangedData(ranges) rd2 <- RangedData(shift(ranges, 100)) rd <- c(rd1, rd2) # Seems to work (with some warnings)... validObject(rd) # but returns an invalid object! - Herve: Make the MaskCollection class a derivative of the SimpleIRangesList class. - Herve: Use a different name for "reverse" method for IRanges and MaskCollection objects. Seems like, for IRanges objects, reverse() and reflect() are doing the same thing, so I should just keep (and eventually adapt) the latter. Also, I should add a "reflect" method for SimpleIRangesList objects that would do what the current "reverse" method for MaskCollection objects does. Once this is done, adapt R/reverse.R file in Biostrings to use reflect() instead of reverse() wherever needed. - Clean up endomorphisms. Long term TODO list ------------------- o RangesList: - parallel rbind - binary ops: "nearest", "intersect", "setdiff", "union" - 'y' omitted: become n-ary ops on items in collection - 'y' specified: performed element-wise - unary ops: "coverage" etc are vectorized o DataTable: - group generics (Math, Ops, Summary) o SplitDataFrameList: - rbind o IO: - xscan() - read data directly into XVector objects ------------------------------------- Conceptual framework (by Michael) ------------------------------------- Basic problem: We have lots of (long) data series and need a way to efficiently represent and manipulate them. A series is a vector, except that the positions of the elements are meaningful. That is, we often expect strong auto-correlation. We have an abstraction called "Vector" for representing these series. There are currently two optimized means of storing long series: 1) Externally, currently only in memory, in XVector derivatives. The main benefit here is avoiding unnecessary copying, though there is potential for vectors stored in databases and flat files on disk (but this is outside our use case). 2) Run-length encoding (Rle class). This is a classic means of compressing discrete-valued series. It is very efficient, as long as there are long runs of equal value. Rle, so far, is far ahead of XVector in terms of direct usefulness. If XVector were implemented with an environment, rather than an external pointer, adding functionality would be easier. Could carry some things over from externalVector. As the sequence of observations in a series is important, we often want to manipulate specific regions of the series. We can use the window() function to select a particular region from a Vector, and a logical Rle can represent a selection of multiple regions. A slightly more general representation, that supports overlapping regions, is the IntegerRanges class. An IntegerRanges object holds any number of start,width pairs that describe closed intervals representing the set of integers that fall within the endpoints. The primary implementation is IRanges, which stores the information as two integer vectors. Often the endpoints of the intervals are interesting independent of the underlying sequence. Many utilities are implemented for manipulating and analyzing IntegerRanges. These include: 1) overlap detection 2) nearest neighbors: precede, follow, nearest 3) set operations: (p)union, (p)intersect, (p)setdiff, gaps 4) coverage, too bio specific? rename to 'table'? 5) resolving overlap: reduce and (soon) collapse 6) transformations: flank, reflect, restrict, narrow... 7) (soon) mapping/alignment There are two ways to explicitly pair an IntegerRanges object with a Vector: 1) Masking, as in MaskedXString, where only the elements outside of the IntegerRanges are considered by an operation. 2) Views, which are essentially lists of subsequences. This relies in the fly-weight pattern for efficiency. Several fast paths, like viewSums and viewMaxs, are implemented. There is an RleViews and an XIntegerViews (is this one currently used at all?). Views are limited to subsequences derived from a single sequence. For more general lists of sequences, we have a separate framework, based on the List class. The List optionally ensures that all of its elements are derived from a specified type, and it also aims to efficiently represent a major use case of lists: splitting a vector by a factor. The indices of the elements with each factor level are stored, but there is no physical split of the vector into separate list elements. A special case that often occurs in data analysis is a list containing a set of variables in the same dataset. This problem is solved by 'data.frame' in base R, and we have an equivalent DataFrame class that can hold any type of R object, as long as it has a vector semantic. Many of the important data structures have List analogs. These include all atomic types, as well as: * SplitDataFrameList: a list of DataFrames that have the same columns (usually the result of a split) * RangesList: Essentially just a list of IntegerRanges objects, but often used for splitting IntegerRanges by their "space" (e.g. chromosome) IRanges/build/0000755000175100017510000000000014641351314014223 5ustar00biocbuildbiocbuildIRanges/build/vignette.rds0000644000175100017510000000041214641351314016557 0ustar00biocbuildbiocbuildRJ0>mcu>@a0CA)"ކa;y%drr~|!XqBnIiю|[ |@}8JqW]r{-GMKZ-WJ짤]c?]j!e;E˭2آ玧ֆOr0)vle1:nka.;tҜenTfX9-MH+fC8̬F_2NIRanges/inst/0000755000175100017510000000000014641351314014101 5ustar00biocbuildbiocbuildIRanges/inst/CITATION0000644000175100017510000000165414626176651015260 0ustar00biocbuildbiocbuildcitEntry(entry="article", title = "Software for Computing and Annotating Genomic Ranges", author = personList( as.person("Michael Lawrence" ), as.person("Wolfgang Huber" ), as.person("Herv\\'e Pag\\`es" ), as.person("Patrick Aboyoun" ), as.person("Marc Carlson" ), as.person("Robert Gentleman" ), as.person("Martin Morgan" ), as.person("Vincent Carey" )), year = 2013, journal = "{PLoS} Computational Biology", volume = "9", issue = "8", doi = "10.1371/journal.pcbi.1003118", url = "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118", textVersion = "Lawrence M, Huber W, Pag\\`es H, Aboyoun P, Carlson M, et al. (2013) Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol 9(8): e1003118. doi:10.1371/journal.pcbi.1003118" ) IRanges/inst/doc/0000755000175100017510000000000014641351314014646 5ustar00biocbuildbiocbuildIRanges/inst/doc/IRangesOverview.R0000644000175100017510000002317214641351314020055 0ustar00biocbuildbiocbuild### R code from vignette source 'IRangesOverview.Rnw' ################################################### ### code chunk number 1: style ################################################### BiocStyle::latex(use.unsrturl=FALSE) ################################################### ### code chunk number 2: options ################################################### options(width=72) ################################################### ### code chunk number 3: install (eval = FALSE) ################################################### ## if (!require("BiocManager")) ## install.packages("BiocManager") ## BiocManager::install("IRanges") ################################################### ### code chunk number 4: initialize ################################################### library(IRanges) ################################################### ### code chunk number 5: iranges-constructor ################################################### ir1 <- IRanges(start=1:10, width=10:1) ir1 ir2 <- IRanges(start=1:10, end=11) ir3 <- IRanges(end=11, width=10:1) identical(ir1, ir2) && identical(ir1, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width=c(12, 6, 6, 15, 6, 2, 7)) ir ################################################### ### code chunk number 6: iranges-start ################################################### start(ir) ################################################### ### code chunk number 7: iranges-end ################################################### end(ir) ################################################### ### code chunk number 8: iranges-width ################################################### width(ir) ################################################### ### code chunk number 9: iranges-subset-numeric ################################################### ir[1:4] ################################################### ### code chunk number 10: iranges-subset-logical ################################################### ir[start(ir) <= 15] ################################################### ### code chunk number 11: plotRanges ################################################### plotRanges <- function(x, xlim=x, main=deparse(substitute(x)), col="black", sep=0.5, ...) { height <- 1 if (is(xlim, "IntegerRanges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col=col, ...) title(main) axis(1) } ################################################### ### code chunk number 12: ir-plotRanges ################################################### plotRanges(ir) ################################################### ### code chunk number 13: ranges-reduce ################################################### reduce(ir) plotRanges(reduce(ir)) ################################################### ### code chunk number 14: rangeslist-contructor ################################################### rl <- IRangesList(ir, rev(ir)) ################################################### ### code chunk number 15: rangeslist-start ################################################### start(rl) ################################################### ### code chunk number 16: bracket-ranges ################################################### set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length=500), seq(10, 0.001, length=500)) xVector <- rpois(1e7, lambda) yVector <- rpois(1e7, lambda[c(251:length(lambda), 1:250)]) xRle <- Rle(xVector) yRle <- Rle(yVector) irextract <- IRanges(start=c(4501, 4901) , width=100) xRle[irextract] ################################################### ### code chunk number 17: overlap-ranges ################################################### ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) ################################################### ### code chunk number 18: ranges-coverage ################################################### cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) ################################################### ### code chunk number 19: ranges-shift ################################################### shift(ir, 10) ################################################### ### code chunk number 20: ranges-narrow ################################################### narrow(ir, start=1:5, width=2) ################################################### ### code chunk number 21: ranges-restrict ################################################### restrict(ir, start=2, end=3) ################################################### ### code chunk number 22: ranges-threebands ################################################### threebands(ir, start=1:5, width=2) ################################################### ### code chunk number 23: ranges-plus ################################################### ir + seq_len(length(ir)) ################################################### ### code chunk number 24: ranges-asterisk ################################################### ir * -2 # double the width ################################################### ### code chunk number 25: ranges-disjoin ################################################### disjoin(ir) plotRanges(disjoin(ir)) ################################################### ### code chunk number 26: ranges-disjointBins ################################################### disjointBins(ir) ################################################### ### code chunk number 27: ranges-reflect ################################################### reflect(ir, IRanges(start(ir), width=width(ir)*2)) ################################################### ### code chunk number 28: ranges-flank ################################################### flank(ir, width=seq_len(length(ir))) ################################################### ### code chunk number 29: ranges-gaps ################################################### gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) ################################################### ### code chunk number 30: ranges-pgap ################################################### ################################################### ### code chunk number 31: ranges-union ################################################### ################################################### ### code chunk number 32: ranges-punion ################################################### ################################################### ### code chunk number 33: ranges-intersect ################################################### ################################################### ### code chunk number 34: ranges-pintersect ################################################### ################################################### ### code chunk number 35: ranges-setdiff ################################################### ################################################### ### code chunk number 36: ranges-psetdiff ################################################### ################################################### ### code chunk number 37: Views-constructors ################################################### xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xRleList <- RleList(xRle, 2L * rev(xRle)) xViewsList <- slice(xRleList, 1) ################################################### ### code chunk number 38: views-looping ################################################### head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) ################################################### ### code chunk number 39: AtomicList-intro ################################################### showClass("RleList") ################################################### ### code chunk number 40: list-construct ################################################### args(IntegerList) cIntList1 <- IntegerList(x=xVector, y=yVector) cIntList1 sIntList2 <- IntegerList(x=xVector, y=yVector, compress=FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress=FALSE) object.size(cIntList2) object.size(sIntList2) ################################################### ### code chunk number 41: list-length ################################################### length(cIntList2) Rle(lengths(cIntList2)) ################################################### ### code chunk number 42: list-lapply ################################################### system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) ################################################### ### code chunk number 43: list-groupgenerics ################################################### xRleList > 0 yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList + yRleList sum(xRleList > 0 | yRleList > 0) ################################################### ### code chunk number 44: list-endoapply ################################################### safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] ################################################### ### code chunk number 45: SessionInfo ################################################### sessionInfo() IRanges/inst/doc/IRangesOverview.Rnw0000644000175100017510000005176214626176651020444 0ustar00biocbuildbiocbuild%\VignetteIndexEntry{An Overview of the IRanges package} %\VignetteDepends{IRanges} %\VignetteKeywords{Ranges,IntegerRanges,IRanges,IRangesList,Views,AtomicList} %\VignettePackage{IRanges} \documentclass{article} \usepackage[authoryear,round]{natbib} <>= BiocStyle::latex(use.unsrturl=FALSE) @ \title{An Overview of the \Biocpkg{IRanges} package} \author{Patrick Aboyoun, Michael Lawrence, Herv\'e Pag\`es} \date{Edited: February 2018; Compiled: \today} \begin{document} \maketitle \tableofcontents <>= options(width=72) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Introduction} When analyzing sequences, we are often interested in particular consecutive subsequences. For example, the \Sexpr{letters} vector could be considered a sequence of lower-case letters, in alphabetical order. We would call the first five letters (\textit{a} to \textit{e}) a consecutive subsequence, while the subsequence containing only the vowels would not be consecutive. It is not uncommon for an analysis task to focus only on the geometry of the regions, while ignoring the underlying sequence values. A list of indices would be a simple way to select a subsequence. However, a sparser representation for consecutive subsequences would be a range, a pairing of a start position and a width, as used when extracting sequences with \Rfunction{window}. Two central classes are available in Bioconductor for representing ranges: the \Rclass{IRanges} class defined in the \Biocpkg{IRanges} package for representing ranges defined on a single space, and the \Rclass{GRanges} class defined in the \Biocpkg{GenomicRanges} package for representing ranges defined on multiple spaces. In this vignette, we will focus on \Rclass{IRanges} objects. We will rely on simple, illustrative example datasets, rather than large, real-world data, so that each data structure and algorithm can be explained in an intuitive, graphical manner. We expect that packages that apply \Biocpkg{IRanges} to a particular problem domain will provide vignettes with relevant, realistic examples. The \Biocpkg{IRanges} package is available at bioconductor.org and can be downloaded via \Rfunction{BiocManager::install}: <>= if (!require("BiocManager")) install.packages("BiocManager") BiocManager::install("IRanges") @ <>= library(IRanges) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{\Rclass{IRanges} objects} To construct an \Rclass{IRanges} object, we call the \Rfunction{IRanges} constructor. Ranges are normally specified by passing two out of the three parameters: start, end and width (see \Rcode{help(IRanges)} for more information). % <>= ir1 <- IRanges(start=1:10, width=10:1) ir1 ir2 <- IRanges(start=1:10, end=11) ir3 <- IRanges(end=11, width=10:1) identical(ir1, ir2) && identical(ir1, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width=c(12, 6, 6, 15, 6, 2, 7)) ir @ % All of the above calls construct the same \Rclass{IRanges} object, using different combinations of the \Rcode{start}, \Rcode{end} and \Rcode{width} parameters. Accessing the starts, ends and widths is supported via the \Rfunction{start}, \Rfunction{end} and \Rfunction{width} getters: <>= start(ir) @ <>= end(ir) @ <>= width(ir) @ Subsetting an \Rclass{IRanges} object is supported, by numeric and logical indices: <>= ir[1:4] @ <>= ir[start(ir) <= 15] @ In order to illustrate range operations, we'll create a function to plot ranges. <>= plotRanges <- function(x, xlim=x, main=deparse(substitute(x)), col="black", sep=0.5, ...) { height <- 1 if (is(xlim, "IntegerRanges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col=col, ...) title(main) axis(1) } @ <>= plotRanges(ir) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ir-plotRanges} \caption{\label{fig-ir-plotRanges}% Plot of original ranges.} \end{center} \end{figure} \subsection{Normality} Sometimes, it is necessary to formally represent a subsequence, where no elements are repeated and order is preserved. Also, it is occasionally useful to think of an \Rclass{IRanges} object as a set of integers, where no elements are repeated and order does not matter. The \Rclass{NormalIRanges} class formally represents a set of integers. By definition an \Rclass{IRanges} object is said to be \textit{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). There are three main advantages of using a \textit{normal} \Rclass{IRanges} object: (1) it guarantees a subsequence encoding or set of integers, (2) it is compact in terms of the number of ranges, and (3) it uniquely identifies its information, which simplifies comparisons. The \Rfunction{reduce} function reduces any \Rclass{IRanges} object to a \Rclass{NormalIRanges} by merging redundant ranges. <>= reduce(ir) plotRanges(reduce(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-reduce} \caption{\label{fig-ranges-reduce}% Plot of reduced ranges.} \end{center} \end{figure} \subsection{Lists of \Rclass{IRanges} objects} It is common to manipulate collections of \Rclass{IRanges} objects during an analysis. Thus, the \Biocpkg{IRanges} package defines some specific classes for working with multiple \Rclass{IRanges} objects. The \Rclass{IRangesList} class asserts that each element is an \Rclass{IRanges} object and provides convenience methods, such as \Rfunction{start}, \Rfunction{end} and \Rfunction{width} accessors that return \Rclass{IntegerList} objects, aligning with the \Rclass{IRangesList} object. Note that \Rclass{IntegerList} objects will be covered later in more details in the ``Lists of Atomic Vectors'' section of this document. To explicitly construct an \Rclass{IRangesList}, use the \Rfunction{IRangesList} function. <>= rl <- IRangesList(ir, rev(ir)) @ % <>= start(rl) @ \subsection{Vector Extraction} As the elements of an \Rclass{IRanges} object encode consecutive subsequences, they may be used directly in sequence extraction. Note that when a \textit{normal} \Rclass{IRanges} is given as the index, the result is a subsequence, as no elements are repeated or reordered. If the sequence is a \Rclass{Vector} subclass (i.e. not an ordinary \Rclass{vector}), the canonical \Rfunction{[} function accepts an \Rclass{IRanges} object. % <>= set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length=500), seq(10, 0.001, length=500)) xVector <- rpois(1e7, lambda) yVector <- rpois(1e7, lambda[c(251:length(lambda), 1:250)]) xRle <- Rle(xVector) yRle <- Rle(yVector) irextract <- IRanges(start=c(4501, 4901) , width=100) xRle[irextract] @ % \subsection{Finding Overlapping Ranges} The function \Rfunction{findOverlaps} detects overlaps between two \Rclass{IRanges} objects. <>= ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) @ \subsection{Counting Overlapping Ranges} The function \Rfunction{coverage} counts the number of ranges over each position. <>= cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-coverage} \caption{\label{fig-ranges-coverage}% Plot of ranges with accumulated coverage.} \end{center} \end{figure} \subsection{Finding Neighboring Ranges} The \Rfunction{nearest} function finds the nearest neighbor ranges (overlapping is zero distance). The \Rfunction{precede} and \Rfunction{follow} functions find the non-overlapping nearest neighbors on a specific side. \subsection{Transforming Ranges} Utilities are available for transforming an \Rclass{IRanges} object in a variety of ways. Some transformations, like \Rfunction{reduce} introduced above, can be dramatic, while others are simple per-range adjustments of the starts, ends or widths. \subsubsection{Adjusting starts, ends and widths} Perhaps the simplest transformation is to adjust the start values by a scalar offset, as performed by the \Rfunction{shift} function. Below, we shift all ranges forward 10 positions. % <>= shift(ir, 10) @ There are several other ways to transform ranges. These include \Rfunction{narrow}, \Rfunction{resize}, \Rfunction{flank}, \Rfunction{reflect}, \Rfunction{restrict}, and \Rfunction{threebands}. For example \Rfunction{narrow} supports the adjustment of start, end and width values, which should be relative to each range. These adjustments are vectorized over the ranges. As its name suggests, the ranges can only be narrowed. % <>= narrow(ir, start=1:5, width=2) @ The \Rfunction{restrict} function ensures every range falls within a set of bounds. Ranges are contracted as necessary, and the ranges that fall completely outside of but not adjacent to the bounds are dropped, by default. % <>= restrict(ir, start=2, end=3) @ The \Rfunction{threebands} function extends \Rfunction{narrow} so that the remaining left and right regions adjacent to the narrowed region are also returned in separate \Rclass{IRanges} objects. % <>= threebands(ir, start=1:5, width=2) @ The arithmetic operators \Rfunction{+}, \Rfunction{-} and \Rfunction{*} change both the start and the end/width by symmetrically expanding or contracting each range. Adding or subtracting a numeric (integer) vector to an \Rclass{IRanges} causes each range to be expanded or contracted on each side by the corresponding value in the numeric vector. <>= ir + seq_len(length(ir)) @ % The \Rfunction{*} operator symmetrically magnifies an \Rclass{IRanges} object by a factor, where positive contracts (zooms in) and negative expands (zooms out). % <>= ir * -2 # double the width @ WARNING: The semantic of these arithmetic operators might be revisited at some point. Please restrict their use to the context of interactive visualization (where they arguably provide some convenience) but avoid to use them programmatically. \subsubsection{Making ranges disjoint} A more complex type of operation is making a set of ranges disjoint, \textit{i.e.} non-overlapping. For example, \Rfunction{threebands} returns a disjoint set of three ranges for each input range. The \Rfunction{disjoin} function makes an \Rclass{IRanges} object disjoint by fragmenting it into the widest ranges where the set of overlapping ranges is the same. <>= disjoin(ir) plotRanges(disjoin(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-disjoin} \caption{\label{fig-ranges-disjoin}% Plot of disjoined ranges.} \end{center} \end{figure} A variant of \Rfunction{disjoin} is \Rfunction{disjointBins}, which divides the ranges into bins, such that the ranges in each bin are disjoint. The return value is an integer vector of the bins. <>= disjointBins(ir) @ \subsubsection{Other transformations} Other transformations include \Rfunction{reflect} and \Rfunction{flank}. The former ``flips'' each range within a set of common reference bounds. <>= reflect(ir, IRanges(start(ir), width=width(ir)*2)) @ % The \Rfunction{flank} returns ranges of a specified width that flank, to the left (default) or right, each input range. One use case of this is forming promoter regions for a set of genes. <>= flank(ir, width=seq_len(length(ir))) @ % \subsection{Set Operations} Sometimes, it is useful to consider an \Rclass{IRanges} object as a set of integers, although there is always an implicit ordering. This is formalized by \Rclass{NormalIRanges}, above, and we now present versions of the traditional mathematical set operations \textit{complement}, \textit{union}, \textit{intersect}, and \textit{difference} for \Rclass{IRanges} objects. There are two variants for each operation. The first treats each \Rclass{IRanges} object as a set and returns a \textit{normal} value, while the other has a ``parallel'' semantic like \Rfunction{pmin}/\Rfunction{pmax} and performs the operation for each range pairing separately. The \textit{complement} operation is implemented by the \Rfunction{gaps} and \Rfunction{pgap} functions. By default, \Rfunction{gaps} will return the ranges that fall between the ranges in the (normalized) input. It is possible to specify a set of bounds, so that flanking ranges are included. <>= gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-gaps} \caption{\label{fig-ranges-gap}% Plot of gaps from ranges.} \end{center} \end{figure} \Rfunction{pgap} considers each parallel pairing between two \Rclass{IRanges} objects and finds the range, if any, between them. Note that the function name is singular, suggesting that only one range is returned per range in the input. <>= @ The remaining operations, \textit{union}, \textit{intersect} and \textit{difference} are implemented by the \Rfunction{[p]union}, \Rfunction{[p]intersect} and \Rfunction{[p]setdiff} functions, respectively. These are relatively self-explanatory. <>= @ <>= @ <>= @ <>= @ <>= @ <>= @ % \subsection{Mapping Ranges Between Vectors} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Vector Views} The \Biocpkg{IRanges} package provides the virtual \Rclass{Views} class, which stores a vector-like object, referred to as the ``subject'', together with an \Rclass{IRanges} object defining ranges on the subject. Each range is said to represent a \textit{view} onto the subject. Here, we will demonstrate the \Rclass{RleViews} class, where the subject is of class \Rclass{Rle}. Other \Rclass{Views} implementations exist, such as \Rclass{XStringViews} in the \Biocpkg{Biostrings} package. \subsection{Creating Views} There are two basic constructors for creating views: the \Rfunction{Views} function based on indicators and the \Rfunction{slice} based on numeric boundaries. <>= xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xRleList <- RleList(xRle, 2L * rev(xRle)) xViewsList <- slice(xRleList, 1) @ Note that \Rclass{RleList} objects will be covered later in more details in the ``Lists of Atomic Vectors'' section of this document. \subsection{Aggregating Views} While \Rfunction{sapply} can be used to loop over each window, the native functions \Rfunction{viewMaxs}, \Rfunction{viewMins}, \Rfunction{viewSums}, and \Rfunction{viewMeans} provide fast looping to calculate their respective statistical summaries. <>= head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Lists of Atomic Vectors} In addition to the range-based objects described in the previous sections, the \Biocpkg{IRanges} package provides containers for storing lists of atomic vectors such as \Rclass{integer} or \Rclass{Rle} objects. The \Rclass{IntegerList} and \Rclass{RleList} classes represent lists of \Rclass{integer} vectors and \Rclass{Rle} objects respectively. They are subclasses of the \Rclass{AtomicList} virtual class which is itself a subclass of the \Rclass{List} virtual class defined in the \Biocpkg{S4Vectors} package. <>= showClass("RleList") @ As the class definition above shows, the \Rclass{RleList} class is virtual with subclasses \Rclass{SimpleRleList} and \Rclass{CompressedRleList}. A \Rclass{SimpleRleList} class uses an ordinary \R{} list to store the underlying elements and the \Rclass{CompressedRleList} class stores the elements in an unlisted form and keeps track of where the element breaks are. The former ``simple list" class is useful when the \Rclass{Rle} elements are long and the latter ``compressed list" class is useful when the list is long and/or sparse (i.e. a number of the list elements have length 0). In fact, all of the atomic vector types (logical, integer, numeric, complex, character, raw, and factor) have similar list classes that derive from the \Rclass{List} virtual class. For example, there is an \Rclass{IntegerList} virtual class with subclasses \Rclass{SimpleIntegerList} and \Rclass{CompressedIntegerList}. Each of the list classes for atomic sequences, be they stored as vectors or \Rclass{Rle} objects, have a constructor function with a name of the appropriate list virtual class, such as \Rclass{IntegerList}, and an optional argument \Rcode{compress} that takes an argument to specify whether or not to create the simple list object type or the compressed list object type. The default is to create the compressed list object type. <>= args(IntegerList) cIntList1 <- IntegerList(x=xVector, y=yVector) cIntList1 sIntList2 <- IntegerList(x=xVector, y=yVector, compress=FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress=FALSE) object.size(cIntList2) object.size(sIntList2) @ The \Rfunction{length} function returns the number of elements in a \Rclass{Vector}-derived object and, for a \Rclass{List}-derived object like ``simple list" or ``compressed list", the \Rfunction{lengths} function returns an integer vector containing the lengths of each of the elements: <>= length(cIntList2) Rle(lengths(cIntList2)) @ Just as with ordinary \R{} \Rclass{list} objects, \Rclass{List}-derived object support the \Rfunction{[[} for element extraction, \Rfunction{c} for concatenating, and \Rfunction{lapply}/\Rfunction{sapply} for looping. When looping over sparse lists, the ``compressed list" classes can be much faster during computations since only the non-empty elements are looped over during the \Rfunction{lapply}/\Rfunction{sapply} computation and all the empty elements are assigned the appropriate value based on their status. <>= system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) @ Unlike ordinary \R{} \Rclass{list} objects, \Rclass{AtomicList} objects support the \Rfunction{Ops} (e.g. \Rfunction{+}, \Rfunction{==}, \Rfunction{\&}), \Rfunction{Math} (e.g. \Rfunction{log}, \Rfunction{sqrt}), \Rfunction{Math2} (e.g. \Rfunction{round}, \Rfunction{signif}), \Rfunction{Summary} (e.g. \Rfunction{min}, \Rfunction{max}, \Rfunction{sum}), and \Rfunction{Complex} (e.g. \Rfunction{Re}, \Rfunction{Im}) group generics. <>= xRleList > 0 yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList + yRleList sum(xRleList > 0 | yRleList > 0) @ Since these atomic lists inherit from \Rclass{List}, they can also use the looping function \Rfunction{endoapply} to perform endomorphisms. <>= safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Session Information} Here is the output of \Rcode{sessionInfo()} on the system on which this document was compiled: <>= sessionInfo() @ \end{document} IRanges/inst/doc/IRangesOverview.pdf0000644000175100017510000067200014641351314020425 0ustar00biocbuildbiocbuild%PDF-1.5 % 116 0 obj << /Length 2058 /Filter /FlateDecode >> stream x\Ks6WHUY{Nk30Ɂ`345ίO7m6e'H4h4@4Y$4撉Eѐj@N/.M8%YUru(JIf%W}zQO\{N0}vsN~zHY@+9ys2q۝^wMFLFli@;5XPB'L!3է\h´LgҢoTd: ?LMU},.nm=m2ȓ'\)'MB}vk2&O/U/)3eYm]U^q>rO%Y O&) z)U2{{[~'4К!* :c'XW~ 8ƛoj֗M}C; ɳ*s K H9HG!3.Nq1k~%a!BB kC9OCXΆ;뻧;A#Eb$nq^\=yaI,O0BX"'Φp~FgD+&#q h/gO̜M|h`39i9ӹԶ٦RZ6""]$>- +b qbh f)?-O'^99OqY?jo+X+*rCNGH"D3t5kjVu|S+"Lܓ~6c6v`֖;_ /ōWj8MҶn11|n)ʢ[Fb?8* P\ON>dbam]_K*HYHp3 bw?Z?)(¨8 Nws$$=g8Ĉ'Zpub.V8#"1ɋ" P!_A*ĥ}sWLlMO s("o E=Q#;uC7<%B4{I4h570[!:ibr| ZHבy> tVw{w,.X endstream endobj 134 0 obj << /Length 2033 /Filter /FlateDecode >> stream xn6=_XUN E恑G,CْdriR!FhO'߽,ME_G%Y&"xWEcX ſbY|[ ;Z^_NX¬2bf:305> wc.ZBDgv._p0vpz 84ReI8]r$0*R4΋k\&+׶@qn]YJm wۅ0qCP[6Ŏ.lcU&E{X*{V+RI&H~$ 2MRvgLe'Vk_3bL\ŏHO3?6in:ӭ N/NJm™*\BƮ^0agimJ=nNy /*$Rn 0pQcG  "0  JAz*;ʝw1)I)mJXʖf.(&/h뮬*]_C:3:(q5D; (A`MHk@[n@4[E]yMDrk+;! 9WӬ`.zUqZADnt4+\~s}} =~R治RCA.^ b]J-}RBJ!#)syo\] jBZ;BwX3:@g/R p!tے꽨2E6ֆVۍ҄Cdr\nh,uuwl-}s%#`1=.0 iҲ#(JG(wd &Q<5TnZy໺ںU oKB(mʺO=%Ϟg)5t_ -%4&RNX†=_gdu"teD\^;쫦k_6οF;F~ F:do!z1輂>q^2!Ѓ*!?Y.KUD[a^OL'Ǝ]Wk>HD/Ѓ` "(6 HMx#g~6>=G1]MU~Lד5Pz9 >a1(ujN'(h3p{Nʷ~S2NB2V\Wꎼ1AƇ/뮀- By/!_( nԳOfy!bp/Yx)|6 zx57StlH/x={%ֽblw407N&.b]61xt|g`e=O߿<#K9=Y'0bH`L0p, Bhf˩2_$`gKe{{?DBձЭy6%A:?*s endstream endobj 140 0 obj << /Length 1393 /Filter /FlateDecode >> stream xYKSFWĮXü[KB-@YY&̒ʟO˶pf4=|=UONG'IerzqÓyr6Ty=I 7?'秿0†`$ X:d [z SIV]9yE{ʃڨu&f` n@tr`>9" '$V06l1 D@e)$u\~{fD!#0&!"Bt"DJ~,6&!+IAs6~3r>I'zL&)D UOQ~DD"ކMmvLN2,Z|V"DIZ# i*ֲ"yZy5e+T&+\NleE~t}+R'ٜ((56 ZKR0a,%n6˛øhKLpZ㟬GW,mV77y<-NPK t|}\o2r,|j]i i@rkS(m) g2"B뮩X%l+m}# ++PY VO=/@ >\b0-͠`]>~5',u썻5amB@"cGESt>crnZ!NTMP $(:FÂӸ|NP@g'q8U'ИKE5Z~_]4P א(z-Njq%6`0 `v_\y]z\^gռ4vuj:3pdՂi%`,3B g [ v63lYwDfswQ` ;M4S|yMJ1Q1-T09^SCԐ?5v]WƢXƟIsC=`e8rմu~8|0\|Q]ΪY媚YpO oe%gIzkl f{& d&N͏р6~,GqȊhduE @hW2aC=@/@}fE;m#F+~h:/] ť_J]>9(s7A-n?p;%Y+( endstream endobj 148 0 obj << /Length 1977 /Filter /FlateDecode >> stream xXK6ϯZ0Ûݪvry8ss|HHÇBR3C{ EJǞ]WJ4Ѝ@͋:RF"nwY/VTy"NL xߕfõO`6o Y*.P'1qƱG~r.hx6k鉅Bx:Iq17"6-_7:=!Bԋd’X{y} -g'Unś{io~߸H;G,C'RTp/$.ΐKSTƖy +]a';ʃ)aq48fa"ׅ}ض^jצZi^Jtgma'Xb]õ1NN>"Zl4I[ 1Wʡ2'a},-"?/KOs߿% i@ H ⑊;N7e7<YQ@ #4 Nh%?Y{p+!TpQ't^'H`T:f(kӃ'(%r ﳍnZjwAwMv˨O~"orc9&`ܶ25u"dk1L6UV8ģa{}{ݙ2\ZM[,4thst.Z)Dp`2RqB)6il du6 c- o ~Ң˫G@iBm|FZN Y*Gb4_- HR5p\D>8T!"INOc%}VԳvKYHI%ͤ5IvF q\ǖ*e;pPت '1q@QPc$5^!3n&hNU̺}Y dOX{DP܉nrlX[|$hAXna-چb1>CЈ_N 0!(QYqw3d9L䭵Ma$y!)s``yp*6okۻ-P8Qv+=H؁T;tKdk(9 #0aԔpʡ,%F>\L)ה>VvMCݲS2@Ÿ*]y˙\HBTPŀ:[]_ =y$1 *3:}gT.o(ituEJ|C$)7UaSڝ2X''ōa-%P1|Vm% v^ՠ&j$bl |j)\أSS@ r' 7_zor np[Z@!} Y/xBF#ű_IF Y p%@] |΀"7 fWפ~>;$޲0%> /ExtGState << >>/ColorSpace << /sRGB 155 0 R >>>> /Length 341 /Filter /FlateDecode >> stream xSMK1WQ/ϼk UXov7]Æ{I˜VͱK-!kaPH"'ؽjp;r4iQ偞.>34V1Zq+h <;+LsG > stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 161 0 obj << /Length 2330 /Filter /FlateDecode >> stream xZ[oܺ~У2Hѓ ikyqhi[VHZ;!J];Eιp8 4hgG>*$Mx]Fcx%#2*:?T%&ǍmKF.o}w}..~D M%%* :jCGʡ <:>fՕl[O"?a6Wm|i=[°|#sJRF0hGDB C Bǻk>,{[ǣI)C P$,J8'/jXE|1{D^Yb)t'ecWܮtm[򸦞{(xb0vfEJXo|̬pܶwDvfιUgl󀳓h1/7B-<] Jb~8bbGB#u3t 'b daۢ,}sh^G\ !/zg VFve(CuX\J9 ׮^oA2m XB9EQcƛtEv} c(υ~);߆vÏx0QJTv2un۠K0 Sp3q ï?Iڿ vvd2[[ec;b7n=K d~Lg0z[0R7 Q~~ x_fS}m66;w*㊝x5s•)ANK;& ew-;h9DbhgNd0uԟkNZ=i(#921/w"fJ{6.ms*d|n$,%T"-ߛB\t9/YG|$u&B:Xt$}@yaQ֠8LƴkMABС *v?f7wEpߔg>0)?!=_oϒ?KI$KF\R{3I Xw@tV.; ӝtyskɯf\3%p{F+DLb endstream endobj 143 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpwYqvEO/Rbuild2a70d72df0a837/IRanges/vignettes/IRangesOverview-ranges-reduce.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 164 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 165 0 R/F3 166 0 R>> /ExtGState << >>/ColorSpace << /sRGB 167 0 R >>>> /Length 318 /Filter /FlateDecode >> stream xSJ1+ޱ$ٽPP]`=,VhfQ ̾0&`5SlkHN@j x7W{cZr/x =ߧia4Izֆ[LH-8m FYKdVG25pB3o\ yiDtXI= c^w\7R B)Ty4@1d@iv|@o{dX tԷX,LCcAc?[Y*%:݀\`J,¬yj+l endstream endobj 169 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 174 0 obj << /Length 1910 /Filter /FlateDecode >> stream xY[s6~}bFDLzI>OiYp&t/[g;^IJ\# ^}sr乢AJҘ2Ph̓D2"SiYDӈD߶f :_pZW!jZyb<݆`2Q B$&Xq%ә+  A$6kCJv9>F OR [gCasSb$f1&b17v$ !I$t'$h2'SI]>cb79gL* W,t_W+s,(0-D 8TEdVѱ){ZQ W^Hb=(;ʞ޾|mtLk$1궹!C'5'EVu2E\na'?{FzѬ> v<`I7Sw٥튲e@lOqh>MTGMew# @.҄(v 9ܱ<ə!yYm8MOŬ)>IMNJs=ղI"dzļ蝨{|hֺ˰ih3?%7!s?{җye~*v5`53>`U5nttk~+vA7(hsQY I7GїMDfӗO;7QWI idL5:!K"цB$ BojbqSl12H𛲶;Wn/Lbw >*caWΝ?"$2G< endstream endobj 2 0 obj << /Type /ObjStm /N 100 /First 829 /Length 2400 /Filter /FlateDecode >> stream xZmo_c徿3pNz&2RT>Cɔ2[nPgp%`iҰȤ1L 4SAbi'LGd`$Qb^Z,_r,*tEN*%;\cb{I+3PaN3CJz t&Z S*zf=SZ*N7qpYዒ9ǔMy q"S8AGc .8+#&Ő"QV2@3:Kfa,> DEGR0 p0JX&B0u'6(P&!x*܀` FqJKb9 8ΆH`8&_HB [FRCE@:x —Ai<=*4pWc TZ"/' "-l#=#6cB2yv N`z$$F."ݸ-<5S$JbFX HG!liy`43՛#lgMQc?8zV{V!y+153jn-Ԩ55Ԥ+jrj jwK4 fM@5^vZ;pV{4m{$M<]ڳf7 /iOK&Vm̪nⲋܭ+j>vԝ[7]sv=mUAYY=/;Ew]ˀ{lc6c7.'rCfy$=ۓߜ:w5/ig2}eU4{tĿvh>lwsrfzz_򕯥kb4ݔ<3auYoWY@7]uJ.˒GG',yk^vE(OԜ?:Pz`9"QǖHс}:0oӕ##S=s! :wV9xt trH틼0#iaN6`!-K'Y=ʧSH~t =0EҴkD9(c:ш47?rkC"!w~X)(mr%^hx[X{of;E޼+){5knf^K\t$-tU4g|ãA9/k4o'̪qQ^?!+O7Ndy3~rxH_c ]ߘ}(i}T\^仜|a,.ٔE=|_6K>tZ:459ϊ:]_NO)&_Y!Yqb|_yLnݬJ$5R`U'zS^UYcmB?"/jVewM{=e4'2O?_fsa7m9jfsS{=Q"XP靧a'=ix,E(Ќzh~ d?'J߼'}2T endstream endobj 170 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpwYqvEO/Rbuild2a70d72df0a837/IRanges/vignettes/IRangesOverview-ranges-coverage.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 177 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 178 0 R/F3 179 0 R>> /ExtGState << >>/ColorSpace << /sRGB 180 0 R >>>> /Length 656 /Filter /FlateDecode >> stream xMo1 >J q* Qu8L:g_;q=@p73܀hDA@cd'|?ݗgpyg,Zka~]ywIoCpWJ|4!!Ĉ>Brڨ783DC-LHպSj0K%1#9{$~XY֋3JU9?ٴo`ބ0yװ2ܕ,o<` v}!&7#ۮ7t{@9 =`)ਦ^sk0o\9˘ڵSzr6_'$suz,N!MNLЉ'NOly&F&Ti VZ,6X_t0ʺut.:z$phw6jst}`B;Gm0sمk2F[\,)؁,e;E7R]g?bہЇ S1x Zbte +jiR`W>d!s}H2PR" #ʤJlnRG?}x>2S72aZi77ufLАa]6\(dhQ?v4' endstream endobj 183 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 187 0 obj << /Length 1872 /Filter /FlateDecode >> stream xZKs6W5c!x?2Mgҙ&\:M}s}%b".Iq}J"%˱GۜXvWMg?_z'pbT&D`5M'\m1Rl5v@DzIx6S;p6#VT+TW;9Țd1+ndH F"~̋*jt\OMdJHPڮPO{tI[ 6 Nt*v`v%fZCp)N|' V&."vUK p;@Ő2~{hP4R!V4PIk *7%Ħ*߻M~CV2lj_IW(e4 )42̋l1]93= tր¤DXD0(mH!iz|Ega=8 6 &e1YAJ 0IU{ 3"9&I\Uǰ~sYRAӃh ]ymw޹q;t,!a)g!c̲.SQ©- } u(re^~pGf/빒ĄJa1jK?~I@R斎ak}%a 7(f]Eʟr;R}qא8ȣ}$ĦkfZ=?p,U;z\E_̝CbZ4_D;5iRCCnX"4q|ЉlO _ :k]A6-7'E]b@IQCe `L&eD|4 zZ.fqΆIegYřSiQ<ЗDžti7゗= C;Vϗ vqU!Ey8%07 znƣbg^ywiÏ۷"޺yl[w iZ}87nt>y'Eͬ]M-n_# FF[k X6Xc_mÓwtF]6N N#k0d9%Şρ49zȓE1jPhzQ('ǟ0dYTmZ. stFX@n2*Ɓ|6j"6ALbIp @ٺ0J$1=X_̚t k礐aXT.:8E ls7y:5}9 *cd-2fy뱝dYs: oH 9շ`137lo8Lp-ac *Zp^u.0p,k0V>2Jz<RQ-(#ۿ4ݏy0[G5^ZKHUH<WG]/FԨvElgvwHw7ӸV:nh!8>E'}Y"An,aWY][EUy\۰?kl׿ ~G@_7aQ|c$;\M4AcJmI̾\>Vsgm{OO6* $P;0$ RN7 endstream endobj 193 0 obj << /Length 1369 /Filter /FlateDecode >> stream xXKo6Whrw)REQ` 4R4-jm)+Φ×,9IwÞ89?74O~M24f7H)INx4[DrP?v&ǻBOtuvW'a$PFb)|⽉( ෋\ZϺq.|C*A6O)ɧA046ʑu68! G$$e8曉T !3]"ut3 SEeS!GJ@/4ኤ1ADpbfRB Sȼ@W iP`<+Q8 VX$CB( e)})K{$)quN+ڕ¼\8a|Wl~D*ipJx%sw< ·J?e8y%Fڍcay8IOgѧXcK@sC.ʂO*KT؃abj} GC6~&lpECÏ,=c+䉧~vNu8(˨s>ޅ S *n5:o+s3 PA@_/!"lp}OmaL17 Iuo_'KKDEbY2V,%.KN\T\E@UD]<_&pw^Ploh믳P\v`׽y^zS)V*fC#3`yIT6wS~1n^Mn(]d}{;о%pJs{=Bvwym Cm6%}lhrlH~% }4cD\|Sn|F5ZlM[.Hzk]°lWwz]scvFGQ>xG^% ɶ`l8tHჃӍc"/9N7H`J̿~7A*?YoG^ WIv neY|[lňf ?fǯ 0oP„bJ-mS^ȔbXfkoePm@@{l2^|9> stream xY[o6~ϯУ "Y Zl):`]CfbI-1,IR<<~<fޝ89=S4.n2E1<Ӓiev1.7h̵Xv4f4_WnT]]vB E0j OlGˁ6Hv)Jײ8?#`8a@ЌeSb a*,N9! 6l!qŅ Ҝ eNG)2,+8'4]/A&zKM~8oX(nΏӪTo p7{`[E0JC." 1I rٍC5ݥ(NgFu}ق$g2w@};*3K028@wy4.(z\+iXU$qpP|P$dIcd~C&:Iվ.\wY$!vdWHD3nw9k>E:%~/AT cN3,_ `&b9w_q/ߖ ~)$!?nqᗫΏ<_e/8tF+`XϼZ.<0IaWͥLZe mR)B܀8{ʢG$ {TֹGE12~Ic@{\;D8\|ZׯںS",#D8?\XG\þ·@"az]9!U›heT=Z8!PAS*K{@^z{ A*'@$l*K,)$K@pw z7zV2 ilЖaG'Gh"k?In2K]tM[RA䣼_>`h Ն'*%Zk")M&yb;'HLC[vw‘ՕO1 ԉ;-:h 6+Dƚ761B} W p`MMO(bvgg.OoP,Ӳ/q6iE݅7S]W>SOm{8{6ݛG;޼" wK]і8°v;#ئucv8"aӺqmmd6 I%nzndPϱ$WGLbI w7=?;o&Q~*PK%u3~V3!YY&-@䓧$w,y~CS#&bI)!5pqb7e++?1 F0=cVV*ֈW>K(*M _5$P +Эܘϡ6 nYYî+LRK@T`(0CGrkβNbBA¶A:r$`TX*N7 endstream endobj 195 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpwYqvEO/Rbuild2a70d72df0a837/IRanges/vignettes/IRangesOverview-ranges-disjoin.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 202 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 203 0 R/F3 204 0 R>> /ExtGState << >>/ColorSpace << /sRGB 205 0 R >>>> /Length 369 /Filter /FlateDecode >> stream xSN0+^_[R%J#qN-HhQ8R+ٱGkTs؁(58ZҌ=+C$/haV0[ 5([Be@bҲK7Hiqdb@w JPdb݋ ' endstream endobj 207 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 211 0 obj << /Length 2259 /Filter /FlateDecode >> stream xZK۸ϯJYXTA@N9ABڻoҹ=LM!5>Ednݢ"2gfyW-˔xAiAk*O(,I˛M4%o_O?p?peH)h+·T Kd K^-x!aDMth3nۆ{r"mfЁYv-m$K;-53φZjVlZ W4A}DZu,F_r|Kbf+⠽ 8po;,DNK-m?`YЮ $@ЩÓ'7 pk~t>"VdP#u~Îaa1-lݠc\ܠ7联{o>6%5{g&[, GyڷNVB×nU"=J|֛>!'8&S5C@@= gm:G_eEtkԪ%+LΏm{D:nzZ[8v"\2ENs297#t#|! `OkAЬOq n r@1Q %?`'x7vܠ}Žjs¤.ba*B*~,_33dW9**2'&|yl(rR3*-Huzt zu`tһ\ Y4(vTHڄ%yQrnࠤ HYE˺ߴ&l&$^1?Aja  0vSuaihH]A>?Ļc22b7!Swq 4 tk3fS" KhrGZ _ZJejW4-aV`QhA!jUu*Рj \Nd(/S y ՓAw22e18<ʦFŌqG2}+`E/c9U 2O8h9s4`ChQ8s ψBz3')> k 'b-`qn3юkWY(H>Nfaqp&1 ~;w) `kD60Bmbjq4 =9 endstream endobj 219 0 obj << /Length 2558 /Filter /FlateDecode >> stream xZ[s۸~#ձ\yiv&nImI;#ARIN}HKN2M\s`]F<짋' r'2.V,djte6~V2߮l3 _v&L|7+>ٻqs-Tę&KLQ=O|z !%뫢t;o``Q/}vŲ Tu:E:eYͅa϶+좥б_KHS"gB0I,^Gד g&X񾁣-Xi|2A:AtN ysfR5Х_#g }NLϲ9>5 羏~6E>hKz it>9=֑fL/: DKxFw' qdtRc:x%1zlc=Vu`/rڂgCvCZiwC@2)=! a[hSYEOP紥fF[ @ܧӖ 325S KэóL)OyL~< zLkm,WXfCY{<8E/+[ W!7z7*`9[[Ӷ"?XEΰk.&&|32Fs2-W'ro1z  fqb̦d;8ձBe 岊x7#dKYtFLGU ss*3{03 n "$ţ"s!H|cAXks$)tj7ݮ0B!p( ^UҒҒ "HjhSBadDVi|X41VqϢ R8۝[ M@[C?'tnz}9G\_wi #;$9eLJ j(ʀWCIAű5+82fOTD"avc!'$@aiʥ,[ËvMڹ 3ӓ󪲳LU cTWІaZ`B c4eFt)"4)K8W&c3dᢒT]PΫ&9pX (A;EC5"aϷM(3jTN<݅N _PH51hȠw< \[t O""{#0خ}і ?p s׻Xj<2 荜I~- ʜ,ߩ2=XIXT0hvU !𾫬P GU}UC=:fO"XI A۪\/_wq8OYk8;vxކʡs!",2I|T 0Xf}jJ2?]a?!,,5}ǧ9f[Vvbu0bx9YKp JB, _6_o9MF{,щO0<H/_7RTrtG`i~Jz'ņÖ1uAMhvEY!WJ]bV6k2df3K)c2KXbTJn<ê= _n\ڿnU{=#~( @"VK@WhtI6,ѽ7(u\Eftm6鋄|٥2M$WTz[\kQZR_zqz}N2hR_s(=TJveGj0$>βk@pg M:>ӭ&\p?nɿ poHnHeY/7$9g CjM~A7KAnU1B<5RSk}0 _v#Ab:]'h;WnO| EQ-v/f.T2!-y="^vy C)4<_s&az覊q} 2/K2 endstream endobj 213 0 obj << /Type /XObject /Subtype /Form /FormType 1 /PTEX.FileName (/tmp/RtmpwYqvEO/Rbuild2a70d72df0a837/IRanges/vignettes/IRangesOverview-ranges-gaps.pdf) /PTEX.PageNumber 1 /PTEX.InfoDict 222 0 R /BBox [0 0 432 162] /Resources << /ProcSet [ /PDF /Text ] /Font << /F2 223 0 R/F3 224 0 R>> /ExtGState << >>/ColorSpace << /sRGB 225 0 R >>>> /Length 355 /Filter /FlateDecode >> stream xRMK1W̱{܃ ՃU,ZlWk(V-2M#0X1pi`Z+RlJWC*Ik|fzrzvûAc4k.8OF#HR߭0TR8b6ޓNSMjF`LEK P`{n{O>Fo>$7Ximrm1jw$&B ߠii6R_fINL`av%60EwoIKڴd'-), MYtn}VCe^>#1gf9,N(l0Mf)2c endstream endobj 227 0 obj << /Alternate /DeviceRGB /N 3 /Length 2596 /Filter /FlateDecode >> stream xwTSϽ7PkhRH H.*1 J"6DTpDQ2(C"QDqpId߼y͛~kg}ֺLX Xňg` lpBF|،l *?Y"1P\8=W%Oɘ4M0J"Y2Vs,[|e92<se'9`2&ctI@o|N6(.sSdl-c(2-yH_/XZ.$&\SM07#1ؙYrfYym";8980m-m(]v^DW~ emi]P`/u}q|^R,g+\Kk)/C_|Rax8t1C^7nfzDp 柇u$/ED˦L L[B@ٹЖX!@~(* {d+} G͋љς}WL$cGD2QZ4 E@@A(q`1D `'u46ptc48.`R0) @Rt CXCP%CBH@Rf[(t CQhz#0 Zl`O828.p|O×X ?:0FBx$ !i@ڐH[EE1PL ⢖V6QP>U(j MFkt,:.FW8c1L&ӎ9ƌaX: rbl1 {{{;}#tp8_\8"Ey.,X%%Gщ1-9ҀKl.oo/O$&'=JvMޞxǥ{=Vs\x ‰N柜>ucKz=s/ol|ϝ?y ^d]ps~:;/;]7|WpQoH!ɻVsnYs}ҽ~4] =>=:`;cܱ'?e~!ańD#G&}'/?^xI֓?+\wx20;5\ӯ_etWf^Qs-mw3+?~O~ endstream endobj 233 0 obj << /Length 1979 /Filter /FlateDecode >> stream xYKs6WDUI!UQٝ$g*EK5EzEJ@)3rphF?it藳ξY(#:%iʣD2"3]ͣLylj2e4fTu΍L>]I&"JUiciD+_P6#0^ս4 / t%ݩ|64~'pNS(#/\E#VQMf3$K+8V&;ϗ 6|NdrkFA$tIRg'SMy0#Us{^68| 1Cc?Mwn'njcױd%P4 <_ܛp;q,Mu.ؓ ( (%&&@_yY$Lݘe3QcG:Y|\;ROI꽧4k]hBmL4IELgSD O)Yh.{!%F3sdg}SnUKwMC'mcʻNh7u8`Q˺mލ|=_wP"RcSګG`RS~3m>^DeVZ`$_|/6(p\h2IUl=$IR;Cf i*}K=: sk0 O@o:ҼXA@^4{es"*)%M\ $|- RrLqkHXZ@'p|l;Uth=RAH{i ec5჌xN AL XoGnHYK@*#:^* p`ɲ,^;$82/*>FVJ,sucTUM( WR N@,K!E mVplSڅ LB1-pn\Wh%rOc㋬Ta)i< lL]@x$<nqt}U;["B4ֹܖA&Z8n]vr3":VM]q7EYc/fe̛(eZDk8)UU^E`UyW$b(w~5Uĺo$6UHK3e'V"xO8ASEM!`H[Y@ڔ endstream endobj 238 0 obj << /Length 1885 /Filter /FlateDecode >> stream xnD=_a/^;՗"J(} rk7MzΙ3Z$ēznse~8;AHF"0% b-NupvW悇7 ~zAޞrOPg$o8g'ClБJOa-O4o]gL}6L7nRq0 K$Y=aYYҚc4W:=wUCʈ*ݨ-E遻h;eֶN/_ES/q$ahs"P xwj."ez}hUVnfs&Kޠf(g83HpJ̥J#dBbHϳ\mr"riHGawDU&6,L&L̏]N%,abB&#(iikRh8ˑ_ΝY VhU0YbҤ,Vmeym/<:lEqd:3bq,3ʎu8|vl%bDRÁ8ȗGnCD t6X@($'pI1 @ /022 ʄ/if3}aPԣڡ&tƯ [{bqG;,mÁqUOEr:+>w2Yg[i}& 3x7n+7D7Y.UOQ{׀xGE/Oį X }F3q+l6%Z?ͤ92Z aP$Bw2̚s R-&G}>5\?Ag4qFf/sv=}__|=9Њt ~;~ Fth؂_j;[C!.\On}K'}=s}/nɇ^'{_b?}:kZKc.*=Zz{|{]7 zs6Ap. }4r'Wਰ.iR8"|  :d†w(b|–{ƶ4zg=PFo?C+%yBu=BQE4i.L >:VįMGWoB1@FEF OʤӊiQBlt(b^z衑NVMxӗ PЂk [%ߪ_)ݑlʕ- \<0( D4F91&> stream x[[o~ϯp`k.=-ZAqaZ[qtjKeɿ IŖb;N6)r. CBiBξ8{NN..E<1dr1I> 7vu\ِ_ U6x%I&JUhjL{^r-9&<-sVQ uHCNCfs5{<0HZb) D~ZW+T *-B`uxEex l`7#Vq@)_fʬ+$c~YZr8[Y,^9w{lgc%>q~"!HHS>naU>~k`bVNɫ0p3N /avMizp*#@)I l o0Rm6mY79 o~Hbdދv[Qzƽj'hҎ7'5q:;/'6>i翾!8iϕy2?9%q.ހZ?1Ŝ[/&8Lڽ^p=K.] U9{nostl. /2?*}7eCUMUcX^!Z[k ow CeYSy} U^ș RcXwv2ϋ rW 2T]#)`l+Vu<9>3}h"!u2t/! Kfa02l,at?:ٍ{-<{76X_<5JR`E%)dPbȖǒd]AZ8YARhN\rs>;G4Bc~YVL\l0}CY92'h~}3^P ][L{ZױFf,Es:냜ڭ1\ LټXOq5݁;t̫@x/o# _HxbQd#!c+F8֜P& wR#7{<{qeo`r~6w> stream xXKo6W(+obh)q{IBD%Iw(R(qpF}=I 6)4] I %baeU1Y?8$1ѭ' ?7v|5nD01rD$R'0&$ Z'&0$-[ɶ"Z۵JgcF1l { 0 }?7# *hY0) kAڢq^SIJV89J cNZ4Q #{ێ1>x3%˄3T5D~ȪZY`3{C0]+(k}Ӝ1e lLP"+fvs)Uae| U~7vc.s[S0(hz(3>1c{wf0OgY怲t *Ws_S,'anL ӻK}Npf/b";6(nQ{8r9K)bU42Q}Me:. R .z]@`h_~'\\AcN  Ivy3ב]/1Sm1/sOYdU^B 5ns71(i f(LNNz mLocsKїo86\p>t1Zڮ:٬ [WKW`nOu^.X7}K^&ȍ^?`d%$'dgųU.;]ǞǮc7Sո i˜Pvfls[ջhԆwv/hdUWW~qwHܦG<$`F]+ʛІ~/"y2V~X9Jٙ_$kސ 2Wa sm˙?!-х>^ $kZgg\Eh:P6|IwVᥗ^T L XjA7܃Huɋ1 r{1Fx*Q~\\~ӝV-b $tP/ޢ+f2+hZX%߄IS')vdkSMc&:ӋdRa'AZ{t endstream endobj 254 0 obj << /Length 1216 /Filter /FlateDecode >> stream xW[s6~~@Df8$u'M:td,Ls_I}H:܅{ [Ӱ7u478Fl:#5Ƭ؅֩&eC#B{eF`mYNO{(lHevtVe^sU4 uqיA/JUc`;%r#$3Ɖ v|jk>4cKyTQ))Z)ݜ 7UUQ ᑎ&5 h>v͓xT2;]FLa>6^*5,";x|k^=by endstream endobj 269 0 obj << /Length1 1985 /Length2 7067 /Length3 0 /Length 8175 /Filter /FlateDecode >> stream xڍue\m6H#"twtw.K²tHIH 4HItIʳ^s_pa9f㘙?-"yr`8Pu׀Aa\0 ˋ$T<H?OT_D\@+ 4p#c^^П\>>-؀!P\?*P;Oo|`; l$- h*sCVz@pG pT n3@O;@{m;&OӅ'qTA! Y3 jC lÝ!P6.>^;@@NP;@Xo5apуt(hq+P B~!a"gDB?>)FV !G gB?0Gxo(GxA"[ćT}[$ѿEHv[d7EHR $tqEz rn "y0t{jsAN`3qkǁT !`˭6> %, ߉@&EFE*tsY\;0یH}p[ r>` r":ށș8݁*Fv#[r TB\P䳽G6vy/7"[72+k|kLe v qG-v;Ed<`͝ 71ϝ~!uA{i2}Kp'RڭP!~|g:^;9w Ѿ|{aj!P? b yԟߝ-5,'q Iy E#ˈ\v p&` 0Wj~AIr~c1v h_Zo6o͘`_P,S>sRVVB}л#ļ)wT[c_)!mI{U,5/ H=vXM"f}CMaa[b@e~`閥;x߾7tzQʹ.nY3h a Y'59smut\,92)Qѭ}Mzq7[sPM-Ni K5؁fգ9zPu5tG! Jݮ8❎̹S}=AT s%pͯ{ύF63хi`閞i!|hMIujLLfD4#C5 ݑdF^t˩,o4@Y3FnI2!#oόE6n^klgP=iaX+&(]{4| LU2;ޒ~lK9Ӳ(A3`(MBe}9d᥹hԴMރsd,̵G(hܯ %4 ĬmS~܂ʶss MAS:ۤ HΝ /5*0TupkO~p@4nX:,Wd3 l'jhC{=y!C"(?q}5u29W6J2e|_F+/ 1M%DY760Wf3z3q/%Pcx֧}* t\/xoB0)HAsF@*kKS4g=9{eC6jǟi_\1 m`[ Du%UUT֣#w,PSfs / 隀?S9RiVP/W|m'߾zbt@u|O2 0}n[yňw7nhCn)O ߫z\.][v(w]EU+7 `OԌ_ϰhanN "\C/06N:a4.ؖ.*JUPNEoQ8FFԎjO.O)^_z\R7D6֡=cP,?/qx-$W}@=ϋ(|=3.ᱧ24ד1r}e8_T_| D !5-פ/7f 7#r"ZwMA7> v1sTTgUi3&_7:S(պd@yAؤe5ԷzjlK{:>|swx"}!zW nO-%m%sڍ,Rڏ#\Ѫw;ٜLL DL fN!Jӓ Tk8yK4'lSZTpkt:jx"8Zc&4!2)g]c&nXCa,)fW66EbzƧ~%=a1Zy./}91e@ߖLwz%>y*8I7xp*%?og"AΒ!Yjz [4TY`H_00`G^-2wIeš255="y6DM~V'#D T%=J˖&JْiSFwS3,TI=EWh0$ѲQ݇.'` <@b/&D?P/p8.V)\f|A6/ ^D03`-^g(?]5Kq{yKVE^ߧAh -UuB8?1|M%|"llTկT O#=HkF_7>`hbXwLD%rluIvy9k"v,b~z:KAf_p(׸d&:?H*,N4&:Bۖb %_Zӳ/; Ǧfz7-&xa- ٫h=nF&IXiu3>f\MR|*ݦr\f8b"kC%K_GU x>kdhOL-:sac?QI667|{\Ds@y{f,ʍyDBG wYw`Roс2H`gxX?Vn CM$v΃mjp0æH9+YO-xkm"n6^gE!9KRGLud4G}*dhq;XXi[m&yx?{ZD'A 'q2!lwr!}^cu9z+aחrzPakqVF8'`8jhD`~{QuܥԕQL 명']PD&6V)BU{B^Eꐩ[!;ƤY(mpz (Yݯpy]FJM+$ Ne ?G 5qEi}EoāK"h(q޼iҴ lB{n~(TdѢ?e{,AARz8XzE dYbEJ)ѵd__CQۺwo FxTIQHBjg2Z_^_՗ plwSGq840;]8 q5حX# K1=Q6`Y],.vsޔ iٮԿB$…k UgIbrG-7=:^gp VTAE+˲:^IZ#4㩸g1x$Շx4mv-q0cOx0k]N%!dIKBXf$H()/q뗬`וּ} O3y,UZXGB}g&㋾ UE=:WjZpskE{ٯra.|q5Lx~]_ߏ:ѳ gI1E;7vp.j/Zol.')_te8ƎPXenUEuһ4Z$sA3[ :åXh=(~zLz" Q e {1LU^Ihu3}9m܌]zA ŢnF:#ꉧ>ɪ^h!'zPp-gs#a\nǬDcꠚgm$j<# h?&-`̧^<RV_oF7Z]˽e ݧa6wGn=$v ]c1; ZԸcFS nvw]amT(ߌ+ . \V/t ^'HU1}?^,vSn[7d k k: >{+u33zn*Ds!>k#^-3狅aQ/v-G-oΫW /Ip2p^1k[dOM^35JUX˅B3樂|V` Cttx}1 F\;i7h I7f芗ijoU {UxPD7pH-Ϯ$RߘzUr~2P=3G2DQ.ʇkʔmCNZd?3"J>&%J|6j?N{X=ZVsq5* 8ٺnT򍇇H\%7YPwtV7ѬZ@өXt`]DΛzrk [1 ?A?4Fڇ mHfNT>MRx(q'{TSVTViBYjCǙgVJI5MҮV=,U.5S8S:]{S$vi(7+5s+Xy9RVs"@s:IOPa:'k-D6i6#B=93۸>>Ou#Y$jQslLה+EyӐė艧)&99=7;0OtZŶe͊ `@^(j S T0 `̚y :Qc(JcChOCc|EEE2̶עuDR4#fQ-z %1kQ}!־WK[2߇W-[=a"z6fJDOmG;LKgj؛J/-FM|0JbM plU+1G S1a~1Mt壽(-4pސVn[HK":WU4^py,yޑCJZ|/NaHEytd(TiN$ Z.ftqe11ɀ٩kDH9uo Ņj{6?,ۻ<Ö[ _ ) [ l5Gz!I%H2{C_ӍAztMx.pj>D>!ADʼnC2P&]>.P !477:M.s'aKV1tHjK>?41sE p8ُǠ!9׀0GlO=\GܝQePOBXkG%$FGSigM5e|| t8766 endstream endobj 271 0 obj << /Length1 1861 /Length2 7544 /Length3 0 /Length 8582 /Filter /FlateDecode >> stream xmUu\֦aiFfhfn A@:DNF${ +4n}!OxD2P' ]Gd`5\J`$ E@񅪋3Dxx@??x@ bmA-ax*, ߿Bl@ֆavbgp! olom"S6YA\1Z;?vB[#̭'*u`5 lGdi[Y B-65=@a/2jj:sLk%/;9=#? FRr@ݑ=p? `w#?[_HY eo0[o$~vx܊E!"dyYn2ˋ02 o+DFj+2uv D >$;1 9͡vP 3N lm md#~$s͙7 pC@dIm*TsUI9r;,Hm,n =,]s w .uArvim@Lm@8M UnϑRt-ŏC>[a.fJ % "u=Fp;Aa6JZHv.w"-$ :P@d' wDni#"w "+Yep z"s`Ip!z܁HY=oeAf:?7s5 ݮ¿0`A~(o? %px  y 9W_&ZF~Pw9\4&)5QNp//rr"HKtp;U~3hQE Guƻ~uYg[TԨR%Z{oƬ-}M Z3v{]%m~ qsXD}X9V-@hHhHCeML߽K}>l8s z jjzOߓmw5a2؞$ڮc?z;my(e=CYo%8۩vTcufc@.(ūj{{˛ gGT(^~$:rUk{|~SFϹ|j?"lHd.pGoYڤh|! cQእ`JC7WU+ A:ܓ3]*,SGLɒg洂 M[Th=XZ 4}XN̯̟ n/x4zm2gگ鎭}Vij:͉NC+,UPO$i`[(ٽW76}ކכɪ}~bK%~=!yҹu(@9%H_;;EyB# $Yr[i(,!#A cG8: Pv^3R@DiJf5x&Y4&"HQ_.Fؽxlݱ,j5?؈ZvdMI"$avT/(-Kb7.K]I$Z&OD[xES=+f.j8%‘B<:Q5@uɓ|/:޹MX ,`hf-G~`VvP@)4.jَ5]ڞp#"XW=6JoI 炦b(`o~33y=+Q̓RvcubWw*XvQ[]1ެN"XqaUt]rò2 $qXu^sx:%tbCQpDVxa8{a]:vDU\F$fw)k=F޽gW gj P{f%xZ'~1@O=q[ʶ$u/jru~wG{˿G[s O`^6zqT}~ I E?}m7Hl]RG!?eN^%Cm&5 @P$;ЗHk+S)Hm4ow*r4џF Կcʄ|0۔ ݴJɩj)Qm{v[xSg*e06MyڊP%PqC[+lZ>33Gf vâf1 i=$TWƔthId$a(}ӳL+3oXa3goޓ6n(Hq2I\Yޤ^c7b lY.<˜ECSUu& X Žk 682#*Xp$Ę$WĻtdtpkMakNRiR=K`_d12ZȻ@ɋeق3c5+כ'yQc=(i)04+eB&4yG f-^}x& yj|E`Χ9T@e=$=D~vh[@qm+Mo)9bd.}}( 1q-3z`X)DI2F>G>)`i ]-py(PSLU^XMd:A-CwgӸnT Q$Ʃ7qFs1sP1%$`){+/r.^>Șу*[ae})j1]Uwvvh̝%"vxERko5mABuɻiӛ5?͛촅}1y 9nK}tƭ0 ZC;/)k'3]l 't‚9Of=]X O, S&NZ EN8HČKtڰ:D;jlhJ:jnj\y5%~cCiUHt-\6gq^)#-Ra QhQ yWCںO@nOCݖsB@ot=yn_M jUz0r]~@{ǐ}^`uFc ƫ+#XO5?i<,0FrQ6aHBe7f|փ3I.˧ܢ;_)5 tb('.,!|BGq b(ы|v%+|~,P/>C- "#,f2}]Z,TQNWUpYs%:Oo'G L']p s @^M<<^b#zqLlW,)ӣ˝?gj0]V!]B(MA\rf׿2 CI1)ܛdw0UC2.SM|by$2/E.+%Gbj#[Ohi6!$4o` >`MiH:C[ /1=!S sƌuLݛ ,=wMňxƿz\~繹G&N !yv>-Ұ6"pFםc-tUmݸ$pEاx<\[6Rf+l =K|Ʒ2j!5M_7zYP8I]x{~9#ܦUi cm;"ˤ9Q̈́0RuqdyIEM˺^uew?hxB3o7!433OQ],dQeϑ!;H¶tVl(}m<Q#kCն3ӏ}n~_1!UA٤Y:So`Мg3ĮLD;z kSItN/G|؃Κ( 9Bs{ e [GAI ;Lq@Hp˻7W`qDMYV[bq-\E:0Bl$@&o|[cpE!-AL{ݕXhRL)7j%QY)ae})Vpΰ=qlq-~Ysb3@iU}|LG.{\FXvT~goDБ~~&>:9X|󓃈mC(OQ OtN'?d{%XAY(DD]Dy%pt6j͛#IlN>}ض" oeխVB4x 5WsC|K߯ ߃q(Ƙ5MĽ΍{=6XlbY'|Kݱ`qw!agM[լܢoX4 v6Bl zLfS]޸r`LFV~En26 ʽ* -y7|̣}I;ʦi ,žDW0!qgqߛ1Hy0z$Ü3T,̳mjy2̖c%,:RW(q'{ўDx{=kY|#T멁 06t1{sʦ/=m㎄Aɤ-3>)N^`__nI/byA1VpJy*Uh'tWfPG%8'n$<#'/#:Q+i^Al -ufv z`&IUo&/h^oHˑӿw pMTyA4NAFYqq&W.)׿qSa!vIAO}gV9XԞ!*_Cmt!kS#ˑXcC534ո˰9X+8!KNXU-P.sd 76a?Yd⁺ɆB弨߮Owev?dp O”)dalHnZ'{) ^:LnJ',c*SgIS8>Ud|9/Wh֊\T!1S=-$j"o^s^e]4^21Tŕb,8D5ڬy%Qw8 O='js !#꒩_.0U`y6CHxpTA 5 +}YOX ;9U&<)PgN5\f0K40\I>': lJXjƴ9? &2sK `+ (Ƞa\ FP/yg3I[Vgf![2nC0:FqU[$|epd~_{.o6C՗ZޟZ׾>5O*s5d;0.yŖڦ}zݐgRr0'RBZA[c,ӂ &k_Bz (;kC|X-h, =1{e#]0hɎ="# IchHMp!H#/.*P^AQjOrJdǛU8u;_rոK7_[-n(92[A!YQC\/i7&jL'ţǦpJMZ۬WV0W{D6l.k%?0G:5,hWw.p,K !6p4=$^!˖c{eʎ=CQ1}c-ޅd$B#2"Bd^ߕ|OmCFh')KE %$-L]&|sRz:U g,|9]'fntb'@?񛝩:y\]aMp.'!< ʞE蠎0庩Yӧ :g)ɏfP>0 "шP ?,f}`"KDW8-NЦEPTAvCv62Fg3Ac[*ܰȉ^_ϣ4E:}N`-fܙڼ#Z\>P׋^ 0Sjl*-XgMAEڟ߿Ӭ1d}Ri==ns&\De:6t|T0,KEǬvԿe{lZ"vI8r]QΖui4&Z"Ψ{&e-^+#0frԇhRczn`Q2δՋ߃ z Iv?$^.i>QfG-/l&<7(f3k^ε2 jWtc!i|޹N#)+Dp@YL?wcם8p2;e|eqO5x[]9u\p?WtEᜌ> stream xڴeT\5 ݝ.h<ww -4C Hp:<Y6jZk19P ffga(*i\٘Ձnf.66.djjI $e x6 0bƏL .% iz9tfU'W0+D YۂI'g/[k̿#`țY;y@y%Dh ś6fV'+&P!UWRՠgpsvvr\$54dRʚ6@VKC_M 5@Y1$)*;j_@2I j v`e`vs8X8;+t:jN hk;8t16Ccc"@C` ;A @7T.ON ||쐙z5)Vpv\&7Uo`ElV?*qXAV? *A >S/A? A4 HEZOCb_ Y89@?.G?l-!B 9 l`N?!ZBlCJra#tbjdd@vH.)7 !9QC|  Gҿ!I:]l6vH 9!ussȗxعh[/9;G? hŇ: OPnH`6qCp$? $q@? Y?A"y]#Emgck yӀ rԳC䐟g?nxKH8y0sAj`L;'6s_ 0}@OҼ`]ZsXtt%<5?q|Rt1TP85 I񭀑_J T:y-zRMlOɏX6VPb@e|^^LV{B;@kH.c"°}ޣ{ "Q't'4'.ڬO|aִ0 oL{:Da/;G>6"aGGMD{(/ܤn)SՌGOn2)_gR8:. h[ie6~ zz\A4vsU&:7&ؒ:C<on]m/4?؅1h/4#j䍉> 5K.x0 ׀qZ+מ5s+:Nmc՚u`qȊc+rP+;зW{|YXgfg{N^(F*ÝdҔXĄ1bnD,=G!JjVq[D'U_,u\,Ju(n#b32DQY$::ތLv2nՊe>"ҥ0^n~ M|⯐9UX,ոX2}xﳡQ)j3j~`3N ],NkL076LJ)} e,TfGHH6*ƙˍ1H C2Ukvl2]Wyt͡Ur#&D݂)UL'x_\;{xS32ɑ V{(gV3y  ,i q{NߦEn*n d7عG6{8lP!nCDgAklus|wGF>Y9x/ 77dJ%k)aR&9ST[i&I-m+=0vrC.oO$Eηn #|板1EwUcolVj;"ɒtV5f| i0tb]k%.Ftޖ$] ݳ3ZdBlȫRZcTc}1Ķ%ጨфruRJ .xqf>aQ>h.2gS( r52wpbpG=DׁL™hJm{ IR+f.U˞:؎kRZTCL5_\;R"oM05_} -5mő>PaHD[5ؿjӋB \%BKHGu31&#E,MzT2}M@ \ժGCD7HxULpJk^&;BfW77)3G ((ZfX%ܽP>  k^To뤿TyW$6kZoJ^*1XKik:Sy9\z}c{@Rsd?G:E:~:U.:; x81Xp tDk7 澜,LP7b 3K PQRɞQ XSր [V#Gr ݅k+3ͣ#k8uaMW%7ך|:VCx_&Ht2|3Ͻu*786}!t,;8ڗOh"?'a, /Io )P%LI@ ;ukv|lDC PzWOAaU8 њεմexC]-^\?Oӥ"ҰWɴ4HS\Ng skQ//?_n(A]o3n=lM4+~mL<*KdR$(-Ǫ͖[\{AAna_ |H*%`$?ֶk=(׼n9: %:oN /)XK3fzkp?8\d^b{ySd>fec֜g>#vsrG~A: uy)n8AAܹI !!%*'jbW ldm']2T2Zf(UPlu=ϏK?/!JLיqi7_c =h&ٞXatk̫eX,O3uopDTu RˊAnEazO[q|zZE*ʼnr2cBd-Ȥ۰ NLEy8SJc;r5l@R?`0`fFy⪵Zξ_*PW>]6uTݞpnsBB/kFx.d4nnu\VFGsN/\M<œ^p?Z%T=NsZFL\+ы{aJ%j|#7 /ZL;k9 {k"Da+(ք~j%&D=n0d&XV+l]7ӰD.w8a c'=WQ&7 ~}cDQ$ W,/ȑrN(y^Xvr 2ꪫCs!KbL,ɻ tA HuJWO ?Mz5GFB p:-s+/J(W99a(Zn:Q=J*|B넏7Mh%pj}sQGka^Tr)XS#F<_@nnz+2DW(aO- &q 䰨 iՓZuh)~3W-w'!`hNG. sQ{eּ}n9B:G"eO<\9wWLf7VmQ KcZ|? aMܷCkO~w'C.%K17ݙGG\#{>DV~Aaޓ%3>eo?\]|X.3AKe-uZ;TZ ~2P-O0cf8YcSHwlE+`?kx2~}do7$AB[jP ol'vKC,А|FTk=Y5T0-qW,b]n ((jy>gv/|_|hF(n&JBE/>/o YS2w~r^j2&}c@W=h4va颂0/E3#KϽa2 X1„Tа]r "t i,d'N9EsVZ6VKR+v_eHMq|~sVa."G[$-iC̕g/T_ш! 0#7{$tI)K3bb#_(5*O&kd^`;޵oF\PWO%zA* )5a/F*${迏@XCPQ ;+A d)r9 ry,VVijKFyG{vBX31ϛVoRƽt9_\BbBiPnlQh(Ւ ޷ gKgָ*xgqt#xSe"׻ !™"g"4i"{sv~*Rξ Kty|OiݳrECҬftU裙 4]_f4aլy}|xRK/\&wkTW͍58?ӑ޲ V yVw]i[W{g4[[\2ƒ6sX(ԑ*{oD]ն=8W8KVh1Nu+ ǚ3߬_xp72 Įn|7@ɧMFsrn)Lk]iM)dU,,z*4*ibh}#۲9+|B>N']szr({#F<*RajocJrDXN>/zjkWKn&t0n}$&#=j^a~a)U۠-hmc#x?]Ʒ`!uA]:[kFN`"2i{eDjJ0I) f `>\HUn81L JP 55vYR6ҍ^q>3zU!B[avk B[V hSO9_o"wYLti|eIG15+8KL$K|8^Q0Ć񥳞'1-RQAM(nɛZ;+*-иTm܉²I 3B=WbJRGGb>|ܬQ,0^ǖOW|BՖ1>< ڛ$S>.o0q{1aԼz, 9euv"EdF,/}Vb a\E v&R?OcW :8H+,tp~+I/7xV[TGgPLr6 asyX'1U()Q)3!~Y+^3 M:Ӗ]ߪmhٔ%qR^^_APAnȼ)6eph ^6X? jK3y&A_] &euB8Q:oi'FuhI! XWRNr)]y6q+'Iߚ6F\K[.,Cۜ!EyXt?oԫb]?[wZ-%50s`1vcQ-w][>8kָ7$Ə$ ]_i/ 0n nؾ-N9NZPJV4 R?i2T_4=&3/%}e'c]L^-5}3l "vڐ_jyƯ+BY!a4% ;D0[_EΈUV* hw߆#劔 4 &*ԭnFT'Z//}ﱄgHI[m+p>@CPO-+ŀƋ`t '{.U_.*b0)(>W>)?l7J^zωMlZf8-ؿHA5@bh]I}/zWy{;ݍǾr?K1ݰaT.zE8&:\u B3=r= -(~P~JuN3FL)*.P褌h֎ 1fXMІ~znOB_8BEd΂*;=$}NO9̝̐}‚oQ{cV>l,| ʭM=T10T_s$"/Jc+ʡV ͊–oxG`3Uʴ~rK5#EGZ$+J[loG JaCZ| YzՋLb2X>%&K`+N׷1R}3k9b(=7ZR|,3gyy+2djϴ.+6η E{Z%@_f,CGlB& N#TTы⓾ +=dܘ\d( ϋPņm@ x8ӛ  :7L JN5hBŽ]nIZ ƪt.CW͟&FCQѾu&(/ bωer042X'g_ vhBX]vQL`m@%U 7JT @ Ewl) ۺNɸ!:L7膡 bC<YNwe=XDd܅oZ_c8㓥7M3Vh2ߴUzkݟV_mrѨ@}$-@ gK[r쫕\ZҏG__W_7 Ϥ]e|bٜWa 5%8<_$gVyNIM03ۢVO{0$hDA_~k^oK|lsq) m.1;F-S8̻`XA٪Voy>`^NڇmuQ^)(QZT7j>T)!\S.=cjW;[atI &}hLܖOu~ Dve1_G"y`VOj%z:CtK+V;ׯzg7{c FwM U oE;4`aQg#SN,4`{8bdώs0n?'M>2b/'<>,q*ȋj^ʷlTf;|;]jrS4#7&;^FWڟA6r/~!/!r/yfRk.ݸK@RL_+ݲ5fYKܟ4u^ׯy~Eoi3pXMUX~fN=Zѡ7.u7yD$33/Yd}sC"9&a4vFH{{>JV8wPxaBC+zI?xg\˟ߏi0".)aeBhquS Dvڈ/]vΔO O{YO.E`Q4-/2Q;. EwLY~>r2P cيmN`QwPIAʚ{ J'/m{4A1 4$Qo*aX˟-Tp>g֘wQ ; ei77acҤ{4A'Ϲ0*1$:3_V 呭iH{W?@AU[exgޒ*z {/C}DnA"-]iK'*2 vuʪܵgQnK+d2K",B.M8 -6t <gY\XEjnPXZ}"kLV[91 ܮ}K(5i;zrFw6[5KuXRBF hO3dxSoڍ?sU^ ow9ﴤ_9u)X9I&Vc{AGh4^`ZL ?N}y 3,$վMxxOA嘜3ab KZ2Nm"Y@Sj6Mc:(zWYq|J_CO;X 99HM;￐`&̶nxCkV+Kbi@aqR}0om<@[QFжxɤFcu<ܙNS@UWM2< B td\}AWH7Il(Jg݊ ht)~g9ũwPKF:`CLP)Ṇa^7~ Zc󶯱WIMj%\O+ Z1i0|R%?e)N0frc)eU!vnrS_ؗXF|3-υlUR. *ePCƪBڶ)=kB@v"<39vA2c)]JJ?"LUt7݇HmlϮT7a^MiMe]=*m2M+W/Gq;uue\|^*9jZS̵3ڕ%\g]|#h/#:lI[Læ) i}?yl9>Q"!SaTPm?x P1uv69'' -Q*nٴuVMùS9&m#J&VXp;&BHV q)[!/_v/E( oXX1t8mך`fFΡio4ɷX<@ 6=VbG>n^E91MiWdizRM}TwΌ`PlXΡ2{Ia/R¨Fuª&} a'9)'q@Wo_?o3IC;wx' oB.zrJ[\*M_ht+\^ԻRq%|ˡ%t۩V~{[&j ;,߭MZ 7Nދ_$,CGa%+O߻ '\',ނ+?:A_Rxv@^JzSwh5MbdgsI{l%RE pK5&Za+>}8v[ruLBSO2X8}2 9'{|k7̩OP lCg;HwqL,ųo[$8g}9=%zd#s!i<ԈA/"m(#g z?8,&C#E!ĿާmWOJ/nצ5rY2ёK:9'1 NQ&C(V>c2p6UE%v,BMDty&rl -709]Pgu:>Vu(ψ辵vC߻wHNЈ"Q@i4Z!󑺿~A:ŁxKevA72ٍr!$Y e dP!B6UJ>u`;.i8 ek7n ^6/s<+"zQS:`ODwoVnDGZ\K\Y5J'pyB3Uiʈ/[LmV Dd;.FuwUp0tpdh(_^/;ڸt~樥QID{ֺ{[0Y$)(lԶb6b+Q dLpR DUQ/[\ wފ &?ˠB]yԳsK5I6%,PrtZ\"YB4х#倞w4 БvA%d"%i2h#Zn%Lkr͏“"Ktm%aG[ J+Y]P)/f8ݽ{$&}3}aG7V뵰̉ž~wNMy:RC\4Jĺ3vzlMA'4] s) Dq6@{F?ǟpt]Pz6V(=N)vk1}(s2LGr.[g-3\ezo&LzKe#natX/Q5DQy!QWQT2F+MRfZ.^ۘ6d"Q(8[ +nOl!>?M ,3:|fMRMf10)p wjDrdN%Ig;'b˨*r.)T'^ 3'$;v!7_M-序r=3nz9HbeIJgnvxqAkax˙[/]7~mAzN |MK̘(tFD ܳ/!g8ԤQ&5:U?ng|ߪ3`W{vJ.4~VYsWdM+p$vt1mSC#sܴ< ;N.a }΋ng[Ü%Žaγ40O2Ä:%FIhb,gve%/_eOEB}~C6L lUI\H܄'U- B,^vgS:]߽7%~rr0]}0VLDM{崀lwKh(!fxDM&mM2>A|*̵q-f#jl|MB,7ƿ4& H<]&ꑵo8+32i|5śHF5}bsiison W7|s |#JeY;ćȗGdi-6bir":[nNl6uTe;/_}"gjVKj(_ gG=^,5< }N;1瓫LSdڛ>qfUhs)Dtډ9DIsk3-ikԻdgn!W )8ˋ*~ yks䖅1OO(F`ĩ/5$ӗa819{#k2~L¤ ">̊ (o",:BMIc #1FQ5#4-250Ұ7gy"cH n[fCϠZ#8RwV<9Wyd_&ApTA˻~GH 9,dJ >!, ~ALtz3ށf;{;XҺt%@T`ԉ 偐<,;kD٘w5r$hI7|¸ MuP !h0w!چPu֥--5N:35eVTB b ވ{ R-iy9,w(*|:ղ,r Wx ],w>%mU@,\}EE,&.D&cEU$\zh!\ .Asdt?<0k2mɻQcJz AS7H%dkc[ WFWj\'b,9Pҏ⣆C "RjI1b1OK`cT>7D(NZ fCtY5+$#wrf:g~ƶ47 M-~=0L^vcnR cm5*uNnۮТ4[{ݩٯnv<] 7'ӀL K@V%4RtGmيTt1l uk.Mf[ͱ&ݒ{ :˃<~T^H} G "rsw.*XXJ=b s+^^i/q#Sjdf#$gdTM%5[ pz> '0b~Ƃ%^a]c/QO Joz)G?:`=xt3;TLH`Z*wQ7nٗli,׋qr$>}}ZR-_=LuvA{$Ri FUDB|p$#OḶ݉"?ll4Sh|Bz?ʝSdĪ8%#<v,,ضh'FCFGx=a ;M͉g TN왹p0U.AG1Er$wbOSjvc5l:ù=I:6kH}~ge@<#J~A('7e%wT='-G5Og_ٳr7xEaiȶ،?!DX2&k~-*kn =z eO8tl^vj2 gb2@z_e "K18&RG$V:Kwmè XS}?ed/t?Yet!L Lf fXu_^jt$''dloRmSFѹ$px|_hB@>Ǩ`V.ﷶ&;\cKˠӒ.qP~:FiK>$JvhMD74ìK,EWNTl%`@ ;׾ju1YUH٘lƧwLxHS#LE@iշ^FVՙh">ʁ4+[F sBBD.aԃBG)K"ĉƢV_~^w|mYKGsi.{b|IuSqXŗOPˣschKƎwP ?yTbdWko֧=1*}C .͍/9Q-C(BޤGjw' jKHV䯀O_ޛD#eT4 3zŵAFT0%m\5c([ڥDF Ca_M/n݁沓[&ӉIٳб&ꉕbw{] XD2!78[ղ8sEL*R^yD~oA $TTm)PpRMHY,ڌ;hvY=f&߾Tdl\ :kTxDƢ<^?EiʏY-bׇ]E੶PO=Bᓵ\:L%<'-˝L,?,H\]C5R}d}3O 0Gn !W=(5"οNR'H{6n+h$Di3:Kbk[vdyr?Gl5-⹱7>XH#KXbPwHDrRGo|0C` zIQO0#SJ?~Ӓ?ըTyQsNg%~(cWIaЂgwTlL= h-K\As>fnՙ违D G-&+Vi*? õ+D=ۭF-wu3&^79;rqIL`Nܬ9f 3^Ix_$*]44Mr$XZrv_X i.Y[5ygZ ~,v^J:oȻ  y9TҴYb;Q5[r@ &gM`{Up*ژK=jSN&(%G kY5A0| %x'ʃ["XƅkWo7OAY60vh>C-~/'Y("f] Sz?o824)8V0Cy,~ԘSeFZo YR$X%TFU # 8@sm>-ՂmڼϿfĄz4u"߰NŊfMJ{t>m>aZ`:oM׽xkUC=p%(vK񱀳`@دt|QFO)QQIiN`V%~p$nhd=bݍ= Q ~~A8=czt8[bS q䈍墥`_puZ^2( 8UfH7Dm1?&V'wum%" lۣN]cU[9%lfu"[8\LݩcE*rśo2k̽ѻ ?YԚ02.E96^MKd+B}cS'K>113I1(+e+%P^y4v9 aqzh]T8 ǶwS٨FRje^my)WvbC&OX$Vy~ vY/h$/x9Ѡ Km%=6EaGykM+j 3S(+a|M{j4789G'"kwe?2QtGvѫ]Xy4Eב}e AeCFu3VƷw]O{1x/N|3OwZ!E=z fqag4oP]r?iDĹGd(rs[ J \}}h Zu& >1Psr>V_$%ꔇ{3O* #MS1J D{F֎/F+̷-*f3 :GCB9F*/mYڞ()8@ K~c ayp2{M,P\70U݉M}OGfCxII~<< P!pATd532%Rȑ-ogdHCdȀ|#c@ٺ|+WԻ 22! ZGQӉEupr^pw=K bYJl{= O/ˬN % c +vޅ }O6_ۈENKHESgj(-1!őipSzHz?Ihy%PQ<9Vˏtƛw!ڹwgJfuy;M&T2!;tLGS4:􉊘->;lse~gbLJj#j*a!>ZVU +YPAB+߭:SϞ-rY>!_N$跰5N6z40 %z3gS.dÝ'y* 1fjC&+F*j˄zAj)qIfTWw+ĤjὮ"A;Xc+|6&l uoq#Ĥ&јϥ,Rh Iuc JO>F,l@2J iҵBXk81PidNYm?$-vL@<|]חR.H=gu_C ΗS+u/1hJ, -x_؅A5tׂͺg7P@ه (Xa%O ~E m>&Dbb"b&̢e\0 9Nsj%̗ܩbT}٢x퉕ٖP[L[V '(Ws@:i[˟^~30Ѩ"Yg+F _{"˷YUsܢog f2zM97=1t+ICMy[Lf\6].L6.H;{@0cly9ιDUYk`癌VMyOڢ: 1B r|=(4H[)F_xK6(̧uQ꫟ѫxc?|\T`۵@d$@V~ w@@ta6 i5KHbOʆ}b -9JIi~ZA,^ILGeVo#=S'֣~_3Ud-Z8@VC)LJwV"#B%GFxX$pYCu(l(j9\r*˨+rJ͐{r:6t#6ؖ9aUh^pOেC 9US] ,!aM?[˯[rYg  X=ϐS2IvJIŸEɶO3!Lh>h[XUZ f~^LrNh⺜[ JV/ Z8B"A4GX"+쭚3c+?6XDm'_ԊiPRrpκ^C=SX_ sca|3\ؑGåZ&mij#_iK}Kv)uIU8_x̿Jf/ dr>u'Hnfs+?d@9].VΧD{XTmo^_MZb/(ԫ;aa.x=. Xv9#iq+vwp˪:KTj52îYGM3{% @1/lQ [&haNF/n’eB[ӄ Xa7erBϪ 9r-4nNJP*~ ,#.sAD:'8쒶kh(:Z"1J_ܶT,Լzlx N6!SM Muޖ?ڔ\>PwV37SF^i"Y $:B߻D0A! BCPԓeye_R kBtEl'ݮ%XH}Cںݻ[G\0'AxwBnxgK~WVrjE\85e쐈h4z~ u{x_nC`Oq;4%~li~fB90gL endstream endobj 275 0 obj << /Length1 1946 /Length2 14719 /Length3 0 /Length 15947 /Filter /FlateDecode >> stream xڵeTM;wi Xwwwiww݃%8 -%ߜ9'3ռݵ^@E lfoXyr gnJ h .%@̃DNoN3@6VphJ`c77daҽ;x8YYX`c`SO#@ ` 203ތVZ{hilk7*IEu%U:Ʒª.NETUM] &&j$UV[IWVVga lq~c?joNv4ZLLnnn.`F{' FYZ9loO'-a\@for-*gOrV@3OvoR%KM🚶 8\9%%9 L`g?Ќ_Q'?=r6.b2=[/c1c_MAV`Ṷl;3+?6yai qU51ۿbSOXL `a0 8L3Ĭt;y0Xۀ@^/f.L +Gf@GԒOf叙M/{3@r6vN.@XfV1;*HTxe~co ?ǔ팚كl=f@s$&{8s0c;+[+@)%܁fJV`S/4mA-Ǥ(پ۽c0?|ohj:;8qD/oa `RWQO8 d`;9{ 1M+m̀ d~K8}NH? dd-/`YLA6_d0Y9LAn_A7  oNVfy||q 1s 1s)9sUN6@M+BNVowрEDݽH7ޖY9o3oZ@S9{S`Ԧ2_rX*J\-x"|M `a@&uorX*e5ǵЖ/hz`y݁LNv tf[|)@}P!u2Bm9֭h}h~m'a(?wTΡk J.Ѡ+=tA R,QAl834і1D&WSi((؎6#eII^'ֆ7+aS;!m&TIO]pO/^ǂȭ泊¡`ho@ "lqN\x/knS ?-n&:h]^2v{9u+W/ؙY::xn Cķ\mcʭ?քٽ&SRE) *OE:*{CH/-*Bx?*H6jR9S b%~:zH*bv hyxyϠ75۩'UC߿{*QS܆S>&6dU PN(kӼ+l_OhYo FʊIΙOPMʷ0k`Dxa!dUD:8LiFIkq}ZagEmEr;D"Pʓ53DCd1)'LBDsiqC[=;(t^rJ~ZC*Vpz6r m )Hh`>҉+o?w `QOڱ_K/:=ߌLUΚ7HT7KB mˀM#;P592tdȚS9c~j<ț,=UHQ4|sGiFo+=jfȠ0HalyG΍#'fZ`|?ehbNmr ThʚDI/3k(V0:NTLH a>0 aFۦH19uv|^e >_ yH=WwC]hmd 2O"ګHz/h |#FYZGo3F.^e6JUJkƿM-gZHxqXqvSJX?- d-+ $]-(B҈b(;v^ˏ,{<61iesʜۤO5vbwP*MQ Cq]41;/ ͘&a26bnfs1'f5rTJkm} %gol{Fq/lghkkR4tok뿂- 5(/ *tuԾ98ޠ'&|Oԭ_@%!.kl0ӻCJRkVa_8R)4 \< oP3ӏEFR*{KY CdzySsihd7_`!A2L5d0a(ՑaD)4%Jixu,D?iOT ;#k[Ė-Nl4[[3`uZӳzm&C*]*+_#2Y:7T(PL "șJ}BK/3 <"e2,Kr8`ĞF_nߐv&/hwD:}(}Pg 2}ze庠kn=lH,*)W}J>WJ:66㳨t@ANb]-X`*940| d&}vd[~ 6Ғ0V%SV9n2xXSa)X$.NEǖn y쎎'ao$ 5~2U*z d/2ڞ4odSBuUvhP zힷ2؍'%K];uUV/ׄ!WAɴ"͕ k{kr|ՙ5zͱZm| !o%'uZ/0׎ٕW|a-#(@ن@<6n$O(HV["雰=pPt&^i G,R8f d (|urRJ(gJPos$ uD;eamSs+򙁍L@eQl='VJ3db #XIZ҆Çs(qX8Ήֱk&HXZsFZot,m-S=q FDpi䭊B :"XS>ۖ҅fbYt2ʨ I!v-$]U*D=@G7RCkP]J5UdQEk&g`DǺɏ Kv?j[ gH֟ױ))Gڊ)B>\*fG 43*8Oݦ#"F뱛"c)usS%wI }r]DGNY)|g,8bvh<)39%`VH m/#fj|x# ЯʺYrv̎B|NJqzJر#aZOjݪh=ώ/BGqߕgVegg˜0ѓG'VDUJ+eW5p-w_+4jsф 117ja̡ ϰ$sMW_,ˎ=P*u' -z֗+t c\H!:c'}hKvf+o ܬK/b9"IèT @96k~H(jD %š .g}nq;V*J ߒvqQN"8v?j,ӮDQM8N!U7{*rN sPhu:qlNJE[cvA< ?8\ ?dora)]jFeR7N2E/0oM2ّnxG9g2V@Ȟ^F?Sq<*,s9-!Q=fC|!G7wi8p&r#]}RO^HyYEME݁Wlzއ1%!WiپSo 멒~Ĵ"KMT˶:Nמ"1WJq?)SR+pn.nJp$qũxb5~"l-3P[K!*PL4[}֑g(S8UnrJ:"?Tk.C+ B!$7AX4{eq0op&:5CM XTyn32_@kA{|ƤD{ދ+_\,7席~܅6EM<}m( sIIU; ϶?#04(h ˜麹KKl^@9dKBV^YIAe2R^ZaG0T6kđ=r֭( Ij=db 81]yօhSW|%8 |m+=!x[e,xMWA%]NsaF>8i /eP<< ѥkx\̐tgi_:$򣒋1R!^/eYdpAZn|\~DA?E=4v L/"r mJ}4yԔfxbsκ)>p),<jؼ$˾&ĨF1I$&#i>V˷qhVbQxuh(V!?+ge κ 75թ-p;|&s(f*Wp"uU R)lh(w 1ˑ=wԀہRf {7X;i&*a!UV.](]O_e6XCw4ſYCUN-}pEyVF" X]'˂EVIa>Sۑ6OT߾:ᳶֿ&"Np$*Mf%\ a (ZvX%*FW7 j|wXW6jp n'uD;xDl_b][yt\U0+2&K褞1y0[=90?r gz5ήN_[iU`ƦzO](ʂjuns-V ͪv6ճkdwb?.`n|(aɇ_9qmeQw˸(ڕK*m[WViN_@N()#fbwCus*{ Ok:V%/u\(HPɲA@xbfDuxZlWxypo~&d>Zy1J[1xܸu,|&'Zhrkv;̔ү*-!gE%mzd>(H3J0F+vj@ALK L~Ne[9$;gEq& 8$Cj:<~m^Jm@ hZ*s\)ȩt1u]-% Y>', 6DAl-;1?9.gsB oL恲"l $'[&Fc&yX;ޯ*?fja&P_XN=ϯ#S/ ZPKӾ~dYPʸIBwC pŋCy'U{&z Jnwg9|m 0dņ?a䁎|.@'ś?@{^t׃"U8"YVSg~n {?//b0<'pB(][}+2GV⒱ @bsW[J6Q/O8-2Ϗ`眰m5Kyf֊kҮ3Eពh2+ W/ux^8:QxM( =:P( cܙ<.ؤ tȄQcK#M1 2S#˯ySa) $"ݟ|) ƼjLs&¤J\ա YNrcљ6a8J1f_ZrɅ=><-cD9jiTa8p{9jPӶ:)K`iQ.(>wV6$%S<E kTߺFh)/㗃 $_E&)ĉɟ;gss>U15ࣵ1F&O#,e=bemc.Cn @Qo4uёօVmsT7#P\OQ" nV92@e uGMAq蛎Y$qz95@^#{/W5\)d[{ֆ:޴p\'l9uL/\jw.TnCzywS/$ضD60.^BˉF5}>͐tf06نԃ@[#R' q:Vmɼ'S=wpyd=Lxߢae(2AJGIEu*+LWĢzB{5i>O<\%D+bS"Y#DQF'"F2,!|$qA4?R%'CHfQ8շb)N/FC?;D$m`s}pl>]NߵuHX&XF`>Ҽwق+qqXC=)mF!ꜞђee4|k9/*nHy*dY4l#Oɐ Ǫua* "e!6OBmi{f:ujfߟ B5C pWqLZf.Bņ&c[/_}KFI26ě}'xT] ŮӄE9m>U*p8`~ŗ_$G(à+Զj!ʆ|N6(A[՛ꏄ5 !EV&FFo'8,>E;̞/q*hn* M;.?:-MY4?xk#6v5Q4T]aszvn\$Rh4Z,ߧS0Y:BAtNf{$M =pd+TrF(8njSsμw^|'7כ(&KN\): _Σ5e19Uwܠ<ۉYH 91O0_n1I=2\x_WkzXgc#N-$NrrBRyQa֪&iK=/͒?'͗}Q36A9{q ww]hz}-+SW٣[h{>ףbI#6*f~*P~kmYWb^'mfߵmx!ϩHZ(zU\FFz#n؟߶dIJ L21R@A}0f#1yָkG{Lˀf͟.H "^Tpwz;m\SYӆaT޲nY; V2DrJ7Eh&58*[ ,Nr>ժ|` EB{XC(W:o^#33 W9jBt:[')p\Îr;J4v*aNCfûb00k݂SO_#7a:"!50N,wQsQXU[v>==m$ EnR4("ݖ1v(, 9^G+/]m+%(B,)#mw?fCuuȽSۻޚFvph3|T5MᅖзfU^ 36YV!nZGVneȝ #2I;hQi$#j}$ 5~Q^Z{>ZK%yDâWG8=/\:RK-*5X7%[?[Z}ϨexSI-c  xUWHA?nЍ<?/{)ljnԌܜ]0i/ʷ "((V|u4 _mnӊ D'e]h:*`s}9qz|b,m "p!uzy!Pޚ 'r@^hʆ!rԾ LQPrY_[`$M aF͜X|+<٧1jyg{!;IyPvWPwI]?|AV[[8* mIMw ,LCֺ,'U/n$tOu"LUSs>QT+!\1ܨ _$sJn卩[:bmn*Zjnr2C/-ӧeO&|⟠AI 9#\'Co A}r8A24]Yu{WZSuVe!b7۾jU( 7iq# { JbqϺ©,P3b3 'EeMzDwl=оbiޥ5Q{h[`Hl|vWj5ڜ)wNxaߔQ+e4  (8VhabfSCw SLDL;O-J o oOf-f>cviOATvt8aG,~[$Bᶄm6n{#<+^6UY;g Jv/SB$|4 mB#<,|'fwpF ׶OdcCVՀ@/ku_֛u&8gyBx;v!eׄcIE$#d<Ű;Y`eLш8m0*l23g=\tf 7ww4@<(~tL)bmƒMˁ;8P_Ѩ]STp+mѡ0M3'H8~n";lL/4dmA`^[ڒFxp&Ey3I@qXõU]E"g^Ţx ,:K8EI#Ob:ͽo"l:yԚ+e6MYmC]ycR&vXol}W9ϸ^;"GOZ0{- W#yUn@+TGvu;®8<,RGFbe P,iySp+Cz#UЫù:~5R лIr}B/q˓fGl9W':љb 2j?PGG( dBn-71# @=dm\pg>@LmF73/ գMJEg.AVzACvQW? ,/sKY1f҈"q{6OpK?ߙq\XJ &b5/i < "?Y㢩gǹ!d5_M_`giWt#DMҟG]b =_Z IIuR#m$) 7SdW9ܤwVa \OEp:*Cs9~oDu;o:S)x{72P.(6}^% 55D:S_rZyuߙc|Ѩ*M2=f>%~bo$ԆG_׬2#@ErLY "#A>1 L\@r*s”,·v%/YsX9Ya?߃Vˏ)ҊV4䵃ΙKm138]'`dgvLD]p * 7wQ, W0,?SAe'SK-"ɚiWOc }v8kuIϰVIr/:Cz N5Gx#u|%9k 6tnp4~5%â(=b[}M윘?ju*_ JV5Y[35"'}~Cz1gcX 7K+*i3. g"  gG=49Q._Fz]>BSv{ºFR Q]T/ۧ6ڊXCqkJdVWOOQ^e}qdSV? #ԣ z;62TbڧӃ}+4/;]޳\p"^ fl"2t^gWKHfsJ~NL7:ZCGMe}]#[ endstream endobj 277 0 obj << /Length1 2036 /Length2 27881 /Length3 0 /Length 29052 /Filter /FlateDecode >> stream xڴzst~l5v&ضhbNl5IVc47m5h_;{Yk<3k(IEL.L,y5c{gVFQ[3%[+'W R 4vr7v],J.  GJzWL< @cuOG +/hb[Xi]=AV.b32[ kljlc072)0߅V{ `Pj4$TRJjL\@ELM]C ..j24[0qWPQQ`e+ rQ3M`W#/3; Ȃ/~Vw UW{rXL?NRڽ]obp_g ?X;+,3wۛ:?f"@r( 4K]}e޾c^._s+[/:{3*8WǞ/?Dy<,\֏&7s{g|Vurqy2mcṋ쀙#xY],aj'_]GG^_oGG3l\T'B`Y7 ]_w&Vk`iߧ`4G`Vtpyo ?_$]mm4WY?B /͟r)[Z˸O-/ƟѲ}s1/{{</C#&+A_fV6N.1dlo7z1f&{w/gW8"D!n6߈ ,70+>F<fG;zYYXC6vwhlO;/O+ng}]Vm YcN=2=?{\'r|ߐ=_?_Qts lZVf([4ILAS~d:ԖZJl'& 6dP:K%kS~Ʋ}]oI1S1S#@dPX ($=1J =ïd2( ٠# #u-Df[JAZ4iŎ˜?Gmhfa!4O"# $ȟkp﹠豐iz`0͸s K4DCV ύ@0ڻ/QK$|jU@e+usA۪VRqlvͲ[ki]~ +̃/ l 㑸q+$28sL^w6{:VEMBx*W׌*i:%ꆃnA7N0*FGf*- fo#}"|7 :cd{$_[jvPeͲ Tlo"1;kwwcٹUbBis=_CQjyҠWinY>[S%̒:O#58{09=˕J&׫Hu6M.ke{Hz[|XYy.USe@ԋtNwV OuVlx浺(Rނ8,[m7΀c"UlTJvwdH+E4%`?S\W@&AG@F'J<?ՆFi#fwKLJ}lLS._W}§Bٍ99%Ro40ܾ=5NoJZKZ wu%;hC2Gkx(}Ԣ a{l=P &V?;T58AHשvs"hŰQ޲GZ&bsygq+2'޽lJܦc gj>Zr %hP HM !Y a ;K<2UaA4\Î7{c_eOtl\˕-dk%L;4>aWGk4)+?g4V+;k9w7O_yRO:halym8j.V0S@jݍ"5_1N"~*}Vrֹ7+I 6y34ðѝ+sֱ[+O0+w˵ىΖTRʔ,YehPlyN5*U!&C6*O--1>^Ɋ4P'V:3)мz¹ )+|9 ĞDgtiNqhw-̘[!˳ZC߀ARȌeN$7 37?ٮT`'Hh(OH4Geo j:Rl\9bq5q1tX(_rܣ 'Mk`sx+Ekm-ø}W;k:7*55̏TWQ5‰vbw)_'ٳ`EC%l ! ;nM2\Q>~q'^kGz`sjG׍bϽH44g:uP,*<9s(O">.a-ާWkYRdIb:dS]%XOQxjΙC%jw8wSJ~'P'KnM*hBt;ш3m(T<9. _"dF}Ip[/(a4:U 3onaZӽ̌K S-:D̶Ցn`9:X~% V/jҽ\k]$G 33OniwF\*U.1aFHu56O۴T1֔^|Omۙ8q5@㋺a$w[N{Sq#d;IA'JN3 _gx '> #in֘*|Ri2g=3YDe&Hh۬c_[~d6evS \mtEHۡ ;LTC*| =ߠzﶮpᫀELa)I|9_' x8qOT2E27mwJI,pyv:%vz>e<Y]Ja{JkDrjgob%[}\%|{|H4i]y7 ]Ͼ!g),8,*)ҥ9U40} Y$&L-)&ML/È|# _KBEdsC=e6RSaY FK[֍$Y: $PqX<2ۖ5'37*Hl F2ÄG ΁`z J%2agDIZ+dDAZ}N$UO".r~,*-w 3<7$1 [n*]VJlȳ:p"˟ X|ʤ&v a³Qq,ȋ-T|V698V WA*vsHEmv!fyQZ#iL$m^`Iv#T+v 4}PE70ڪz![gr@4Q0^':[8)*Gw2]d~jpV:T}槏ԖLez6 KӦ5-|@I%e5 sWg96$fvzH,joLe$glIoDi@p;{Mp(RۡT9?rʝp H/▅&r3s|/AU=,B-}N13h\*~J{+aƏ3[#"0?ܠsU \'W) [$R"4g3HλeOuɾC\OGm:Ϲ4껄H-ĔPEgr.h ]52DE'i_LlЏmUx+  7N C<}Jf|jLJai L'IC|$5ƃ&aFWCNh>]w1܄̪EOOT= H'n0̩B>Wg?h~CVTn9&ĘGKJ `]ޗVbNz7 M>х7M1pjJνp>{J&3Kv )UM8Y,`DU%*%MMj EIY\%9R՝ I5 Ta<[@\.vjƵ쇯Ym,ƽX_uzZK6Ơg>yЯ̰`wP+z8 }Lk`reVJk5\a=ߒ! d"sBg\ &~ 6.ϰ=ou.0!S1;H|;BS;dd|^Ũ{LN5d,8w^B3] r9>1{ZXkh^x abE{Af N V$Il0ڊJy\-45Vo)8:Gl9Q{mx3BlH3Lح>ZY}}R6jhw<,\ǀV )XޘCT~ ߫"'[b HJ8eC5cxפ,Wh2٧'xյWε I;L9ms$aOܭ~iXhB>$$v]q9-5g@G^6k]磜ĆbOhG7u<_8& L(h٦_jgU>`Ä1;2D$kfÈ%~̺YOԜe'Nk!:O)׎9Iүp5qB͔>B*wj/B:^w6I+JEJGB{ق+3$o{)ύhM1pQqN/ZujI恜2b?;rnn T="$(XdExN0n\ 0؅8}ԋ_ Hll/&ظs?C. [ȱI<&uz0 [`C.VQQ46g׹jJBwoeUN9v0ZlNL>egf|GR~¶"G. 0Y9j{LߤOBGPޯdipK.3|WOuA7HM]chn ѡ9U{ƭEgTp1wZ= o̭k6:V-AWFp,[_I{kk,E.DNko~8q.a˒ ÝynF&b)ӵ /3.C̶EԓL6~֡ @!~ 譍E5wIó<]¬|QP@Xl[(G2`"Ԯ>I"SxH*xPL9NauUgI,’U{Dž`Iܯ8gcCt^],tN{qJw] 9/q-c`g_)e7((n e zy>ZrlA* dc$bgA/~2 U˒!*?,:}se@c_nRJ0 HD(&Ft[%_DəhM@钜d* v܍@wFnB,`(엒y/ӾS8@5Cn=l4M_O+ҕ @iK$k<ȯŨEχx]!(3 Ѭ-|O3HAN.WaQMꔬK'P9j53 6%y*NZI15#k?OzybII']E#:0vyD@G>CǷ;?6&FI"^3JO VK4lW\XAi[fҽϕi+7-H^7跺LnĠe>#Hr0R¡ίlM:*LWh v>hdQ79VlV!*-quF6znf !)[qnGP2fg\Uȶ{ 84kfx [yj9kGJ5gˍZ>lɾX~HTSN6Q)am$1L]͋dz?N ; teuԃ 0_[v(t%¿;O"w] P`gUKQqV;g󀆫ӈSdC_ǒBtMSsլ~|C1<}ź؏XTn脙V\6%̘bEӅEJPvX x𞷿RܤYYVG Gu5:Ow:O3oAD"\V"EC7)tD8.c<I"wN:[i=7vMQ]Im훵DLp .s3F\UrJCU 8,Ki㟼F X=*|^~4 |$-ǽɓnӆedU @8컝ʍc|^_9tD ,S`k\-HHo-GOw<'mV `Ǿj_}yQщvR伣p7Yo/Oie`#*FUN|˳r9_Pn#iKbh@g!(2tW4%c~[MƼM,D&Jߒ}I[^8ji8J5P}߇P0C/*pq/Gg A`cUB*EdfT67æW)}v6׹ fT‚*ý7 .|!*C:9%XY8#[i(R2Cp~8 N>M3\rsM}W)I`k݁w;bty*Zq7IJgz,5a}4} vV~{-6;`#3]|[VfG,aTkp/qਓ`2Ey3 adYѯJ#sQlUqtU4*ǷJ}MU<'zWyE ^n2b}B&ۇUTdµ6kjV$ \6؎SCyx Uw?b.~n!IVDUUzmeQÒ"i?Y!-K(~k,]Yn޳Үp`Ahq8Tm']UMP)ӮGz Y ItQ{SK᳈…#A#J[%dƃa]lhmPă͔(BFrlà(I1W#fYAa+2T)>dMt`۝O;Ẅ{R΂u{B.(|ŴuͅnQ6h g1deSF1Wgdžqx.X-u@]K5ﱯCLwv¶d?.Q6>.r~zam?P*=Y 6݇X.npyϴwl{< }ox.iRB[(a}GeF_HuTHjO?ؔYG:ej*VJBÝ +]i-Y>Z!U!7-eɣ%x\yP'k7TbŨ "OK`I?-4+ŏp17! wjp p)O3Xؖ2Q2HN雌iK,vǤj [υw?:VF;*u{!{QOET 6fn1j+a2=}6D[ޑIgbz~2-HBʗ yjofP1E$W|V&c ųhHn=o/g<+Ǎ_#"%lPW}pӔnq4 i1Sn-աayżVsC|7Ǎ|Xʥ~afsZ3 %3njT3_ne!#7 EjNjF!^-ڸ#/H?KuͺvITSw@9WDƥ-J!vJ!a@/GH<'S)[lQ!1G -|dx鷛 ~Y "*ŝr-@[`?"ΠslX#~›![ Td]It>&*-ye^(;$#}+bEHuԲn24Ѽ>`Zv)0 AuMmMBF3VX7 QPX "?-9~:8QoZMP'#*JE1!FU?oPi˼bx QtyOxoZֈwq\WY`=Ӱ6 F]J8 Ե@rQh[] np\`*B丹#i:Uz Glo掑\ts \I4&M\XB6onɈXh 9[,ZBrύl0h?m. -)}] 55 :j>"Vk"dO@k\ϻ`6k%.'Iwu},m+Vqb}m.d0OžYDu`*c.`!R$vw V$KZ}b.1  *wsZ.( OX?; f= {Srhbp/kB߭~acsܥJi_q,M(\ A;zxp\!~fhIU0R.(I0罔8ikƷNɎ p.,(PbB0`dY`wexbCdA֏(M5jl\G, &\4{~unLLdXF>՟Q)/~W+3DҹkYDz@wӀKF%=Vl(YB)ѡ gӅޢӽ|Z^EmFUySE;y%hL:+)-K6 qJXe/i5o,ȌMFjZdKL Uay5cįgT@eն{ȭfttq,0]-K$3l)9aGhIqW TN+{Ž'"WN]{?luëeÖb!:IQYH?|HSNPiZ-fUyrMs%ΪM$X&͕ro|ucQ%:C})T [8L.ΏFϵ9>>4VY3E+4 ^דhYY Y0 s y}DHGtKgE&$Jâ܀h`j3gn ^@ty[?;߰#Nwz#ϊ(4]tr\M"ٵ_<]cw@/l4]Kl|}b.HEgJ G @f0`$3I7b7(,FB}413:KqeUIojJ>bxB us8|4]5 UոT;9v>hkulxpKV쪃dmϲIΒPV'l1'$0McP`[[i4|aty)dmS̸H Pf Y3~Sd.XAJb*#\'VQ{ˠ1!Y߱INJz0\!<5UڇAn'bjz1Lwԡu"ՐnRGvSDž_Z< zY累 ""0ƎpHvpVGt35Xz,͜y0#fXG<j~贒f^;ZezlP%s"򐁡U+sXzF=Pj]w9KZm?(WX Dqb_QuZtKO<^0Tz=bJti[vGdϯ}L%([Q\Du1AhF6ړK31\o}P or=&|is rdn{Iq5X*؂mkJ)T◾ә[Ss2?%kN c MT m9okl~_-U$YkVժ>WWN?gu'ɉL͍kWMNq X@Ye0v"ykB}۳-yFۏ;o]勭j`8v'bD[Y33~e_f9G3"CA")!; TiʛY "җ(}iSo RuorVUf+/c 6(=CڏC)*Բp Vd"0L9 tV\䫮Э~c<~Z v*׫JL%4Q @s7B$ٺ_m5phPOA4tk9ft y$a׊^7,ɵ\(! ge3׆XwVo JZu>HCDL9M-ac>ec=T)%,?<0t} Amf+Oݴs Q6M>뻅Yc㲥3j'?sOk I5&u_ |*Mj_-񌌇 RG/~%3ږzS˗ώ pbݍ;geЊ:rbLtXL:sHjKZvf`.rGa[ +H"[z.*[ oP ㎚vM@jRg]FFvM@$U]j=B5ڛ3EXVIIoIΘ4Bs 6~kz\R`҇ Ǫp(_%T nTx!:#$mCGNRѴTAg r\_ #"w@lr*!'>ꢇNOznME~׵B?1UC)Ѳjj ܍or]mdÜm&q%pijAjY$Ugb",&/T_ςь*h3NGCB[s R ~ *uj%ERg+f4Kw䎄RFD'kA;i6@MTiw,1y( PG ,_*}6H'ـzQv=U/W"دЋ֬2oɨ!w}ʼn]7Pn%3ym|_7|TdK7rHI Ώ_^ϫF-|t׷H Rލ[8#۝c3W{̭Gy['TN 2szn*l](#c_^03}V.9ǾjT5ӊZ 6M`mNYi~+D4 x`ν` ƋsP_4SJmv%Ck۶m۶mƩmM9Nm۶m?dEחT[@z2l0oV֋10]zIۢު ߦ^? U3S)ë٦oF9e g W7hXJKR&dA$w >ARuvf5_JhuB,EM?SPa#մQg #`Cacd~/ǙrBL9@ ?j62ގղ̼ͧ+y7_Ts-%f5;$|}FTд"ٟK"n `O*iEr^}/n[>YwcK\:'=fZ{륐-w]'kWǚL#NBVgapź3[oԖ \j Ici%] ix5oOB P Cnn>C0 ;yzwQX +=7# }(SCY@ɭɉ>;Cp :ddGtL]e=Iя܊r&Y0OyKԃ0,6u~1d!LY<܍*)fC)5ʨg }R$G^ (bZ,e9}"]&Ʃqv'j9='׋;Xv&f$it kc a m{zjgs8vΛ)kM3I45AtdD̡*&˂Eꀦ>E%W3b4:5DlT'#!)FIOڞ9)$6zmE* ЌA=8 r̕;% ~cP1CK)㻇dMt|#lPxj B7xAT$S!j\ߎ*2T^}j}ͩRXRo[L5l%ə[:u+8n0pFgՓYDLsORV!`/"!Dm uxsIYgӑ/p~x@f`{0Y30c}y=~j 7<ȝha(-D݁W6;h҈ҁ,ir3 72@9z2 1B%f[厩` x.-Yy\10H-dHY|O]LvA#v揑,+s*BQ@!3NB4}|zƚ:p9Y-\ԝ]n#&O6m@Ojf]LLx2F_)Vv<bF5b?j+iɜ-/xT:x)87j`JO ~h z(DqR^N }z亀Fی]ͩ@*tB2uo8a5ͮmܪr.i֢I8+a´i4͝9Z6=-B=>a>ʒ׵Aqϑk@̪sՆ!b(٬y*P1TE1_\` 廂Eݎs|(b$&T'2$|^0Uln6eԓ6 ܿ*gѹJNSqEK,h;2UشyH-o1vG80t|о9+9&>iźo0""mxHy%M'EYl0 foc#$<:`ld U>mIL$I0=߶xT*CBD}]$M'g,Z!hT;ghN˰oINm5 [ j#yܐFؿ}uPKeiE+d qĉIݠ `@WT% B2 FA2B5e.&LwxìHGbnRCu%nVXytq+>%7so/ X|U=ov A'406׃5Dmb J_kXySrL>d bP,m$j\)1IĐ@c$!P礬mG uc%[Oȓ' #e.~h[VTZ=Geif.6/n oT* q>@ ϱvr_~IKއܥ2CUZuTv*#9QUCJ:;|h]"Uh5e=~8?v{bФ j2v@ikcL7IǣP+0Tr&yH!]R^_Fxf$1$mb,C%F!\eKB]6ѓLBen6'Dm8Ḛ*Yi3\rCm @[%ߵ`^:WXJV*  ?&92aϲikP[p' c#zfv^3fڞB;$o*m~RMhF*o><*bI$Bș`JA'(Yzj=Ae]KIajQ2UGC)+WGE0e|US!f.#Q=^ꑢv,ⰏQ%* (4 AV._U?6.9XaLtr:W"ve)U 2LDl hmk%[t#_) auXUE2w`%;\XG{Y DMqt$AW)_?-#HWBc3[?YQPo51eϵ{;k SQȿ,(iY3zK]Zxl#h+Wv$lVhsAPif\dYcb9Bx cniӑTD/,c,1.ImN".r+!V,Kz>|ʥLJUKf/:&:+ T&2L6E+'3 gNrilnMŐe6itEUQ@1?I&ZUUbj_S8$\q`,ƖG TBBeobXuGfCT. :`1ޝ\4ZN.7nҎkc @]l aJ{^S lYDDA\yCZ';8.f>謌֣+3FiP'Ig~C%4$)[~39b`Lr[ϸ4^H29& +U|P9W*)B)^ʕ4/'ϫHʢ]X3!9T#&O=_lhPv}2'Cc*=XD%pPیÉP2HGp)jy(Lct*')<-vMZRFTR],"#R*b o(g%a AklE5xQ=uB]?4eƋKWi,Omf[i{ &&Gᬀ o®zL #׼osppgiJ(z"Rp{c5 VZ%I"JDڙx kPNV#AIX^^*2{GD!a(U.i12O(度\AVX?/k1ifZQwi,ϲ +b?WyL!+`+d/\bdF7 "4R _w)Ld7{9w> ~!U<\YnnWsmC8P7dgb? NOpkCHd^9*6X#ѢT~`jRo|(KDE(01/\! |K hMLj>e_9\XnEmuzQPJvDZclSʔo x#Teޅ_?{0CP! ¶#CE\H;]@2Sَ],s#5#S6ns[;+{}OlJrbuͶ6:|Ӑs}| DEx闸V͐syTd&E>֊nlܷ_tYmZ Xw0GkBeƹVX`0(c%;r@J">Yᶒ;GLwZ aóR&[m4)yINnjhI\jPzB+SwIqw%Gc0_Vj 9`RZ@^ a^sUA }ΖST|u!I|,x`W20X\bT%E 'D%9*DC6E24RĦ6u9)sI^]-:Qӧ^?7vªׄ+ޟEİk_MۅImS njEea᥽VbuEBNf̠Ou;x:y+>F{ZKtæ%4зiHs5|Ӹ- =lGJ>gB%xv8$>{fdNPQ~E;Yk#i`l: /x6fz9?aq$`G7oB{NY4OZk+Oո7rZtްNR3HԿvPJILzBQcPg(nU ฒFw4_W0*%[$NTpe:<0J<]*)V#n:*&z@ڞ_dN]6Y7w3NFWZnǺi0b8-4␱scuVp`{1@1yX}Iᅉꞷ yV1qba" ]F*ogC,yYҀcoG=2jޟzKchLQXI٭iϝN\H)$nwlFum78]{) i}[{ѻr`-,j\y_ɯ0,6>.pN x *W7]&'r3RvƋju9d6V2TՌrx4˙4WPYR ~jx*'.YX=q3J]6:L+"Z.vlgrԥ جN2Lɟ^3Q|X f!`|V9w0~ 蕘(4,;֧Vw)1_Zh@פ00wO&̈́ )i.v=sE:Lɩ /kjZ}$He f5:ŹeuʈLo{-hvI]zt&T-N!Ōļ1!)-{Tj_>8ߏTMV3IЄC'hR: /ݤ,eAbyNPr"LםLS;@ծzEEع*Ni0L6J bHu76]4'>|B?b7jξcwpD:kaW`*7hʐ2$N$ǎۓ74Ϥ'ĕ#:~)㩪OkSd&^mБ(Mы?ehS)82n6.t2 d0i fqUVYl5w"eȸ;#߰W]8iTVLo@IhCuyaB$N̟$ztm<+tKeW" F?[3]?Lp:;{ ("yh!zv2鄿-ɉFa焺:3}ҚufFX?tm=xYg\VyTXưl>ZCi"Ǩ"X[eb\R56KT2&._4xxMJAǤ,?2w(=۴vS5c վdG8y)I{3pA1dj-'Ռ@ʌytWͥB0gmqqm$W%n%" }p:aWp42 O+ԡ: l_k]fA3'#)V?ځ;bſ68;EqZNm"ZNAz#>z%S^' gh|GozEq"|?D/(u_ HÅDØs~r*ܣWNGKSy6^Z|P)E;MLp EsǶUk[VT~0pbX}n|t+>H:0G .Gp:yGF=^Iۣ L,8c/$T(0Q)LX&,͊|!A&On@^M(m?y.tar5(Ryg ht~|}APȗT0/՝ӑYM}9 fuWyq:'Gj ʕD)p~Q!\^zgSV Df>%B`lb[ԕ;?'R #;hķTs$~'AGdӸ.G +X`X V^.̃"!-d?2 7i0:Y/Z>z=*JÈ)5@. -X?ELnkn]=!"AmExn\#5s@T1]d4>j=> stream xڴuT[>!Kqw(nť;w(Z_z{gȳsOJRUFQ3 P dPPT5wfeaؘٚXX8()ŝ.V { c l`caEHNJ3'@bdT@.&j=Edea';#HŘrƦ6 wg+@I rZh@9dPjU%??)2Vuup9?.j Q%5IP Q h߂$+7՟ōojN h,]\ݙ,\]@NLSrlN@[_q7{o%_@h $Nr{o˟28/_ `4A @OS9;QWgk+flg٦ {g+gḘl;Y3+dJRj gϨz=_J(xX!7ٽvF> > 4%jkdl~&Qk,e4Sr1WO%u1~xQ{ [z%Rl߇s^Y9K>6@gg_*{{0kJRYFdfeo`;9{"''}̀Mppu"7Y_,o`qxF<f#wd }/$$vv{7d0K WxpߓWb7d}λd=?;ݿ3q[;$Vw  tqeCwϿ):?@f{,{_X dԴ2{?.d}$LÛج\%s鿎?l@S%)ujShdL94%/Y%\jL!D.P0s&U!HAO/%X2u5G=c?E?BI\ &Lŕ]d'r_K82>Oy;bئ!+o[χv/Z`mrXA#$\wy{6]73.ۍ]$HY?Dxa _z v> OYt^%,I.A36#^Kw-yQ|V:KeSY++҅}ժoUdVŖ ϯcQ`.DD#lQxnj@ŎIGeEwxsNnOndzFu%C("D'yI O3՞IYtA{}Y+~h'4ps=*i.n P~0ΌCB~ JtMX-]ˌSMs(GRpk'4UK.Orz8F-HaIb-wghس)y` h5qi+N;˄Gyՠ).Y#)3 "#W^o)"a~ *AvT\WßL#]7؅<¨FckMQ/; VcZKPo/U͋Hv=sR ©+.cO%53ofYl3P8O( OPp;Xõ 0Ok8ԩX&?HT GBO}lvq^m~Ꙟ1gd 녛"^Zf."@aݗiG @<g,!W^ZHi9^INԹ oFxEvqԹծݍ}nq{g GW2\$I^Tјx?%b4+u.)s im>{/'CޯhnQ69T籨mW',n?^~ E؝dh*Y[lP]?EyNiB}Rjx-%1!ǝ6H$WgMQo$9s6ߕ `3 ~13]g0;hÂRJs=݄*u4y2\:t8g A)r(%G(M0Ez;^JYMy#-PW落.\=ZIqls+Q 7j x n_>BmE|j;6"5hX1N5=SaCnւ+OI~8zf>#qz3VE % !!Mh ^%)!GcPPZ}}_|ݮ_B]MxC?)r!VOH[d-5ML(1Bk VH,5KGKh1n(S!4ݠ+hTQE#IOVKCǸ 9|֐*fLB4# Zb*m޾X76N}?Ϫs_,1PP- "i}JSQ,?J5 p\"2 %XlMnt9lFqJ rW8cdS0;I +/NX"3+pWv N`p9s*I4R]oqf3gOw hJDk{ق+_>g'J %Jr Wwv .[KBsNr;b*&iOOb3]IRتJ)7.͛~T_!vUwh vP ~R*6iD Ozp3LE;ȩP/Ǽ*6 `Yoz9wPi=x;j"ʬ(MH<ڐ h4VnqHshkdqۇKM@oVJr.e]j/2_5'G?" 1_px J0/iնU0P߅MQv UuR 6]Z]:F):ߗ{x&v &7[I{w;A%g0&1i/owp-5fDjT0~DRC8Y?9~Q] Va0WaO O2b^݉g *dNo4On|'(Q"cjg|fߍ|Iy2vgt,f.VeϻvV]L;`nuዺԭ*R V6ӿCKtFLΞ!VD!:x_UȕFNɤT6:am^c5͐wn!Dff"InɊ)c+1dw,0 (z ~gwQj,< /`eۮߧ_=z ,RW }oZŨ>ճ6T ^qwa gUu8O`EL? 'g3q@:MlG3Stӵbn홨H\87Dv$M")6'l5wƕmAOl`Eb ;)ëDWHަK&oQtN35tSb ^?=:i=yj:8bcڧg ;*gn>rFdG;Q#diDij08[1椪tҮl?ksH,Ą<=]RQovnfk^X E.j( d7ck73VjЭ۷J%Z CX*?f_Ȯ,- Yՙp&b$Z4#٣=%;k} Cop !?/%%r 6Tg IuOaOЈ1) OA)0AB&#|vł&Dq<"F!z.j[;Z[ܦON55xҟ*͟,^EO Bi8Ϧ wf[_k3΂t9!TEBw%@+zůȠY4BLZ4ƒA5ORO) Z~kgHiY-0ruCGeɻ]MpHee$:NpU0c jEԛ3\l`:wf:wEJ-7k|Bp!_'^3=4VLGXZ5>]:8tSv@ah'G(;>TI䄳[܋Qڎ嶥J~ylC]L@`,ƧHg &)\t\Wp^X7jNa +DCCޢ1,ɉԏO_{x#Lڢ0L5}]I yS=ɟ N1#0S'$Ͽ 8XZ^2 'H֌>Nuݤi}CD'% B)i ыqg]ǚ6`K%>}׮gX8*l蟖y&8{Thbiõg#Ve+n=;[k}&G_?PvWȁRNzxnQ8?=/Lf0>Brftr+>4 %J~xQP1h*GӳOIYh/I2-EOD>4?ab S)WoLeXxȿY54i{9 K  KϒښJD*ɹaVPFm-6ů\smWa1<+eɹTYmBP?߹F}4<DjB\P=I7y-l YnJ: *`ކ䝗chy=bZs^O_\ȅ {\a| 23j Cz"]{Qc؜"$BK]sq޿?>r:$CkG~+SK98VohwߔSϢ;b$vP߹(˲>Mg VVtkmh*y*VJ4f'Eq={s]NѫD:7i-M%,ncrشǫG˸֫μv1x Ř?%-P@N XdE" "{1raX\",*vai;Hjb2y-嶏Qfp]0ᲈ|@`x*r:V 84MJB$iO/Jw9X0siA }c:v^l`XF=OPQ9Sֽvf ðNI=yLXU MK\x#.㗟r>ng vlVNyp{rqWٽǛcΉq t^b^厉t=kmǕSyu!4 lW4'D7j0>HXђd.cF8I #Q+Xގ< =`gar1Àぼ0 |ĥKXuv ALKѷ#->ߧ&N)9%kW..n{>:l hd*CrM g!2* jk8Ne& 'R\ Rߢ[p͢Z8jo$QWrR^f=qA$},1 8~mx^S-%v.!oXM/ ⨣ 稸}0gr.DqhwqNsf7S;ʺCso]/(?X}+}@xP/_Q`v,5Sm9zN@r(jyHGbh-v惍<[(xU; [NM Ӱ*tip2/ztA [VL^ޯme=h=?neE-+)$3595syiN?OVs$vZ,^Lۺc0c0D[<F_ wٺOz4{ vKEav 8m09[oNT2.'GIyhHrz5ov1W'VVNZm)YKenh[{7gHN)_Seו3y_z$DZ̗%,}yS30n,^)`2^_)hۓ2o{}˛4=UN􎔸og%hg$wHouNZWhu; /d%ZLa6:)G b @㕻X;0:4o"05M_f2ef'?dy,6E~bs-+6B|7F7-1I#fGr5W /<;싩[(]~{ذwڮtW]6e]yh$'>ronGģ罘Hz WgZ5ҋB n8`.3ZjPIO7Rk(?qF_]tvllZ tSG$j|QHh\!8_Ao>؅2 o5kqsשbY[= :D Z/F$۝>処1KIRwn1os5 n]!S=)A#Yif6Q\F}T9p\פzp浣įs< /F}ϙ[U}5,Wb!r*G]Zhie8]FKs;')킌_A˱Ghd[gw ffZBAשX~/[]GBp݈iZ͆ oy2i~+|->jZÀ[00%ᑐS֩*Zuo&$$7وశ $TaUU(?Ow&`/?2}kV\̳ǥ-[_?y#nra*c3I0anF51auE0{VR~8"c!?۾qUy\OMHݎ"H!#OK O8V6C~*31e/xP @d׊E!y~Anz#5MV '~L,c62N҃= Rڪ5k_ߤucPԼ?Z&o&EYMG*laH˕CCEhv)q8p" /^aU)Xdm6c3 1CA6:q>G>IYnbz7{j<7ob ;d䊟oqa"3>/D6}rN _Mz YqĴCaG#̸oMBf*|es jPevD$kru ԁIs1'w1r'Ƿ*%&f$ nL%e[!/Lżbrx!_8negYMv>c4c="w*\,=%|V}5z%B%2aO"j4奣2kgx0u+բO,y?ؕH#0l_.s3}a+alMH~Ȓ[:%CD@md(_-cF|JSu'xԊc MXIϲMz28y(@cB~o 66T\z .l^ό `/p`_g4U7,C=o8,Bt 7lOM,M=s%#/y\VP/ 7=HDR5T`yn]R#^mA3?a8Ôh+t_*|K Ӽ"rhN{cj2U+܇Ҕu~o%@zhG]]k4)T9=ir*MmUh# !N5զw }@>D ]eM )Q|h&P|VGP\-drA׋]sZ4|әv:?YMoOR0gU>-3x@? Ӫ.$T؀?h5`A!Rf>H=~P&вtoOIdIfL/Bk4?|<鐳Zpn2d9#yMFGٚhm{.ŽBNe*% 9!Dɨ,hv%6\#7tNE|Z,@` 9 gϳtVE0OAXX!$0A<-m*R?Z8qͩqTa_LVQ솗($ٷ ޮONK0 Ll9XēsSK$|XYY*6ctfEt> rnKk:bH %͢R^/ʃ}v- FƞgBV5ua`vIf=ڱ-o'&>dkZ\o5ե!Nc?<+Xk.򇆦_8Qn0;ZC]V[Z4eB9W|P/ EDnClݿUw|Ъf\l#3 [e`p>H&lb*B(޸ۻkkĥQ4*Oኋ,SZmʅtISWi 5I.7Vgf]e(T+8O|~lwXOmV+&WDXѾ_ lXY{KC,$ͩMA6?=n~b5B_fnM1/nI:?3#h |%<ەb͍έVh@ } 3<# ÷\/;>4!_`V 8-i<ԧ$jD(}:|-3fqH4ތUUT砽sĀ PVh_ xg1G -t~dT~!S DLy54 wXmj'~]<4o%w&A1K{Q0/f\';BV jo9<jW1ۑT3+i5P/26dז 2fǖ_{´Hie4OРv)=nJr\Q@vl=D뉤LtpIhח3*LP3{VW%7;U핏'Lw5q 75qxw6 ׇV%,T[OF(~ |mkqqvjJз)T֠C0UIr"T:*#D46G-OVp$a70-IӁ=بA'.<6ܻ<8lkԑVhQsXb&,*&WqZvuYN(֘"]>;}[Ǒx -ݥ|$*&Q$P\\2G"{"ɣ0)dRkc%犎n䑖Pք{6sUCT9_u6[= Xș#@B:KP],;N~1҂/Bޮ;nuf-y6M.39gaQ荻RyqĥEF:Vdk󬏒sp) ,%=nذo<ڹbtC0%wv5Nc=\ˀ^?Bar1DKvp=A08.k@%Oώ-lslk=KlPFdP8U_[Mlr_)i q)imu~&h cY ƺ)~.GF$zh ES_\LE~&MQZu{eO _5O}2>EGGqb 5>CuHK\~'ٴA7 KIj<ÌWV6  Gd\GI[z"afN81DnOsGB0416a}TB̼[QSϕS<%KSt}|-wAF YCqE~́AU(< , 'Pe!DʂʒvxM` %'JY)*蓗嬆O@$̂HiwjU~-:w._ծŒ&s?T*#j|T:JHΔȻ E3t[) zJ\r["/(!P?Q[d/' f{rtfxB| >yxd--pVѿVz͂fs|kO['gaTQqɻE)"q\<QzZ T2gݜI- #xN=~R(lBprlNCF#< |? Jcgc77Y3A8'f} mVȊCxIgkk.Mot>uqu!D nҿcl$,Yi6p-MyG*5uùu"N+jRb1Pz~eO*fʽn*QRB  U Qco u8qeW2 ZtvM;inR B&*.S&eޓIG1qMYbLų<{鬀%W*AI¬oYc)EghIuaL|ĪB$i7d5r,A<5Te20ZXq=H .>qg/t@KZ#fGkoav[H _EPCAM"Ÿog^U: R 6}Ę?7 `pw-SCۃĀ+5_md]Sӳ}pW(ƀ@.yЂ(OoMI(Lxj,Kw*A*=<Ûy;>f藖rjFq_9ggl:lw  qt"b J=:;lX Ł šCLu3¬~  };r+oXp@pGC<=r:^|1@]n>SPhmg扔Z#"++ R> Fw4vWr9΅zyF4IoAٿi\5`GLd SDGWDM'قD"ЩK g6iwwDJG ArvI Uf{vQmagb΢nJg4XZuɑQow~V<6hǨ9pE{0JX QW1w/ZW,w`VQk2=5Slb]+<:m o-CDBv}`Ǐnʨ=fG;&Rv `ͰsrE7O'h_".UYS4^buo{1v4-`Jڵ^ 7L Caָ{FTrn Upi`c5ۢ췵%{wkG.L_nSE ޞʥtXX1cDJKt7 8UE@\Z'DRv0s; uQ L Gm0a$1TЙ$)>Ur^K3ʐ܀{j+E75I. FV(lYh*a^FfLia^q7[5CT q}ϮjS㊾ GFn3q5&fAzi_&v<a$]1f Kt9lazX~ wc#0 ;p` 8uʏѷcʔ~j7bGV ,`1 lKi&AXG8YceJ1;l-\ǯ\;zdk7FnO9dU!mE㊋WHq(x7$Gl)lsZ \+FE{IXcE 9߬((&HO2rc\xLO ]l1 2eQYfN+h=p1bgE/m nf¦JDJH6e5_RWMlܼ]ړ {} ":0%(m0Ӿqbu놠^wL{CI ;dh?i}Igr049m&%N *@cya6\$qI ?a[(A2Eu. Fٺ~`h*Gϒs!h.=/ຼ l `]t 3YCГD9&ܶƈ^J ( 3QQw (z3-?<Ն̞iX, C!]qJ v3 9"DM@kGz?i'<͆Wv>fa9lwUyG4Xfoa8j&$X|]o헩m-BKݰe_;k);'3iMz1#4Xk:3SZՊjT瓛(_5ݾafuhʀ^2- m>UE⑇_>ݟa/~¸dJMv,)twK1>\j#aF[ Ftk._SֶǦ$~<̳؅6EfbuJqdFUCԷqk JZ::*72&$=7iSBu+L,#Z9w\20 GlPND0ƫګػ#0f0Upm$ "!M$4vF95a>* 5GbS4QIqoLGV3"圐لpS^-A `PX;JxƔmT,BᄼOw,5>Sa4%A v3 sMְg]sjfie .͎'q.>Q򤔊QtkLzE57qd\Z9 0qoR) |cѲ0o^kuOx+1~s?߈l^dM)G!#AJJ_ˀ4%R1j1ʹ ~q?߱5fX& AOB\3m˓*{0M3r0PDUbǵ۸dS»^BÔ -j' /ڜӿ9BTP=F⿫S-ZVtzaF0Mm [& -_20j]yqh!{ֆQS^(F  +uuh%_Z=jBRΨWvBqY Z zt?:SRC+<2dW=\yiN)°WtƋ5j  P&kJz$q'\+ȳԃ}\aD-zT?>I LBa͇7%0jpiM'D 萓y.y13qc0gt /)ó]r$<r(gQO?[ &( ,cY4Gʰvr~o)@q&ӗ?,ӬOo ׹iJA AudžOWyʡ?nFx!YUcߓpCTFOG;i3ElXASQh>L- w&XpT4`.IŅ- W46} 5/lowAB▲Wu+G~k{ 4;$=mqpą yJ"DlƖ5x2B4)=Mm'4=o/#&xWFA0j=>d4QuzKIT7n VE38 >}HZ%LE£tsz0A̼;^%u: HObEN15:WYrܢ5̝ޣ((aImؤ-m,Ά8[ʿ"&P634oMZ,~ѥ ;FAɍ̚t5?f`;f Q˷ҕ(DC;aғBqڋ!T1vn2iea-=[4E}y!9XvUS E-ek?MocsRҦ'Zo(N,n] fYcnK(=>z|?P=(@迃_@~+*ܽXg;LO/BTs.ߵuR8L&6ֵ\=Msw8VGA\.:dVԒ/]8r~nj)qʂne_vHiJЉ =t|w Ϯ̅##H+[l1$t;o9lp aAVm":P% V泜6NJ'KZo ꔎ@xm5@ G*Bw#w =Nj:;z +ݙn#nha}m endstream endobj 281 0 obj << /Length1 2066 /Length2 21760 /Length3 0 /Length 23005 /Filter /FlateDecode >> stream xڴeT\۶5w'H] <Ͻgs5}}X9 1PDS1sffW0ttXEF K{;Q# (S,LL i 0AF@f_@Dolڙ[SD<,-@`ҟla% w{7Z@ #3@ PSSVH(+)P3/`?\DTT$Bb:@BME_U;s:O?rbBZb̌`-/nS{O5sradtssc0wq1;38Ofd xtj{;A-P&@;g$q9m[n{#@ִW82F**l,@@;#;@`hJ/@ӟrrw.l3]/#>1#;g?mbol r׊@ {?gfiMNH^J\LE^]xvrݱc,`z;kg?|Ƀѵk735ySF5?6)~7m3L#nb_jcfc~oci|r6r@N.@:3sL-M@B,.egfGT]T[jjog03ۃA=Z.66F@nZxGEhpw5/;Tdb.2z׾ @/ڟ ecgx8.Jk;3/EeEh&fgbojig`a99y3KjS_Z02كS. `0 1 qFF#Nf߈(7zϓqX<{=O߈߈ho`4|7dhf{UM[׵|/e7d~/e^ÿ93:, *ﵝ3@v|g/RQ3,Lk"a5P 9#%f~zQlaa{w/z6&={Eg{l>ky?MM>[6~˟(&f8)הNZ̘h+mP &E$or]&y0ZSRඑGd1uLOԇ9yZlS D#֎7O+n3?0lP>.L1㢍i BG?8tu`G gH1GpCl|DsݽOUHP:Dσ˃T|R[؁lo ]{iHRU'iȸ< ^)BV(HrI6J~>,Ѯt$esG=,M?\{`5Ɏ1Zzj xR ϡA.BΩҙDf{J=3;W>^x(^{}0bknԘ#HS/F6+Uͮr޳Fbb:6EnY^]CLpG߱'g.uSt /T5,iJ-%fؘ }j&ڭʊx"2㮸W9,?;F5SMnMG' (ě{خ*$-xus݊3rs#nj{+~Ew,;]np񭆌:HGQ4Hj\暌+ ATنn_!s֚2ﰐ5MkgMGffU6J.I SrWqg6^I` uQ RKrh7k8)[bu;IKv zш MϚ1{%tO s"6( .,2Ļ a]!.q~W59 bE 7!nK6vQ6l3n\J4,L@:4džCaDx:-QgU<ZrJԡ0 gT+a bK$XAjq VAh1H^:DK+b(@-qÜ6=I@#GF(*{2 D˜/ԍiI9\Hώm*}܌9E`uyQd$m~~OH_א#$r`ykA#.MN2^jD杅"Մeq A,Djл{DIu" k oePטDDR8esZu-R_؎EA9K)ֳܱ#YIٯ8%ZxJSW7ͶYg]\y4ʟ-<.U1*DPt w'~EV=}:յ q)PsWo4U2Ȥ5s`s,~ Aq x:3kf↼< Z߄V9H1 %%D+P   F^("OSwi+D 3 B/9jTBcZSe l$Jun Ce,*>r|FB׊R1zqo6#HErycXw 8>$߫yװ Sp[g}ZDc-`d]B#u!bEjvw_yq4J3 $Jb輆{S|;IntHCm`%,ZjCG{ |* a9d<0%YǤ~1,muP%_p~bҾ+Q@ٸ=sҦ嚔$7Q֢]W*=2Q uoOqaT]a$iik2&жԍvzj❓S 6'`D^RO4%RWQƒd X8 g3Uɘ)zɴwcrlf7g~FCI]t~P\rt؏ ̨C0^>샤kA5(|_?Ö@J+mlK$H tʲؗ-ZNS4f3>P]1^f> XaTUlzmvCPuidG?$g`FGe%)MG*,%c˕oתIC9O )GhGI7WLӊw+@%< &̄jyLàřu.ae뻓+S bW$k.%&'E,nM#PQw}k*'7(Jc3(uLCcMК,}GN'X3%$-'~y I(|I2Ja1. 7|E>8*Rn]iK(cc"*1~ OtØ8DOr4QQxN7BU%ccd\0胢x?Cy 7C@ q<אּ"[#8+m3*CTW#AMenp *X Nɘ M}(zV,&Å-_M(M$ %pgBz:˚tT{27QGn`P'ޜ1һ8:@Q92ߤcO]5Z"!&]ir>Ɩt^ĄDY(LeS}KJTkg R&]6l/cQ=ݒ#-2G3YzZ8?O%7 :"$ɫcWrM!w$աNj^Ӵ/̻|R~?'?Jm;s.~ٟ,622}$$btj`6ǗU'|h'da%ֈ$^{ßUK }¢C߸]/SVl܋ WL u>Xh.VeQ1A`8m+4~Xe/1|6OϢ?^0y{E͞Vz8Պ rg@v즻%'ފ[ђ䧯jw[?sX |YF>I2cҔXaR" D3 Pxv_Yc6w*ņ]$^ X,<|"K2ϊڌBj(ǍW6PQ>S;wY ;G \}Iy@K`_"/%~H_aY~=\J }|!Ykl~&2ʧGPڳJoHϵVlzs mP|KBZBټFXфʫ*!}ƇI{)DI ahӠ?hWL vO!3PbhjO7ҌP؏07N`9glA: S0؅lCql~`>2QD= 0BNrѧ߯F8]G֭UXqyx!+NuFL 9̤s/PLɭ*MddW,5{Ţ F% :W1mN5tɓȯpZav8ˌV=Տ7~pDM KAPL9ƒBT ̒l"P&݊Mˡ`V3X<]U;lG6a7+Vm nB\˺%h>1idѲʢ.aؠSlyn=Gzb܈l%ɘ~VoC*ljD&/t5U__;=*5s_muj%E8O._u&tG[5'HSeN#q~Dqt)3ŭS?)na Dw\7fH4f$ݠ K`6k>q#Yj,cܖ+3Y{̂G/ݐYHV5fI;%o6sHo193.˔yW~a{aṊFTq~ wo Pn6g\ݢvdКu4Fg7XVE'i[*L޾ P6u%2ʽė> N>(D)Ft+0f&134 e<`HY͆/]7ν9 2H̡ !> "5tcH &S,'GH`qQCLxhy×:!mB |0*P}17RMCwP[$H4ɬ .gŀqA-T V(*Uj(õ 5Bf^΄ע4AuT"ѭ&u9J?Yc:TpR,4BP!i;aQ#bp':%ڤI+SdDgCYU^.tSf ~LD 1n/]6fAH뀽E+r4 ,씄b K9^<;>hKi62Mɍ]ؠ5L\ϖ^+ˉv[/)CWϽ=b01vVg:O@Fl˖tB@SLJuTFr*ƒϣARvahbI6oSfB##Pc"*=J[Mtksؽ7L,hE-}K"v%CXlcs|MR;\0n,Ƚ>v6ZS#)xb+  ǞY ڦZ#+|lO6],Ceu`$S4BdһZl0ͺjE"BF4hxIC@U4)pn9A *u򻤝K5cW駶=dC^UGN@PO0W+~:tP6:ghǴU̷r&KF ;^D|#]t$ R.>12IK(dPqOIYGH%~?^H>b?5y#Q d%/j e+!}y *+!GŸ%DBZ&j)rHqdݠ<h/|r&TB5y4eV<ѣND!, tWt)C⒴~Nc?su"2UD89^6PH/;%=>}I{<>Iچh -=_$v?.m8JlZ0e"%I߯fi=qP&*'Io<8q\NrX|jU_4Q|pk--(<3'PdgvX5EGl]Gel(:c!Gۛ(V[g[Y6W!X dtĎW={id\t~$Q>|NA~Yf^0x<%o7Y8Wfm ,2\`]@>-tvzHgلp+̪6WLM>6H{S̪>Q"o[l6(`$4W|O<8`rߩ-L~6фn,-|M/h>ǿczҹkno6I8F:m\R~3 %l˝zCw سRD' .2~jv<(@%IM6)2^MK}oo7UnUgwI qK0a5+DSV#]=ٚID'y3)UHTRQi+)DKvſ^Kϳ2-5/e>LWpzخxN32H{RJ} /G.*UŹ[nV},y|m]7uuu3fluRgmǿ8R'2i`s54,Ӷ]t<ђT?IԴ ֧%~ݖrjYnUS*KHm1g7M^2:_H R0ܵrhCh+Ӗ2\fXg,@.BY:er][OB0i./wG%tKv\?׍*0[ աyBYWDw\2퐎\vY>D|: TeW 1PKJs9{˕7fza'ZSyIgX}gkwAaSmB-o> MYΣ+kF-9=p'VE;m׈tp/lxb'#سyc+5{M?_8lAKdl[uppPJsz"::rϳ$r!!ct,]P$zly3vX?CIe[ap~d$W@2./x5ueI! W%HꆇFN<othasyvEz^*-7gwB P#G鼗x'4UJD[S/g,iʶ$u)=qqǸH)SkY,H}>*CU8~e dB&z1Pțk | =.'/b<)>W@Wfٷ iK,ߴ'ܯ9l vQƔ%6Qq#YP@Cx}l-I!J[< D0>n}I3(iݶ-slݛ[^o4PX܎LHS(>98T/Qj(2 $K)L"_||ջ; guq,3['BqUQqjXy!gf1d5 5H~0raDTAks O G3}HnCT K6S#%[JD3y+ % 1337f*ڧ-ϲ%Qv‰PRиW>!`ߤ~6*!RN|:Į2/P΢}YT!0½HwB־FysV<.N{D,Yfɗ(]_|:rT’>K5xic28^-,B93}gH s2#)*gQ447<٬9mBQ0TP=j-!K7_.;tv7~RC[w`K,3$DWH8h%o qo>+:hv?N[=Xץ OgHXqgnm/j{Y% :^,1H@ӸaVWH5Dg>X4qDW3Łs,n*ɕ \>=0Hb&5\Q q2֓gf<*( O[h ) n"Zw!C2kAf̮6%>ҹmԍzJ_D+hXHshNAKxz ض3͕(M>wQ\x̀iَg^Ufz'nKq8ez_{? Hui";X[|r5  syD/pHU/ ~oIM̠4v3dƜt:ԉͭPj!^o2X/S< Ɩ)9ȇa/r60ѿR& 3G@ELalGיUfز`͊] Q=]':ޞ<ǟoI{z6õq~e[mN㺑80sq7qewhYI{hadc.[hƚRd^wbLqo}8ac01wRmU/^da W-NT,Sz +Z :#09tUYG1~}3[Zș tӈђKx]E1+TrJ<+VoxfXp7B~Dz@#жbetRN??Javc?Pv4Ps2,T*,:Σ*N!l8Нrd1,^~W$ >pg Ht@2SA?2KǬJ2~B2EIw WER3\62j  JFY&+v£؃+O vf eg('o̖ ) ^ SbU[Ce.\e$S߿@/7Pũv>Ss*解ZE1R:pK *NY`n$\ncG>ARi-X%<tSj 9DQWj.Ȱ7^#SO livbJ8SsgLt S3$NTh>OGS͢K*F2\t*qs^9Тm#JiEJr)a?#zvqtfb@?.v`[;+YZl}oZ%iώvSVȞ@- GZI#܇ R p*E b!SeN"AgLT"$ B<΢-Dq6cUɛ'O(L8yMƗ"`X{x1[W?nFp4lufAao$D෠\kC%cvӃ=]p&%5nIJv}3o]~4Mt0)T+Y"(Pdq eaf`>CjURj#&nET4OCP~Ǜ8^f n_K@xKv Q:C i\Ci+N| }BMhddD jV!m}YiMNŖUt~^65T5m_[AS'kF7m?X-i>πBTڍ`S 65@i"fnDPAb(nDIy8\+d袴&Agj(#m\j`m; 0<ӿJ0ZE0%@RF Bjx?sk9~8R!ž.ق|tY:~P^:$<4dޏ`.-Y7{R詙\K Ua0DR8& 3b|\R_\T^>p1 gKڈ+\PHZv Oc.~-{>؇⃠bʪ;}G~_'8dq.gE<O^E4uj8ehg]xsE=)BLXkUA ZЊVw2iV)1;-XD9u;*NDVUenvއ% 4 -u'ړ(iݬ!zxd~nyvym+$8x80'_Gp k^t 60Xv6~R K *buVI?:!h3Ifuy<`Ca"I.5|0*汸r%mHvnK޳kD4J>º7va4=;D>%'_lq,b*8#Ti;1Y>\2 0H}eY/K(\f9xZ588ঃ@YϾ Ả1XW,^]`E԰:_X`H1~ӇjxڡT .;Yzi=&ۜ&˄Z8s&ڮxV3.z LWk?iEuՊyE`UBw|N4eP,eHW2dViT1r;ӤQ/lʔVpXH[T/dxq43l)b6\[J=lOSCN=Vwo`F[.ɴAǟ^y{;{sx;8Y- j=ggsKr4 ?c~8\"5)ciU'"ψ0&5i;g -,Vk]Z؃JY"H!,'Lfi-9Ƈo7B E:I`^ :a* 21+6?SI5ǔiQ9}>:ː"+^jpmU xN}~g> ~j`Ka^d")^kp9-kImk9%wQtƷʥ?CDO,Gg.y{-R3!ے.Q)+@}9pEV}($-ϴɛPveA1FcTBwˋH=Nxuʼ~F΋ԄeZpa9ghN^@*BE-v r~vKbz9}ߒA#fׯ> m #r02{GVę[^Va\uB>%ً2mJǴV*9? }k'E.kvg0ɲ |~_s2ufbJ!c@-t/!GjɀL:8)1$9q!w<V2Kd}٨V(ܵOUVxz]4nVݒd0o6Xs4֒Ub>Ҕ]8x<`h6TT1?2ݺ,&*vK+'Acr1?v(Eƚ$o@8}'P&栈vWrJUҟX$TpT;Rܖe-μb 7# C@˜6iTlT?s;6rfq*fք Q1\̔i`! B싚f7_xdcgA6}H9Kv6jDOe]9u>6&'Tqv@SؤD~|pxbpD!H#/D+WopӃ5{-A嶻 !ܯώR*+Vób؉;$Uič 'ZlCO X8/~4$y[%Z[*P_~1Q9EA~\\bGlvi1w[a/].Fo:[ZchV;]Fd*Js]ci9}K/ j/{AYH hxA"*pYf4iD:(7J'_ EF!~Z7B`1X+Di1{G$*ln8<<^^b45t* U;6W4'iLp0h;%d>x%JD'CZ \̯ᅯۭpk/_LX%'?6">Ʊs+zK[jl\ǻ6{k2V|jۆEޓ [>uZOPN؀#GúlcEJ '!\X1p%#%%J! a|CPfy@?*!j"V0G3mW,z?0yFE.S0*әN *94oT"7i+ƾ{\](&ƫ˕ކ㣰H@aPvwC`70n`iW pAITc=0ˁROP!!ؑgk|2(%~%FKz |yDzZM?C{Uo&'? WB(j?]0dJgȺ&yC=TpaHu%yfa%eޒ?epf戕<=lCZz"+^p'9ϳq=<%3^m1c![鷱Mep(;G-5<8b03;*"/+l}uv(r2Cw9eWeZ]E@A-RM<#04w+ۅS;=T OĶ1e)R}ei&Sٸs4**-:)Ј]r]c>b+6ַuU뽢X~}\8G4A:Ḿ^{[Oj$<%:lnlE>?x2=~Ep^02^:zKfEaSqTw.+}Or?k#>(Fr e<'V%Őr1F|4}'eV6BM)rj0ڀ rI#~ 9يD*|/$9VS~0ARp$Aq^0{7)!'&tXD/VA̹37i"Ѕ/2 $L{SW-/jB1FeoFVj{`oŰlDZ:^D{-PVop3[uA0>vĿXwcU׏>d.lj7BR6[xSnyE%rj\S^wJZ.}u>fأ8OܷQO@}]k6{U \DĜ, dV #t}12tc >=*XҲ""iR[ cABW]T_ 7LN7Z[o`Ґ$ie k礅\X!_>U?C{NE #)()wފ>?:)$S@'f#O 1,O !j}rm V{P<7 a\Y8*֙ s%Yj(*;em[4X62 {LaD&+q KZROD id+d)\"g}̦5X\"f+ #\n*'rU.n QwMwxw"X-rҗ&3!&ܜJLzR\}~!L? :ne0{ۊE*@H%sQ>ƈ- nj43Ylcͫ,[0Vn@jK-l LiJsIY~&\?^.yW(֒"e6u3+{;B)) .>{0ԉl*)W1@4=Xb7*VďȐl7wh f疐E/4QFz@l[ywX{3v E roV3;_tގ-FK8:s FBK_Wpmb{$sgr-{ufdT<.e]HY(."tg= dJi O( dL]o [>v ֢Wdw8JVH( ڳa&x$Enމo>@!7#zw3Ѝݠ~8S)tZyJM+qQ[c3*n: CR]@%7ma^§Mg8wǽ8 qcp~ڊ=duq&'dmʜF/7Gr%\:wu>ʵ]؍az~s$Xub[3?\v4+{7ֱ@SRgU@:M F_#Yr(}cvwܣO_h_8]7s%p(I+U꒑2!rE q.O S¯ XA){[m74Z;_=vjGJFȱUwxlH N"L”2w39š趆dN-q_,:(lJXuι0N7w(1qiI7kl@]?+ɍ?Ё:86z&DTZӵt`QRɟ5VG?gt"IuGHrWXnN-d 藸SS/OYXofj]n/jTLor(q̮ۡj.kœa{ Ŵ&19{*9zѠ+㤓*΃zҀ𙗾^/Ju3V![4ƛx h(t6 Ӄt~O1R_+v!*7YVnD]"LLFeaߧX [ȝfVhu!3sJ8FGFCjAͰϲ"%iցk~ȳY0|sv ȭts~n+$N菲zjc] CO98Y: Pf]LcNrx4QеiG^QvĽ@Z:^E5 ѩpщ5 {pWeīM*ѕbUg?tV($0Bə}̛Zq>JnaQM]ۖQ6sBm˜gjMeTJm S{D^ىˍ~LG ("ś0*Ӱ~x{JDgpt1UsOi1~w>hf}a1Uk̜Ctk,)VeMU&~7;n99 wuF5M)Iՠo4#5b`&{(_ Ncᴆ*߃ 6v^+EIEzq )O!^_H\Hf/fRȞ|m^>RZnb'O1[10?a e\-R$D/'FK0bnUBΏrvjbS9ZU|#i(c9f*BW1`4 U[ݞa7”m3w: q RuV*⩸nA2,W0RO1i/e_ ,ܲ R& PoT<7VXg)si| *j;YǬ7XUEo vj3CIΨ0QhNJ38 v.q'"/qZA%ll;[xya _|tdFp 3W@F˜NKиƳ_ 6J!vi T;Y  Z,3˳bpb.YaFHgs咞ķmZ.A29sEf w]퐧khO&m|QvfU„/\ʼnL呵bgp$rSws^RK"Ph}rm=!{ 'gudިvxx\=s3`Au"^@k71Xϸ&#s[$n(p,55vlȟ23s#isM,!{GTUb~m6'cƝlxb$~-w:vRhn,GÇsl^tȱŵ-Zʱ ($#%CCQlsQkonX3 i.FM>v3Zd֐9r͈kTe[cR.IB =?t6P:>׫;i3ElXAT>ɤͬH_ږY,oIyUw{Z29VYUy=߲r# |њV؛EW,!%&8L9/BGț:>BѼ"EJ;nˋ_@öW[,&yDpbꋭ5T^i6{e K՝o V(U}c\ܦ;vguzpRqå"^ۋLͲϖ| Ñr7٩aXkL^ f1> ԯ: $GCmo{f~nɻHdj.$쏟btHA]~(bh6Y©u _* $c{Zr2*9{ե5}Etш[ilE2eOw*r%3ڐIs\G+lW>wsӧ:<.q|_Xqoe"PIhT$Zdwg撏yu;"V:@j?/i IGlώH6Yq=# '&`k`nkeX!^њ8ԝjBM`wmT6!^[d[PNJ#vJ%XqߞpNDNKڋEK˓SQWNfvшUo@ r$jF 1T_hصې['cv(r_T"haY-AMQW$ VlL2  uA~m;_JGO!l+& Wgp*@p=g%pUŋVx{ r"="b':nm+dl(>o;c XL$MWuC@6 ^ 륍qvb{XW{g&)ؚyuiN!OP}SFݜ9vjԓJ(!)3[MYun fOx <>%x͘ .ۛEخ_Ĥlް=K2.L[c{aڻ%.20,( [e_$t[kP@1if?c*I/zǮs1>QY$Npm5Hm]uS 1EL/?1]:@ѫ!0J84{&rL7ɨ8n-]怠HDނ< p7a[8Ū>=%oM~u*GA9'/c*Ww{H1|B& 񓘥SȖO;鬹LJmͨEOVm.-U$F㯴`XV\2įb*z gA_A͑+2z_%nrVz{YG`-X/ *YnV7d~c?@,NFVT kv0=3)>|>rQI<+5b|jp jae{S1-(w)k_?M<ə΋d*~;a/ÎΜ;bFkP|٤;Wj;c fӡRI|v;28BC)+;Wiԭ?{4Ԏ }o؞"v#Dڴ2ӵ&VDBROHN:"e(~r#c?C ܡjGE~rZۘRїA)9 m8 ܾ߯N$NOϿAR'鏛J(AbQrҩtlH ('oc#S endstream endobj 283 0 obj << /Length1 1608 /Length2 7838 /Length3 0 /Length 8656 /Filter /FlateDecode >> stream xڭWeTے-x[@pw4ҍ4`=[wyͺ~ͼwUNZ- Y(% Pۙ;;iA%'#?=# B` a xxBBB)# /?1'fP;d =0Tω?"D<{;`O cWD=P3{jkՓ\\N`7:|4}h F;7?|@ q g$ޜZXMD?v"u@-y!) uxs y/@Oÿ_g3# h.?_D fx~ΎOyO%gA 7ca NNKUgHF/vz' Wpԍ ?4RdEdؙ:&@ܝ*Ⱥi\ry2 PKw{}DCӸbه%LJ TE؆[_{u7y'rkF:W>u<˕BS\(5 M%ݳ.:d&~ 6,zMK}'S~}ni%t,ȜO:*^.EO4J2z+|^}<)]Эt+䞽o4>{P}7.Gͭ ޼"<?`e}|mErZQX,3+2iӾQ.Γ2YV#[ d@WύW}'!=*1gUS_Id{XIxj$)Ṽl)&4ie I0h3dm(@;Ui} hMǕmfEs!Ībp1e:+g~8L\.9C[&Q K),KrʒJfI4"[-dGߔRt\9_߯>r]q"BCiEEތvXMVz@# 4%.z!"Lډ:\yl˼48e#6ˏr:29h+Rxcod\2ed}aq3r*=aiR 8g'բ]G"`X"I |!johORp8ۺ'41NGDA)hK 4Z`7ߴC$ֱH1aAљADȱqMHy/K DIyrئoj^0ȀVAB`S2!Ku0UJuײ ԝ =C+n G3#:]GgL-dt̀Ur?lCk<ޣ$a B(š߂UN9dls.]pr6uS-/kx^*l9S*TLj`_1[1 +#Z䒻W8ocSI~ dK| gz~;EVhrYD#?n=p"5cSB Q E!S]+?=z%5D\Fe\Kab+ x0*dTt~|K'=3@Pt2O Wf/(U)^m+aK36xNU +riNkWm9EecKv6aO9kg d - ŒF~&ݝErsM\bCth&MI/FY'Zx^!+ ]g/VơRTRjpя,d ud(\Pjߣ߸ʱ ̞'j'dXZ[7Y0)T1 q''{\!{+3 Unv<)4\S/BCɥs=Cw1n:̧9 љ> (g8~a\Z JuEh;dfJd1 MS0hy,%x#t+mOi+Σe1`c3uX.nskO[sc3Pe i; UM] ۴\cZ딃Y&&T]C A#G?ׁ>HTbgU <Bl[`̆La5+vQ3Sox-*"Pw@w ǽݡZh!8En W'Ȟ!'(=&YE>JHnU!PDB5(43B,V^}b9~EƋl]8չk!9=p*m^-E'?`ayU!E,$){Lϗ %G.U)+{Сۂo5BAjBnT;g}obQr Q ؠ业RE]sH018pȠ @>;AݎXa@'wR]ىy|!TˏT< 9@\ʙzKKL+4zS+{]O(_Ug8%)+WﴣE>鲱W<bMN2C nzQ>AB2^ayz2b,5GCm.::yc&WEuO 3 q; nvD7VD{fw .ٙ~5-[CL[GBal~egCA~ !g| i-)З*C;H u刽C >p )|[z(شqOߙM~zՈDrqLh\lx-REDqSUqxoTs>ӠȬn"IfěLmL@ݢ~3RX7"˗VyJksdT !.Z!AM:z-y,吚yI)]?~ 3o7.5Y9Q1dP!ڸrc_ZEd_AQ[t1bq5 U" Z#fnXu*Fb72\:K>wx\%4?p/i~z[P4㒜ʜ /;bh9/p&z}RGVAܮOe89(nm&9u!+ʌ$t5؂g,LU-ڗ6d?xǐc(ms ;yiI𸤯&@g߭p_\J;pvr*ٓ Y(e|&D龄Av@CﺚmUuG&Σ,\F.J!Ze\|rB\_?gZj/T*oc:?񕎅މFJT0ck7峫W}|9lDPpMqnr7/adw}#C}^,jx#|Q<&d8MYkS\BWiƲ'^'t6 ^:ԗRJ?6+ҭ ,Swоz8%]6Cc\θH7X]U64|՜7N?!9۲WS}\lr1YM"x-MQRz 17&oo;L׊U⊔*lge+БIE~<7ḱ{m^ͫt#]&5i7;^kg( o̺l}hz3LLFv3fϟQDau :Q5|;$NOD$m'@Y[%Vy#xet4](%uh c{ @[jȰ74s|BYɇ_.㦼-nu(n?ttzG(ǺbBHVwe@ѳ+t+ben(ċ}{K+lwaqx)úVx-g"%Y@~4a nh=`qR%A6\yDY*SU>`Ѣ^Xix\+ɹIK+;.eȏ; Ӆ pm_hupƟ09ʠxCpdh /,>1U^V8cLxbI{\"߈VO/}Ry?oNJFN˄S'44ַ uJtwm(8? 5xu%d*W5NifTm^+A֙A_#j9!jFkR,cW/E*u~ty&ʋ'8׆kǨ#gXw/io2QYėY\ XJ5O'7eUqC}aw٨D,(>m=TcYaIHBM|-dbr~ls1cOϹQ"xINU߻d_SsiE4셞`RU 0` Idf?R K֭3pwBY{Q8/_3=X>"̩0k@v㓲O˝4r~q4EV;j|-Ä2`IgUeuaI֟viBx.Ō̽6)&fI~fN ,z,\/r2}*VBVlx2Kdmz#-ǏaV:(;g['72~z7zZ&Y_l#iO?u4cM^u "582'zTeYB r)߿"`Z otO Obަm xzwhuq8]%ׁKU Aʜ/?{a!\g޸amEU%PUG-SܦqCl} Ơ8vnK;jWǤe~+#~rWjV{F`!b/ ܢӴt~.`r0U -DR).I$I~瑈 O%ej#J.6꜀BN+0kN~zpGա3Уt w>K ZTg1C2@u{ UX" /엡n/O=~8+6pl i gU5e$ȌC-鳎LT*#>'=CcD$V~jo+X0`OB7m59KI++Pl7VRqC{/!YJ1wSҮI.4u܈,G৐a(YrQ\$Onآ0^jjQKޕLL2a1sR;`DҘSQ~UguQ^kiFy6rk6ZsF_82{Ȇt,lnN|!^75x]Ǥ4]+܊TYp(ϣ1u\sX9O)[6ZJp缦R]F- H?7ƨ'7gĺLe UO5FcϟNϟ?0Yϳ`Yؗ+QY7I'\D@CIW^tk΄=6/'^MzWq2G- _ESs 2> R6 ~so3!45ƃauyMj endstream endobj 285 0 obj << /Length1 1625 /Length2 5084 /Length3 0 /Length 5906 /Filter /FlateDecode >> stream xڭTw<{ZE)jR=j;V- B$!TiQgkRjjϪhZjRu_y9FT  07O=BG@ hbpSj!*  @QQ4)'@99c<Ɔ||i.M"8O4 }xAHQ׎FP( 8P&P4CP\pCh(/!HvYZǥ((sKzh78y#0` \&;"'@,pLƠ0Gg{el4 8KyYo GC10`.c9@Ƒ<`DNe:{@P4G_uz{ _90h(QTDv!H.Exļs3$!H: \H7en# 3G}VÁnpW \塁 nh #KcP+"pC C|} p[o@=079 9KHswBFFJ|ǭ䋂75S]$_%/ ") KDu10/o40r4F{gR\`Ox?}\ʿ7 I!!.)R1e*ECQ@y9A%ȦY"P~5٢]lg3) ےC5U'ɇ 'Oa5BB;k`h攈NoWN !8}4m=euYC]mM;-2I8e%&`|ks+ 9-Ru~pgЬEM6? )t1X鋑wO%˅5xJ4[#ع՛[=xNm7{9IV>K>9 ڶ,J:f(RU\6!e]R׽Uˏ%(C0@5s5`kRɳ5ȉ(㑗T5e f[: [O(6 jk?!ՉrjC6QmC9~@9LfRJ# !x8gpAiY&&&7\yg\+O{UM{O!mƐkO 6oQL^dYLPsQԢ t` ]Pι4/#oݮrhvM?u?zLbXJmg/QWrayt.Q$5gj{;MI{_ ҅hjazIqJ=2BwTgțe2ψnoTlekԾ^q}MN?YF0""OËf6S,^rlu(X3K?.=yׅ6T x9+CA/= ʳ+ 3efD" ţ}TEb\ rvEy,{~=,.ϸזk%^LeIM\]*YbT6 mC: 8lqZYiqa'qHs5R:ʔ\AM<:g/B;09J$;e2$ hD{q$2S-jɞrWxv?M=POm׆1!c{=9W#k}zB*UvpږlWC{&LdBG99RL2v-徵 v86Iǽmv w~]/ZG/2k4U+1\b,:,gD4 Kn׳"L 9?&%$wv[N@OٶKi:QF GDgX6ʱʑ י]xKn,~Hs{e| 2nZuđ_N>\EQB2Z*mBT8,ex/)H=ӃZ.^w?-ի.媔i5_0~3I+FsgrǙX ?i+~/'+˛o+Ziʳ޾J}[yy_ m]^rjMyF<A"DJ^zpFW`qVq6'I4iY1{_s`3A*۞OLYA!p8fWI<}[w_Z+~v484j$9 !FZք8ρ*UM#[޻> Yb2_M?<ص( Pdex5/@ŜǻrPw7` [~qB؃/6KI9})q%J~ ?K2PNKڢ houEu㒲Qs X!Xھ7),B?nn`<*,YgL5[L=#tym1 yM8޸f6rضl]|Q#M0!ub@iZ0`էQn<@&/Ê%HArh:40]ŹedÄ{YK;YY~b1[ Jz*[Ay`'ޛ#%Kr9L O ,̇ڪ:`2o{J,p(j+]K'r#!lm'*_-؃#XPesWC΋y-7硴㙯Q0NG-ےhÕMo= i 2b6){u`i|G2|hه8f]׸zQTWl΍% )&`5~Q. QS ]|ձX1|$qDCkǢχ^HxyEG:3wE~8#½-o^ '- n<4U9*iNt|žGFz O][ݺҿ.?|Z2a-flK7?TL\h +T2o2I* ٟ}:(1E"*B+ւ(w܁Q1.5AtH͛{!L2QvpWvHQiO]WnEj ͜j#ˠgK=O%̭(ܱՍ/SE}|" VR⺐tPY14BZxvJ޺)slUGg1zmi|S":2 ҂ozˮnTseGK+ϮbG]g?$ܤr.]`j9nd6g~@[m.({4j[UTCNz&IhME;Y/a4~TсK$Y@M>)7%Pc׬}7Ib%(6QitH%{k͝X9Z N&AVҊ˟&ڐȺv5f=-eA<(רpke`R*Z#@OѺ@ЇL"Yqd=\Nm~.Ƥ6D1V _nK?Ѭ!_N#fZvO`lIxǒa8ֶ\jQs\DYn !r GB*sk+8rf-62[_I`0]I.54~HUNr#PZ2sGU7)-BP'.!'nRܳ@[_rM nd8"iGJx4)eo4(GD* 4UbWͫbN])п%8UK؂5ze\nF) >xぺЄ%}7\ R=yk6'3*gω|0!As wJ9۩5'*χoc|#51؊<  XS5viU쮔iɴ;sĝ&y$xUEK:ŕ~ qص+ )-bW2W=}?|Hn 1bʆqT:ό]01#>|ʏ ϢXY2HYm C}ef2 ,7I>>ȓ$^%pMx2nD/M\c:!^)j tc/LM T% endstream endobj 287 0 obj << /Length1 1144 /Length2 7106 /Length3 0 /Length 7873 /Filter /FlateDecode >> stream xusu\6Hw#J 0CtwC0t (]4HJݩt ! /{}u]Xb咶Y`P( ЀX{ZAոtGP IpHV *gx<Vp/E@k ߿apQh`3Hl<\P3lvym;+@; :l# Ͽb0?.<0?2=J(`cѶjYB `u@ \Eyx\7a{,Tj+ sCA`Ǧ|x97'( _jgK?I6 rv@?l?w+O0A -?^0@ >kX<2:2zQ22GI.?WX)"|TԲoE+C` ؿ]`ӰlTҀ! 6`&1 ?s(x8;?_;w _V.g? rHC=&lA8ep}/sf\ ?8=?)0)60[x8?/v`o ,F,̱6Zҋkk[P;yIS)%Ol9!*Su B^N"2m$O"r׉kb'PߓtBW~i$);BFD}m릉NIS^P;&Bձs(978* 7,*&8 W$mĖ3=2xAӯ֔ƥB_Fy9J}A,e9}}m" .>X؂ &eL~O<4}ԪoiN.aޡyݹyLXikcMӕImOٮ/kH5f4HTAo_yJu-: E O>Y@T%d'aP鞼So]Lu5AIqN:B !EMl=ѽ9if †^R1zt`>>2|c堟ZŔyhd8ӰF,*)ͬG_FF̬sc"PpY4 ?]%$\nуL)q&L'q5ʅ4 a o_<\$߸r@J/1k;1_0_R Y|"D_{N2^?j ?e7s#?}.Fޑnj\18CL,)sP a\ $/f @6Y]S*'/_$G;9oSwvk,W L9*E6Z8I+\lы6S p'6*ث]O&7; D#$9_N[MK4L`8f{P)Dr]˚"+&.d1|6ȾyӅEH#Z"YY6'7}Mf%nR+)Vx $ %lqP緬,2"Oը^xr2_M&'"Q)m2~Ӵhn=^:oJp'Ђ) Ԁ8^tXxvrxKlPm.֋Κ-o*G zdz.ʰ?3i )3"n2$8D'2S׉`R6It'Xq?ؠB&Ch#d8+H #9u"֝/PJoP8\eh_FZ/V07_x.Blڽ?,ƛZ|}.%-VES-t2f;S,gn 0GG>&|Y&ZZy |'}*;s w>-_) 8#`Gs[:Q f$5>j |o[ n~eHNd8:)uҫh(RQƟQ㔱cdqQO|?>kV*0qB ݥn7Fn?qbђQ1eL% OTsh%6u޺eK2kt'y[+f9˜C^ŴVK$2~6Y$!t&El-oA6?ّ q6wEٖRNdT =:h8V*A8p' 㒐m¹J!خSץчs;֥[_^.Zң4t<кF:&wyoA_2w QkWiĬ4XpC\bIν|q=b+10\VN wc%eB6IήRץZ,OE9.4umc/`j)U3z5эu/*:֩Jރk]/Il,.+t,|ţጐ;XÐw`YGyuhzl_$ѯ 3}+Þ VcvH'0sL|Dw*SA^ w4~&/Dvv\wN&ĒBծNENgX3*6\W\ghMtppu%] K=W$E!?Ahm(ˏIQ ;}EK$59DoR+Wo(pcopӜSܛp!S?=FҔ:B,w7r5ljXM廵eC \M $w _V8[^pϸǵ=m}-Wi(ޮ_4ܗ|IL2o{J1iw<bwx}쩥;ϸ4& q6V+;ȍ}7r܌ Fl!ФнhM]t!$b O3 11r7!yS>,ODUg[#IcNz^;CW\ uuquq k ;lCz+lj:- H'~j4k_-5ڟZ(P]>,USb\AOo*kI6Z:6SoM*A%OKzzt.m^MH9wUvop4I3Ѝ`śrx`x%l6H;}/m 6_,k1<ܞΠjsP̝ acW1~}Y:0(kn=s<2R&3<9}`)K &@~[G wYꕉ8e^*J̌ՔO _00ϼؕGJ2WqyOJ[l k006s}PwcHS޴!]Ɔ\'x'?;;.%8R*y_|Fx:&9tJP"I3ˑwg |OM> e=-J 5fϲ Sw ~#5@&ė.k7ca㗓s9~e9/VOϩ <&܁w?9z(^b'Rh]N(u,~Sr^A@G.47u{":zw f?~Hqӆ'"Ůg:n?D:I6Sh$/mZ;PiA<)J֕ʗ2yʄ2oy 7-_r(4K^0`^<:.k3nl!c¹% )n&ͭWԑ<'i!MkTVmLNEb=r;*è:5o u N7 VIMuj}4 j)oPS9U\D%#1%0A@>kxV0027A/ܱpnTy3e±/z1׋v`[RI#~TNcY#åmTɳfr}3ARgKv  *!-?⋺̪~#e;5̥}.X9ҵAsVwt r펷:Q0N=RN Jvo.;( +uKm*E6)vh OsacO9OILhMh6`oA7'B+訿Y.@7Jjqr4V|%ܬ'IpQmŹӓo%w|2dtT\ ^ѥGZvC?ni׉z)6: =$w-ʦVqӔ=~iPt&z#%.Fx _F=e竼ǡ~A]f-SMW= T(9 I>VA"şKv0w .Rq$(whH4 MD`&-~Ij;:M]΍Izpi?x@P\Z-2zY.ӊҢ6R!w94G>;$"1Fn; e|7wY)AO0oy EA$;/婸1smwaw޸3c3"F5wI/ Vrwc# endstream endobj 289 0 obj << /Length1 1177 /Length2 2989 /Length3 0 /Length 3734 /Filter /FlateDecode >> stream xmUydA USWRP uÇ6 $`18*eG@ɞŀ(;' Q@S+}f@Lx GӒ2?ab9F a @pԾ a$|H$uyy4brD$KR5ġ~3@$`ąBcp8L45ہ ļA U¡jAHw#pP?F``8 R_D9 `x%cI@oNA%D.P9(@WnEqV?6uu]ZaeajQ++*OgSO?  D*" O A@`\PV/or5>?zEgA?/_ O?w_uqX)1D#L>?Twġ_EO_f2ŁD"U)%!Gapހ=*F/O#uNOA uxԈ>[dJ(;JYqc|B5K;%G܊߷vMvD@_^7O 7I\:2D YZFG6eNnX [γs֌e,spEky[uгQ2~7sf{S90آǡ|?֘Tq*QUJiTi0?@a޷wOܗSE~a™zH'c rvd0tRA4|}8^w̢1[cR*`_eNo_:EnS׉˖ALs:Bo0'7!l&Z7:T?e(05mZQ[]ӟΉM т𢏙+Wgiep,lI,+ūc&e {Vw| |2xH8ܽkqIdlX~/ ЅrNk ^FXWMGҴe<ĞVRŬJΦ2QO҇DwHNtkcu;_DDu2ɘ%=R(+ H],ok |m34uYjgUeQYZ='S`{y'_dY ]Ư<Ҫy@n{.ZԀ6KgCph¸׆ԉ4[*)!*٩AQdzqqo.݆' ?(3hbm{.Ci7)c܉!P 5y1p말W 0,4pdT,DaPK.&(q\ 1q&6uR?֏aw|n#֚iϕރWg+W9Jb^X:1ھdVbWjکP tkL aW!W%(ڙ$UaHlV: F9w~P~q6LX8z[rgtԦ ߸gaja"YFbэ/Wᩢmo׹DZ< Y>wHSʺJ9#)مJ/b!n-_Uiyv}_m˵eXU&adY'RY̤#02ǫaQ[SV!+|cv Au. 7zIjTys;]gGY6 EWqhCSAvA-qd˞y޶ESftL`Z͸quxI% '܂O=s8ӱZ`^ DL-sb&篑T2u_|:b5Ų=~'Dٰ-ُ& WU ]Ei3v'o^:عcy~imA'q0;q'kLݲ-I5"YzH\٪ѭv fm#swON#G蟼_]ަ4*}r3~6ܴ;SC8ԕwt w^ͪY5<2HSmӤX>=o]'a=ƑAy4nAY{DrK}@B̒O 1T(VtRgT6X,m*0)Vq#Z|ONt\v\i>P~'*aJK<?U[\!'˕Ҏ7Wm>\ܧvS+/3GNZi|^YvjKS6mv ?)9D1;f Ӵ?ƛG=RgI;WܸfUr#w'_+q7^d N]uA銔W>>ٱ6}֙w`T-Jv/ŷZ~*>I 4]=4 tx\-vy\/ev2TH6^qi,nXuJZ~ě~x[%MgXRAB8?g&R(@4 ా9"LUȜ{mfJ .{츑r;V-' ? endstream endobj 181 0 obj << /Type /ObjStm /N 100 /First 892 /Length 3695 /Filter /FlateDecode >> stream x\[s8}ׯ㤦Lw`+ULO&q\2<mk#%eiHKN_<8}hTs,gy&=0)s&ʼdJ3Znϖ9+@yz0Se!#Y ǥ@=`edPBO kC).%iR=FQaB+j( \S@&= Ua2Ǚ+ehϔo1'S2h<0%*t+%B=)!.-h05)Ey0m=5%\al3CK e$1L`FY{l`,iIg&P  bsC@&T(1ìHZo3A@X\QFd &(X4NC{=sFQpIy 0@&4 $42<ь&Ti!+I!('z "$PGav-]RTKtj 1 S zҡBv*PF6Jb@QF.ӭ’m)Ʌ=~uYL쇃:'r<`ctc.. ;MfST_T3GX=qןJƟTiN E,.N^1)Igc9iܯyt9;Ejm;{h߷p P]ky=}8OYY~9ao6r8MXìl Sk>d2uE6-yXggG_Ӻ ~9CL<5r0El|^ԳQ1y&WE|Eswf^Umw?o3w=~\2";tȜV q'!]?\LWtX1dU}Ρ⼜b#h{]ixkg a<mUI4ŊFV[! 붒*F!e#?^lNȼy 7~ K[q!d38<$=n,S 8:7=Tau`,Ӕ]c+X&j@T mֱ\UX$**p1b &b شa}^T$|H}G` ۭVlH8qb !v}+SwqgU{o!Z#7ցQGFkdqʌ_lv>&OԍJmKI!{):)ߟ{ww[+eȜu[0l߉\+[ aGr?Jw_%ʵ[%~ 8UY)4mfoL[bӝ|2*<˭ ɔ~c:3Jo%DFVk2e󑪵Ӭ2Y}^V>?{xȁչɂL*.2AU9LFPߥBFYKNo%^w%JnJ:ifwzK *:_:NЏ}`78 vzZާݵxv!}ݽ>xoj5a^7_~n]bVW?a0#}+> fzi3IM.2OOZ:+Ifsٹ }(K!k鐺9R޻0rA8p6E}mPk!BzӈnOhَ\x@G$"Rޘ|iShHXqcr4M JԱ]@iVo)-My7kzMLXnh2'AVGZ:T%IpR[K*)J|B:*%sV%cEl0\fz[,61⁎4[)咾oy&iJ`WAnN;&[۩ǖ&n gmJDy&iB9 ¤lv =!NiBH iϲ赬NF"IZ3$)d lܯ_K 7sc;|JUd(އER>|tr4(kȢ]w>}9('zx5+}ߟǽ.NE5vb Qq>a:Iŗ;_kIv' =_\=+Sz74 퀎=~8-F|TOo#idCEM~|Ÿ/ɏ+~Ok+/.zRLz8yOi]ߗSfi jT^^|hTԼQ^~#~φK~Vj~;E9Co8}/WBzK8p̥al?{#^ dOFO.K>Og|6`v.G-Q͓!06i|kGYN4f+qwdhh6# c 95~?*.Ɓ_w!!‹g?h |հZl D8#y<8$I\? [pd;~6VE4-3QrS&]:MfS1۫:;jմn~t5FM|B}ZBj`n›㓟Zrּщ՝rm0RY_D%٤85gI ៻۠ç']t6b Pv#(ɱ Y녣FjZY5yZ^|şKn]@_ZF'x_X.4~[m'owO|9[擓go#Fۄk(%]0ꮨuو!hDNL;{u{ mq9EwjԞT]bcǗM:ƏK/<0@/M'[97':7q_o+@5wڋ82/*<ÂhJXP L Yp e Rp: OF}]$ endstream endobj 292 0 obj << /Length 843 /Filter /FlateDecode >> stream xmUMo0+J! ᫊"RVmk N7R$ݪ70W?g_,ɍehܬ=WWU\;;׺v7MOtҺ=po>fv8 | G՗_n}w̭][GL2sQ擾ݾk^!00jYV%H~~v}\; C}h{ϗC`Rރѩc~^ON6[7ݛ ZԲW/{FR^ww?U4H6!L@@B@q\s *G|F/+>㹴3Z~Z83f3[:٭ ߬Lg3t33 ~!>CO!>S 33>IY ?BXIAup*Çq G潪N$p|eO_:q;:'dE_kCvW endstream endobj 293 0 obj << /Length 842 /Filter /FlateDecode >> stream xmUn0CƆ"RVmk N7R$L̛O3 /~\k4~VzhO{|wޝn8O.oN?'uRG]>3dX;ҷ*נ_~vC̵:}W {1Esgq]ߍG@]dbڣH~z~ohTǰ9wxΏU]~NÛ Ju~*6{y~?xڰvtش~>ZjR˦YE3=sׁpuRA)`*R2$!`8li9UEХGSj043`4`4Ý(?Q  rt\e #q5p眛[q>x \iEܰpNMk l4\? 皞c:gN5^ ELOup3%M6`^ۘ1ل150ym 1F}3&ԗ0 bKl+֌>oRa Oѷ`)w`)?\֟agYg ֙P.L(ulgYˉx/N|N|&ٝ N|N'>cv'>7'>S} ~)>_Sϔ+>cR|&L|'a9i0K)cR{XTG5;)NͽRPs> stream xuUMo@Wla_BZXʡIW ld!fm웙7շĶM[؟McpuUӃsk/zfN꺼Ɠfn݅R^w}9qdMoXj_v}EQ>>pø;en>ڲ?`1&5vaj UkNAm<}\MxHM0}Z7WuI]ǽBnz/_ N{y;:ڰox\7nXw.kP^k3^Kյ u/A )`JbD>`2$`TY'``9&Dkx+0*NXXQQ3c w"]j~1F60aG+gıcW c rn q9Qܗ8% DMq.5Sh]`4$a]~9Vk ]8 IncT5obY:socsOPcYB?9Os֙3\Q.4ٰX3Z9#>^Z} ?L[ V|V|oV|3[: } B|)W|L| ,Y a!SMV,鸞:?8C8G潪N$ĸ<ޏ< Nuν_B,u7zl endstream endobj 295 0 obj << /Length 846 /Filter /FlateDecode >> stream xuUMo@Wla_BZXʡMW ldiof<ۻW_W7nzrc7)U7Nߜk]{7+wR}uN7|5s. )裮ݏk&8n~iyQqE0N[,g IM/*D@f`B9xczOpm`>W'9WRzL E]PwWqD`PދoSφ}= imX]ӷn<7̵^y]/׵Il/ܥ: ل0%1 " 0Z{q́0R0r0QK5<T`,if,1L.S5?׃[#M cL#F3X1+N978Nsk`q KpN8q )q4ϮEp O.5Ypc.Y7ь1O*ezl,d mY%0ymȋ,aYʘ8 xA} 3/Y1<*T71މf 97g19w(g1?\֟`g Yg 9LsQ.(ulgYˊx/V|V|&٭ V|N+>cv+>7+>S} ~!>_Sϔ+>cB|&LOr`B,&+jwRP{xᇣI^U E'b\o|s C:].cDܛX=oNܙ endstream endobj 296 0 obj << /Length 845 /Filter /FlateDecode >> stream xuUMo@+H.ȲrhQիԒ ؇6jo73o{q3mfѭVOn/Cf)rtskzf꺼Ɠpi?p>fv8coJ?< a9(})suזÌ\$qATh L}s6G 7o],jotuþ{UןtptZ|MÏѩNN6[7ݫ ZԲWO&suB`ilB =@ )U 9yI(ѥ S*043``MSiv|kiCXc, pDˆzA:x0)ljsn l9u}SrI4"nXCA8%&ٵ6AI cMϱXS_S/w"': fyRy(#c^g!ch"ƨ-kC^d cRx~h K^| МQV14Nd5cY9Y?C9돡'g ?%>O:ShYggΈrYgDg>[bghX|&^V|{ig33qgng3tZ[Yog,g-g B|B|\3gg3?f)O5[TT+&GUP#a#7a/c?w:'dEgtdbP2ڂ endstream endobj 297 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTMk0WhFG*! miʲV6vFrBbތf}\xM}qV'7t羋<]swrո:܉Ǿ-w$mm o\1A+Z7!؛~B?Fߗb n;nX7U{[LG5 @@N,Gw͡ 1}ԿhWWq}QEݹ-r*FNL7uY~~l+l+7tE )b,#TTHy9)9>*QKr7P:MȡQ^s$LD6aȑ*s.$S56`>ƄmÁ#TL 5kd}WXssc*{Rh/#? bE$L|ږ8^y>eSQc̯bV̯cNa'O;Q~{5pX2]$\^snaK??q FqMyc0=) &l(mi,s|d &\cV ]͸&ӈ9w{d :mB Ƈ\..Ա g~n59&\pe[N 8\4<[n6|kq_]~&)a endstream endobj 298 0 obj << /Length 666 /Filter /FlateDecode >> stream xmTn0C6U@"mTt@Կyct+%13nU틛ķR<=]tuUӽsƷÝxrN:ۦ>P)Εrus ~v?'Ǿ5~D !8뇺mRn=MuSxHiQ)YiH޽'w66Z,^DӇr}ݼ-w{s d\{?:1 kmn_~߼h!R,6ew*ؔb%k e+Kӄ$a"1x*s.$S56P>Ƅm„A Fs 5577vر׾+uaя6R:!,əCxg+ѧy*JcL|*m:fvui0ܓ`†›F2g'I`2e?fyx0j5F̹k#n'im7>T20P-9[A˲,p~nE8|p9j7o-kݸJv?ƏVR`c endstream endobj 299 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTMk0WhFG*! miʲVZCcYy#9톅ļ{3񼛤es^7箰 nn8l=hzI-._뫦~^JIu]f `tTsr*o8{&X,dew+mWos~X(2X.EiTz}ܟ^7uY~lVNMєo R.bY.֔O9؄b%9vsr(MXa#D$ar bqMDs!FKRLDP0.BEHQ#͸FuŎ577v}QȕanOd$g;A,əCR;6+ѧx**Ę$90q'oקfQ%n;5pX2]$^q~+s"F!CyhIh~CMnOf1$#h)r~hмj5F̹k#ni<7>Tsa>s\8s&wsaY1:+r1\ut[ZM,k4w6_%aJ endstream endobj 300 0 obj << /Length 666 /Filter /FlateDecode >> stream xmTn0CB*D rضj^pZ;olvR3ތm~<&i$͹+$o)'[֖wkuͷu5P.Υ/U} ~'C $D !8Rˬ9zLU]vރ8QBQVW,N4$  1}н`Еq}Eܶo KQ#U~'+xZZ9?ESھ/6XHfغ)Pb$b ab4aeILD!ID bq&"Q\H&(61*"TDDi5RH׮+&ElƮ}G= WA?Пe aLL\ږq8^9>eSQ!$"VFN??5J195wkdY]$^q~+s~"F!CyhIx~CMnOf1$#x)r<qh|utgmZdGGMYcu endstream endobj 301 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTMk0WhFG*! miʲVZCcYy#9햅ļ{3񸟤e&Oo]&C]]Mq>zwt߉Ǯ)n.pCx?nڽVgx=itO"i [\l\WM}'ԭ̚t4pXeȉeU oq yM\-CnCW_Ey}wP dZz891euB)] W-\v\]~[S!8&+Zce"'2Ɍ5I@|"B2AQhSlLء28a}ɑFq5ҍnnbfǮCG= Wܢe$g;A,:sx l=NOTƘ$0_س/vЧQ%~Zx pX2]$^qnaK??q FqMyc0=) &l(mi,3|d &\c ]͹&ӈ9w{d-tx\ \cΜekqLJs?<@>qhx .׷8wl~1V<*m"mmDa endstream endobj 302 0 obj << /Length 666 /Filter /FlateDecode >> stream xmTn0C6U@"۪V{Mi@Կyct+%13nUķR<=]tuU*Wo;зΝu-M}mS+7F?h^q~M}k $|y'BpOu u+$bTy{!y1  GҢSX< {NmmX#N;{}y[D]`Ah;P5K_;'4S}}⢅Klkީ|cSs&^s 1eΘOd~`xՌk?s׾G0N-۰o|e>ha>6h Z8sseY1:@++܊psqsoZ׺q=7÷c endstream endobj 317 0 obj << /Producer (pdfTeX-1.40.22) /Author()/Title(\376\377\000A\000n\000\040\000O\000v\000e\000r\000v\000i\000e\000w\000\040\000o\000f\000\040\000t\000h\000e\000\040\000I\000R\000a\000n\000g\000e\000s\000\040\000p\000a\000c\000k\000a\000g\000e)/Subject()/Creator(LaTeX with hyperref)/Keywords() /CreationDate (D:20240703183803-04'00') /ModDate (D:20240703183803-04'00') /Trapped /False /PTEX.Fullbanner (This is pdfTeX, Version 3.141592653-2.6-1.40.22 (TeX Live 2022/dev/Debian) kpathsea version 6.3.4/dev) >> endobj 291 0 obj << /Type /ObjStm /N 47 /First 396 /Length 1638 /Filter /FlateDecode >> stream xڭYsHTҚK#URel'&>p^. "DO=<6}GLw?G(sADz~(2'f?Gr-4>AqڡQa^& 1AH0>p&{(a~ѮBA DH"D*J `d Dq+q>:S0S&2|L9>DPs KB.*C#B) H1@BCD$#D/L!93c$€"v޽ϕ&p%x9I|$?zx:P%L`d֝_<`a;*\QXD0%pp2F|,?x0t$m7F;疼Kwg6q}x~Z%p/'GOlӗ~rw^& YDKBhE>AW_]n/݆fZlKÉaUxܬm61.%o]^LmK-u}<0X l^lp ^Kowa;jT;plC׿<G ^,yنE+T\3M;)h6/0mrL-cgͅe3Aa['zVOT=yb_[&YhUe[r;X4)P'9H8usךd).OwjlTV/o,pOJ/rl1Z endstream endobj 318 0 obj << /Type /XRef /Index [0 319] /Size 319 /W [1 3 1] /Root 316 0 R /Info 317 0 R /ID [ ] /Length 793 /Filter /FlateDecode >> stream x%KOVW:)BZxAE@mTPO" rZM6M;lmbS&a'vЩaPv?/'~a9,3/H^sd }P &b2DVm`lV>L2nɶfX jV*>e|LVGa IVO C8HlX9g*rs&ӿԡyx1Lփ4٘>@SqYs YK)YkiY[Y{YYGyYgE١d˲Ve]ɮˎ$!NvS֓얬7ײnˎ%3qp 'ݬRx9(nPR΀΁>5U@k 4 @;(M~c endstream endobj startxref 225255 %%EOF IRanges/inst/extdata/0000755000175100017510000000000014626176651015547 5ustar00biocbuildbiocbuildIRanges/inst/extdata/ce2chrM.bed0000644000175100017510000000027114626176651017506 0ustar00biocbuildbiocbuildchrM 13357 13651 trf 2 162.5 2 67 19 173 47 0 0 52 1.00 TA chrM 13436 13585 trf 7 23.9 7 66 28 61 46 0 0 53 1.00 ATTATAA chrM 13406 13658 trf 9 28.7 9 66 15 63 45 0 0 54 0.99 TATTATATT IRanges/inst/extdata/ce2chrM.fa.out0000644000175100017510000000714314626176651020155 0ustar00biocbuildbiocbuild SW perc perc perc query position in query matching repeat position in repeat score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID 210 19.5 0.0 7.2 chrM 433 515 (13279) + T-rich Low_complexity 2 78 (0) 1 28 6.1 0.0 0.0 chrM 543 591 (13203) + AT_rich Low_complexity 1 49 (0) 2 34 2.4 0.0 0.0 chrM 1345 1385 (12409) + AT_rich Low_complexity 1 41 (0) 3 26 8.2 0.0 0.0 chrM 2464 2524 (11270) + AT_rich Low_complexity 1 61 (0) 4 239 21.5 0.8 9.7 chrM 2566 2699 (11095) + T-rich Low_complexity 3 124 (0) 5 27 7.3 0.0 0.0 chrM 3922 3976 (9818) + AT_rich Low_complexity 1 55 (0) 6 37 5.7 0.0 3.3 chrM 4206 4296 (9498) + AT_rich Low_complexity 1 88 (0) 7 22 5.6 0.0 0.0 chrM 4759 4794 (9000) + AT_rich Low_complexity 1 36 (0) 8 28 2.9 0.0 0.0 chrM 5437 5471 (8323) + AT_rich Low_complexity 1 35 (0) 9 228 18.5 1.9 0.0 chrM 5538 5591 (8203) + T-rich Low_complexity 5 59 (0) 10 22 0.0 0.0 0.0 chrM 6584 6605 (7189) + AT_rich Low_complexity 1 22 (0) 11 36 9.7 0.0 0.0 chrM 6699 6811 (6983) + AT_rich Low_complexity 1 113 (0) 12 21 8.9 0.0 0.0 chrM 6978 7033 (6761) + AT_rich Low_complexity 1 56 (0) 13 26 8.8 0.0 0.0 chrM 7521 7588 (6206) + AT_rich Low_complexity 1 68 (0) 14 30 8.3 0.0 0.0 chrM 7794 7865 (5929) + AT_rich Low_complexity 1 72 (0) 15 21 0.0 0.0 0.0 chrM 8054 8074 (5720) + AT_rich Low_complexity 1 21 (0) 16 39 5.0 0.0 0.0 chrM 10561 10620 (3174) + AT_rich Low_complexity 1 60 (0) 17 207 16.8 6.6 4.7 chrM 10656 10761 (3033) + (TTTAA)n Simple_repeat 1 108 (0) 18 32 2.6 0.0 0.0 chrM 10947 10985 (2809) + AT_rich Low_complexity 1 39 (0) 19 25 0.0 0.0 0.0 chrM 11805 11829 (1965) + AT_rich Low_complexity 1 25 (0) 20 21 3.6 0.0 0.0 chrM 12027 12054 (1740) + AT_rich Low_complexity 1 28 (0) 21 24 3.2 0.0 0.0 chrM 12320 12350 (1444) + AT_rich Low_complexity 1 31 (0) 22 27 8.1 0.0 0.0 chrM 12727 12788 (1006) + AT_rich Low_complexity 1 62 (0) 23 22 9.9 0.0 0.0 chrM 12883 12953 (841) + AT_rich Low_complexity 1 71 (0) 24 60 7.2 0.0 1.9 chrM 13006 13161 (633) + AT_rich Low_complexity 1 153 (0) 25 28 9.1 0.0 0.0 chrM 13193 13269 (525) + AT_rich Low_complexity 1 77 (0) 26 468 6.9 9.4 0.0 chrM 13358 13516 (278) + (TA)n Simple_repeat 1 174 (0) 27 344 8.5 8.3 2.4 chrM 13491 13658 (136) + (TTATA)n Simple_repeat 2 179 (0) 28 IRanges/inst/extdata/hg18liftAll.lft0000644000175100017510000004065714626176651020351 0ustar00biocbuildbiocbuild0 1/NT_077402 167280 chr1 247249719 217280 1/NT_077911 40302 chr1 247249719 307582 1/NT_077912 153649 chr1 247249719 511231 1/NT_004350 2112849 chr1 247249719 2674080 1/NT_004321 1161048 chr1 247249719 3895128 1/NT_004547 1440092 chr1 247249719 5385220 1/NT_021937 7590365 chr1 247249719 13025585 1/NT_113791 116914 chr1 247249719 13192499 1/NT_077382 237250 chr1 247249719 13479749 1/NT_004873 3518496 chr1 247249719 17048245 1/NT_004610 12702424 chr1 247249719 29800669 1/NT_032977 73835825 chr1 247249719 103686494 1/NT_113792 157344 chr1 247249719 103893838 1/NT_019273 16604841 chr1 247249719 120548679 1/NT_086586 189539 chr1 247249719 120788218 1/NT_077389 398739 chr1 247249719 141476957 1/NT_113793 195588 chr1 247249719 141722545 1/NT_113794 186739 chr1 247249719 141959284 1/NT_113795 175055 chr1 247249719 142184339 1/NT_113796 201709 chr1 247249719 142436048 1/NT_113797 126477 chr1 247249719 142612525 1/NT_079485 194615 chr1 247249719 142857140 1/NT_079497 78698 chr1 247249719 142985838 1/NT_077932 127263 chr1 247249719 143163101 1/NT_077933 170669 chr1 247249719 143383770 1/NT_113798 38311 chr1 247249719 143522081 1/NT_004434 1022394 chr1 247249719 144594475 1/NT_034398 281532 chr1 247249719 144926007 1/NT_034400 1566655 chr1 247249719 146542662 1/NT_113799 185320 chr1 247249719 146777982 1/NT_079483 172789 chr1 247249719 147000771 1/NT_034401 220313 chr1 247249719 147271084 1/NT_034403 455185 chr1 247249719 147776269 1/NT_004487 56413061 chr1 247249719 204239330 1/NT_086602 259514 chr1 247249719 204548844 1/NT_021877 17265625 chr1 247249719 221864469 1/NT_004559 11394365 chr1 247249719 233308834 1/NT_004836 13665999 chr1 247249719 247024833 1/NT_032968 174886 chr1 247249719 50000 10/NT_077567 5577110 chr10 135374737 5677110 10/NT_077569 12337571 chr10 135374737 18064681 10/NT_008705 20794160 chr10 135374737 38908841 10/NT_024133 286100 chr10 135374737 41674941 10/NT_079540 191752 chr10 135374737 41916693 10/NT_033985 3830277 chr10 135374737 45896970 10/NT_031847 952205 chr10 135374737 46999175 10/NT_077570 263307 chr10 135374737 47412482 10/NT_077571 163231 chr10 135374737 47725713 10/NT_030772 989829 chr10 135374737 48865542 10/NT_017696 1941874 chr10 135374737 50857416 10/NT_035036 211435 chr10 135374737 51118851 10/NT_008583 30112613 chr10 135374737 81241464 10/NT_030059 44617998 chr10 135374737 125909462 10/NT_035040 2696597 chr10 135374737 128656059 10/NT_008818 4615335 chr10 135374737 133281394 10/NT_025835 246123 chr10 135374737 133577517 10/NT_017795 1797220 chr10 135374737 50000 11/NT_035113 1102759 chr11 134452384 1169335 11/NT_009237 49571094 chr11 134452384 50947429 11/NT_035158 503352 chr11 134452384 54450781 11/NT_033903 14395596 chr11 134452384 68848982 11/NT_078088 588129 chr11 134452384 69454899 11/NT_033927 17911127 chr11 134452384 87378026 11/NT_008984 8549206 chr11 134452384 95942794 11/NT_033899 38509590 chr11 134452384 16000 12/NT_009759 7043293 chr12 132349534 7132293 12/NT_009714 27615668 chr12 132349534 36142961 12/NT_029419 38648979 chr12 132349534 75041940 12/NT_019546 32815934 chr12 132349534 107914874 12/NT_009775 13091146 chr12 132349534 121156020 12/NT_009755 10116045 chr12 132349534 131317065 12/NT_024477 972469 chr12 132349534 17918000 13/NT_024524 67740325 chr13 114142980 85708325 13/NT_009952 25443670 chr13 114142980 111551995 13/NT_027140 1821999 chr13 114142980 113473994 13/NT_077627 184056 chr13 114142980 113758050 13/NT_024498 369930 chr13 114142980 18070000 14/NT_026437 88290585 chr14 106368585 18260000 15/NT_037852 1403478 chr15 100338915 19763478 15/NT_077631 334079 chr15 100338915 20197557 15/NT_078094 868737 chr15 100338915 21116294 15/NT_026446 3571299 chr15 100338915 24731593 15/NT_078095 438013 chr15 100338915 25270606 15/NT_010280 1108140 chr15 100338915 26478746 15/NT_078096 417989 chr15 100338915 26996735 15/NT_010194 53619965 chr15 100338915 80676700 15/NT_077661 2098777 chr15 100338915 82835477 15/NT_010274 13510195 chr15 100338915 96367672 15/NT_035325 3971243 chr15 100338915 0 16/NT_037887 8576922 chr16 88827254 8594422 16/NT_010393 25336229 chr16 88827254 34030651 16/NT_024773 1112651 chr16 88827254 44943302 16/NT_010498 42003582 chr16 88827254 86966884 16/NT_010542 1855370 chr16 88827254 0 17/NT_024972 296854 chr17 78774742 343376 17/NT_010718 21163833 chr17 78774742 21607209 17/NT_024862 579924 chr17 78774742 22287133 17/NT_010799 9412828 chr17 78774742 31799961 17/NT_078100 1629269 chr17 78774742 33529230 17/NT_010755 5072491 chr17 78774742 38701721 17/NT_010783 24793602 chr17 78774742 63585323 17/NT_010641 11472733 chr17 78774742 75211056 17/NT_024871 2103126 chr17 78774742 77379182 17/NT_010663 1275560 chr17 78774742 0 18/NT_010859 15400898 chr18 76117153 16764896 18/NT_010966 33548238 chr18 76117153 50360134 18/NT_025028 20074199 chr18 76117153 70462333 18/NT_025004 3388475 chr18 76117153 73872808 18/NT_010879 2244345 chr18 76117153 11000 19/NT_011255 7286004 chr19 63811651 7302004 19/NT_077812 1291194 chr19 63811651 8598198 19/NT_011295 15825424 chr19 63811651 32423622 19/NT_011109 31383029 chr19 63811651 0 2/NT_022327 1254071 chr2 242951149 1255071 2/NT_022221 2252116 chr2 242951149 3557187 2/NT_022139 1426129 chr2 242951149 5083316 2/NT_005334 11088087 chr2 242951149 16221403 2/NT_015926 4791168 chr2 242951149 21037571 2/NT_022184 68373980 chr2 242951149 89561551 2/NT_032994 397279 chr2 242951149 90958830 2/NT_034508 731068 chr2 242951149 94689898 2/NT_026970 2594449 chr2 242951149 97296847 2/NT_022171 12173457 chr2 242951149 109612304 2/NT_034485 736346 chr2 242951149 110498650 2/NT_077407 359898 chr2 242951149 111008548 2/NT_022135 38390280 chr2 242951149 149498828 2/NT_005403 84213157 chr2 242951149 233731985 2/NT_005120 5688986 chr2 242951149 239453971 2/NT_113800 12944 chr2 242951149 239496915 2/NT_022173 952154 chr2 242951149 240474069 2/NT_005416 2277080 chr2 242951149 8000 20/NT_011387 26259569 chr20 62435964 28033230 20/NT_025215 234339 chr20 62435964 29267569 20/NT_028392 5092930 chr20 62435964 34380499 20/NT_011362 26144333 chr20 62435964 60551882 20/NT_035608 71932 chr20 62435964 60733814 20/NT_011333 1702150 chr20 62435964 9719767 21/NT_029490 490233 chr21 46944323 13260000 21/NT_011512 28617429 chr21 46944323 41878628 21/NT_030188 1627105 chr21 46944323 43507092 21/NT_011515 3437231 chr21 46944323 14430000 22/NT_028395 647850 chr22 49691432 15227850 22/NT_011519 3661581 chr22 49691432 18939431 22/NT_011520 23276302 chr22 49691432 42227733 22/NT_011521 830225 chr22 49691432 43107958 22/NT_011523 4248192 chr22 49691432 47366250 22/NT_011525 1384186 chr22 49691432 48767136 22/NT_019197 320440 chr22 49691432 49089176 22/NT_113818 17927 chr22 49691432 49126803 22/NT_011526 464629 chr22 49691432 0 22_h2_hap1/NT_113959 63661 chr22_h2_hap1 63661 35000 3/NT_022517 66080833 chr3 199501827 66375833 3/NT_022459 24211711 chr3 199501827 94987544 3/NT_005612 100530253 chr3 199501827 195537797 3/NT_005535 1299866 chr3 199501827 196864663 3/NT_029928 2582164 chr3 199501827 0 4/NT_037622 1413146 chr4 191273063 1464146 4/NT_006081 2419310 chr4 191273063 3963456 4/NT_006051 4820284 chr4 191273063 8933740 4/NT_006316 22487426 chr4 191273063 31492166 4/NT_022794 976586 chr4 191273063 32527752 4/NT_016297 7445039 chr4 191273063 39992791 4/NT_006238 9040907 chr4 191273063 49183698 4/NT_037645 171176 chr4 191273063 52354874 4/NT_022853 7074452 chr4 191273063 59479326 4/NT_022778 9796115 chr4 191273063 69375441 4/NT_077444 2161413 chr4 191273063 71711854 4/NT_006216 3929449 chr4 191273063 75671303 4/NT_016354 92123751 chr4 191273063 167825054 4/NT_022792 23438009 chr4 191273063 63000 5/NT_006576 46378398 chr5 180857866 49441398 5/NT_006713 42230486 chr5 180857866 91711884 5/NT_023148 5878002 chr5 180857866 97612886 5/NT_034772 41199371 chr5 180857866 138817257 5/NT_029289 16301663 chr5 180857866 155123020 5/NT_023133 25714846 chr5 180857866 0 5_h2_hap1/NT_113801 1146088 chr5_h2_hap1 1794870 1186088 5_h2_hap1/NT_113802 608782 chr5_h2_hap1 1794870 5000 6/NT_034880 9194728 chr6 170899992 9249728 6/NT_007592 48945890 chr6 170899992 58245618 6/NT_033172 642507 chr6 170899992 61938125 6/NT_033948 248423 chr6 170899992 62236548 6/NT_007299 33500716 chr6 170899992 95937264 6/NT_025741 61645385 chr6 170899992 157632649 6/NT_007422 10134273 chr6 170899992 167784922 6/NT_007302 2236975 chr6 170899992 170171897 6/NT_007583 725095 chr6 170899992 0 6_cox_hap1/NT_113891 4731698 chr6_cox_hap1 4731698 0 6_qbl_hap2/NT_113892 475847 chr6_qbl_hap2 4565931 635157 6_qbl_hap2/NT_113893 1999704 chr6_qbl_hap2 4565931 2686011 6_qbl_hap2/NT_113894 291112 chr6_qbl_hap2 4565931 3003554 6_qbl_hap2/NT_113895 266639 chr6_qbl_hap2 4565931 3322986 6_qbl_hap2/NT_113896 623992 chr6_qbl_hap2 4565931 3973953 6_qbl_hap2/NT_113897 591978 chr6_qbl_hap2 4565931 34000 7/NT_029998 293567 chr7 158821424 477567 7/NT_007819 47690382 chr7 158821424 48207949 7/NT_030008 2130176 chr7 158821424 50378125 7/NT_033968 6577293 chr7 158821424 57005418 7/NT_023629 1052855 chr7 158821424 61058273 7/NT_023603 256182 chr7 158821424 61364455 7/NT_077528 190137 chr7 158821424 61604592 7/NT_007758 12749068 chr7 158821424 74603660 7/NT_007933 64426257 chr7 158821424 139054917 7/NT_007914 14846650 chr7 158821424 154001567 7/NT_034885 736332 chr7 158821424 154817899 7/NT_007741 4003525 chr7 158821424 0 8/NT_023736 7462059 chr8 146274826 7562059 8/NT_077531 4537293 chr8 146274826 12199352 8/NT_030737 9464880 chr8 146274826 21681632 8/NT_023666 8051036 chr8 146274826 29798768 8/NT_007995 14159284 chr8 146274826 46958052 8/NT_023678 1291149 chr8 146274826 48309201 8/NT_008183 38454502 chr8 146274826 86851003 8/NT_008046 57155273 chr8 146274826 144106276 8/NT_023684 1290020 chr8 146274826 145403396 8/NT_037704 871430 chr8 146274826 0 9/NT_008413 39653686 chr9 140273252 39703686 9/NT_086745 261110 chr9 140273252 40014796 9/NT_078049 208233 chr9 140273252 40273029 9/NT_113811 142805 chr9 140273252 40465834 9/NT_078041 464507 chr9 140273252 40980341 9/NT_078042 152873 chr9 140273252 41183214 9/NT_113812 172579 chr9 140273252 41405793 9/NT_078043 1198158 chr9 140273252 42653951 9/NT_078055 549743 chr9 140273252 43253694 9/NT_078045 632871 chr9 140273252 43936565 9/NT_079529 680077 chr9 140273252 44666642 9/NT_078077 181647 chr9 140273252 44898289 9/NT_078051 291910 chr9 140273252 45240199 9/NT_078053 465318 chr9 140273252 45755517 9/NT_113813 350909 chr9 140273252 46156426 9/NT_086759 194609 chr9 140273252 46401035 9/NT_086755 498918 chr9 140273252 46949953 9/NT_078078 157546 chr9 140273252 65207499 9/NT_078052 450681 chr9 140273252 65708180 9/NT_113814 223855 chr9 140273252 65982035 9/NT_113815 162441 chr9 140273252 66194476 9/NT_078058 159539 chr9 140273252 66404015 9/NT_078059 199148 chr9 140273252 66653163 9/NT_078065 194491 chr9 140273252 66897654 9/NT_079533 158462 chr9 140273252 67106116 9/NT_078066 471702 chr9 140273252 67627818 9/NT_078067 376183 chr9 140273252 68054001 9/NT_078068 174765 chr9 140273252 68278766 9/NT_078069 289439 chr9 140273252 68618205 9/NT_078070 682157 chr9 140273252 69350362 9/NT_078071 158187 chr9 140273252 69558549 9/NT_113816 187806 chr9 140273252 69796355 9/NT_113817 178933 chr9 140273252 70025288 9/NT_023935 21507948 chr9 140273252 91583236 9/NT_079535 85380 chr9 140273252 91718616 9/NT_008470 40394265 chr9 140273252 132212881 9/NT_035014 3818133 chr9 140273252 136231014 9/NT_019501 2075804 chr9 140273252 138336818 9/NT_024000 1936434 chr9 140273252 0 M/NC_001807 16571 chrM 16571 0 X/NT_086925 34821 chrX 154913754 84821 X/NT_078115 86563 chrX 154913754 201384 X/NT_028413 766173 chrX 154913754 1017557 X/NT_086929 36556 chrX 154913754 1104113 X/NT_086931 80121 chrX 154913754 1274234 X/NT_033330 754004 chrX 154913754 2128238 X/NT_011757 34879939 chrX 154913754 37033177 X/NT_079573 12096764 chrX 154913754 49179941 X/NT_086939 680972 chrX 154913754 50040913 X/NT_011638 2371726 chrX 154913754 52462639 X/NT_011630 6136098 chrX 154913754 61598737 X/NT_011669 14971611 chrX 154913754 76590348 X/NT_011651 36813576 chrX 154913754 113473924 X/NT_028405 2122394 chrX 154913754 115616318 X/NT_011786 27718692 chrX 154913754 143365010 X/NT_011681 5427710 chrX 154913754 148832720 X/NT_011726 6081034 chrX 154913754 0 Y/NT_113967 34821 chrY 57772954 84821 Y/NT_113968 86563 chrY 57772954 201384 Y/NT_113969 766173 chrY 57772954 1017557 Y/NT_113970 36556 chrY 57772954 1104113 Y/NT_113971 80121 chrY 57772954 1274234 Y/NT_113972 754004 chrY 57772954 2128238 Y/NT_113973 581282 chrY 57772954 2709520 Y/NT_011896 6265435 chrY 57772954 9024955 Y/NT_086998 276367 chrY 57772954 9901322 Y/NT_011878 813231 chrY 57772954 11214553 Y/NT_087001 39401 chrY 57772954 11653954 Y/NT_113819 554624 chrY 57772954 12308578 Y/NT_011875 10002238 chrY 57772954 22360816 Y/NT_011903 4867933 chrY 57772954 57228749 Y/NT_025975 98295 chrY 57772954 57377044 Y/NT_091573 66393 chrY 57772954 57443437 Y/NT_113974 329517 chrY 57772954 0 1/NT_113870 145186 chr1_random 1663265 195186 1/NT_113871 197748 chr1_random 1663265 442934 1/NT_113872 183763 chr1_random 1663265 676697 1/NT_113873 51825 chr1_random 1663265 778522 1/NT_113874 136815 chr1_random 1663265 965337 1/NT_113875 114056 chr1_random 1663265 1129393 1/NT_113876 25994 chr1_random 1663265 1205387 1/NT_113877 208942 chr1_random 1663265 1464329 1/NT_113878 106433 chr1_random 1663265 1620762 1/NT_113879 42503 chr1_random 1663265 0 10/NT_113918 113275 chr10_random 113275 0 11/NT_113919 40524 chr11_random 215294 90524 11/NT_113920 35155 chr11_random 215294 175679 11/NT_113921 39615 chr11_random 215294 0 13/NT_113923 186858 chr13_random 186858 0 15/NT_113924 139260 chr15_random 784346 189260 15/NT_113925 168820 chr15_random 784346 408080 15/NT_113926 119514 chr15_random 784346 577594 15/NT_113927 111864 chr15_random 784346 739458 15/NT_113928 44888 chr15_random 784346 0 16/NT_113929 105485 chr16_random 105485 0 17/NT_113930 174588 chr17_random 2617613 224588 17/NT_113931 186078 chr17_random 2617613 460666 17/NT_113932 104495 chr17_random 2617613 615161 17/NT_113933 142595 chr17_random 2617613 807756 17/NT_113934 120350 chr17_random 2617613 978106 17/NT_113935 185449 chr17_random 2617613 1213555 17/NT_113936 163628 chr17_random 2617613 1427183 17/NT_113937 37443 chr17_random 2617613 1514626 17/NT_113938 45226 chr17_random 2617613 1609852 17/NT_113939 147354 chr17_random 2617613 1807206 17/NT_113940 19187 chr17_random 2617613 1876393 17/NT_113941 37498 chr17_random 2617613 1963891 17/NT_113942 117663 chr17_random 2617613 2131554 17/NT_113943 81310 chr17_random 2617613 2262864 17/NT_113944 182567 chr17_random 2617613 2495431 17/NT_113945 41001 chr17_random 2617613 2586432 17/NT_113946 31181 chr17_random 2617613 0 18/NT_113947 4262 chr18_random 4262 0 19/NT_113948 92689 chr19_random 301858 142689 19/NT_113949 159169 chr19_random 301858 0 2/NT_113880 185571 chr2_random 185571 0 21/NT_113950 28709 chr21_random 1679693 78709 21/NT_113951 152296 chr21_random 1679693 281005 21/NT_113952 184355 chr21_random 1679693 515360 21/NT_113953 131056 chr21_random 1679693 696416 21/NT_113954 129889 chr21_random 1679693 876305 21/NT_113955 178865 chr21_random 1679693 1105170 21/NT_113956 150002 chr21_random 1679693 1305172 21/NT_113957 166452 chr21_random 1679693 1521624 21/NT_113958 158069 chr21_random 1679693 0 22/NT_113960 40752 chr22_random 257318 90752 22/NT_113961 166566 chr22_random 257318 0 3/NT_113881 146010 chr3_random 749256 196010 3/NT_113882 172475 chr3_random 749256 418485 3/NT_113883 137703 chr3_random 749256 606188 3/NT_113884 143068 chr3_random 749256 0 4/NT_113885 189789 chr4_random 842648 239789 4/NT_113886 96249 chr4_random 842648 386038 4/NT_113887 3994 chr4_random 842648 440032 4/NT_113888 191469 chr4_random 842648 681501 4/NT_113889 161147 chr4_random 842648 0 5/NT_113890 143687 chr5_random 143687 0 6/NT_113898 1305230 chr6_random 1875562 1355230 6/NT_113899 520332 chr6_random 1875562 0 7/NT_113900 112804 chr7_random 549659 162804 7/NT_113901 182896 chr7_random 549659 395700 7/NT_113902 153959 chr7_random 549659 0 8/NT_113903 12854 chr8_random 943810 62854 8/NT_113904 50950 chr8_random 943810 163804 8/NT_113905 183161 chr8_random 943810 396965 8/NT_113906 46082 chr8_random 943810 493047 8/NT_113907 37175 chr8_random 943810 580222 8/NT_113908 13036 chr8_random 943810 643258 8/NT_113909 38914 chr8_random 943810 732172 8/NT_113910 211638 chr8_random 943810 0 9/NT_113911 36148 chr9_random 1146434 86148 9/NT_113912 185143 chr9_random 1146434 321291 9/NT_113913 154740 chr9_random 1146434 526031 9/NT_113914 90085 chr9_random 1146434 666116 9/NT_113915 187035 chr9_random 1146434 903151 9/NT_113916 173443 chr9_random 1146434 1126594 9/NT_113917 19840 chr9_random 1146434 0 X/NT_113962 217385 chrX_random 1719168 267385 X/NT_113963 24360 chrX_random 1719168 341745 X/NT_113964 204131 chrX_random 1719168 595876 X/NT_113965 1005289 chrX_random 1719168 1651165 X/NT_113966 68003 chrX_random 1719168 IRanges/inst/extdata/hs_b36v3_chrY.agp0000644000175100017510000003070514626176651020567 0ustar00biocbuildbiocbuild# # Homo sapiens chromosome Y, reference assembly, complete sequence # # This file provides assembly instructions for sequence NC_000024 # included in reference assembly of NCBI build 36 (HGSC Finished Genome v4.0). # #chrom chr_start chr_stop part_no part_type comp_id/gap_len comp_type/gap_type comp_end/linkage orientation/empty chrY 1 34821 1 F BX640545.2 1 34821 + chrY 34822 84821 2 N 50000 contig no chrY 84822 122592 3 F AL954722.18 1 37771 + chrY 122593 157464 4 F BX537334.4 1 34872 - chrY 157465 171384 5 F BX000483.7 1999 15918 + chrY 171385 201384 6 N 30000 contig no chrY 201385 232395 7 F AL954664.17 8190 39200 - chrY 232396 265528 8 F BX000476.5 208 33340 + chrY 265529 482251 9 F AL732314.18 2001 218723 + chrY 482252 550112 10 F BX004827.18 51695 119555 + chrY 550113 723877 11 F AL683871.15 2001 175765 + chrY 723878 837875 12 F AL672311.26 2001 115998 + chrY 837876 967557 13 F AL672277.20 2001 131682 + chrY 967558 1017557 14 N 50000 contig no chrY 1017558 1054113 15 F BX908402.3 1 36556 + chrY 1054114 1104113 16 N 50000 contig no chrY 1104114 1147822 17 F BX649635.3 1 43709 + chrY 1147823 1184234 18 F BX901949.9 1685 38096 + chrY 1184235 1274234 19 N 90000 contig no chrY 1274235 1307891 20 F BX908382.8 1 33657 + chrY 1307892 1390773 21 F BX649553.6 2001 84882 + chrY 1390774 1425289 22 F BX901885.7 1 34516 + chrY 1425290 1458955 23 F BX119906.16 2001 35666 + chrY 1458956 1618348 24 F AL683870.15 1001 160393 + chrY 1618349 1661433 25 F AL691415.17 1 43085 + chrY 1661434 1849259 26 F AL683807.22 1 187826 + chrY 1849260 1966556 27 F AL672040.10 1 117297 + chrY 1966557 2028238 28 F CR381640.8 2001 63682 + chrY 2028239 2128238 29 N 100000 contig no chrY 2128239 2165561 30 F CR856018.10 1 37323 + chrY 2165562 2202890 31 F CR381696.5 2000 39328 + chrY 2202891 2280454 32 F BX649443.16 2001 79564 + chrY 2280455 2333896 33 F BX119919.5 1 53442 - chrY 2333897 2514591 34 F AC079176.15 1 180695 - chrY 2514592 2593088 35 F AC097314.27 1 78497 - chrY 2593089 2709520 36 F AC006209.25 23323 139754 - chrY 2709521 2838553 37 F AC006040.3 57272 186304 + chrY 2838554 2845472 38 F AC074181.1 1 6919 + chrY 2845473 2999955 39 F AC006157.2 1 154483 + chrY 2999956 3170037 40 F AC006032.2 1 170082 + chrY 3170038 3247254 41 F AC006152.4 1 77217 + chrY 3247255 3316539 42 F AC011305.2 1 69285 + chrY 3316540 3471638 43 F AC009479.4 1 155099 + chrY 3471639 3567129 44 F AC019058.4 1 95491 + chrY 3567130 3650550 45 F AC024038.6 1 83421 + chrY 3650551 3827222 46 F AC012078.3 1 176672 + chrY 3827223 3945458 47 F AC010094.5 1 118236 + chrY 3945459 4015462 48 F AC010737.4 1 70004 + chrY 4015463 4164401 49 F AC010084.3 1 148939 + chrY 4164402 4316800 50 F AC010905.3 1 152399 + chrY 4316801 4335619 51 F AC010106.2 1 18819 + chrY 4335620 4367320 52 F AC024703.5 1 31701 + chrY 4367321 4576407 53 F AC012077.4 1 209087 + chrY 4576408 4712669 54 F AC010142.4 1 136262 + chrY 4712670 4829264 55 F AC019060.5 1 116595 + chrY 4829265 4917080 56 F AC023423.5 1 87816 + chrY 4917081 5077603 57 F AC010722.2 1 160523 + chrY 5077604 5237705 58 F AC010685.3 1 160102 + chrY 5237706 5281650 59 F AC010129.3 1 43945 + chrY 5281651 5445957 60 F AC012067.2 1 164307 + chrY 5445958 5520761 61 F AC012667.2 1 74804 + chrY 5520762 5697342 62 F AC010081.4 1 176581 + chrY 5697343 5811187 63 F AC010874.3 1 113845 + chrY 5811188 5947272 64 F AC010977.4 1 136085 + chrY 5947273 6015798 65 F AC016681.2 1 68526 + chrY 6015799 6163940 66 F AC010140.3 1 148142 + chrY 6163941 6386252 67 F AC006335.2 1 222312 + chrY 6386253 6441403 68 F AC010154.3 1 55151 + chrY 6441404 6607550 69 F AC010144.4 1 166147 + chrY 6607551 6623520 70 F AC010728.4 1 15970 + chrY 6623521 6823534 71 F AC013412.3 1 200014 + chrY 6823535 6877729 72 F AC011297.3 1 54195 + chrY 6877730 7044045 73 F AC012068.5 1 166316 + chrY 7044046 7154095 74 F AC010104.3 1 110050 + chrY 7154096 7233943 75 F AC010143.3 1 79848 + chrY 7233944 7405192 76 F AC007284.4 1 171249 + chrY 7405193 7521780 77 F AC007247.5 1 116588 + chrY 7521781 7638889 78 F AC007274.3 1 117109 + chrY 7638890 7808593 79 F AC007275.4 1 169704 + chrY 7808594 7883829 80 F AC010678.4 1 75236 + chrY 7883830 7997798 81 F AC010902.4 1 113969 + chrY 7997799 8184715 82 F AC016749.4 1 186917 + chrY 8184716 8278929 83 F AC051663.9 1 94214 + chrY 8278930 8370383 84 F AC025731.12 1 91454 + chrY 8370384 8544585 85 F AC016991.5 1 174202 + chrY 8544586 8634069 86 F AC064829.6 1 89484 + chrY 8634070 8774803 87 F AC009491.3 1 140734 + chrY 8774804 8962150 88 F AC007967.3 1 187347 + chrY 8962151 8972934 89 F AC068719.3 1 10784 + chrY 8972935 8974955 90 F AC079126.3 1 2021 + chrY 8974956 9024955 91 N 50000 clone no chrY 9024956 9030327 92 F AC079125.4 1 5372 + chrY 9030328 9178185 93 F AC009952.4 1 147858 + chrY 9178186 9190963 94 F AC025732.9 1 12778 + chrY 9190964 9301322 95 F AC006158.6 1 110359 + chrY 9301323 9901322 96 N 600000 clone no chrY 9901323 10013703 97 F AC006156.5 1 112381 + chrY 10013704 10088698 98 F AC025819.7 1 74995 + chrY 10088699 10250877 99 F AC017019.3 1 162179 + chrY 10250878 10283186 100 F AC010891.2 1 32309 + chrY 10283187 10456943 101 F AC006986.3 1 173757 + chrY 10456944 10622784 102 F AC006987.2 1 165841 + chrY 10622785 10714553 103 F AC010970.3 1 91769 + chrY 10714554 11214553 104 N 500000 clone no chrY 11214554 11253954 105 F AC069323.5 1 39401 + chrY 11253955 11653954 106 N 400000 centromere no chrY 11653955 11738549 107 F AC140113.3 71061 155655 - chrY 11738550 11861114 108 F AC134878.3 33249 155813 - chrY 11861115 12003063 109 F AC134882.2 10706 152654 - chrY 12003064 12208578 110 F AC134879.3 1 205515 - chrY 12208579 12308578 111 N 100000 centromere no chrY 12308579 12468100 112 F AC011293.5 1 159522 + chrY 12468101 12581699 113 F AC012502.3 1 113599 + chrY 12581700 12759636 114 F AC011302.3 1 177937 + chrY 12759637 12838587 115 F AC013735.5 1 78951 + chrY 12838588 12911566 116 F AC004772.2 40021 112999 + chrY 12911567 12936024 117 F AC005942.2 1 24458 - chrY 12936025 13059669 118 F AC002992.1 2001 125645 + chrY 13059670 13234892 119 F AC004617.2 1 175223 + chrY 13234893 13319390 120 F AC004810.1 1 84498 - chrY 13319391 13515290 121 F AC002531.1 2001 197900 + chrY 13515291 13619376 122 F AC004474.1 44195 148280 + chrY 13619377 13664255 123 F AC006565.4 1 44879 - chrY 13664256 13879980 124 F AC005820.1 1 215725 - chrY 13879981 13952171 125 F AC010877.3 64839 137029 + chrY 13952172 14123978 126 F AC006376.2 1 171807 + chrY 14123979 14159738 127 F AC007004.3 1 35760 + chrY 14159739 14264748 128 F AC006383.2 1 105010 + chrY 14264749 14466902 129 F AC006371.2 1 202154 + chrY 14466903 14639848 130 F AC006370.2 1 172946 + chrY 14639849 14742756 131 F AC018677.3 1 102908 + chrY 14742757 14779500 132 F AC010720.4 1 36744 + chrY 14779501 14953720 133 F AC010723.3 1 174220 + chrY 14953721 14981864 134 F AC019191.4 1 28144 + chrY 14981865 15158188 135 F AC010726.4 1 176324 + chrY 15158189 15267382 136 F AC010979.3 1 109194 + chrY 15267383 15447103 137 F AC010879.2 1 179721 + chrY 15447104 15512578 138 F AC011903.4 1 65475 + chrY 15512579 15557501 139 F AC017032.3 1 44923 + chrY 15557502 15714967 140 F AC006989.3 1 157466 + chrY 15714968 15795304 141 F AC011289.4 1 80337 + chrY 15795305 15909741 142 F AC010972.3 1 114437 + chrY 15909742 15967360 143 F AC007007.3 1 57619 + chrY 15967361 16111420 144 F AC006998.3 1 144060 + chrY 16111421 16274030 145 F AC006382.3 1 162610 + chrY 16274031 16413824 146 F AC006462.3 1 139794 + chrY 16413825 16509580 147 F AC006336.4 1 95756 + chrY 16509581 16528817 148 F AC016671.3 1 19237 + chrY 16528818 16695020 149 F AC017020.4 1 166203 + chrY 16695021 16899873 150 F AC011749.2 1 204853 + chrY 16899874 16926987 151 F AC053516.10 1 27114 + chrY 16926988 17103738 152 F AC010135.3 1 176751 + chrY 17103739 17166324 153 F AC010128.3 1 62586 + chrY 17166325 17345839 154 F AC011751.2 1 179515 + chrY 17345840 17488621 155 F AC016678.4 1 142782 + chrY 17488622 17510633 156 F AC015979.4 1 22012 + chrY 17510634 17659631 157 F AC007034.4 1 148998 + chrY 17659632 17766833 158 F AC007043.3 1 107202 + chrY 17766834 17879511 159 F AC006999.2 1 112678 + chrY 17879512 17886218 160 F AC007042.3 1 6707 + chrY 17886219 17923463 161 F AC091329.3 1 37245 + chrY 17923464 18051552 162 F AC007972.4 1 128089 + chrY 18051553 18230132 163 F AC015978.4 1 178580 + chrY 18230133 18244934 164 F AC068704.4 1 14802 + chrY 18244935 18443908 165 F AC007742.4 1 198974 + chrY 18443909 18474219 166 F AC095381.1 1 30311 + chrY 18474220 18653273 167 F AC009976.4 1 179054 + chrY 18653274 18698502 168 F AC095380.1 1 45229 + chrY 18698503 18861991 169 F AC024183.4 1 163489 + chrY 18861992 19023659 170 F AC007241.3 1 161668 + chrY 19023660 19090333 171 F AC069130.6 1 66674 + chrY 19090334 19096847 172 F AC073962.5 1 6514 + chrY 19096848 19134213 173 F AC068541.7 1 37366 + chrY 19134214 19290370 174 F AC022486.4 1 156157 + chrY 19290371 19464252 175 F AC007379.2 1 173882 + chrY 19464253 19623962 176 F AC009235.4 1 159710 + chrY 19623963 19808739 177 F AC007244.2 1 184777 + chrY 19808740 19871998 178 F AC021210.4 1 63259 + chrY 19871999 20023872 179 F AC010133.4 1 151874 + chrY 20023873 20066067 180 F AC012062.4 1 42195 + chrY 20066068 20189568 181 F AC010137.3 1 123501 + chrY 20189569 20221039 182 F AC009977.4 1 31471 + chrY 20221040 20400963 183 F AC010889.3 1 179924 + chrY 20400964 20472584 184 F AC010151.3 1 71621 + chrY 20472585 20667823 185 F AC009233.3 1 195239 + chrY 20667824 20778857 186 F AC079157.3 1 111034 + chrY 20778858 20829067 187 F AC079261.2 1 50210 + chrY 20829068 20846452 188 F AC079156.4 1 17385 + chrY 20846453 20889975 189 F AC024250.6 1 43523 + chrY 20889976 20991065 190 F AC009240.6 1 101090 + chrY 20991066 21140390 191 F AC011745.4 1 149325 + chrY 21140391 21249363 192 F AC007678.3 1 108973 + chrY 21249364 21437477 193 F AC009494.2 1 188114 + chrY 21437478 21456819 194 F AC026061.8 1 19342 + chrY 21456820 21557572 195 F AC009489.3 1 100753 + chrY 21557573 21722251 196 F AC007876.2 1 164679 + chrY 21722252 21858422 197 F AC009239.3 1 136171 + chrY 21858423 22029157 198 F AC010086.4 1 170735 + chrY 22029158 22150215 199 F AC010141.2 1 121058 + chrY 22150216 22310816 200 F AC021107.3 1 160601 + chrY 22310817 22360816 201 N 50000 clone no chrY 22360817 22384194 202 F AC078938.3 1 23378 + chrY 22384195 22403616 203 F AC024236.5 1 19422 + chrY 22403617 22587631 204 F AC007322.4 1 184015 + chrY 22587632 22688058 205 F AC007359.3 1 100427 + chrY 22688059 22757212 206 F AC023342.3 1 69154 + chrY 22757213 22817822 207 F AC025227.6 1 60610 + chrY 22817823 23005516 208 F AC007320.3 1 187694 + chrY 23005517 23210553 209 F AC008175.2 1 205037 + chrY 23210554 23230057 210 F AC016694.2 1 19504 + chrY 23230058 23386813 211 F AC010080.2 1 156756 + chrY 23386814 23417437 212 F AC016911.6 1 30624 + chrY 23417438 23583695 213 F AC006366.4 201 166458 - chrY 23583696 23691182 214 F AC010088.4 1 107487 + chrY 23691183 23794414 215 F AC053490.2 1 103232 + chrY 23794415 23811321 216 F AC007039.6 1 16907 + chrY 23811322 23991466 217 F AC006983.4 1 180145 + chrY 23991467 24140496 218 F AC009947.2 1 149030 + chrY 24140497 24157833 219 F AC016707.2 1 17337 + chrY 24157834 24324069 220 F AC016752.2 1 166236 + chrY 24324070 24353833 221 F AC025246.6 1 29764 + chrY 24353834 24474584 222 F AC073649.3 1 120751 + chrY 24474585 24538834 223 F AC073893.4 1 64250 + chrY 24538835 24614390 224 F AC068601.8 1 75556 + chrY 24614391 24751872 225 F AC023274.2 1 137482 + chrY 24751873 24941229 226 F AC012005.4 1 189357 + chrY 24941230 24948693 227 F AC013465.4 1 7464 + chrY 24948694 25102779 228 F AC016698.3 1 154086 + chrY 25102780 25206726 229 F AC010153.3 1 103947 + chrY 25206727 25291998 230 F AC025735.4 1 85272 + chrY 25291999 25395154 231 F AC010089.4 1 103156 + chrY 25395155 25397101 232 F AC006982.3 1 1947 + chrY 25397102 25572891 233 F AC006338.6 1 175790 + chrY 25572892 25724743 234 F AC016728.4 1 151852 + chrY 25724744 25888772 235 F AC006386.4 1 164029 + chrY 25888773 26066341 236 F AC006328.5 1 177569 + chrY 26066342 26203218 237 F AC007562.4 1 136877 + chrY 26203219 26348595 238 F AC010682.3 1 145377 + chrY 26348596 26443316 239 F AC017005.7 1 94721 + chrY 26443317 26625199 240 F AC007965.3 1 181883 + chrY 26625200 26798241 241 F AC006991.3 1 173042 + chrY 26798242 26906387 242 F AC024067.4 1 108146 + chrY 26906388 27086206 243 F AC013734.4 1 179819 + chrY 27086207 27194539 244 F AC019099.6 1 108333 + chrY 27194540 27228749 245 F AC073880.5 1 34210 + chrY 27228750 57228749 246 N 30000000 heterochromatin no chrY 57228750 57327044 247 F AC068123.5 1 98295 + chrY 57327045 57377044 248 N 50000 clone no chrY 57377045 57443437 249 F AC025226.4 101674 168066 - chrY 57443438 57614293 250 F AJ271735.1 69145 240000 + chrY 57614294 57772954 251 F AJ271736.1 1 158661 + IRanges/inst/include/0000755000175100017510000000000014626176651015540 5ustar00biocbuildbiocbuildIRanges/inst/include/IRanges_defines.h0000644000175100017510000000312214626176651020734 0ustar00biocbuildbiocbuild/***************************************************************************** IRanges C interface: typedefs and defines ----------------------------------------- The IRanges C interface is split in 2 files: 1. IRanges_defines.h (this file): contains the typedefs and defines of the interface. 2. IRanges_interface.h (in this directory): contains the prototypes of the IRanges C routines that are part of the interface. Please consult IRanges_interface.h for how to use this interface in your package. *****************************************************************************/ #ifndef IRANGES_DEFINES_H #define IRANGES_DEFINES_H #include "S4Vectors_defines.h" #include #include /* * *_holder structs. */ typedef struct compressed_chars_list_holder { int length; const char *unlisted; const int *breakpoints; } CompressedCharsList_holder; typedef struct compressed_ints_list_holder { int length; const int *unlisted; const int *breakpoints; } CompressedIntsList_holder; typedef struct compressed_doubles_list_holder { int length; const double *unlisted; const int *breakpoints; } CompressedDoublesList_holder; typedef struct iranges_holder { const char *classname; int is_constant_width; int length; const int *width; const int *start; const int *end; int SEXP_offset; /* offset in 'names' member below */ SEXP names; } IRanges_holder; typedef struct compressed_iranges_list_holder { const char *classname; int length; const int *end; IRanges_holder unlistData_holder; } CompressedIRangesList_holder; #endif IRanges/inst/include/IRanges_interface.h0000644000175100017510000000633314626176651021266 0ustar00biocbuildbiocbuild/***************************************************************************** IRanges C interface: prototypes ------------------------------- The IRanges C interface is split in 2 files: 1. IRanges_defines.h (in this directory): contains the typedefs and defines of the interface. 2. IRanges_interface.h (this file): contains the prototypes of the IRanges C routines that are part of the interface. *****************************************************************************/ #include "IRanges_defines.h" /* * Comparing integer ranges. * (see IPosRanges_comparison.c) */ int overlap_code( int x_start, int x_width, int y_start, int y_width ); int invert_overlap_code(int code); /* * Low-level manipulation of IRanges objects. * (see IRanges_class.c) */ SEXP get_IRanges_start(SEXP x); SEXP get_IRanges_width(SEXP x); SEXP get_IRanges_names(SEXP x); int get_IRanges_length(SEXP x); IRanges_holder hold_IRanges(SEXP x); int get_length_from_IRanges_holder(const IRanges_holder *x_holder); int get_width_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); int get_start_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); int get_end_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); SEXP get_names_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i); IRanges_holder get_linear_subset_from_IRanges_holder(const IRanges_holder *x_holder, int offset, int length); void set_IRanges_names(SEXP x, SEXP names); void copy_IRanges_slots(SEXP x, SEXP x0); SEXP new_IRanges(const char *classname, SEXP start, SEXP width, SEXP names); SEXP new_IRanges_from_IntPairAE(const char *classname, const IntPairAE *intpair_ae); SEXP new_list_of_IRanges_from_IntPairAEAE(const char *element_type, const IntPairAEAE *intpair_aeae); SEXP alloc_IRanges(const char *classname, int length); /* * Low-level manipulation of Grouping objects. * (see Grouping_class.c) */ SEXP get_H2LGrouping_high2low(SEXP x); SEXP get_H2LGrouping_low2high(SEXP x); SEXP get_Partitioning_names(SEXP x); SEXP get_PartitioningByEnd_end(SEXP x); SEXP new_PartitioningByEnd(const char *classname, SEXP end, SEXP names); /* * Low-level manipulation of CompressedList objects. * (see CompressedList_class.c) */ SEXP get_CompressedList_unlistData(SEXP x); SEXP get_CompressedList_partitioning(SEXP x); int get_CompressedList_length(SEXP x); SEXP get_CompressedList_names(SEXP x); SEXP new_CompressedList(const char *classname, SEXP unlistData, SEXP partitioning); CompressedIntsList_holder hold_CompressedIntegerList(SEXP x); int get_length_from_CompressedIntsList_holder(const CompressedIntsList_holder *x_holder); Ints_holder get_elt_from_CompressedIntsList_holder(const CompressedIntsList_holder *x_holder, int i); /* * Low-level manipulation of CompressedIRangesList objects. * (see CompressedIRangesList_class.c) */ CompressedIRangesList_holder hold_CompressedIRangesList(SEXP x); int get_length_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder); IRanges_holder get_elt_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder, int i); int get_eltNROWS_from_CompressedIRangesList_holder(const CompressedIRangesList_holder *x_holder, int i); IRanges/inst/include/_IRanges_stubs.c0000644000175100017510000001271614626176651020622 0ustar00biocbuildbiocbuild#include "IRanges_interface.h" #define DEFINE_CCALLABLE_STUB(retT, stubname, Targs, args) \ typedef retT(*__ ## stubname ## _funtype__)Targs; \ retT stubname Targs \ { \ static __ ## stubname ## _funtype__ fun = NULL; \ if (fun == NULL) \ fun = (__ ## stubname ## _funtype__) R_GetCCallable("IRanges", "_" #stubname); \ return fun args; \ } /* * Using the above macro when retT (the returned type) is void will make Sun * Studio 12 C compiler unhappy. So we need to use the following macro to * handle that case. */ #define DEFINE_NOVALUE_CCALLABLE_STUB(stubname, Targs, args) \ typedef void(*__ ## stubname ## _funtype__)Targs; \ void stubname Targs \ { \ static __ ## stubname ## _funtype__ fun = NULL; \ if (fun == NULL) \ fun = (__ ## stubname ## _funtype__) R_GetCCallable("IRanges", "_" #stubname); \ fun args; \ return; \ } /* * Stubs for callables defined in IPosRanges_comparison.c */ DEFINE_CCALLABLE_STUB(int, overlap_code, (int x_start, int x_width, int y_start, int y_width), ( x_start, x_width, y_start, y_width) ) DEFINE_CCALLABLE_STUB(int, invert_overlap_code, (int code), ( code) ) /* * Stubs for callables defined in IRanges_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_start, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_width, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_IRanges_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_IRanges_length, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(IRanges_holder, hold_IRanges, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_IRanges_holder, (const IRanges_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(int, get_width_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_start_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_end_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(SEXP, get_names_elt_from_IRanges_holder, (const IRanges_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(IRanges_holder, get_linear_subset_from_IRanges_holder, (const IRanges_holder *x_holder, int offset, int length), ( x_holder, offset, length) ) DEFINE_NOVALUE_CCALLABLE_STUB(set_IRanges_names, (SEXP x, SEXP names), ( x, names) ) DEFINE_NOVALUE_CCALLABLE_STUB(copy_IRanges_slots, (SEXP x, SEXP x0), ( x, x0) ) DEFINE_CCALLABLE_STUB(SEXP, new_IRanges, (const char *classname, SEXP start, SEXP width, SEXP names), ( classname, start, width, names) ) DEFINE_CCALLABLE_STUB(SEXP, new_IRanges_from_IntPairAE, (const char *classname, const IntPairAE *intpair_ae), ( classname, intpair_ae) ) DEFINE_CCALLABLE_STUB(SEXP, new_list_of_IRanges_from_IntPairAEAE, (const char *element_type, const IntPairAEAE *intpair_aeae), ( element_type, intpair_aeae) ) DEFINE_CCALLABLE_STUB(SEXP, alloc_IRanges, (const char *classname, int length), ( classname, length) ) /* * Stubs for callables defined in Grouping_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_H2LGrouping_high2low, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_H2LGrouping_low2high, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_Partitioning_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_PartitioningByEnd_end, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, new_PartitioningByEnd, (const char *classname, SEXP end, SEXP names), ( classname, end, names) ) /* * Stubs for callables defined in CompressedList_class.c */ DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_unlistData, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_partitioning, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_CompressedList_length, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, get_CompressedList_names, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(SEXP, new_CompressedList, (const char *classname, SEXP unlistData, SEXP partitioning), ( classname, unlistData, partitioning) ) DEFINE_CCALLABLE_STUB(CompressedIntsList_holder, hold_CompressedIntegerList, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_CompressedIntsList_holder, (const CompressedIntsList_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(Ints_holder, get_elt_from_CompressedIntsList_holder, (const CompressedIntsList_holder *x_holder, int i), ( x_holder, i) ) /* * Stubs for callables defined in CompressedIRangesList_class.c */ DEFINE_CCALLABLE_STUB(CompressedIRangesList_holder, hold_CompressedIRangesList, (SEXP x), ( x) ) DEFINE_CCALLABLE_STUB(int, get_length_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder), ( x_holder) ) DEFINE_CCALLABLE_STUB(IRanges_holder, get_elt_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder, int i), ( x_holder, i) ) DEFINE_CCALLABLE_STUB(int, get_eltNROWS_from_CompressedIRangesList_holder, (const CompressedIRangesList_holder *x_holder, int i), ( x_holder, i) ) IRanges/inst/unitTests/0000755000175100017510000000000014626176651016117 5ustar00biocbuildbiocbuildIRanges/inst/unitTests/test_AtomicList-class.R0000644000175100017510000000712014626176651022454 0ustar00biocbuildbiocbuildtest_AtomicList_constructors <- function() { subclasses <- c(logical="LogicalList", integer="IntegerList", #double="NumericList", numeric="NumericList", complex="ComplexList", character="CharacterList", raw="RawList", Rle="RleList") for (elt_type in names(subclasses)) { subclass <- subclasses[[elt_type]] constructor <- get(subclass) vec1 <- get(elt_type)(6) vec2 <- get(elt_type)(8) target <- list(A=vec1, B=vec2) for (compress in c(TRUE, FALSE)) { current <- constructor(A=vec1, B=vec2, compress=compress) checkTrue(is(current, subclass)) checkIdentical(compress, is(current, "CompressedList")) checkIdentical(elt_type, elementType(current)) checkIdentical(target, as.list(current)) checkIdentical(unname(target), as.list(current, use.names=FALSE)) } } } test_AtomicList_general <- function() { vec1 <- c(1L,2L,NA,3L,NA,5L,2L,8L) vec2 <- c(NA,15L,45L,20L,NA,1L,15L,100L,80L,5L,NA) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(list1 %in% c(1L, 5L)), lapply(list1, "%in%", c(1L, 5L))) checkIdentical(lapply(list1 %in% IntegerList(one = vec1, vec2, compress = compress), as.vector), mapply("%in%", lapply(list1, as.vector), list(one = vec1, vec2))) checkIdentical(as.list(is.na(list1)), lapply(list1, is.na)) checkIdentical(as.list(match(list1, c(1L, 5L))), lapply(list1, match, c(1L, 5L))) checkIdentical(lapply(match(list1, IntegerList(one = vec1, vec2, compress = compress)), as.vector), mapply(match, lapply(list1, as.vector), list(one = vec1, vec2))) checkIdentical(as.list(sort(list1)), lapply(list1, sort)) checkIdentical(as.list(unique(list1)), lapply(list1, unique)) } } } test_RleList_methods <- function() { x1 <- RleList(11:15, 15L, integer(0), 15:16, compress=FALSE) x2 <- RleList(11:15, 15L, integer(0), 15:16, compress=TRUE) checkIdentical(as(runValue(x1), "CompressedIntegerList"), runValue(x2)) checkIdentical(as(runLength(x1), "CompressedIntegerList"), runLength(x2)) checkIdentical(as(ranges(x1), "CompressedIRangesList"), ranges(x2)) a1 <- Rle(1, 999722111) a2 <- 20 * a1 a <- RleList(a1, a2, compress=TRUE) b1 <- c(a1, a1) b2 <- 20 * b1 b <- RleList(b1, b2, compress=FALSE) ## FIXME: 'a1 <= 19:21' is taking forever and eats up all the memory in ## BioC <= 2.12! Seems like 'a1' is expanded to integer vector first, which ## is not good :-/ #for (y in list(8L, 8, 19:21)) { for (y in list(8L, 8)) { ## With a CompressedRleList target <- RleList(a1 <= y, a2 <= y, compress=TRUE) current <- a <= y checkIdentical(target, current) ## With a SimpleRleList target <- RleList(b1 <= y, b2 <= y, compress=FALSE) current <- b <= y checkIdentical(target, current) } } IRanges/inst/unitTests/test_AtomicList-utils.R0000644000175100017510000002053214626176651022511 0ustar00biocbuildbiocbuildtest_AtomicList_GroupGenerics <- function() { vec1 <- c(1L,2L,3L,5L,2L,8L) vec2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(list1 + list1), Map("+", list1, list1)) checkIdentical(as.list(log(list1)), lapply(list1, log)) checkIdentical(as.list(round(sqrt(list1))), lapply(list1, function(x) round(sqrt(x)))) checkIdentical(sum(list1), sapply(list1, sum)) } } } test_AtomicList_logical <- function() { vec1 <- c(TRUE,NA,FALSE, NA) vec2 <- c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE) for (compress in c(TRUE, FALSE)) { for (type in c("LogicalList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) checkIdentical(as.list(!list1), lapply(list1, "!")) checkIdentical(as.list(which(list1)), lapply(list1, which)) } } } test_AtomicList_numerical <- function() { vec1 <- c(1L,2L,NA,3L,NA,5L,2L,8L) vec2 <- c(NA,15L,45L,20L,NA,1L,15L,100L,80L,5L,NA) for (compress in c(TRUE, FALSE)) { for (type in c("IntegerList", "RleList")) { list1 <- do.call(type, list(one = vec1, vec2, compress = compress)) list2 <- endoapply(list1, rev) checkIdentical(as.list(diff(list1)), lapply(list1, diff)) checkIdentical(as.list(pmax(list1, list2)), mapply(pmax, list1, list2)) checkIdentical(as.list(pmin(list1, list2)), mapply(pmin, list1, list2)) checkIdentical(as.list(pmax.int(list1, list2)), mapply(pmax.int, list1, list2)) checkIdentical(as.list(pmin.int(list1, list2)), mapply(pmin.int, list1, list2)) checkIdentical(mean(list1, na.rm=TRUE), sapply(list1, mean, na.rm=TRUE)) checkIdentical(var(list1, na.rm=TRUE), sapply(list1, var, na.rm=TRUE)) checkIdentical(cov(list1, list2, use="complete.obs"), mapply(cov, list1, list2, MoreArgs = list(use="complete.obs"))) checkIdentical(cor(list1, list2, use="complete.obs"), mapply(cor, list1, list2, MoreArgs = list(use="complete.obs"))) checkIdentical(sd(list1, na.rm=TRUE), sapply(list1, sd, na.rm=TRUE)) checkIdentical(median(list1, na.rm=TRUE), sapply(list1, median, na.rm=TRUE)) checkIdentical(quantile(list1, na.rm=TRUE), do.call(rbind, lapply(list1, quantile, na.rm=TRUE))) checkIdentical(mad(list1, na.rm=TRUE), sapply(list1, mad, na.rm=TRUE)) checkIdentical(IQR(list1, na.rm=TRUE), sapply(list1, IQR, na.rm=TRUE)) vec3 <- (-20:20)^2 vec3[c(1,10,21,41)] <- c(100L, 30L, 400L, 470L) list3 <- do.call(type, list(one = vec3, rev(vec3), compress = compress)) checkIdentical(as.list(smoothEnds(list3)), lapply(list3, smoothEnds)) checkIdentical(as.list(runmed(list3, 7)), lapply(list3, function(x) { y <- runmed(x, 7) if (type != "RleList") y <- as.vector(y) y })) } } } test_AtomicList_character <- function() { txt <- c("The", "licenses", "for", "most", "software", "are", "designed", "to", "take", "away", "your", "freedom", "to", "share", "and", "change", "it.", "", "By", "contrast,", "the", "GNU", "General", "Public", "License", "is", "intended", "to", "guarantee", "your", "freedom", "to", "share", "and", "change", "free", "software", "--", "to", "make", "sure", "the", "software", "is", "free", "for", "all", "its", "users") for (compress in c(TRUE, FALSE)) { for (type in c("CharacterList", "RleList")) { list1 <- do.call(type, list(one = txt, rev(txt), compress = compress)) checkIdentical(as.list(nchar(list1)), lapply(list1, nchar)) checkIdentical(as.list(chartr("@!*", "alo", list1)), lapply(list1, chartr, old="@!*", new="alo")) checkIdentical(as.list(tolower(list1)), lapply(list1, tolower)) checkIdentical(as.list(toupper(list1)), lapply(list1, toupper)) checkIdentical(as.list(sub("[b-e]",".", list1)), lapply(list1, sub, pattern="[b-e]", replacement=".")) checkIdentical(as.list(gsub("[b-e]",".", list1)), lapply(list1, gsub, pattern="[b-e]", replacement=".")) } } } test_RleList_methods <- function() { ## na.rm x <- RleList(c(NA,1,1), c(1L,NA_integer_,1L), c(1,Inf,1,-Inf),compress=TRUE) target <- RleList(c(1,2), c(1L,1L), c(Inf,Inf,-Inf)) current <- runsum(x,2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,2), c(NA_integer_,NA_integer_), c(Inf,Inf,-Inf)) current <- runsum(x,2, na.rm = FALSE) checkIdentical(target, current) target <- RleList(c(2,4), c(2,2), c(Inf, Inf, -Inf)) current <- runwtsum(x,2, c(2,2), na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,4), c(NA_real_,NA_real_), c(Inf,Inf,-Inf)) current <- runwtsum(x,2, c(2,2), na.rm = FALSE) checkIdentical(target, current) target <- RleList(c(1,1), c(1,1), c(Inf,Inf,-Inf)) current <- runmean(x, 2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,1), c(NA_real_, NA_real_), c(Inf, Inf, -Inf)) current <- runmean(x, 2, na.rm = FALSE) checkIdentical(target, current) x <- RleList(c(NA,1,2), c(2L,NA_integer_,1L), c(1,Inf,1,-Inf),compress=TRUE) target <- RleList(c(1,2), c(2L,1L), c(Inf,Inf,1)) current <- runq(x, 2, 2, na.rm = TRUE) checkIdentical(target, current) target <- RleList(c(NA,2), c(NA_integer_, NA_integer_), c(Inf, Inf, 1)) current <- runq(x, 2, 2, na.rm = FALSE) checkIdentical(target, current) ## Binary operations between an RleList and an atomic vector: a1 <- Rle(1, 999722111) a2 <- 20 * a1 a <- RleList(a1, a2, compress=TRUE) b1 <- c(a1, a1) b2 <- 20 * b1 b <- RleList(b1, b2, compress=FALSE) for (y in list(8L, 8)) { ## With a CompressedRleList target <- RleList(a1 + y, a2 + y, compress=TRUE) current <- a + y checkIdentical(target, current) target <- RleList(a1 * y, a2 * y, compress=TRUE) current <- a * y checkIdentical(target, current) target <- RleList(a1 / y, a2 / y, compress=TRUE) current <- a / y checkIdentical(target, current) ## With a SimpleRleList target <- RleList(b1 + y, b2 + y, compress=FALSE) current <- b + y checkIdentical(target, current) target <- RleList(b1 * y, b2 * y, compress=FALSE) current <- b * y checkIdentical(target, current) target <- RleList(b1 / y, b2 / y, compress=FALSE) current <- b / y checkIdentical(target, current) } } test_AtomicList_repElements <- function() { test_addition <- function(x, y) { current <- x + y target <- IntegerList(Map(function(x, y) x + y, x, y)) checkIdentical(current, target) } test_addition(IntegerList(NULL), IntegerList(NULL)) test_addition(IntegerList(11:13), IntegerList(NULL)) test_addition(IntegerList(11:13, NULL), IntegerList(NULL, NULL)) test_addition(IntegerList(11:13, NULL), IntegerList(NULL, 10:12)) test_addition(IntegerList(11:13, NULL), IntegerList(10:12, NULL)) test_addition(IntegerList(11:13), IntegerList(NULL, 10:12)) test_addition(IntegerList(11:12), IntegerList(10:13)) test_addition(IntegerList(11:12), IntegerList(10:12)) test_addition(IntegerList(11:13, 11:12), IntegerList(10:12)) } IRanges/inst/unitTests/test_DataFrame-utils.R0000644000175100017510000000070314626176651022263 0ustar00biocbuildbiocbuild ## splitting test_DataFrame_splitting <- function() { data(swiss) rn <- rownames(swiss) sw <- DataFrame(swiss, row.names=rn) swisssplit <- split(swiss, swiss$Education) ## split swsplit <- split(sw, sw[["Education"]]) checkTrue(validObject(swsplit)) checkIdentical(as.list(lapply(swsplit, as.data.frame)), swisssplit) checkTrue(validObject(split(DataFrame(IRanges(1:26, 1:26), LETTERS), letters))) } IRanges/inst/unitTests/test_DataFrameList.R0000644000175100017510000002266514626176651021774 0ustar00biocbuildbiocbuildtest_DataFrameList_construction <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) checkDFL2dfl(DataFrameList(swiss, airquality), list(swiss, airquality)) } test_SplitDataFrameList_construction <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } striprownames <- function(x) { lapply(x, function(y) { rownames(y) <- NULL y }) } data(airquality) data(swiss) aq <- DataFrame(airquality) sw <- DataFrame(swiss, row.names=rownames(swiss)) aqsplit1 <- split(aq, aq[["Month"]]) aqsplit2 <- SplitDataFrameList(lapply(split(airquality, airquality[["Month"]]), as, "DataFrame")) checkIdentical(aqsplit1, aqsplit2) swsplit1 <- split(sw, sw[["Education"]]) swsplit2 <- SplitDataFrameList(lapply(split(swiss, swiss[["Education"]]), as, "DataFrame")) checkIdentical(swsplit1, swsplit2) for (compress in c(TRUE, FALSE)) { airqualitysplit <- striprownames(split(airquality, airquality[["Month"]])) aqsplit <- SplitDataFrameList(as.list(split(aq, aq[["Month"]])), compress = compress) checkDFL2dfl(aqsplit, airqualitysplit) swisssplit <- split(swiss, swiss[["Education"]]) swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) checkDFL2dfl(swsplit, swisssplit) } } test_DataFrameList_subset <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) checkDFL2dfl(DFL1[], dfl1[]) checkDFL2dfl(DFL1[1], dfl1[1]) checkDFL2dfl(DFL1[2:1], dfl1[2:1]) checkIdentical(as.data.frame(DFL1[[2]]), airquality) checkException(DFL1[[3]], silent = TRUE) DFL2 <- DataFrameList(s = swiss, a = airquality) dfl2 <- list(s = swiss, a = airquality) checkDFL2dfl(DFL2[], dfl2[]) checkDFL2dfl(DFL2[1], dfl2[1]) checkDFL2dfl(DFL2["a"], dfl2["a"]) checkDFL2dfl(DFL2[c("a", "s")], dfl2[c("a", "s")]) checkIdentical(as.data.frame(DFL2[["a"]]), airquality) checkIdentical(DFL2[["z"]], NULL) } test_SplitDataFrameList_subset <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(swiss) sw <- DataFrame(swiss, row.names = rownames(swiss)) for (compress in c(TRUE, FALSE)) { swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) swisssplit <- split(swiss, swiss[["Education"]]) checkDFL2dfl(swsplit[], swisssplit[]) checkDFL2dfl(swsplit[1], swisssplit[1]) checkDFL2dfl(swsplit[2:1], swisssplit[2:1]) checkIdentical(as.data.frame(swsplit[[2]]), swisssplit[[2]]) checkIdentical(swsplit[["A"]], NULL) checkException(swsplit[[30]], silent = TRUE) checkIdentical(as.list(swsplit[,1]), split(swiss[[1]], swiss[["Education"]])) checkIdentical(as.list(swsplit[,"Examination"]), split(swiss[["Examination"]], swiss[["Education"]])) } } test_SplitDataFrameList_as.data.frame <- function() { checkDFL2dfl <- function(DFL, dfl, compress) { target <- data.frame(group = togroup(PartitioningByWidth(dfl)), group_name = names(dfl)[togroup(PartitioningByWidth(dfl))], do.call(rbind, dfl), stringsAsFactors=FALSE, row.names=NULL) rownames(target) <- unlist(lapply(dfl, row.names), use.names = FALSE) checkIdentical(target, as.data.frame(DFL)) } data(swiss) sw <- DataFrame(swiss, row.names = rownames(swiss)) for (compress in c(TRUE, FALSE)) { swsplit <- SplitDataFrameList(as.list(split(sw, sw[["Education"]])), compress = compress) swisssplit <- split(swiss, swiss[["Education"]]) checkDFL2dfl(swsplit, swisssplit, compress) } } test_SplitDataFrameList_columnUtils <- function() { set.seed(100001) original <- splitAsList(DataFrame(X=runif(100), Y=rpois(100, 5)), sample(letters, 100, replace=TRUE)) out <- original checkIdentical(commonColnames(out), c("X", "Y")) commonColnames(out) <- c("a", "b") checkIdentical(commonColnames(out), c("a", "b")) checkIdentical(colnames(out[[1]]), c("a", "b")) checkIdentical(colnames(out[[length(out)]]), c("a", "b")) checkIdentical(commonColnames(out[0]), c("a", "b")) # Same behavior for SimpleSDFLs. alt <- as(original, "SimpleSplitDataFrameList") checkIdentical(commonColnames(alt), c("X", "Y")) commonColnames(alt) <- c("a", "b") checkIdentical(commonColnames(alt), c("a", "b")) checkIdentical(colnames(alt[[1]]), c("a", "b")) checkIdentical(colnames(alt[[length(alt)]]), c("a", "b")) checkIdentical(commonColnames(alt[0]), NULL) } test_DataFrameList_replace <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } data(airquality) data(swiss) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[] <- DFL1[1] dfl1[] <- dfl1[1] checkDFL2dfl(DFL1, dfl1) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[2] <- DFL1[1] dfl1[2] <- dfl1[1] checkDFL2dfl(DFL1, dfl1) DFL1 <- DataFrameList(swiss, airquality) dfl1 <- list(swiss, airquality) DFL1[[1]][[1]] <- DFL1[[1]][[1]] + 1L dfl1[[1]][[1]] <- dfl1[[1]][[1]] + 1L checkDFL2dfl(DFL1, dfl1) } test_SplitDataFrameList_replace <- function() { checkDFL2dfl <- function(DFL, dfl) { checkIdentical(lapply(as.list(DFL), as.data.frame), dfl) } striprownames <- function(x) { lapply(x, function(y) { rownames(y) <- NULL y }) } data(airquality) data(swiss) swiss2 <- swiss rownames(swiss2) <- NULL sw2 <- DataFrame(swiss2) for (compress in c(TRUE, FALSE)) { swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) swiss2split[] <- swiss2split[1] sw2split[] <- sw2split[1] checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) swiss2split[c(2, 4, 5)] <- swiss2split[1] sw2split[c(2, 4, 5)] <- sw2split[1] checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) swiss2split <- lapply(swiss2split, function(x) {x[["Examination"]] <- x[["Examination"]] + 1L; x}) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) sw2split[,"Examination"] <- sw2split[,"Examination"] + 1L checkDFL2dfl(sw2split, swiss2split) swiss2split <- striprownames(split(swiss2, swiss2[["Education"]])) swiss2split <- lapply(swiss2split, function(x) { x[["Examination"]][x[["Examination"]] > 22] <- x[["Examination"]][x[["Examination"]] > 22] + 1L x }) sw2split <- SplitDataFrameList(as.list(split(sw2, sw2[["Education"]])), compress = compress) sw2split[sw2split[, "Examination"] > 22, "Examination"] <- sw2split[sw2split[, "Examination"] > 22,"Examination"] + 1L checkDFL2dfl(sw2split, swiss2split) } } test_DataFrameList_transform <- function() { DF <- DataFrame(state.division, state.region, state.area) DFL <- split(DF, DF$state.division) # NICER: split(DF, ~ state.devision) DFL <- transform(DFL, total.area=sum(state.area[state.region!="South"]), fraction=ifelse2(total.area == 0, 0, state.area/total.area)) ANS <- DataFrame(lapply(unlist(DFL, use.names=FALSE), unname)) df <- as.data.frame(DF) df$total.area <- with(subset(df, state.region != "South"), sapply(split(state.area, state.division), sum))[df$state.division] df$fraction <- with(df, ifelse(total.area == 0, 0, state.area/total.area)) df <- df[order(df$state.division),] rownames(df) <- NULL checkIdentical(ANS, DataFrame(df)) } test_SplitDataFrameList_rownames <- function() { csdfl <- SplitDataFrameList(DataFrame(one = c(1,2,3,4), row.names = seq_len(4)), DataFrame(one = c(11,12,13,14), row.names = c("a","b","c","d"))) csdfl[[1]] <- DataFrame(one = c(4,3,2,1), row.names = rev(seq_len(4))) csdfl2 <- SplitDataFrameList(DataFrame(one = c(1,2,3,4), row.names = rev(seq_len(4))), DataFrame(one = c(11,12,13,14), row.names = c("a","b","c","d"))) checkIdentical(rownames(csdfl), rownames(csdfl2)) } IRanges/inst/unitTests/test_Grouping-class.R0000644000175100017510000001341214626176651022177 0ustar00biocbuildbiocbuild### test_PartitioningByEnd <- function() { ## on a numeric vector, NG not supplied current0 <- PartitioningByEnd() checkTrue(validObject(current0)) target <- new("PartitioningByEnd") checkIdentical(target, current0) breakpoints <- c(0, 5, 5, 8) current1 <- PartitioningByEnd(breakpoints) checkTrue(validObject(current1)) checkIdentical(4L, length(current1)) checkIdentical(as.integer(breakpoints), end(current1)) checkIdentical(end(current1), cumsum(width(current1))) checkIdentical(NULL, names(current1)) checkException(PartitioningByEnd(breakpoints, names=letters), silent=TRUE) current2 <- PartitioningByEnd(breakpoints, names=letters[1:4]) checkTrue(validObject(current2)) checkIdentical(letters[1:4], names(current2)) names(breakpoints) <- names(current2) current3 <- PartitioningByEnd(breakpoints) checkIdentical(current2, current3) current4 <- PartitioningByEnd(breakpoints, names=LETTERS[4:1]) checkIdentical(LETTERS[4:1], names(current4)) breakpoints <- rep.int(0, 1000) current5 <- PartitioningByEnd(breakpoints) checkTrue(validObject(current5)) checkIdentical(as.integer(breakpoints), end(current5)) checkIdentical(end(current5), cumsum(width(current5))) ## on a PartitioningByEnd object checkIdentical(current1, PartitioningByEnd(current1)) # no-op checkIdentical(current2, PartitioningByEnd(current2)) # no-op checkException(PartitioningByEnd(current2, names=LETTERS), silent=TRUE) current6 <- PartitioningByEnd(current2, names=names(current4)) checkTrue(validObject(current6)) checkIdentical(names(current4), names(current6)) ## on CompressedList, SimpleList, IRanges, and list objects do_checks <- function(x) { checkIdentical(current1, PartitioningByEnd(x)) checkException(PartitioningByEnd(x, names=letters), silent=TRUE) checkIdentical(current2, PartitioningByEnd(x, names=names(current2))) names(x) <- names(current2) checkIdentical(current2, PartitioningByEnd(x)) checkIdentical(current4, PartitioningByEnd(x, names=names(current4))) } x <- RleList(Rle(), Rle(-3, 5), Rle(), Rle(1:0, c(2,1)), compress=TRUE) do_checks(x) do_checks(as(x, "SimpleList")) do_checks(as.list(x)) x <- IRanges(seq(148, by=-50, length.out=4), width=width(current1)) do_checks(x) ## TODO: Uncomment this when as.list() works again on IRanges objects #do_checks(as.list(x)) do_checks(list(NULL, integer(5), complex(0), raw(3))) } test_PartitioningByWidth <- function() { ## on a numeric vector, NG not supplied current0 <- PartitioningByWidth() checkTrue(validObject(current0)) target <- new("PartitioningByWidth") checkIdentical(target, current0) widths <- c(0, 5, 0, 3) current1 <- PartitioningByWidth(widths) checkTrue(validObject(current1)) checkIdentical(4L, length(current1)) checkIdentical(as.integer(widths), width(current1)) checkIdentical(end(current1), cumsum(width(current1))) checkIdentical(NULL, names(current1)) checkException(PartitioningByWidth(widths, names=letters), silent=TRUE) current2 <- PartitioningByWidth(widths, names=letters[1:4]) checkTrue(validObject(current2)) checkIdentical(letters[1:4], names(current2)) names(widths) <- names(current2) current3 <- PartitioningByWidth(widths) checkIdentical(current2, current3) current4 <- PartitioningByWidth(widths, names=LETTERS[4:1]) checkIdentical(LETTERS[4:1], names(current4)) widths <- rep.int(0, 1000) current5 <- PartitioningByWidth(widths) checkTrue(validObject(current5)) checkIdentical(as.integer(widths), width(current5)) checkIdentical(end(current5), cumsum(width(current5))) ## on a PartitioningByWidth object checkIdentical(current1, PartitioningByWidth(current1)) # no-op checkIdentical(current2, PartitioningByWidth(current2)) # no-op checkException(PartitioningByWidth(current2, names=LETTERS), silent=TRUE) current6 <- PartitioningByWidth(current2, names=names(current4)) checkTrue(validObject(current6)) checkIdentical(names(current4), names(current6)) ## on CompressedList, SimpleList, IRanges, and list objects do_checks <- function(x) { checkIdentical(current1, PartitioningByWidth(x)) checkException(PartitioningByWidth(x, names=letters), silent=TRUE) checkIdentical(current2, PartitioningByWidth(x, names=names(current2))) names(x) <- names(current2) checkIdentical(current2, PartitioningByWidth(x)) checkIdentical(current4, PartitioningByWidth(x, names=names(current4))) } x <- RleList(Rle(), Rle(-3, 5), Rle(), Rle(1:0, c(2,1)), compress=TRUE) do_checks(x) do_checks(as(x, "SimpleList")) do_checks(as.list(x)) x <- IRanges(seq(148, by=-50, length.out=4), width=width(current1)) do_checks(x) ## TODO: Uncomment this when as.list() works again on IRanges objects #do_checks(as.list(x)) do_checks(list(NULL, integer(5), complex(0), raw(3))) } test_PartitioningByEndOrWidth_NG_supplied <- function() { for (class in c("PartitioningByEnd", "PartitioningByWidth")) { CONSTRUCTOR <- get(class) x <- c(3, 3, 4, 6) NG <- 8 current1 <- CONSTRUCTOR(x, NG) checkTrue(is(current1, class)) checkTrue(validObject(current1)) checkIdentical(8L, length(current1)) checkIdentical(tabulate(x, nbins=NG), width(current1)) checkException(CONSTRUCTOR(x, NG, names=letters[1:4]), silent=TRUE) current2 <- CONSTRUCTOR(x, NG, names=letters[1:8]) checkTrue(validObject(current2)) checkIdentical(letters[1:8], names(current2)) names(x) <- letters[1:4] current3 <- CONSTRUCTOR(x, NG) checkIdentical(current1, current3) } } IRanges/inst/unitTests/test_HitsList.R0000644000175100017510000000037014626176651021044 0ustar00biocbuildbiocbuildtest_HitsList_as_matrix <- function() { x <- IRangesList(chr1=IRanges(1, 5), chr2=IRanges(6, 10)) y <- IRangesList(chr2=IRanges(8, 10)) checkIdentical(as.matrix(findOverlaps(x, y)), cbind(queryHits = 2L, subjectHits = 1L)) } IRanges/inst/unitTests/test_IPos-class.R0000644000175100017510000003000614626176651021255 0ustar00biocbuildbiocbuildtest_IPos_constructor_and_getters <- function() { ## Empty object checkException(new("IPos")) ipos0a <- new("UnstitchedIPos") checkTrue(validObject(ipos0a)) checkIdentical(0L, length(ipos0a)) checkIdentical(integer(0), pos(ipos0a)) checkIdentical(integer(0), start(ipos0a)) checkIdentical(integer(0), end(ipos0a)) checkIdentical(integer(0), width(ipos0a)) checkTrue(is.null(names(ipos0a))) ipos0b <- new("StitchedIPos") checkTrue(validObject(ipos0b)) checkIdentical(0L, length(ipos0b)) checkIdentical(integer(0), pos(ipos0b)) checkIdentical(integer(0), start(ipos0b)) checkIdentical(integer(0), end(ipos0b)) checkIdentical(integer(0), width(ipos0b)) checkTrue(is.null(names(ipos0b))) checkIdentical(ipos0a, IPos()) checkIdentical(ipos0b, IPos(stitch=TRUE)) checkIdentical(ipos0a, IPos(stitch=FALSE)) ## Positions supplied in an unnamed integer vector pos <- c(44:53, 10:5, -3:6) # unnamed score <- runif(26) ipos1a <- IPos(pos, names=LETTERS, score=score) checkTrue(is(ipos1a, "UnstitchedIPos")) checkTrue(validObject(ipos1a)) checkIdentical(length(pos), length(ipos1a)) checkIdentical(pos, pos(ipos1a)) checkIdentical(pos, start(ipos1a)) checkIdentical(pos, end(ipos1a)) checkIdentical(rep.int(1L, length(pos)), width(ipos1a)) checkIdentical(LETTERS, names(ipos1a)) checkIdentical(DataFrame(score=score), mcols(ipos1a, use.names=FALSE)) checkIdentical(LETTERS, rownames(mcols(ipos1a))) ipos1b <- IPos(pos, names=LETTERS, score=score, stitch=TRUE) checkTrue(is(ipos1b, "StitchedIPos")) checkTrue(validObject(ipos1b)) checkIdentical(length(pos), length(ipos1b)) checkIdentical(pos, pos(ipos1b)) checkIdentical(pos, start(ipos1b)) checkIdentical(pos, end(ipos1b)) checkIdentical(rep.int(1L, length(pos)), width(ipos1b)) checkIdentical(LETTERS, names(ipos1b)) checkIdentical(DataFrame(score=score), mcols(ipos1b, use.names=FALSE)) checkIdentical(LETTERS, rownames(mcols(ipos1b))) ## Positions supplied in a named integer vector ipos2a <- IPos(setNames(pos, LETTERS), score=score) checkIdentical(ipos1a, ipos2a) ipos2b <- IPos(setNames(pos, LETTERS), score=score, stitch=TRUE) checkIdentical(ipos1b, ipos2b) ## Invalid positions checkException(IPos(c(35, NA, 5))) ## Positions specified as integer ranges ipos3 <- IPos(IRanges(c(25, 2), c(100, 50))) checkTrue(is(ipos3, "StitchedIPos")) checkTrue(validObject(ipos3)) checkIdentical(125L, length(ipos3)) checkIdentical(c(25:100, 2:50), pos(ipos3)) checkIdentical(ipos3, IPos(c("25-100", "2-50"))) } test_IPos_names_setter <- function() { ipos0a <- IPos(stitch=FALSE) ipos0 <- `names<-`(ipos0a, names(ipos0a)) # no-op checkIdentical(ipos0a, ipos0) names(ipos0) <- character(0) checkTrue(validObject(ipos0)) checkIdentical(character(0), names(ipos0)) checkIdentical(ipos0a, unname(ipos0)) ipos0b <- IPos(stitch=TRUE) ipos0 <- `names<-`(ipos0b, names(ipos0b)) # no-op checkIdentical(ipos0b, ipos0) names(ipos0) <- character(0) checkTrue(validObject(ipos0)) checkIdentical(character(0), names(ipos0)) checkIdentical(ipos0b, unname(ipos0)) pos <- c(44:53, 10:5, -3:6) # unnamed ipos1a <- IPos(pos) checkTrue(is.null(names(ipos1a))) checkIdentical(ipos1a, `names<-`(ipos1a, names(ipos1a))) # no-op checkException(names(ipos1a) <- c(letters, LETTERS)) names(ipos1a) <- LETTERS[26:22] checkIdentical(LETTERS[26:22], head(names(ipos1a), n=5)) checkIdentical(rep.int(NA_character_, 21), tail(names(ipos1a), n=21)) checkIdentical(ipos1a, `names<-`(ipos1a, names(ipos1a))) # no-op checkIdentical(IPos(pos), unname(ipos1a)) ipos1b <- IPos(pos, stitch=TRUE) checkTrue(is.null(names(ipos1b))) checkIdentical(ipos1b, `names<-`(ipos1b, names(ipos1b))) # no-op checkException(names(ipos1b) <- c(letters, LETTERS)) names(ipos1b) <- LETTERS[26:22] checkIdentical(LETTERS[26:22], head(names(ipos1b), n=5)) checkIdentical(rep.int(NA_character_, 21), tail(names(ipos1b), n=21)) checkIdentical(ipos1b, `names<-`(ipos1b, names(ipos1b))) # no-op checkIdentical(IPos(pos, stitch=TRUE), unname(ipos1b)) } test_IPos_mcols_setter <- function() { ipos0a <- IPos(names=character(0), stitch=FALSE) ipos0 <- `mcols<-`(ipos0a, value=mcols(ipos0a)) # no-op checkIdentical(ipos0a, ipos0) mcols(ipos0)$score <- numeric(0) checkTrue(validObject(ipos0)) checkTrue(is(mcols(ipos0), "DataFrame")) checkIdentical(c(0L, 1L), dim(mcols(ipos0))) checkIdentical(list(character(0), "score"), dimnames(mcols(ipos0))) checkIdentical(list(NULL, "score"), dimnames(mcols(ipos0, use.names=FALSE))) checkIdentical(ipos0a, `mcols<-`(ipos0, value=NULL)) ipos0b <- IPos(names=character(0), stitch=TRUE) ipos0 <- `mcols<-`(ipos0b, value=mcols(ipos0b)) # no-op checkIdentical(ipos0b, ipos0) mcols(ipos0)$score <- numeric(0) checkTrue(validObject(ipos0)) checkTrue(is(mcols(ipos0), "DataFrame")) checkIdentical(c(0L, 1L), dim(mcols(ipos0))) checkIdentical(list(character(0), "score"), dimnames(mcols(ipos0))) checkIdentical(list(NULL, "score"), dimnames(mcols(ipos0, use.names=FALSE))) checkIdentical(ipos0b, `mcols<-`(ipos0, value=NULL)) pos <- c(44:53, 10:5, -3:6) # unnamed ipos1a <- IPos(pos, names=LETTERS, stitch=FALSE) checkIdentical(ipos1a, `mcols<-`(ipos1a, value=mcols(ipos1a))) # no-op mcols(ipos1a)$stuff <- 1:2 mcols(ipos1a)$gene_id <- sprintf("ID%02d", 1:26) checkTrue(validObject(ipos1a)) checkTrue(is(mcols(ipos1a), "DataFrame")) checkIdentical(c(26L, 2L), dim(mcols(ipos1a))) checkIdentical(c("stuff", "gene_id"), colnames(mcols(ipos1a))) checkIdentical(LETTERS, rownames(mcols(ipos1a))) checkIdentical(NULL, rownames(mcols(ipos1a, use.names=FALSE))) checkIdentical(rep.int(1:2, 13), mcols(ipos1a)$stuff) ipos1b <- IPos(pos, names=LETTERS, stitch=TRUE) checkIdentical(ipos1b, `mcols<-`(ipos1b, value=mcols(ipos1b))) # no-op mcols(ipos1b)$stuff <- 1:2 mcols(ipos1b)$gene_id <- sprintf("ID%02d", 1:26) checkTrue(validObject(ipos1b)) checkTrue(is(mcols(ipos1b), "DataFrame")) checkIdentical(c(26L, 2L), dim(mcols(ipos1b))) checkIdentical(c("stuff", "gene_id"), colnames(mcols(ipos1b))) checkIdentical(LETTERS, rownames(mcols(ipos1b))) checkIdentical(NULL, rownames(mcols(ipos1b, use.names=FALSE))) checkIdentical(rep.int(1:2, 13), mcols(ipos1b)$stuff) } test_IPos_coercion <- function() { pos <- c(44:53, 10:5, -3:6) ipos1a <- IPos(pos, LETTERS, stuff=1:2, stitch=FALSE) ipos1b <- IPos(pos, LETTERS, stuff=1:2, stitch=TRUE) ## Back and forth between UnstitchedIPos and StitchedIPos checkIdentical(ipos1b, as(ipos1a, "StitchedIPos")) checkIdentical(ipos1a, as(ipos1b, "UnstitchedIPos")) ## From IPos to IRanges ir1a <- as(ipos1a, "IRanges") ir1b <- as(ipos1b, "IRanges") checkIdentical(ir1a, ir1b) checkIdentical(pos, start(ir1a)) checkIdentical(pos, end(ir1a)) checkIdentical(names(ipos1a), names(ir1a)) checkIdentical(mcols(ipos1a), mcols(ir1a)) ## From IRanges to IPos checkIdentical(ipos1a, as(ir1a, "UnstitchedIPos")) checkIdentical(ipos1b, as(ir1a, "StitchedIPos")) checkIdentical(ipos1a, as(ir1a, "IPos")) checkException(as(IRanges(1:5, 5), "UnstitchedIPos")) checkException(as(IRanges(1:5, 5), "StitchedIPos")) checkException(as(IRanges(1:5, 5), "IPos")) } test_IPos_subsetting <- function() { pos <- c(44:53, 10:5, -3:6) for (stitch in c(FALSE, TRUE)) { ## unnamed object ipos1 <- IPos(pos, stitch=stitch) ipos <- ipos1[12:5] checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(8L, length(ipos)) checkIdentical(pos[12:5], pos(ipos)) ipos <- ipos1[c(FALSE, TRUE)] checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(13L, length(ipos)) checkIdentical(pos[c(FALSE, TRUE)], pos(ipos)) ipos <- ipos1[-5] checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(25L, length(ipos)) checkIdentical(pos[-5], pos(ipos)) ipos <- tail(ipos1) checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(6L, length(ipos)) checkIdentical(tail(pos), pos(ipos)) ## named object names(ipos1) <- LETTERS ipos <- ipos1[12:5] checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(LETTERS[12:5], names(ipos)) ## with metadata columns mcols(ipos1)$stuff <- 1:2 mcols(ipos1)$ok <- c(TRUE, FALSE) ipos <- ipos1[12:5] checkIdentical(class(ipos1), class(ipos)) checkTrue(validObject(ipos)) checkIdentical(mcols(ipos1)[12:5, ], mcols(ipos)) } } test_IPos_concatenation <- function() { pos <- c(44:53, 10:5, -3:6) ## No medata columns ipos1 <- IPos(pos, names=LETTERS, stitch=FALSE) # unstitched, named ipos2 <- IPos(c("-9-5", "41-55")) # stitched, unnamed ipos12 <- c(ipos1, ipos2) checkTrue(is(ipos12, "UnstitchedIPos")) checkTrue(validObject(ipos12)) checkIdentical(length(ipos1) + length(ipos2), length(ipos12)) checkIdentical(c(pos(ipos1), pos(ipos2)), pos(ipos12)) checkIdentical(c(names(ipos1), character(length(ipos2))), names(ipos12)) ipos21 <- c(ipos2, ipos1) checkTrue(is(ipos21, "StitchedIPos")) checkTrue(validObject(ipos21)) checkIdentical(length(ipos2) + length(ipos1), length(ipos21)) checkIdentical(c(pos(ipos2), pos(ipos1)), pos(ipos21)) checkIdentical(c(character(length(ipos2)), names(ipos1)), names(ipos21)) ## With medata columns on one object mcols(ipos1)$stuff <- 1:2 mcols(ipos1)$ok <- c(TRUE, FALSE) checkIdentical(ipos12, c(ipos1, ipos2, ignore.mcols=TRUE)) ipos12 <- c(ipos1, ipos2) mcols12 <- mcols(ipos12) checkTrue(is(mcols12, "DataFrame")) checkIdentical(c(length(ipos12), 2L), dim(mcols12)) checkIdentical(c("stuff", "ok"), colnames(mcols12)) checkTrue(is.integer(mcols12$stuff)) checkIdentical(mcols(ipos1)$stuff, head(mcols12$stuff, n=length(ipos1))) checkIdentical(rep.int(NA_integer_, length(ipos2)), tail(mcols12$stuff, n=length(ipos2))) checkTrue(is.logical(mcols12$ok)) checkIdentical(mcols(ipos1)$ok, head(mcols12$ok, n=length(ipos1))) checkIdentical(rep.int(NA, length(ipos2)), tail(mcols12$ok, n=length(ipos2))) checkIdentical(ipos21, c(ipos2, ipos1, ignore.mcols=TRUE)) ipos21 <- c(ipos2, ipos1) mcols21 <- mcols(ipos21) checkTrue(is(mcols21, "DataFrame")) checkIdentical(c(length(ipos21), 2L), dim(mcols21)) checkIdentical(c("stuff", "ok"), colnames(mcols21)) checkTrue(is.integer(mcols21$stuff)) checkIdentical(rep.int(NA_integer_, length(ipos2)), head(mcols21$stuff, n=length(ipos2))) checkIdentical(mcols(ipos1)$stuff, tail(mcols21$stuff, n=length(ipos1))) checkTrue(is.logical(mcols21$ok)) checkIdentical(rep.int(NA, length(ipos2)), head(mcols21$ok, n=length(ipos2))) checkIdentical(mcols(ipos1)$ok, tail(mcols21$ok, n=length(ipos1))) ## With medata columns on the two objects mcols(ipos2)$ok <- "yes" mcols(ipos2)$more_stuff <- Rle(1:5, 6) ipos12 <- c(ipos1, ipos2) mcols12 <- mcols(ipos12) checkTrue(is(mcols12, "DataFrame")) checkIdentical(c(length(ipos12), 3L), dim(mcols12)) checkIdentical(c("stuff", "ok", "more_stuff"), colnames(mcols12)) checkTrue(is.integer(mcols12$stuff)) checkTrue(is.character(mcols12$ok)) ipos21 <- c(ipos2, ipos1) mcols21 <- mcols(ipos21) checkTrue(is(mcols21, "DataFrame")) checkIdentical(c(length(ipos21), 3L), dim(mcols21)) checkIdentical(c("ok", "more_stuff", "stuff"), colnames(mcols21)) checkTrue(is.character(mcols21$ok)) checkTrue(is.integer(mcols21$stuff)) } IRanges/inst/unitTests/test_IRanges-class.R0000644000175100017510000000407514626176651021742 0ustar00biocbuildbiocbuildtest_IRanges_names <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) checkIdentical(names(range1), NULL) nms <- c("a", NA, "b") names(range1) <- nms checkIdentical(names(range1), nms) checkTrue(validObject(nms)) names(range1) <- NULL checkTrue(validObject(nms)) checkIdentical(names(range1), NULL) names(range1) <- "a" checkTrue(validObject(range1)) checkIdentical(names(range1), c("a", NA, NA)) checkException(names(range1) <- c("a", "b", "c", "d"), silent = TRUE) } test_IntegerRanges_isDisjoint <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) ir2 <- IRanges(c(2,9,5), c(3,9,6)) ir3 <- IRanges(1, 5) checkIdentical(isDisjoint(ir1), FALSE) checkIdentical(isDisjoint(ir2), TRUE) checkIdentical(isDisjoint(ir3), TRUE) ## Handling of zero-width ranges current <- sapply(11:17, function(i) isDisjoint(IRanges(c(12, i), width=c(4, 0)))) target <- rep(c(TRUE, FALSE, TRUE), c(2, 3, 2)) checkIdentical(target, current) } test_IRanges_concatenate <- function() { range <- IRanges(start=c(1,2,3,1), end=c(5,2,8,3)) srange <- split(range, start(range) == 1) checkIdentical(srange, IRangesList(`FALSE` = range[2:3], `TRUE` = range[c(1,4)])) checkIdentical(do.call(c, unname(as.list(srange))), IRanges(c(2,3,1,1), c(2,8,5,3))) ir1 <- IRanges(1, 10) ir2 <- IRanges(c(1, 15), width=5) mcols(ir2) <- DataFrame(score=1:2) checkIdentical(mcols(c(ir1, ir2)), DataFrame(score = c(NA, 1L, 2L))) ## Concatenating multiple IRanges object with varying mcols mcols(ir1) <- DataFrame(gc=0.78) ir12 <- c(ir1, ir2, ignore.mcols=TRUE) checkIdentical(mcols(ir12), NULL) target_mcols <- DataFrame(gc=c(0.78, NA, NA), score=c(NA, 1:2)) mcols(ir12) <- target_mcols checkIdentical(c(ir1, ir2), ir12) } test_IRanges_annotation <- function() { range <- IRanges(c(1, 4), c(5, 7)) mcols(range) <- DataFrame(a = 1:2) checkIdentical(mcols(range)[,1], 1:2) checkIdentical(mcols(range[2:1])[,1], 2:1) checkIdentical(mcols(c(range,range))[,1], rep(1:2,2)) } IRanges/inst/unitTests/test_IRanges-constructor.R0000644000175100017510000000235614626176651023222 0ustar00biocbuildbiocbuildtest_IRanges_constructor <- function() { ir0 <- IRanges() checkTrue(is(ir0, "IRanges")) checkTrue(validObject(ir0)) checkIdentical(0L, length(ir0)) ir1 <- IRanges(start=4:-2, end=13:7) checkTrue(is(ir1, "IRanges")) checkTrue(validObject(ir1)) checkIdentical(7L, length(ir1)) ir2 <- IRanges(start=4:-2, width=10) checkIdentical(ir1, ir2) ir3 <- IRanges(end=13:7, width=10) checkIdentical(ir1, ir3) ir <- IRanges(start=c(a=1, b=2, c=3), end=c(d=10)) checkTrue(is(ir, "IRanges")) checkTrue(validObject(ir)) checkIdentical(3L, length(ir)) ir <- IRanges(matrix(1:24), 30) checkTrue(is(ir, "IRanges")) checkTrue(validObject(ir)) checkIdentical(24L, length(ir)) ir <- IRanges(array(1:24, 4:2), 30) checkTrue(is(ir, "IRanges")) checkTrue(validObject(ir)) checkIdentical(24L, length(ir)) ## Solve NAs in 'start', 'end', or 'width'. ir <- IRanges(start=c(NA, -2, 15, -119), end =c(26, NA, 34, -100), width=c(20, 20, NA, 20)) checkTrue(is(ir, "IRanges")) checkTrue(validObject(ir)) checkIdentical(4L, length(ir)) checkIdentical(c( 7L, -2L, 15L, -119L), start(ir)) checkIdentical(c(26L, 17L, 34L, -100L), end(ir)) checkIdentical(c(20L, 20L, 20L, 20L), width(ir)) } IRanges/inst/unitTests/test_IRangesList-class.R0000644000175100017510000001120714626176651022571 0ustar00biocbuildbiocbuildtest_IRangesList_construction <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { empty <- IRangesList(compress=compress) checkTrue(validObject(empty)) checkIdentical(length(empty), 0L) named <- IRangesList(one = range1, two = range2, compress=compress) checkTrue(validObject(named)) checkIdentical(length(named), 2L) checkIdentical(start(named), IntegerList(one = start(range1), two = start(range2), compress=compress)) checkIdentical(end(named), IntegerList(one = end(range1), two = end(range2), compress=compress)) checkIdentical(width(named), IntegerList(one = width(range1), two = width(range2), compress=compress)) checkIdentical(names(named), c("one", "two")) checkIdentical(range1, named[[1]]) unnamed <- IRangesList(range1, range2, compress=compress) checkTrue(validObject(unnamed)) checkIdentical(length(unnamed), 2L) checkIdentical(range2, unnamed[[2]]) checkIdentical(names(unnamed), NULL) } } test_IRangesList_subset <- function() { for (compress in c(TRUE, FALSE)) { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(1,15,20,45), end=c(5,15,100,80)) collection <- IRangesList(one = range1, range2, compress=compress) checkIdentical(subsetByOverlaps(collection, IRangesList()), IRangesList(one=IRanges(), IRanges(), compress=compress)) checkIdentical( subsetByOverlaps(collection, IRangesList(IRanges(4, 6), IRanges(50, 70), compress=compress)), IRangesList(one=IRanges(c(1,3),c(5,8)), IRanges(c(20,45),c(100,80)), compress=compress)) checkIdentical( subsetByOverlaps(collection, IRangesList(IRanges(50, 70), one=IRanges(4, 6), compress=compress)), IRangesList(one=IRanges(c(1,3),c(5,8)), IRanges(), compress=compress)) } } test_IRangesList_as_list <- function() { for (compress in c(TRUE, FALSE)) { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) checkIdentical(list(range1, range2), as.list(IRangesList(range1, range2, compress=compress))) checkIdentical(list(a=range1, b=range2), as.list(IRangesList(a=range1, b=range2, compress=compress))) } } test_IRangesList_as_data_frame <- function() { for (compress in c(TRUE, FALSE)) { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) rl <- IRangesList(range1, range2, compress=compress) df <- data.frame(group=togroup(PartitioningByWidth(rl)), group_name=NA_character_, as.data.frame(c(range1,range2)), stringsAsFactors=FALSE) checkIdentical(df, as.data.frame(rl)) names(rl) <- c("a", "b") df$group_name <- c("a", "b")[togroup(PartitioningByWidth(rl))] checkIdentical(df, as.data.frame(rl)) } } test_IRangesList_annotation <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { rl <- IRangesList(range1, range2, compress = compress) mcols(rl) <- DataFrame(a = 1:2) checkIdentical(mcols(rl)[,1], 1:2) checkIdentical(mcols(rl[2:1])[,1], 2:1) checkIdentical(mcols(c(rl,rl))[,1], rep(1:2,2)) checkIdentical(mcols(append(rl,rl))[,1], rep(1:2,2)) } } ## test_IRangesList_overlap <- function() { ## rl1 <- IRangesList(a = IRanges(c(1,2),c(4,3)), b = IRanges(c(4,6),c(10,7))) ## rl2 <- IRangesList(b = IRanges(c(0,2),c(4,5)), a = IRanges(c(4,5),c(6,7))) ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## names(rl2)[1] <- "c" ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## names(rl2) <- NULL ## overlap(rl1, rl2) ## overlap(rl1, rl2, select = "first") ## overlap(rl1, rl2, select = "first", drop = TRUE) ## overlap(rl1, rl2[1]) ## overlap(rl1, rl2[1], select = "first") ## overlap(rl1, rl2[1], select = "first", drop = TRUE) ## overlap(rl1[1], rl2) ## overlap(rl1[1], rl2, select = "first") ## overlap(rl1[1], rl2, select = "first", drop = TRUE) ## } IRanges/inst/unitTests/test_NCList-class.R0000644000175100017510000005120714626176651021545 0ustar00biocbuildbiocbuild### findOverlaps_NCList <- IRanges:::findOverlaps_NCList findOverlaps_NCLists <- IRanges:::findOverlaps_NCLists .transpose_hits <- function(hits) { if (is.list(hits)) return(lapply(hits, .transpose_hits)) t(hits) } ### Used in the unit tests for GNCList located in GenomicRanges. .compare_hits <- function(target, current) { if (is.list(target) || is(target, "List") && is.list(current) || is(current, "List")) return(all(mapply(.compare_hits, target, current))) identical(.transpose_hits(target), .transpose_hits(current)) } ### Used in the unit tests for GNCList located in GenomicRanges. .make_Hits_from_q2s <- function(q2s, s_len) { q_hits <- rep.int(seq_along(q2s), elementNROWS(q2s)) s_hits <- as.integer(unlist(q2s, use.names=FALSE)) Hits(q_hits, s_hits, length(q2s), s_len, sort.by.query=TRUE) } .make_Hits_from_s2q <- function(s2q, q_len) .transpose_hits(.make_Hits_from_q2s(s2q, q_len)) .select_hits <- function(x, select) { if (is.list(x)) return(lapply(x, .select_hits, select)) selectHits(x, select) } ### Vectorized. Return -1 if the query and subject overlap (i.e. if ### end(query) < start(subject) and end(subject) < start(query) are both ### false). Otherwise (i.e. if they are disjoint), return the width of the ### gap between them. Note that a gap width of 0 means that they are adjacent. ### TODO: Rename this pgapWidth(), make it a generic with various methods ### (at least one for IntegerRanges and one for GenomicRanges objects), and ### export it. .gapwidth <- function(query, subject) { ifelse(end(query) < start(subject), start(subject) - end(query), ifelse(end(subject) < start(query), start(query) - end(subject), 0L)) - 1L } ### Vectorized. ### TODO: Rename this poverlapWidth(), make it a generic with various methods ### (at least one for IntegerRanges and one for GenomicRanges objects), and ### export it. .overlapwidth <- function(query, subject) { score <- pmin.int(end(query), end(subject)) - pmax.int(start(query), start(subject)) + 1L pmax.int(score, 0L) } ### Used in the unit tests for GNCList located in GenomicRanges. .get_query_overlaps <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal")) { type <- match.arg(type) if (type == "any" && maxgap != -1L && minoverlap != 0L) stop("when 'type' is \"any\", at least one of 'maxgap' ", "and 'minoverlap' must be set to its default value") overlapwidth <- .overlapwidth(query, subject) ok <- overlapwidth >= minoverlap if (type == "any") { gapwidth <- .gapwidth(query, subject) ok <- ok & gapwidth <= maxgap return(ok) } if (maxgap == -1L) maxgap <- 0L if (type != "end") d1 <- abs(start(subject) - start(query)) if (type != "start") d2 <- abs(end(subject) - end(query)) if (type == "start") return(ok & d1 <= maxgap) if (type == "end") return(ok & d2 <= maxgap) if (type == "equal") return(ok & d1 <= maxgap & d2 <= maxgap) if (type == "within") { ok2 <- start(query) >= start(subject) & end(query) <= end(subject) } else { # type == "extend" ok2 <- start(query) <= start(subject) & end(query) >= end(subject) } ok <- ok & ok2 if (maxgap > 0L) ok <- ok & (d1 + d2) <= maxgap ok } .findOverlaps_naive <- function(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "extend", "equal"), select=c("all", "first", "last", "arbitrary", "count")) { type <- match.arg(type) select <- match.arg(select) hits_per_query <- lapply(seq_along(query), function(i) which(.get_query_overlaps(query[i], subject, maxgap=maxgap, minoverlap=minoverlap, type=type))) hits <- .make_Hits_from_q2s(hits_per_query, length(subject)) selectHits(hits, select=select) } test_NCList <- function() { x <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5), names=LETTERS[1:21]) mcols(x) <- DataFrame(score=seq(0.7, by=0.045, length.out=21)) nclist <- NCList(x) checkTrue(is(nclist, "NCList")) checkTrue(validObject(nclist, complete=TRUE)) checkIdentical(x, ranges(nclist, use.mcols=TRUE)) checkIdentical(length(x), length(nclist)) checkIdentical(names(x), names(nclist)) checkIdentical(start(x), start(nclist)) checkIdentical(end(x), end(nclist)) checkIdentical(width(x), width(nclist)) checkIdentical(x, as(nclist, "IRanges")) checkIdentical(x[-6], as(nclist[-6], "IRanges")) } ### Test findOverlaps_NCList() *default* behavior, that is, with all optional ### arguments (i.e. 'maxgap', 'minoverlap', 'type', 'select', and ### 'circle.length') set to their default value. test_findOverlaps_NCList <- function() { query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) target0 <- .findOverlaps_naive(query, subject) current <- findOverlaps_NCList(query, NCList(subject)) checkTrue(.compare_hits(target0, current)) current <- findOverlaps_NCList(NCList(query), subject) checkTrue(.compare_hits(target0, current)) current <- findOverlaps_NCList(query, subject) checkTrue(.compare_hits(target0, current)) ## Shuffle query and/or subject elements. permute_input <- function(q_perm, s_perm) { q_revperm <- integer(length(q_perm)) q_revperm[q_perm] <- seq_along(q_perm) s_revperm <- integer(length(s_perm)) s_revperm[s_perm] <- seq_along(s_perm) target <- remapHits(target0, Lnodes.remapping=q_revperm, new.nLnode=length(q_perm), Rnodes.remapping=s_revperm, new.nRnode=length(s_perm)) current <- findOverlaps_NCList(query[q_perm], NCList(subject[s_perm])) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(NCList(query[q_perm]), subject[s_perm]) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(query[q_perm], subject[s_perm]) checkTrue(.compare_hits(target, current)) } q_perm <- rev(seq_along(query)) s_perm <- rev(seq_along(subject)) permute_input(q_perm, seq_along(subject)) # reverse query permute_input(seq_along(query), s_perm) # reverse subject permute_input(q_perm, s_perm) # reverse both set.seed(97) for (i in 1:33) { ## random permutations q_perm <- sample(length(query)) s_perm <- sample(length(subject)) permute_input(q_perm, seq_along(subject)) permute_input(seq_along(query), s_perm) permute_input(q_perm, s_perm) } } test_findOverlaps_NCList_with_filtering <- function() { query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) pp_query <- NCList(query) pp_subject <- NCList(subject) for (type in c("any", "start", "end", "within", "extend", "equal")) { for (maxgap in -1:3) { if (type != "any" || maxgap == -1L) max_minoverlap <- 4L else max_minoverlap <- 0L for (minoverlap in 0:max_minoverlap) { for (select in c("all", "first", "last", "count")) { ## query - subject target <- .findOverlaps_naive(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(query, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(pp_query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) ## subject - query target <- .findOverlaps_naive(subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(pp_subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, pp_query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, query, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) ## subject - subject target <- .findOverlaps_naive(subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) current <- findOverlaps_NCList(pp_subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCList(subject, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select=select) checkTrue(.compare_hits(target, current)) } } } } } ### Only test "start" and "end" types at the moment. test_findOverlaps_NCList_special_types <- function() { x <- IRanges(10, 10) x1 <- IRanges(10, 9) y1 <- IRanges(start=c(7, 7, 13, 13), width=c(2, 0, 2, 0)) stopifnot(all(abs(start(x) - start(y1)) == 3L)) stopifnot(all(abs(start(x1) - start(y1)) == 3L)) x2 <- IRanges(11, 10) y2 <- IRanges(end=c(7, 7, 13, 13), width=c(2, 0, 2, 0)) stopifnot(all(abs(end(x) - end(y2)) == 3L)) stopifnot(all(abs(end(x2) - end(y2)) == 3L)) test_maxgap_and_type <- function(maxgap, minoverlap, nhit) { hits <- findOverlaps(x, y1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(y1, x, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(x1, y1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(y1, x1, maxgap=maxgap, minoverlap=minoverlap, type="start") checkEquals(nhit, length(hits)) hits <- findOverlaps(x, y2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(y2, x, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(x2, y2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) hits <- findOverlaps(y2, x2, maxgap=maxgap, minoverlap=minoverlap, type="end") checkEquals(nhit, length(hits)) } ## no hits for (maxgap in -1:2) { test_maxgap_and_type(maxgap, minoverlap=1L, 0L) test_maxgap_and_type(maxgap, minoverlap=0L, 0L) } for (maxgap in 3:5) { ## no hits test_maxgap_and_type(maxgap, minoverlap=1L, 0L) ## 4 hits test_maxgap_and_type(maxgap, minoverlap=0L, 4L) } } .test_arbitrary_selection <- function(query, subject) { pp_query <- NCList(query) pp_subject <- NCList(subject) for (type in c("any", "start", "end", "within", "extend", "equal")) { for (maxgap in -1:3) { if (type != "any" || maxgap == -1L) max_minoverlap <- 4L else max_minoverlap <- 0L for (minoverlap in 0:max_minoverlap) { target <- as(.findOverlaps_naive(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="all"), "CompressedIntegerList") target_idx0 <- elementNROWS(target) == 0L check_arbitrary_hits <- function(current) { current_idx0 <- is.na(current) checkIdentical(target_idx0, current_idx0) current <- as(current, "CompressedIntegerList") checkTrue(all(current_idx0 | as.logical(current %in% target))) } current <- findOverlaps_NCList(query, pp_subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) current <- findOverlaps_NCList(pp_query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) current <- findOverlaps_NCList(query, subject, maxgap=maxgap, minoverlap=minoverlap, type=type, select="arbitrary") check_arbitrary_hits(current) } } } } test_findOverlaps_NCList_arbitrary <- function() { query <- IRanges(4:3, 6) subject <- IRanges(2:4, 10) .test_arbitrary_selection(query, subject) query <- IRanges(-3:7, width=3) subject <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) .test_arbitrary_selection(query, subject) } .test_circularity <- function(query0, subject0, circle_length, target0, pp, findOverlaps_pp, type) { for (i in -2:2) { query <- shift(query0, shift=i*circle_length) pp_query <- pp(query, circle.length=circle_length) for (j in -2:2) { subject <- shift(subject0, shift=j*circle_length) pp_subject <- pp(subject, circle.length=circle_length) for (select in c("all", "first", "last", "count")) { target <- .select_hits(target0, select=select) current <- findOverlaps_pp(query, pp_subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(pp_query, subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(query, subject, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) target <- .select_hits(.transpose_hits(target0), select=select) current <- findOverlaps_pp(pp_subject, query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(subject, pp_query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) current <- findOverlaps_pp(subject, query, type=type, select=select, circle.length=circle_length) checkTrue(.compare_hits(target, current)) } } } } test_findOverlaps_NCList_with_circular_space <- function() { query <- IRanges(-2:17, width=3) subject <- IRanges(c(4, -1, 599), c(7, 0, 999)) circle_length <- 10L ## type "any" s2q <- list(c(5:10, 15:20L), c(1:3, 10:13, 20L), 1:20) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "any") ## type "start" s2q <- lapply(start(subject), function(s) which((start(query) - s) %% circle_length == 0L)) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "start") ## type "end" s2q <- lapply(end(subject), function(e) which((end(query) - e) %% circle_length == 0L)) target <- .make_Hits_from_s2q(s2q, length(query)) .test_circularity(query, subject, circle_length, target, NCList, findOverlaps_NCList, "end") } test_NCLists <- function() { x1 <- IRanges(-3:7, width=3) x2 <- IRanges() x3 <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) x <- IRangesList(x1=x1, x2=x2, x3=x3) mcols(x) <- DataFrame(label=c("first", "second", "third")) nclists <- NCLists(x) checkTrue(is(nclists, "NCLists")) checkTrue(validObject(nclists, complete=TRUE)) checkIdentical(x, ranges(nclists, use.mcols=TRUE)) checkIdentical(length(x), length(nclists)) checkIdentical(names(x), names(nclists)) checkIdentical(start(x), start(nclists)) checkIdentical(end(x), end(nclists)) checkIdentical(width(x), width(nclists)) checkIdentical(x, as(nclists, "IRangesList")) checkIdentical(x[-1], as(nclists[-1], "IRangesList")) checkIdentical(elementNROWS(x), elementNROWS(nclists)) nclist <- nclists[[3]] checkTrue(is(nclist, "NCList")) checkTrue(validObject(nclist, complete=TRUE)) checkIdentical(x3, as(nclist, "IRanges")) } test_findOverlaps_NCLists <- function() { ir1 <- IRanges(-3:7, width=3) ir2 <- IRanges(rep.int(1:6, 6:1), c(0:5, 1:5, 2:5, 3:5, 4:5, 5)) target0 <- mapply(findOverlaps_NCList, list(ir1, ir2), list(ir2, ir1)) for (compress in c(TRUE, FALSE)) { query <- IRangesList(ir1, ir2, IRanges(2, 7), compress=compress) pp_query <- NCLists(query) subject <- IRangesList(ir2, ir1, compress=compress) pp_subject <- NCLists(subject) for (select in c("all", "first", "last", "count")) { target <- .select_hits(target0, select=select) current <- findOverlaps_NCLists(query, pp_subject, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCLists(pp_query, subject, select=select) checkTrue(.compare_hits(target, current)) current <- findOverlaps_NCLists(query, subject, select=select) checkTrue(.compare_hits(target, current)) } } } test_findOverlaps_NCLists_with_circular_space <- function() { query1 <- IRanges(-2:17, width=3) subject1 <- IRanges(c(4, -1, 599), c(7, 0, 999)) query <- IRangesList(query1, IRanges(), subject1) subject <- IRangesList(subject1, IRanges(), query1) circle_length <- c(10L, NA_integer_, 10L) s2q <- list(c(5:10, 15:20L), c(1:3, 10:13, 20L), 1:20) target1 <- .make_Hits_from_s2q(s2q, length(query1)) target2 <- .make_Hits_from_s2q(list(), 0) target3 <- .transpose_hits(target1) target <- list(target1, target2, target3) .test_circularity(query, subject, circle_length, target, NCLists, findOverlaps_NCLists, "any") } IRanges/inst/unitTests/test_Ranges-comparison.R0000644000175100017510000000367714626176651022705 0ustar00biocbuildbiocbuildtest_pcompare_IntegerRanges <- function() { x1 <- IRanges(6:16, width=4) y <- IRanges(11, 14) target <- c(-6:-4, -4L, -4L, 0L, 4L, 4L, 4:6) checkIdentical(target, pcompare(x1, y)) checkIdentical(-target, pcompare(y, x1)) x2 <- IRanges(4:16, width=6) target <- c(-6:-4, -4L, -4L, -3L, -2L, 1L, 4L, 4L, 4:6) checkIdentical(target, pcompare(x2, y)) checkIdentical(-target, pcompare(y, x2)) x3 <- IRanges(8:16, width=2) target <- c(-6:-4, -1L, 2L, 3L, 4:6) checkIdentical(target, pcompare(x3, y)) checkIdentical(-target, pcompare(y, x3)) ## Moving a 0-width range over a non 0-width range. ## Note that when the end of the 0-width range is equal to the start of ## the non 0-width range minus 1, returning code -5 (which describes ## a situation of adjacent ranges) seems appropriate. ## However, one could argue that returning code -1 (which describes a ## situation where one range is inside the other) would also be ## appropriate, because, in that case, the two ranges have the same start. ## So the question really is whether the 0-width range should be considered ## *outside* or *inside* the non 0-width range. ## It's an arbitrary choice and we chose the former. x0 <- IRanges(10:16, width=0) target <- c(-6:-5, 2L, 2L, 2L, 5:6) checkIdentical(target, pcompare(x0, y)) checkIdentical(-target, pcompare(y, x0)) ## Moving a 0-width range over a 0-width range. y0 <- IRanges(13, 12) target <- c(-6L, -6L, -6L, 0L, 6L, 6L, 6L) checkIdentical(target, pcompare(x0, y0)) checkIdentical(-target, pcompare(y0, x0)) } test_order_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1,5), c(3,7,3,6)) ir1.sort <- IRanges(c(1,2,5,5), c(3,3,6,7)) ir1.rev <- IRanges(c(5,5,2,1), c(7,6,3,3)) checkIdentical(sort(ir1), ir1.sort) checkIdentical(sort(ir1, decreasing=TRUE), ir1.rev) checkException(sort(ir1, decreasing=NA), silent = TRUE) } IRanges/inst/unitTests/test_RleViews.R0000644000175100017510000001615614626176651021052 0ustar00biocbuildbiocbuildtest_RleViews <- function() { empty <- Views(Rle(), IRanges()) checkIdentical(empty, new("RleViews")) checkIdentical(list(), viewApply(empty, min)) checkIdentical(integer(0), viewMins(empty)) checkIdentical(integer(0), viewMaxs(empty)) checkIdentical(integer(0), viewSums(empty)) checkIdentical(numeric(0), viewMeans(empty)) checkIdentical(integer(0), viewWhichMins(empty)) checkIdentical(integer(0), viewWhichMaxs(empty)) checkIdentical(IRanges(), viewRangeMins(empty)) checkIdentical(IRanges(), viewRangeMaxs(empty)) x <- rep(c(1L, 3L, NA, 7L, 9L), 1:5) xRle <- Rle(x) xRleViewsUntrimmed <- Views(xRle, IRanges(start = c(1,1), width = c(0,20))) checkIdentical(c(Inf, 1), suppressWarnings(viewApply(xRleViewsUntrimmed, min, na.rm = TRUE))) checkIdentical(c(2147483647L, 1L), viewMins(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMins(xRleViewsUntrimmed, na.rm = TRUE), min(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(-2147483647L, 9L), viewMaxs(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMaxs(xRleViewsUntrimmed, na.rm = TRUE), max(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(0L, 80L), viewSums(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewSums(xRleViewsUntrimmed, na.rm = TRUE), sum(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NaN, 20/3), viewMeans(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewMeans(xRleViewsUntrimmed, na.rm = TRUE), mean(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 1L), viewWhichMins(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewWhichMins(xRleViewsUntrimmed, na.rm = TRUE), which.min(xRleViewsUntrimmed)) checkIdentical(c(NA_integer_, 11L), viewWhichMaxs(xRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(viewWhichMaxs(xRleViewsUntrimmed, na.rm = TRUE), which.max(xRleViewsUntrimmed)) checkException(max(xRleViewsUntrimmed, xRleViewsUntrimmed, na.rm = TRUE), silent = TRUE) xRleViews <- Views(xRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9), names = letters[1:5]) xList <- lapply(structure(seq_len(length(xRleViews)), names = letters[1:5]), function(i) window(x, start = start(xRleViews)[i], end = end(xRleViews)[i])) checkIdentical(letters[1:5], names(viewApply(xRleViews, min))) checkIdentical(letters[1:5], names(viewMins(xRleViews))) checkIdentical(letters[1:5], names(viewMaxs(xRleViews))) checkIdentical(letters[1:5], names(viewSums(xRleViews))) checkIdentical(letters[1:5], names(viewMeans(xRleViews))) checkIdentical(letters[1:5], names(viewWhichMins(xRleViews))) checkIdentical(letters[1:5], names(viewWhichMaxs(xRleViews))) checkIdentical(letters[1:5], names(viewRangeMins(xRleViews, na.rm = TRUE))) checkIdentical(letters[1:5], names(viewRangeMaxs(xRleViews, na.rm = TRUE))) checkEqualsNumeric(sapply(xList, min), viewMins(xRleViews)) checkEqualsNumeric(sapply(xList, min), viewApply(xRleViews, min)) checkEqualsNumeric(sapply(xList, min, na.rm = TRUE), viewMins(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, min, na.rm = TRUE), viewApply(xRleViews, min, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, max), viewMaxs(xRleViews)) checkEqualsNumeric(sapply(xList, max), viewApply(xRleViews, max)) checkEqualsNumeric(sapply(xList, max, na.rm = TRUE), viewMaxs(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, max, na.rm = TRUE), viewApply(xRleViews, max, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, sum), viewSums(xRleViews)) checkEqualsNumeric(sapply(xList, mean), viewMeans(xRleViews)) checkEqualsNumeric(sapply(xList, sum), viewApply(xRleViews, sum)) checkEqualsNumeric(sapply(xList, sum, na.rm = TRUE), viewSums(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, mean, na.rm = TRUE), viewMeans(xRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(xList, sum, na.rm = TRUE), viewApply(xRleViews, sum, na.rm = TRUE)) y <- rep(c(1.2, 3.4, NA, 7.8, 9.0), 1:5) yRle <- Rle(y) yRleViewsUntrimmed <- Views(yRle, IRanges(start = c(1,1), width = c(0,20))) checkIdentical(c(Inf, 1.2), suppressWarnings(viewApply(yRleViewsUntrimmed, min, na.rm = TRUE))) checkIdentical(c(Inf, 1.2), viewMins(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(-Inf, 9), viewMaxs(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(0, 84.2), viewSums(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NaN, 84.2/12), viewMeans(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 1L), viewWhichMins(yRleViewsUntrimmed, na.rm = TRUE)) checkIdentical(c(NA_integer_, 11L), viewWhichMaxs(yRleViewsUntrimmed, na.rm = TRUE)) yRleViews <- Views(yRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) yList <- lapply(seq_len(length(yRleViews)), function(i) window(y, start = start(yRleViews)[i], end = end(yRleViews)[i])) checkEqualsNumeric(sapply(yList, min), viewMins(yRleViews)) checkEqualsNumeric(sapply(yList, min), viewApply(yRleViews, min)) checkEqualsNumeric(sapply(yList, min, na.rm = TRUE), viewMins(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, min, na.rm = TRUE), viewApply(yRleViews, min, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, max), viewMaxs(yRleViews)) checkEqualsNumeric(sapply(yList, max), viewApply(yRleViews, max)) checkEqualsNumeric(sapply(yList, max, na.rm = TRUE), viewMaxs(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, max, na.rm = TRUE), viewApply(yRleViews, max, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, sum), viewSums(yRleViews)) checkEqualsNumeric(sapply(yList, mean), viewMeans(yRleViews)) checkEqualsNumeric(sapply(yList, sum), viewApply(yRleViews, sum)) checkEqualsNumeric(sapply(yList, sum, na.rm = TRUE), viewSums(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, mean, na.rm = TRUE), viewMeans(yRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(yList, sum, na.rm = TRUE), viewApply(yRleViews, sum, na.rm = TRUE)) z <- rep(c(1+1i, 3.4-1i, NA, 7.8+3i, 9.0-2i), 1:5) zRle <- Rle(z) zRleViews <- Views(zRle, start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) zList <- lapply(seq_len(length(zRleViews)), function(i) window(z, start = start(zRleViews)[i], end = end(zRleViews)[i])) checkEqualsNumeric(sapply(zList, sum), viewSums(zRleViews)) checkEqualsNumeric(sapply(zList, mean), viewMeans(zRleViews)) checkEqualsNumeric(sapply(zList, sum), viewApply(zRleViews, sum)) checkEqualsNumeric(sapply(zList, sum, na.rm = TRUE), viewSums(zRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(zList, mean, na.rm = TRUE), viewMeans(zRleViews, na.rm = TRUE)) checkEqualsNumeric(sapply(zList, sum, na.rm = TRUE), viewApply(zRleViews, sum, na.rm = TRUE)) } IRanges/inst/unitTests/test_RleViewsList.R0000644000175100017510000001302714626176651021700 0ustar00biocbuildbiocbuildtest_RleViewsList <- function() { x1 <- rep(c(1L, 3L, NA, 7L, 9L), 1:5) x1Rle <- Rle(x1) x1Ranges <- IRanges(start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) x2 <- rev(x1) x2Rle <- Rle(x2) x2Ranges <- IRanges(start = c(2, 4, 6, 8, 10), end = c(3, 9, 11, 13, 15)) checkIdentical(RleViewsList(Views(x1Rle, x1Ranges), Views(x2Rle, x2Ranges)), RleViewsList(rleList = RleList(x1Rle, x2Rle), rangesList = IRangesList(x1Ranges, x2Ranges))) xRleViewsList <- RleViewsList(a = Views(x1Rle, x1Ranges), b = Views(x2Rle, x2Ranges)) xList <- list(a = lapply(seq_len(length(xRleViewsList[[1]])), function(i) window(x1, start = start(x1Ranges)[i], end = end(x1Ranges)[i])), b = lapply(seq_len(length(xRleViewsList[[2]])), function(i) window(x2, start = start(x2Ranges)[i], end = end(x2Ranges)[i]))) checkIdentical(c("a", "b"), names(viewApply(xRleViewsList, min))) checkIdentical(c("a", "b"), names(viewMins(xRleViewsList))) checkIdentical(c("a", "b"), names(viewMaxs(xRleViewsList))) checkIdentical(c("a", "b"), names(viewSums(xRleViewsList))) checkIdentical(c("a", "b"), names(viewMeans(xRleViewsList))) checkIdentical(c("a", "b"), names(viewWhichMins(xRleViewsList))) checkIdentical(c("a", "b"), names(viewWhichMaxs(xRleViewsList))) checkIdentical(c("a", "b"), names(viewRangeMins(xRleViewsList, na.rm = TRUE))) checkIdentical(c("a", "b"), names(viewRangeMaxs(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, min)), unlist(viewMins(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, min)), unlist(viewApply(xRleViewsList, min))) checkEqualsNumeric(unlist(lapply(xList, lapply, min, na.rm = TRUE)), unlist(viewMins(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, min, na.rm = TRUE)), unlist(viewApply(xRleViewsList, min, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, max)), unlist(viewMaxs(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, max)), unlist(viewApply(xRleViewsList, max))) checkEqualsNumeric(unlist(lapply(xList, lapply, max, na.rm = TRUE)), unlist(viewMaxs(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, max, na.rm = TRUE)), unlist(viewApply(xRleViewsList, max, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum)), unlist(viewSums(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, mean)), unlist(viewMeans(xRleViewsList))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum)), unlist(viewApply(xRleViewsList, sum))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum, na.rm = TRUE)), unlist(viewSums(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, mean, na.rm = TRUE)), unlist(viewMeans(xRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(xList, lapply, sum, na.rm = TRUE)), unlist(viewApply(xRleViewsList, sum, na.rm = TRUE))) y1 <- rep(c(1.2, 3.4, NA, 7.8, 9.0), 1:5) y1Ranges <- IRanges(start = c(1, 3, 5, 7, 9), end = c(1, 13, 11, 10, 9)) y1Rle <- Rle(y1) y2 <- rev(y1) y2Rle <- Rle(y2) y2Ranges <- IRanges(start = c(2, 4, 6, 8, 10), end = c(3, 9, 11, 13, 15)) checkIdentical(RleViewsList(Views(y1Rle, y1Ranges), Views(y2Rle, y2Ranges)), RleViewsList(rleList = RleList(y1Rle, y2Rle), rangesList = IRangesList(y1Ranges, y2Ranges))) yRleViewsList <- RleViewsList(Views(y1Rle, y1Ranges), Views(y2Rle, y2Ranges)) yList <- list(lapply(seq_len(length(yRleViewsList[[1]])), function(i) window(y1, start = start(y1Ranges)[i], end = end(y1Ranges)[i])), lapply(seq_len(length(yRleViewsList[[2]])), function(i) window(y2, start = start(y2Ranges)[i], end = end(y2Ranges)[i]))) checkEqualsNumeric(unlist(lapply(yList, lapply, min)), unlist(viewMins(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, min)), unlist(viewApply(yRleViewsList, min))) checkEqualsNumeric(unlist(lapply(yList, lapply, min, na.rm = TRUE)), unlist(viewMins(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, min, na.rm = TRUE)), unlist(viewApply(yRleViewsList, min, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, max)), unlist(viewMaxs(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, max)), unlist(viewApply(yRleViewsList, max))) checkEqualsNumeric(unlist(lapply(yList, lapply, max, na.rm = TRUE)), unlist(viewMaxs(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, max, na.rm = TRUE)), unlist(viewApply(yRleViewsList, max, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum)), unlist(viewSums(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, mean)), unlist(viewMeans(yRleViewsList))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum)), unlist(viewApply(yRleViewsList, sum))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum, na.rm = TRUE)), unlist(viewSums(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, mean, na.rm = TRUE)), unlist(viewMeans(yRleViewsList, na.rm = TRUE))) checkEqualsNumeric(unlist(lapply(yList, lapply, sum, na.rm = TRUE)), unlist(viewApply(yRleViewsList, sum, na.rm = TRUE))) } IRanges/inst/unitTests/test_coverage-methods.R0000644000175100017510000000155714626176651022545 0ustar00biocbuildbiocbuildtest_IRanges_coverage <- function() { ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width = c(12, 6, 6, 15, 6, 2, 7)) checkIdentical(as.vector(coverage(ir)), rep(c(1L, 2L, 1L, 2L, 3L, 2L, 1L, 0L, 1L, 0L, 1L), c(7, 5, 2, 4, 1, 5, 5, 4, 2, 4, 7))) ir <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) checkIdentical(as.vector(coverage(ir)), rep(c(3L, 1L, 0L, 1L, 2L, 1L), c(2, 2, 4, 1, 3, 2))) checkIdentical(as.vector(coverage(ir, shift=7)), rep(c(1L, 0L, 1L, 2L, 3L, 1L, 0L, 1L, 2L, 1L), c(3, 1, 2, 1, 2, 2, 4, 1, 3, 2))) checkIdentical(as.vector(coverage(ir, shift=7, width=27)), rep(c(1L, 0L, 1L, 2L, 3L, 1L, 0L, 1L, 2L, 1L, 0L), c(3, 1, 2, 1, 2, 2, 4, 1, 3, 2, 6))) } IRanges/inst/unitTests/test_extractList.R0000644000175100017510000000030214626176651021602 0ustar00biocbuildbiocbuild### test_relistToClass <- function() { ## TODO } test_relist <- function() { ## TODO } test_splitAsList <- function() { ## TODO } test_extractList <- function() { ## TODO } IRanges/inst/unitTests/test_findOverlaps-methods.R0000644000175100017510000001317514626176651023405 0ustar00biocbuildbiocbuild### test_findOverlaps_IntegerRanges <- function() { ## ..... ## .... ## .. ## x ## xx ## xxx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) result <- findOverlaps(query, subject, select = "first") checkIdentical(result, c(1L, NA, 3L)) result <- findOverlaps(query, subject, select = "last") checkIdentical(result, c(2L, NA, 3L)) result <- findOverlaps(query, subject, select = "arbitrary") checkIdentical(result, c(2L, NA, 3L)) checkOverlap <- function(a, q, s, r, c, self=FALSE) { target <- Hits(q, s, r, c, sort.by.query=TRUE) if (self) target <- as(target, "SortedByQuerySelfHits") checkIdentical(t(a), t(target)) } result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 3), c(1, 2, 3), 3, 3) ## with 'maxgap' result <- findOverlaps(query, subject, maxgap = 0L) checkOverlap(result, c(1, 1, 2, 3), c(2, 1, 2, 3), 3, 3) ## with 'minoverlap' result <- findOverlaps(query, subject, minoverlap = 3L) checkOverlap(result, integer(0), integer(0), 3, 3) result <- findOverlaps(query, subject, minoverlap = 2L) checkOverlap(result, 1, 2, 3, 3) result <- findOverlaps(query, subject, minoverlap = 2L, select = "first") checkIdentical(result, c(2L, NA, NA)) result <- findOverlaps(query, subject, minoverlap = 2L, select = "last") checkIdentical(result, c(2L, NA, NA)) result <- findOverlaps(query, subject, minoverlap = 2L, select = "arbitrary") checkIdentical(result, c(2L, NA, NA)) ## zero-width ranges query <- IRanges(9:14, 8:13) result <- findOverlaps(query, subject, minoverlap = 1L) checkOverlap(result, integer(0), integer(0), 6, 3) result <- findOverlaps(query, subject) checkOverlap(result, c(3, 4), c(3, 3), 6, 3) result <- findOverlaps(query, subject, maxgap = 0L) checkOverlap(result, 2:5, c(3, 3, 3, 3), 6, 3) result <- findOverlaps(query, subject, maxgap = 1L) checkOverlap(result, 1:6, c(3, 3, 3, 3, 3, 3), 6, 3) result <- findOverlaps(subject, query, minoverlap = 1L) checkOverlap(result, integer(0), integer(0), 3, 6) result <- findOverlaps(subject, query) checkOverlap(result, c(3, 3), c(3, 4), 3, 6) result <- findOverlaps(subject, query, maxgap = 0L) checkOverlap(result, c(3, 3, 3, 3), 2:5, 3, 6) result <- findOverlaps(subject, query, maxgap = 1L) checkOverlap(result, c(3, 3, 3, 3, 3, 3), 1:6, 3, 6) ## ..... ## .... ## .. ## xxxx ## xxx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2), c(5, 4)) result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 2, 2), c(1, 2, 1, 2), 3, 2) result <- findOverlaps(subject, query) checkOverlap(result, c(1, 1, 2, 2), c(1, 2, 1, 2), 2, 3) query <- IRanges(c(1, 4, 9, 11), c(5, 7, 10, 11)) result <- findOverlaps(query) checkOverlap(result, c(1, 1, 2, 2, 3, 4), c(1, 2, 1, 2, 3, 4), 4, 4, TRUE) ## check case of identical subjects ## ..... ## ..... ## .. ## xxxx ## xxxx ## xx ## xxx ## xx query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 6, 6, 6), c(5, 5, 7, 8, 7)) result <- findOverlaps(query, subject) checkOverlap(result, c(1, 1, 2, 2, 2, 2, 2), c(1, 2, 1, 2, 3, 4, 5), 3, 5) subject <- IRanges(c(1, 6, 13), c(4, 9, 14)) # single points checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "first"), c(1L, 2L, NA)) checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "last"), c(1L, 2L, NA)) checkIdentical(findOverlaps(c(3L, 7L, 10L), subject, select = "arbitrary"), c(1L, 2L, NA)) checkIdentical(findOverlaps(IRanges(c(2,1),c(3,4)), subject), Hits(1:2, c(1, 1), 2, 3, sort.by.query=TRUE)) ## check other types of matching ## .. ## .. ## .... ## ...... ## xxxx ## xxxx ## xxxxx ## xxxx query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) ## 'start' result <- findOverlaps(query, subject, type = "start") checkOverlap(result, c(1, 2, 3), c(1, 3, 2), 4, 4) ## minoverlap > 1L result <- findOverlaps(query, subject, type = "start", minoverlap = 3L) checkOverlap(result, 3, 2, 4, 4) ## 'end' result <- findOverlaps(query, subject, type = "end") checkOverlap(result, c(2, 3, 4, 4), c(2, 2, 3, 4), 4, 4) result <- findOverlaps(subject, query, type = "end") checkOverlap(result, c(2, 2, 3, 4), c(2, 3, 4, 4), 4, 4) ## select = "first" result <- findOverlaps(query, subject, type = "end", select = "first") checkIdentical(result, c(NA, 2L, 2L, 3L)) ## 'within' result <- findOverlaps(query, subject, type = "within") checkOverlap(result, c(1, 2, 2, 3), c(1, 2, 3, 2), 4, 4) ## 'equal' result <- findOverlaps(query, subject, type = "equal") checkOverlap(result, 3, 2, 4, 4) checkException(findOverlaps(query, NULL), silent = TRUE) checkException(findOverlaps(NULL, query), silent = TRUE) } test_subsetByOverlaps_IntegerRanges <- function() { x <- IRanges(9:12, 15) ranges <- IRanges(1, 10) checkIdentical(x[1:2], subsetByOverlaps(x, ranges)) checkIdentical(x[3:4], subsetByOverlaps(x, ranges, invert=TRUE)) checkIdentical(x[1:3], subsetByOverlaps(x, ranges, maxgap=0)) checkIdentical(x[4], subsetByOverlaps(x, ranges, maxgap=0, invert=TRUE)) x <- IRanges(c(1, 4, 9), c(5, 7, 10)) ranges <- IRanges(c(6, 8, 10), c(7, 12, 14)) checkIdentical(x[2:3], subsetByOverlaps(x, ranges)) checkIdentical(x[1], subsetByOverlaps(x, ranges, invert=TRUE)) checkIdentical(x, subsetByOverlaps(x, ranges, maxgap=0)) checkIdentical(x[0], subsetByOverlaps(x, ranges, maxgap=0, invert=TRUE)) } IRanges/inst/unitTests/test_inter-range-methods.R0000644000175100017510000003121314626176651023155 0ustar00biocbuildbiocbuildtest_range_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) ir2 <- IRanges(c(5,2,0), c(6,3,1)) checkIdentical(range(ir1), IRanges(1, 7)) checkIdentical(range(ir1, ir2), IRanges(0, 7)) checkIdentical(range(IRanges()), IRanges()) checkException(range(ir1, c(2,3)), silent = TRUE) # check with.revmap rng1 <- range(ir1, with.revmap=TRUE) rng2 <- range(ir2, with.revmap=TRUE) rng3 <- range(ir1,ir2, with.revmap=TRUE) checkIdentical(mcols(rng1)$revmap, IntegerList(seq(3))) checkIdentical(mcols(rng2)$revmap, IntegerList(seq(3))) checkIdentical(mcols(rng3)$revmap, IntegerList(seq(6))) ir3 <- IRanges() checkIdentical(mcols(range(ir3, with.revmap=TRUE))$revmap, IntegerList()) } test_range_IntegerRangesList <- function() { for (compress in c(TRUE, FALSE)) { rl1 <- IRangesList(a = IRanges(c(1,2),c(4,3)), b = IRanges(c(4,6),c(10,7)), compress = compress) rl2 <- IRangesList(c = IRanges(c(0,2),c(4,5)), a = IRanges(c(4,5),c(6,7)), compress = compress) ans <- IRangesList(a = IRanges(1,7), b = IRanges(4,10), c = IRanges(0,5), compress = compress) checkIdentical(range(rl1, rl2), ans) names(rl2) <- NULL ans <- IRangesList(IRanges(0,5), IRanges(4,10), compress = compress) checkIdentical(range(rl1, rl2), ans) ## must be same length checkException(range(rl2, rep.int(rl2, 2L)), silent=TRUE) } # check with.revmap revmap1 <- mcols(range(rl1,rl2, with.revmap=TRUE)[[1]])$revmap revmap2 <- mcols(range(rl1,rl2, with.revmap=TRUE)[[2]])$revmap ans <- IntegerList(seq(4)) checkIdentical(revmap1, ans) checkIdentical(revmap2, ans) range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges() range4 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) collection <- IRangesList(range1, range2, range3, range4) rng <- range(collection, with.revmap=TRUE) checkIdentical(mcols(rng[[1]])$revmap, IntegerList(1:3)) checkIdentical(mcols(rng[[2]])$revmap, IntegerList(1:4)) checkIdentical(mcols(rng[[3]])$revmap, IntegerList()) checkIdentical(mcols(rng[[4]])$revmap, IntegerList(1:3)) rng <- range(IRangesList(IRanges(), IRanges()), with.revmap=TRUE) checkIdentical(mcols(rng[[1]])$revmap, IntegerList()) checkIdentical(mcols(rng[[2]])$revmap, IntegerList()) } test_reduce_IntegerRanges <- function() { x <- IRanges() current <- reduce(x) checkIdentical(x, current) x <- IRanges(1:3, width=0) current <- reduce(x, with.revmap=TRUE) target <- x mcols(target) <- DataFrame(revmap=as(seq_along(target), "IntegerList")) checkIdentical(target, current) current <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) target <- IRanges() mcols(target) <- DataFrame(revmap=IntegerList(seq_along(target))) checkIdentical(target, current) x <- IRanges(c(1:4, 10:11, 11), width=c(0,1,1,0,0,0,1)) current <- reduce(x, with.revmap=TRUE) target <- IRanges(c(1:2, 10:11), width=c(0,2,0,1)) mcols(target) <- DataFrame(revmap=IntegerList(1,2:4,5,6:7)) checkIdentical(target, current) current <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) target <- IRanges(c(2, 11), width=c(2,1)) mcols(target) <- DataFrame(revmap=IntegerList(2:3,7)) checkIdentical(target, current) x <- IRanges(start=c(1,2,3), end=c(5,2,8)) y <- reduce(x, with.revmap=TRUE) target <- IRanges(start=1, end=8) mcols(target) <- DataFrame(revmap=IntegerList(1:3)) checkIdentical(target, y) mcols(target)$revmap <- as(seq_along(target), "IntegerList") checkIdentical(target, reduce(y, with.revmap=TRUE)) x <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) y <- reduce(x, with.revmap=TRUE) target <- IRanges(start=c(1,15,20), end=c(5,15,100)) mcols(target) <- DataFrame(revmap=IntegerList(4, 1, 3:2)) checkIdentical(target, y) mcols(target)$revmap <- as(seq_along(target), "IntegerList") checkIdentical(target, reduce(y, with.revmap=TRUE)) x <- IRanges(start=c(7,3,-2,6,7,-10,-2,3), width=c(3,1,0,0,0,0,8,0)) ## Before reduction: ## start end width ==-10===-5====0===+5==+10=== ## [1] 7 9 3 ....:....:....:....:.xxx:... ## [2] 3 3 1 ....:....:....:..x.:....:... ## [3] -2 -3 0 ....:....:..[.:....:....:... ## [4] 6 5 0 ....:....:....:....:[...:... ## [5] 7 6 0 ....:....:....:....:.[..:... ## [6] -10 -11 0 ....[....:....:....:....:... ## [7] -2 5 8 ....:....:..xxxxxxxx....:... ## [8] 3 2 0 ....:....:....:..[.:....:... ## ---------------------==-10===-5====0===+5==+10=== ## After reduction: ## y1: ....[....:..xxxxxxxx.xxx:... ## y3: ....:....:..xxxxxxxx....:... y1 <- reduce(x) checkIdentical(y1, IRanges(start=c(-10,-2,7), end=c(-11,5,9))) checkIdentical(reduce(y1), y1) y2 <- reduce(x, with.inframe.attrib=TRUE) checkIdentical(start(attr(y2, "inframe")), c(9L,6L,1L,9L,9L,1L,1L,6L)) checkIdentical(width(attr(y2, "inframe")), width(x)) y3 <- reduce(x, drop.empty.ranges=TRUE) checkIdentical(y3, y1[width(y1) != 0L]) checkIdentical(reduce(y3), y3) y4 <- reduce(x, drop.empty.ranges=TRUE, with.inframe.attrib=TRUE) checkIdentical(attr(y4, "inframe"), attr(y2, "inframe")) y5 <- reduce(x, min.gapwidth=0) checkIdentical(y5, IRanges(start=c(-10,-2,-2,6,7,7), end=c(-11,-3,5,5,6,9))) y6 <- reduce(x, drop.empty.ranges=TRUE, min.gapwidth=0) checkIdentical(y6, y5[width(y5) != 0L]) y7 <- reduce(x, min.gapwidth=2) checkIdentical(y7, IRanges(start=c(-10,-2), end=c(-11,9))) y8 <- reduce(x, min.gapwidth=8) checkIdentical(y8, y7) y9 <- reduce(x, min.gapwidth=9) checkIdentical(y9, IRanges(start=-10, end=9)) } test_reduce_IntegerRangesList <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) range3 <- IRanges(start=c(3,-2,6,7,-10,-2,3), width=c(1,0,0,0,0,8,0)) range4 <- IRanges() for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one=range1, range2, range3, range4, compress=compress) for (with.revmap in c(FALSE, TRUE)) { for (drop.empty.ranges in c(FALSE, TRUE)) { current <- reduce(collection, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap) target <- IRangesList(one=reduce(range1, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range2, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range3, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), reduce(range4, drop.empty.ranges=drop.empty.ranges, with.revmap=with.revmap), compress=compress) checkIdentical(target, current) } } } } test_gaps_IntegerRanges <- function() { checkIdentical(gaps(IRanges()), IRanges()) checkIdentical(gaps(IRanges(), start=1, end=4), IRanges(start=1, end=4)) x <- IRanges(start=2, end=3) checkIdentical(gaps(x), IRanges()) checkIdentical(gaps(x, start=2), IRanges()) checkIdentical(gaps(x, start=4), IRanges()) checkIdentical(gaps(x, start=0), IRanges(start=0, end=1)) checkIdentical(gaps(x, end=3), IRanges()) checkIdentical(gaps(x, end=1), IRanges()) checkIdentical(gaps(x, end=5), IRanges(start=4, end=5)) checkIdentical(gaps(x, start=0, end=5), IRanges(start=c(0,4), end=c(1,5))) } test_gaps_IntegerRangesList <- function() { range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one = range1, range2, compress = compress) checkIdentical(gaps(collection), IRangesList(one = gaps(range1), gaps(range2), compress = compress)) } } test_disjoin_IntegerRanges <- function() { checkIdentical(disjoin(IRanges()), IRanges()) ir <- IRanges(c(1, 21, 10, 1, 15, 5, 20, 20), c(6, 20, 9, 3, 14, 11, 20, 19)) current <- disjoin(ir) checkTrue(validObject(current, complete=TRUE)) ## The result of disjoin(x) must verify the following properties: check_disjoin_general_properties <- function(y, x) { checkTrue(isDisjoint(y)) checkTrue(isStrictlySorted(y)) checkIdentical(reduce(x, drop.empty.ranges=TRUE), reduce(y)) checkTrue(all(start(y) %in% c(start(x), end(x) + 1L))) checkTrue(all(end(y) %in% c(end(x), start(x) - 1L))) } check_disjoin_general_properties(current, ir) target <- IRanges(c(1, 4, 5, 7, 10, 20), c(3, 4, 6, 9, 11, 20)) checkIdentical(target, current) ## Check 'revmap'. mcols(ir)$label <- LETTERS[seq_along(ir)] current <- disjoin(ir, with.revmap=TRUE) revmap <- IntegerList(c(1, 4), 1, c(1, 6), 6, 6, 7) mcols(target)$revmap <- revmap checkIdentical(target, current) ## With many randomly generated ranges. set.seed(2009L) ir <- IRanges(start=sample(580L, 500L, replace=TRUE), width=sample(10L, 500L, replace=TRUE) - 1L) check_disjoin_general_properties(disjoin(ir), ir) ir <- IRanges(start=sample(4900L, 500L, replace=TRUE), width=sample(35L, 500L, replace=TRUE) - 1L) check_disjoin_general_properties(disjoin(ir), ir) } test_disjoin_IntegerRangesList <- function() { ir0 <- IRanges(10, 20) checkTrue(validObject(disjoin(IRangesList()))) ## unnamed; incl. 0-length irl <- IRangesList(IRanges()) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(ir0, IRanges(), ir0) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(ir0, IRanges(), IRanges(), ir0) checkIdentical(irl, disjoin(irl)) ## named; incl. 0-length irl <- IRangesList(a=IRanges()) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(a=ir0, b=IRanges(), c=ir0) checkIdentical(irl, disjoin(irl)) irl <- IRangesList(a=ir0, b=IRanges(), c=IRanges(), d=ir0) checkIdentical(irl, disjoin(irl)) ## no interference between separate elements ir0 <- IRanges(10, c(15, 20)) dr0 <- disjoin(ir0) irl <- IRangesList(ir0, ir0) checkIdentical(IRangesList(dr0, dr0), disjoin(irl)) irl <- IRangesList(ir0, IRanges(), ir0) checkIdentical(IRangesList(dr0, IRanges(), dr0), disjoin(irl)) ## 0-width ## 1-width ir0 <- IRanges(c(1, 10), 10) irl <- IRangesList(ir0, IRanges()) checkIdentical(disjoin(ir0), disjoin(irl)[[1]]) irl <- IRangesList(IRanges(), ir0) checkIdentical(disjoin(ir0), disjoin(irl)[[2]]) ## check don't collapse levels irl <- IRangesList(IRanges(1, 5), IRanges(3, 7)) names(irl) <- character(2) checkIdentical(irl, disjoin(irl)) ## check 'revmap' on many randomly generated ranges set.seed(2009L) ir1 <- IRanges(start=sample(580L, 500L, replace=TRUE), width=sample(10L, 500L, replace=TRUE) - 1L) ir2 <- IRanges(start=sample(4900L, 500L, replace=TRUE), width=sample(35L, 500L, replace=TRUE) - 1L) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(one=ir1, IRanges(), ir0, ir0, ir2, IRanges(), compress=compress) for (with.revmap in c(FALSE, TRUE)) { current <- disjoin(collection, with.revmap=with.revmap) target <- IRangesList(one=disjoin(ir1, with.revmap=with.revmap), disjoin(IRanges(), with.revmap=with.revmap), disjoin(ir0, with.revmap=with.revmap), disjoin(ir0, with.revmap=with.revmap), disjoin(ir2, with.revmap=with.revmap), disjoin(IRanges(), with.revmap=with.revmap), compress=compress) checkIdentical(target, current) } } } test_disjointBins_IntegerRanges <- function() { checkIdentical(disjointBins(IRanges()), integer()) checkIdentical(disjointBins(IRanges(1, 5)), 1L) checkIdentical(disjointBins(IRanges(c(1, 3), c(5, 12))), c(1L, 2L)) checkIdentical(disjointBins(IRanges(c(1, 3, 10), c(5, 12, 13))), c(1L, 2L, 1L)) checkIdentical(disjointBins(IRanges(c(3, 1, 10), c(5, 12, 13))), c(2L, 1L, 2L)) } IRanges/inst/unitTests/test_intra-range-methods.R0000644000175100017510000002377414626176651023166 0ustar00biocbuildbiocbuildtest_shift_IntegerRanges <- function() { ir0 <- IRanges(0, 0) ir1 <- shift(ir0, .Machine$integer.max) checkTrue(validObject(ir1)) checkIdentical(.Machine$integer.max, start(ir1)) checkIdentical(.Machine$integer.max, end(ir1)) checkIdentical(1L, width(ir1)) checkIdentical(ir1, shift(ir1)) checkIdentical(ir1, shift(shift(ir1, -10), 10)) ir2 <- shift(ir0, -.Machine$integer.max) checkTrue(validObject(ir2)) checkIdentical(-.Machine$integer.max, start(ir2)) checkIdentical(-.Machine$integer.max, end(ir2)) checkIdentical(1L, width(ir2)) checkIdentical(ir2, shift(ir2)) checkIdentical(ir2, shift(shift(ir2, 10), -10)) ## shift() would produce an object with ranges that are not within the ## [-.Machine$integer.max, .Machine$integer.max] range. checkException(suppressWarnings(shift(ir1, 1)), silent=TRUE) checkException(suppressWarnings(shift(ir2, -1)), silent=TRUE) ir3 <- IRanges(1999222000, width=1000) checkException(suppressWarnings(shift(ir3, 188222000)), silent=TRUE) ir4 <- IRanges(1:20, width=222000000) checkException(suppressWarnings(shift(ir4, 1:20 * 99000000L)), silent=TRUE) } test_narrow_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(narrow(ir1, start=1, end=2), IRanges(c(2, 5, 1), c(3, 6, 2))) checkException(narrow(ir1, start=10, end=20), silent = TRUE) } test_narrow_IRangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(narrow(collection, start=1, end=2), IRangesList(IRanges(c(2, 5), c(3, 6)), IRanges(1, 2), compress = compress)) checkException(narrow(collection, start=10, end=20), silent = TRUE) } } test_resize_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(resize(ir1, width=10), IRanges(c(2, 5, 1), width=10)) checkIdentical(resize(ir1, width=10, fix="end"), IRanges(c(-6, -2, -6), width=10)) checkIdentical(resize(ir1, width=10, fix="center"), IRanges(c(-2, 1, -3), width=10)) checkIdentical(resize(ir1, width=10, fix=c("start", "end", "center")), IRanges(c(2, -2, -3), width=10)) checkException(resize(ir1, -1), silent = TRUE) } test_resize_IRangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(resize(collection, width=10), IRangesList(IRanges(c(2, 5), width=10), IRanges(1, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix="end"), IRangesList(IRanges(c(-6, -2), width=10), IRanges(-6, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix="center"), IRangesList(IRanges(c(-2, 1), width=10), IRanges(-3, width=10), compress = compress)) checkIdentical(resize(collection, width=10, fix=CharacterList(c("start", "end"), "center")), IRangesList(IRanges(c(2, -2), width=10), IRanges(-3, width=10), compress = compress)) checkException(resize(collection, -1), silent = TRUE) } } test_flank_IntegerRanges <- function() { checkIdentical(flank(IRanges(), 2), IRanges()) ir1 <- IRanges(c(2, 5, 1), c(3, 7, 3)) checkIdentical(flank(ir1, 2), IRanges(c(0, 3, -1), c(1, 4, 0))) checkIdentical(flank(ir1, 2, FALSE), IRanges(c(4, 8, 4), c(5, 9, 5))) checkIdentical(flank(ir1, 2, c(FALSE, TRUE, FALSE)), IRanges(c(4, 3, 4), c(5, 4, 5))) checkIdentical(flank(ir1, c(2, -2, 2)), IRanges(c(0, 5, -1), c(1, 6, 0))) checkIdentical(flank(ir1, 2, both = TRUE), IRanges(c(0, 3, -1), c(3, 6, 2))) checkIdentical(flank(ir1, 2, FALSE, TRUE), IRanges(c(2, 6, 2), c(5, 9, 5))) checkIdentical(flank(ir1, -2, FALSE, TRUE), IRanges(c(2, 6, 2), c(5, 9, 5))) checkException(flank(ir1, 2, both = c(TRUE, FALSE, TRUE)), silent = TRUE) # not vectorized checkException(flank(ir1, 2, c(FALSE, TRUE, NA)), silent = TRUE) checkException(flank(ir1, NA), silent = TRUE) } test_flank_IRangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(flank(collection, 2), IRangesList(IRanges(c(0, 3), c(1, 4)), IRanges(-1, 0), compress = compress)) checkIdentical(flank(collection, 2, FALSE), IRangesList(IRanges(c(4, 8), c(5, 9)), IRanges(4, 5), compress = compress)) checkIdentical(flank(collection, 2, LogicalList(c(FALSE, TRUE), FALSE)), IRangesList(IRanges(c(4, 3), c(5, 4)), IRanges(4, 5), compress = compress)) checkIdentical(flank(collection, IntegerList(c(2, -2), 2)), IRangesList(IRanges(c(0, 5), c(1, 6)), IRanges(-1, 0), compress = compress)) checkIdentical(flank(collection, 2, both = TRUE), IRangesList(IRanges(c(0, 3), c(3, 6)), IRanges(-1, 2), compress = compress)) checkIdentical(flank(collection, 2, FALSE, TRUE), IRangesList(IRanges(c(2, 6), c(5, 9)), IRanges(2, 5), compress = compress)) checkIdentical(flank(collection, -2, FALSE, TRUE), IRangesList(IRanges(c(2, 6), c(5, 9)), IRanges(2, 5), compress = compress)) checkException(flank(collection, 2, both = c(TRUE, FALSE, TRUE)), silent = TRUE) # not vectorized checkException(flank(collection, 2, LogicalList(c(FALSE, TRUE), NA)), silent = TRUE) checkException(flank(collection, NA), silent = TRUE) } } test_promoters_IntegerRanges <- function() { ir0 <- IRanges(score=integer(0)) # zero-length checkIdentical(promoters(ir0, 5, 2), ir0) checkIdentical(terminators(ir0, 5, 2), ir0) score <- c(0.8, 0.55) ir <- IRanges(c(10, 10), width=c(0, 1), score=score) checkIdentical(promoters(ir, 0, 0), IRanges(c(10, 10), width=0, score=score)) checkIdentical(terminators(ir, 0, 0), IRanges(c(9, 10), width=0, score=score)) checkIdentical(promoters(ir, 1, 0), IRanges(c(9, 9), width=1, score=score)) checkIdentical(terminators(ir, 1, 0), IRanges(c(8, 9), width=1, score=score)) checkIdentical(promoters(ir, 0, 1), IRanges(c(10, 10), width=1, score=score)) checkIdentical(terminators(ir, 0, 1), IRanges(c(9, 10), width=1, score=score)) ir <- IRanges(c(5, 2, 20), width=1) checkIdentical(promoters(ir, 5, 2), IRanges(c(0, -3, 15), width=7)) checkIdentical(terminators(ir, 5, 2), promoters(ir, 5, 2)) library(XVector) subject <- XInteger(10, 3:-6) view <- Views(subject, start=4:2, end=4:6) current <- promoters(view, 0, 0) checkIdentical(start(current), start(view)) checkIdentical(width(current), rep.int(0L, length(view))) current <- terminators(view, 0, 0) checkIdentical(start(current), end(view)) checkIdentical(width(current), rep.int(0L, length(view))) current <- promoters(view, 3, 10) checkIdentical(start(current), start(view) - 3L) checkIdentical(end(current), start(view) + 9L) current <- terminators(view, 3, 10) checkIdentical(start(current), end(view) - 3L) checkIdentical(end(current), end(view) + 9L) } test_promoters_IRangesList <- function() { irl <- IRangesList("A"=IRanges(5:7, width=1), "B"=IRanges(10:12, width=5)) current <- promoters(irl, 0, 0) checkIdentical(names(current), names(irl)) checkIdentical(start(current), start(irl)) current <- terminators(irl, 0, 0) checkIdentical(names(current), names(irl)) checkIdentical(start(current), end(irl)) current <- promoters(irl, 2, 0) checkIdentical(unique(unlist(width(current))), 2L) current <- terminators(irl, 2, 0) checkIdentical(unique(unlist(width(current))), 2L) } test_reflect_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) bounds <- IRanges(c(0, 5, 3), c(10, 6, 9)) checkIdentical(reflect(ir1, bounds), IRanges(c(7, 4, 9), c(8, 6, 11))) checkException(reflect(ir1, IRanges()), silent = TRUE) } test_restrict_IntegerRanges <- function() { ir1 <- IRanges(c(2,5,1), c(3,7,3)) checkIdentical(restrict(ir1, start=2, end=5), IRanges(c(2, 5, 2), c(3, 5, 3))) checkIdentical(restrict(ir1, start=1, end=2), IRanges(c(2, 1), c(2, 2))) checkIdentical(restrict(ir1, start=1, end=2, keep.all.ranges=TRUE), IRanges(c(2, 3, 1), c(2, 2, 2))) } test_restrict_IRangesList <- function() { range1 <- IRanges(start=c(2,5), end=c(3,7)) range2 <- IRanges(start=1, end=3) for (compress in c(TRUE, FALSE)) { collection <- IRangesList(range1, range2, compress = compress) checkIdentical(restrict(collection, start=2, end=5), IRangesList(IRanges(c(2, 5), c(3, 5)), IRanges(2, 3), compress = compress)) checkIdentical(restrict(collection, start=1, end=2), IRangesList(IRanges(2, 2), IRanges(1, 2), compress = compress)) checkIdentical(restrict(collection, start=1, end=2, keep.all.ranges=TRUE), IRangesList(IRanges(c(2, 3), c(2, 2)), IRanges(1, 2), compress = compress)) } } test_zoom_IntegerRanges <- function() { ir <- IRanges(c(1,5), c(3,10)) checkIdentical(ir*1, ir) checkIdentical(ir*c(1,2), IRanges(c(1,6), c(3, 8))) checkIdentical(ir*-2, IRanges(c(-1,2), c(4, 13))) checkException(ir*NA_integer_, silent = TRUE) checkException(ir*numeric(), silent = TRUE) checkException(ir*c(1,2,1), silent = TRUE) checkException(ir[rep(1,3)]*c(1,2), silent = TRUE) } IRanges/inst/unitTests/test_nearest-methods.R0000644000175100017510000001165114626176651022407 0ustar00biocbuildbiocbuildcheckMatching <- function(a, q, s, r, c) { mat <- cbind(queryHits = as.integer(q), subjectHits = as.integer(s)) checkIdentical(as.matrix(a), mat) checkIdentical(c(queryLength(a), subjectLength(a)), as.integer(c(r, c))) } test_precede_follow_IntegerRanges <- function() { query <- IRanges(c(1, 3, 9), c(3, 7, 10)) subject <- IRanges(c(3, 10, 2), c(3, 12, 5)) checkIdentical(precede(query, subject), c(2L, 2L, NA)) checkIdentical(precede(IRanges(), subject), integer()) checkIdentical(precede(query, IRanges()), rep(NA_integer_, 3)) checkIdentical(precede(query), c(3L, 3L, NA)) checkIdentical(follow(query, subject), c(NA, NA, 3L)) checkIdentical(follow(IRanges(), subject), integer()) checkIdentical(follow(query, IRanges()), rep(NA_integer_, 3)) checkIdentical(follow(query), c(NA, NA, 2L)) checkMatching(precede(query, subject, select="all"), c(1, 2), c(2, 2), 3, 3) ## xxxx ## xxx ## xx ## xx ## xxx ## .. ## .. ## .. ## .. ## .. subject <- IRanges(c(1, 2, 9, 15, 15), width=c(4, 3, 2, 2, 3)) query <- IRanges(c(6, 11, 1, 13, 18), width=c(2, 2, 2, 2, 2)) checkMatching(precede(query, subject, select="all"), c(1, 2, 2, 3, 4, 4), c(3, 4, 5, 3, 4, 5), 5, 5) checkMatching(precede(subject, query, select="all"), c(1, 2, 3, 4, 5), c(1, 1, 2, 5, 5), 5, 5) checkMatching(follow(query, subject, select="all"), c(1, 1, 2, 4, 5), c(1, 2, 3, 3, 5), 5, 5) checkMatching(follow(subject, query, select="all"), c(3, 4, 5), c(1, 4, 4), 5, 5) checkMatching(precede(query, select="all"), c(1, 2, 3, 4), c(2, 4, 1, 5), 5, 5) checkMatching(precede(subject, select="all"), c(1, 2, 3, 3), c(3, 3, 4, 5), 5, 5) checkMatching(follow(query, select="all"), c(1, 2, 4, 5), c(3, 1, 2, 4), 5, 5) checkMatching(follow(subject, select="all"), c(3, 3, 4, 5), c(1, 2, 3, 3), 5, 5) } test_nearest_IntegerRanges <- function() { query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) ## 2 possible results current <- nearest(query, subject) target1 <- c(1L, 1L, 3L) target2 <- c(1L, 2L, 3L) checkTrue(identical(target1, current) || identical(target2, current)) checkIdentical(nearest(query), c(2L, 1L, 2L)) checkIdentical(nearest(query, subject[c(2,3,1)]), c(3L, 3L, 2L)) ## xxxx ## xxx ## xx ## xx ## xxx ## .. ## .. ## .. ## .. ## .. subject <- IRanges(c(1, 2, 9, 15, 15), width=c(4, 3, 2, 2, 3)) query <- IRanges(c(6, 11, 1, 13, 18), width=c(2, 2, 2, 2, 2)) checkMatching(nearest(query, subject, select = "all"), c(1, 1, 1, 2, 3, 3, 4, 4, 5), c(1, 2, 3, 3, 1, 2, 4, 5, 5), 5, 5) checkMatching(nearest(subject, query, select = "all"), c(1, 2, 3, 4, 5, 5), c(3, 3, 2, 4, 4, 5), 5, 5) checkMatching(nearest(subject, select="all"), c(1, 2, 3, 3, 3, 3, 4, 5), c(2, 1, 1, 2, 4, 5, 5, 4), 5, 5) checkMatching(nearest(query, select="all"), c(1, 1, 2, 3, 4, 5), c(2, 3, 4, 1, 2, 4), 5, 5) } quiet <- suppressWarnings test_distance_IntegerRanges <- function() { checkIdentical(quiet(distance(IRanges(), IRanges())), integer()) ## adjacent, overlap, separated by 1 query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) checkIdentical(quiet(distance(query, subject)), c(0L, 0L, 1L)) ## recycling checkIdentical(quiet(distance(query[1:2], subject)), c(0L, 0L, 9L)) ## zero-width target <- abs(-3:3) current <- sapply(-3:3, function(i) quiet(distance(shift(IRanges(4,3), i), IRanges(4,3)))) checkIdentical(target, current) checkIdentical(quiet(distance(IRanges(4,3), IRanges(3,4))), 0L) } test_distanceToNearest_IntegerRanges <- function() { target <- Hits(sort.by.query=TRUE) current <- distanceToNearest(IRanges(), IRanges()) checkIdentical(queryHits(current), queryHits(target)) checkIdentical(subjectHits(current), subjectHits(target)) checkIdentical(queryLength(current), queryLength(target)) x <- IRanges(5, 10) subject <- IRanges(c(1, 1, 1), c(4, 5, 6)) current <- distanceToNearest(x, subject, select="all") checkIdentical(subjectHits(current), 1:3) current <- distanceToNearest(x, rev(subject), select="all") checkIdentical(subjectHits(current), 1:3) current <- distanceToNearest(x, IRanges()) checkIdentical(length(current), 0L) checkIdentical(queryLength(current), 1L) checkIdentical(subjectLength(current), 0L) x <- IRanges(c(2, 4, 12, 15), c(2, 3, 13, 14)) subject <- IRanges(1, 10) current <- distanceToNearest(x, subject) checkIdentical(queryHits(current), 1:4) checkIdentical(mcols(current)$distance, c(0L, 0L, 1L, 4L)) } IRanges/inst/unitTests/test_seqapply.R0000644000175100017510000000133014626176651021134 0ustar00biocbuildbiocbuildtest_unsplit <- function() { ir <- IRanges(1:5, 11:15) f <- factor(c("a", "b", "a", "b", "b"), c("b", "a", "c")) rl <- split(ir, f) checkIdentical(unsplit(rl, f), ir) rl <- split(ir, f, drop=TRUE) checkIdentical(unsplit(rl, Rle(f), drop=TRUE), ir) checkException(unsplit(rl, f, drop=FALSE), silent=TRUE) v <- 1:5 l <- splitAsList(v, f) checkIdentical(unsplit(l, Rle(f)), v) names(ir) <- letters[1:5] rl <- split(ir, f) checkIdentical(unsplit(rl, f), ir) df <- DataFrame(unname(ir), row.names=names(ir)) dfl <- split(df, f) checkIdentical(unsplit(dfl, f), df) ir <- IRanges(1:5, 11:15) names(ir)[c(1,3,5)] <- letters[1:3] rl <- split(ir, f) checkIdentical(unsplit(rl, f), ir) } IRanges/inst/unitTests/test_setops-methods.R0000644000175100017510000000770214626176651022265 0ustar00biocbuildbiocbuildtest_IRanges_union <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- union(x, y) ans0 <- IRanges(c(1, 9), c(7, 12)) checkIdentical(ans, ans0) } test_IRanges_intersect <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- intersect(x, y) ans0 <- IRanges(c(2,10),c(3,10)) checkIdentical(ans, ans0) } test_IRanges_setdiff <- function() { x <- IRanges(c(1, 4, 9), c(5, 7, 10)) y <- IRanges(c(2, 2, 10), c(2, 3, 12)) ans <- setdiff(x, y) ans0 <- IRanges(c(1,4,9), c(1,7,9)) checkIdentical(ans, ans0) ans <- setdiff(y, x) ans0 <- IRanges(c(11), c(12)) checkIdentical(ans, ans0) } test_IRanges_punion <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans0 <- IRanges(start=c(1,8,18,31,41,48,61,71), end=c(5,15,25,36,45,55,65,79)) checkIdentical(punion(x, y, fill.gap=TRUE), ans0) checkIdentical(punion(y, x, fill.gap=TRUE), ans0) } test_IRanges_pintersect <- function() { x <- IRanges(start=c(22,22,22,22,22,22), end=c(28,28,28,28,21,21)) y <- IRanges(start=c(25,30,29,25,22,22), end=c(30,40,40,24,21,29)) ansMaxStart <- IRanges(start=c(25,30,29,25,22,22), end=c(28,29,28,24,21,21)) ansStartX <- IRanges(start=c(25,22,29,25,22,22), end=c(28,21,28,24,21,21)) ansStartY <- IRanges(start=c(25,30,29,25,22,22), end=c(28,29,28,24,21,21)) checkException(pintersect(x, y), silent = TRUE) checkException(pintersect(y, x), silent = TRUE) for (resolve.empty in c("none", "max.start", "start.x")) { checkIdentical(x, pintersect(x, x, resolve.empty = resolve.empty)) checkIdentical(y, pintersect(y, y, resolve.empty = resolve.empty)) } checkIdentical(pintersect(x[-c(2,3)], y[-c(2,3)]), ansMaxStart[-c(2,3)]) checkIdentical(pintersect(y[-c(2,3)], x[-c(2,3)]), ansMaxStart[-c(2,3)]) checkIdentical(pintersect(x, y, resolve.empty = "max.start"), ansMaxStart) checkIdentical(pintersect(y, x, resolve.empty = "max.start"), ansMaxStart) checkIdentical(pintersect(x, y, resolve.empty = "start.x"), ansStartX) checkIdentical(pintersect(y, x, resolve.empty = "start.x"), ansStartY) } test_IRanges_psetdiff <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans <- psetdiff(x[-7], y[-7]) ans0 <- IRanges(start=c(5,11,23,31,41,51,71), end=c(5,10,25,34,40,55,75)) checkIdentical(ans, ans0) ans <- psetdiff(y[-2], x[-2]) ans0 <- IRanges(start=c(1,18,36,43,48,63,78), end=c(0,20,36,45,50,62,79)) checkIdentical(ans, ans0) } test_IRanges_pgap <- function() { x <- IRanges(start=c(1,11,21,31,41,51,61,71), end=c(5,10,25,35,40,55,65,75)) y <- IRanges(start=c(1, 8,18,35,43,48,63,78), end=c(4,15,22,36,45,50,62,79)) ans <- pgap(x, y) checkIdentical(width(ans), c(0L, 0L, 0L, 0L, 2L, 0L, 0L, 2L)) checkIdentical(start(ans)[width(ans) != 0L], c(41L, 76L)) } test_IntegerRangesList_setops <- function() { for (compress in c(TRUE, FALSE)) { rl1 <- IRangesList(IRanges(c(1,2),c(4,3)), IRanges(c(4,6),c(10,7)), compress=compress) rl2 <- IRangesList(IRanges(c(0,2),c(4,5)), IRanges(c(4,5),c(6,7)), compress=compress) checkIdentical(union(rl1, rl2), IRangesList(union(rl1[[1]], rl2[[1]]), union(rl1[[2]], rl2[[2]]), compress=compress)) checkIdentical(intersect(rl1, rl2), IRangesList(intersect(rl1[[1]], rl2[[1]]), intersect(rl1[[2]], rl2[[2]]), compress=compress)) checkIdentical(setdiff(rl1, rl2), IRangesList(setdiff(rl1[[1]], rl2[[1]]), setdiff(rl1[[2]], rl2[[2]]), compress=compress)) } } IRanges/inst/unitTests/test_split.R0000644000175100017510000000024014626176651020430 0ustar00biocbuildbiocbuildtest_splitAsList <- function() { ir <- IRanges(sample(100),sample(100)+100) ir2 <- unlist(split(ir, ceiling(1:100 / 10))) checkTrue(all(ir==ir2)) } IRanges/inst/unitTests/test_splitListElements.R0000644000175100017510000000764614626176651023002 0ustar00biocbuildbiocbuild### test_regroupBySupergroup <- function() { regroupBySupergroup <- IRanges:::regroupBySupergroup .do_checks <- function(x, breakpoints, target) { supergroups <- PartitioningByEnd(breakpoints) current <- regroupBySupergroup(x, supergroups) checkIdentical(target, current) checkIdentical(target, regroupBySupergroup(x, breakpoints)) x_partitioning <- PartitioningByEnd(x) current2 <- regroupBySupergroup(x_partitioning, supergroups) checkIdentical(PartitioningByEnd(target), current2) } x <- CharacterList( x1=NULL, x2=LETTERS[1:3], x3=LETTERS[4:5], x4=letters[1:5], x5=NULL, x6=letters[6:7] ) breakpoints <- c(SG1=3, SG2=6) target <- CharacterList(SG1=LETTERS[1:5], SG2=letters[1:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=2, SG2=5, SG3=6) target <- CharacterList(SG1=LETTERS[1:3], SG2=c(LETTERS[4:5], letters[1:5]), SG3=letters[6:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=2, 2, SG2=5, SG3=6) target <- CharacterList(SG1=LETTERS[1:3], NULL, SG2=c(LETTERS[4:5], letters[1:5]), SG3=letters[6:7], compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- 6 target <- CharacterList(unlist(x, use.names=FALSE), compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(SG1=6, SG2=6, SG3=6) target <- CharacterList(SG1=unlist(x, use.names=FALSE), SG2=NULL, SG3=NULL, compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- c(0, 0, 0, 6, 6) target <- CharacterList(NULL, NULL, NULL, unlist(x, use.names=FALSE), NULL, compress=TRUE) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) breakpoints <- seq_along(x) target <- unname(as(x, "CompressedList")) .do_checks(as(x, "CompressedList"), breakpoints, target) .do_checks(as(x, "SimpleList"), breakpoints, target) names(breakpoints) <- names(x) target <- as(x, "CompressedList") .do_checks(as(x, "CompressedList"), breakpoints, target) # no-op .do_checks(as(x, "SimpleList"), breakpoints, target) x0 <- CharacterList() breakpoints <- setNames(integer(0), character(0)) target <- setNames(CharacterList(compress=TRUE), character(0)) .do_checks(as(x0, "CompressedList"), breakpoints, target) # Fails at the moment because unlist() is doesn't work properly on # SimpleCharacterList #.do_checks(as(x0, "SimpleList"), breakpoints, target) x2 <- RleList(Rle(44:45, 2:1), Rle(45), Rle(-2, 3)) breakpoints <- c(SG1=2, SG2=3) target <- RleList(SG1=Rle(44:45, c(2,2)), SG2=Rle(-2, 3), compress=TRUE) .do_checks(as(x2, "CompressedList"), breakpoints, target) .do_checks(as(x2, "SimpleList"), breakpoints, target) x3 <- Views(unlist(x2, use.names=FALSE), start=c(3, 1, 1), end=c(6, 1, 3)) breakpoints <- c(SG1=2, SG2=3) target <- RleList(SG1=Rle(c(45,-2,44), c(2,2,1)), SG2=Rle(44:45, 2:1), compress=TRUE) .do_checks(x3, breakpoints, target) } IRanges/inst/unitTests/test_tile-methods.R0000644000175100017510000000360014626176651021676 0ustar00biocbuildbiocbuildtest_tile <- function() { ir <- IRanges() checkIdentical(tile(ir, n=3), IRangesList()) checkIdentical(tile(ir, width=2), IRangesList()) checkIdentical(tile(ir, n=0), IRangesList()) ir <- IRanges(1, 4) checkIdentical(tile(ir, n=2), IRangesList(IRanges(c(1, 3), c(2, 4)))) checkIdentical(tile(ir, n=2), tile(ir, width=2)) ir <- IRanges(1, 5) checkIdentical(tile(ir, n=3), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)))) checkIdentical(tile(ir, n=3), tile(ir, width=2)) ir <- IRanges(1, 4) checkIdentical(tile(ir, n=3), IRangesList(IRanges(1:3, c(1, 2, 4)))) ir <- IRanges(1:3, width=5:3) checkIdentical(tile(ir, n=3), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)), IRanges(c(2, 3, 4), c(2, 3, 5)), IRanges(c(3, 4, 5), c(3, 4, 5)))) checkIdentical(tile(ir, width=2), IRangesList(IRanges(c(1, 2, 4), c(1, 3, 5)), IRanges(c(2, 4), c(3, 5)), IRanges(c(3, 4), c(3, 5)))) checkIdentical(elementNROWS(tile(ir, width=4)), c(2L, 1L, 1L)) checkException(tile(ir, n=4), silent=TRUE) checkException(tile(ir, width=-1), silent=TRUE) checkException(tile(ir, n=-1), silent=TRUE) ir <- setNames(IRanges(1:3, width = 10), letters[1:3]) checkIdentical(names(ir), names(tile(ir, n = 2))) checkIdentical(names(ir), names(tile(ir, width = 3))) } test_slidingWindows <- function() { ir <- IRanges() checkIdentical(slidingWindows(ir, width=3), IRangesList()) ir <- IRanges(1:3, width=5:3) checkIdentical(slidingWindows(ir, width=3, step=2), IRangesList(IRanges(c(1, 3), c(3, 5)), IRanges(c(2, 4), c(4, 5)), IRanges(3, 5))) ir <- setNames(IRanges(1:3, width = 10), letters[1:3]) checkIdentical(names(ir), names(slidingWindows(ir, width = 3))) } IRanges/man/0000755000175100017510000000000014626176651013713 5ustar00biocbuildbiocbuildIRanges/man/AtomicList-class.Rd0000644000175100017510000002643314626176651017365 0ustar00biocbuildbiocbuild\name{AtomicList} \docType{class} % AtomicList classes \alias{class:AtomicList} \alias{AtomicList-class} \alias{AtomicList} \alias{class:CompressedAtomicList} \alias{CompressedAtomicList-class} \alias{CompressedAtomicList} \alias{class:SimpleAtomicList} \alias{SimpleAtomicList-class} \alias{SimpleAtomicList} \alias{class:LogicalList} \alias{LogicalList-class} \alias{LogicalList} \alias{class:CompressedLogicalList} \alias{CompressedLogicalList-class} \alias{CompressedLogicalList} \alias{class:SimpleLogicalList} \alias{SimpleLogicalList-class} \alias{SimpleLogicalList} \alias{class:IntegerList} \alias{IntegerList-class} \alias{IntegerList} \alias{class:CompressedIntegerList} \alias{CompressedIntegerList-class} \alias{CompressedIntegerList} \alias{class:SimpleIntegerList} \alias{SimpleIntegerList-class} \alias{SimpleIntegerList} \alias{class:NumericList} \alias{NumericList-class} \alias{NumericList} \alias{class:CompressedNumericList} \alias{CompressedNumericList-class} \alias{CompressedNumericList} \alias{class:SimpleNumericList} \alias{SimpleNumericList-class} \alias{SimpleNumericList} \alias{class:ComplexList} \alias{ComplexList-class} \alias{ComplexList} \alias{class:CompressedComplexList} \alias{CompressedComplexList-class} \alias{CompressedComplexList} \alias{class:SimpleComplexList} \alias{SimpleComplexList-class} \alias{SimpleComplexList} \alias{class:CharacterList} \alias{CharacterList-class} \alias{CharacterList} \alias{class:CompressedCharacterList} \alias{CompressedCharacterList-class} \alias{CompressedCharacterList} \alias{class:SimpleCharacterList} \alias{SimpleCharacterList-class} \alias{SimpleCharacterList} \alias{class:RawList} \alias{RawList-class} \alias{RawList} \alias{class:CompressedRawList} \alias{CompressedRawList-class} \alias{CompressedRawList} \alias{class:SimpleRawList} \alias{SimpleRawList-class} \alias{SimpleRawList} \alias{class:RleList} \alias{RleList-class} \alias{RleList} \alias{class:CompressedRleList} \alias{CompressedRleList-class} \alias{CompressedRleList} \alias{class:SimpleRleList} \alias{SimpleRleList-class} \alias{SimpleRleList} \alias{class:FactorList} \alias{FactorList-class} \alias{FactorList} \alias{class:CompressedFactorList} \alias{CompressedFactorList-class} \alias{CompressedFactorList} \alias{class:SimpleFactorList} \alias{SimpleFactorList-class} \alias{SimpleFactorList} % coercion \alias{as.list,CompressedAtomicList-method} \alias{coerce,CompressedAtomicList,list-method} \alias{as.vector,AtomicList-method} \alias{coerce,vector,AtomicList-method} \alias{as.matrix,AtomicList-method} \alias{lapply,CompressedAtomicList-method} \alias{coerce,vector,CompressedLogicalList-method} \alias{coerce,vector,SimpleLogicalList-method} \alias{coerce,vector,CompressedIntegerList-method} \alias{coerce,vector,SimpleIntegerList-method} \alias{coerce,vector,CompressedNumericList-method} \alias{coerce,vector,SimpleNumericList-method} \alias{coerce,vector,CompressedComplexList-method} \alias{coerce,vector,SimpleComplexList-method} \alias{coerce,vector,CompressedCharacterList-method} \alias{coerce,vector,SimpleCharacterList-method} \alias{coerce,vector,CompressedRawList-method} \alias{coerce,vector,SimpleRawList-method} \alias{coerce,vector,CompressedRleList-method} \alias{coerce,vector,SimpleRleList-method} \alias{coerce,AtomicList,LogicalList-method} \alias{coerce,AtomicList,IntegerList-method} \alias{coerce,AtomicList,NumericList-method} \alias{coerce,AtomicList,ComplexList-method} \alias{coerce,AtomicList,CharacterList-method} \alias{coerce,AtomicList,RawList-method} \alias{coerce,AtomicList,RleList-method} \alias{RleList,AtomicList,RleList-method} \alias{coerce,AtomicList,RleViews} \alias{unlist,SimpleFactorList-method} \alias{unlist,SimpleRleList-method} \alias{unique,RleList-method} \alias{unique,CompressedList-method} \alias{table,AtomicList-method} \alias{table,SimpleAtomicList-method} \alias{drop,AtomicList-method} \alias{duplicated,CompressedList-method} \alias{duplicated,CompressedAtomicList-method} \alias{sort,List-method} \alias{order,List-method} \alias{rank,List-method} \alias{runLength,RleList-method} \alias{runValue,RleList-method} \alias{runLength,CompressedRleList-method} \alias{runValue,CompressedRleList-method} \alias{runValue<-,CompressedRleList-method} \alias{runValue<-,SimpleRleList-method} \alias{ranges,RleList-method} \alias{ranges,CompressedRleList-method} \alias{show,AtomicList-method} \alias{show,RleList-method} \title{Lists of Atomic Vectors in Natural and Rle Form} \description{An extension of \code{\linkS4class{List}} that holds only atomic vectors in either a natural or run-length encoded form.} \details{ The lists of atomic vectors are \code{LogicalList}, \code{IntegerList}, \code{NumericList}, \code{ComplexList}, \code{CharacterList}, and \code{RawList}. There is also an \code{RleList} class for run-length encoded versions of these atomic vector types. Each of the above mentioned classes is virtual with Compressed* and Simple* non-virtual representations. } \section{Constructors}{ \describe{ \item{\code{LogicalList(..., compress = TRUE)}:}{ Concatenates the \code{logical} vectors in \code{...} into a new \code{LogicalList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{IntegerList(..., compress = TRUE)}:}{ Concatenates the \code{integer} vectors in \code{...} into a new \code{IntegerList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{NumericList(..., compress = TRUE)}:}{ Concatenates the \code{numeric} vectors in \code{...} into a new \code{NumericList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{ComplexList(..., compress = TRUE)}:}{ Concatenates the \code{complex} vectors in \code{...} into a new \code{ComplexList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{CharacterList(..., compress = TRUE)}:}{ Concatenates the \code{character} vectors in \code{...} into a new \code{CharacterList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{RawList(..., compress = TRUE)}:}{ Concatenates the \code{raw} vectors in \code{...} into a new \code{RawList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{RleList(..., compress = TRUE)}:}{ Concatenates the run-length encoded atomic vectors in \code{...} into a new \code{RleList}. If \code{compress}, the internal storage of the data is compressed.} \item{\code{FactorList(..., compress = TRUE)}:}{ Concatenates the \code{factor} objects in \code{...} into a new \code{FactorList}. If \code{compress}, the internal storage of the data is compressed.} } } \section{Coercion}{ \describe{ \item{\code{as(from, "CompressedSplitDataFrameList")}, \code{as(from, "SimpleSplitDataFrameList")}:}{ Creates a \linkS4class{CompressedSplitDataFrameList}/\linkS4class{SimpleSplitDataFrameList} instance from an AtomicList instance. } \item{\code{as(from, "IRangesList")}, \code{as(from, "CompressedIRangesList")}, \code{as(from, "SimpleIRangesList")}:}{ Creates a \linkS4class{CompressedIRangesList}/\linkS4class{SimpleIRangesList} instance from a LogicalList or logical RleList instance. Note that the elements of this instance are guaranteed to be normal. } \item{\code{as(from, "NormalIRangesList")}, \code{as(from, "CompressedNormalIRangesList")}, \code{as(from, "SimpleNormalIRangesList")}:}{ Creates a \linkS4class{CompressedNormalIRangesList}/\linkS4class{SimpleNormalIRangesList} instance from a LogicalList or logical RleList instance. } \item{\code{as(from, "CharacterList")}, \code{as(from, "ComplexList")}, \code{as(from, "IntegerList")}, \code{as(from, "LogicalList")}, \code{as(from, "NumericList")}, \code{as(from, "RawList")}, \code{as(from, "RleList")}:}{ Coerces an \code{AtomicList} \code{from} to another derivative of \code{AtomicList}. } \item{\code{as(from, "AtomicList")}:}{ If \code{from} is a vector, converts it to an \code{AtomicList} of the appropriate type. } \item{\code{drop(x)}:}{ Checks if every element of \code{x} is of length one, and, if so, unlists \code{x}. Otherwise, an error is thrown. } \item{\code{as(from, "RleViews")}:}{ Creates an RleViews where each view corresponds to an element of \code{from}. The subject is \code{unlist(from)}. } \item{\code{as.matrix(x, col.names=NULL)}:}{ Maps the elements of the list to rows of a matrix. The column mapping depends on whether there are inner names (either on the object or provided via \code{col.names} as a List object). If there are no inner names, each row is padded with NAs to reach the length of the longest element. If there are inner names, there is a column for each unique name and the mapping is by name. To provide inner names, the \code{col.names} argument should be a List, usually a CharacterList or FactorList (which is particularly efficient). If \code{col.names} is a character vector, it names the columns of the result, but does not imply inner names. } } } \section{Compare, Order, Tabulate}{ The following methods are provided for element-wise comparison of 2 AtomicList objects, and ordering or tabulating of each list element of an AtomicList object: \code{is.na}, \code{duplicated}, \code{unique}, \code{match}, \code{\%in\%}, \code{table}, \code{order}, \code{sort}. } \section{RleList Methods}{ RleList has a number of methods that are not shared by other AtomicList derivatives. \describe{ \item{\code{runLength(x)}:}{ Gets the run lengths of each element of the list, as an IntegerList. } \item{\code{runValue(x)}, \code{runValue(x) <- value}:}{ Gets or sets the run values of each element of the list, as an AtomicList. } \item{\code{ranges(x)}:}{ Gets the run ranges as a \code{IntegerRangesList}. } } } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{AtomicList-utils} for common operations on AtomicList objects. \item \link[S4Vectors]{List} objects in the \pkg{S4Vectors} package for the parent class. } } \examples{ int1 <- c(1L,2L,3L,5L,2L,8L) int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) collection <- IntegerList(int1, int2) ## names names(collection) <- c("one", "two") names(collection) names(collection) <- NULL # clear names names(collection) names(collection) <- "one" names(collection) # c("one", NA) ## extraction collection[[1]] # range1 collection[["1"]] # NULL, does not exist collection[["one"]] # range1 collection[[NA_integer_]] # NULL ## subsetting collection[numeric()] # empty collection[NULL] # empty collection[] # identity collection[c(TRUE, FALSE)] # first element collection[2] # second element collection[c(2,1)] # reversed collection[-1] # drop first collection$one ## replacement collection$one <- int2 collection[[2]] <- int1 ## concatenating col1 <- IntegerList(one = int1, int2) col2 <- IntegerList(two = int2, one = int1) col3 <- IntegerList(int2) append(col1, col2) append(col1, col2, 0) col123 <- c(col1, col2, col3) col123 ## revElements revElements(col123) revElements(col123, 4:5) } \keyword{methods} \keyword{classes} IRanges/man/AtomicList-utils.Rd0000644000175100017510000002162114626176651017412 0ustar00biocbuildbiocbuild\name{AtomicList-utils} \alias{AtomicList-utils} \alias{Ops,AtomicList,AtomicList-method} \alias{Ops,CompressedAtomicList,CompressedAtomicList-method} \alias{Ops,SimpleAtomicList,CompressedAtomicList-method} \alias{Ops,CompressedAtomicList,SimpleAtomicList-method} \alias{Ops,AtomicList,atomic-method} \alias{Ops,atomic,AtomicList-method} \alias{Ops,AtomicList,missing-method} \alias{Ops,CompressedAtomicList,atomic-method} \alias{Ops,atomic,CompressedAtomicList-method} \alias{Math,AtomicList-method} \alias{Math,CompressedAtomicList-method} \alias{Math2,AtomicList-method} \alias{Math2,CompressedAtomicList-method} \alias{Summary,AtomicList-method} \alias{Summary,CompressedRleList-method} \alias{Complex,AtomicList-method} \alias{Complex,CompressedAtomicList-method} \alias{sum,CompressedIntegerList-method} \alias{sum,CompressedLogicalList-method} \alias{sum,CompressedNumericList-method} \alias{which,CompressedLogicalList-method} \alias{which,SimpleLogicalList-method} \alias{which,CompressedRleList-method} \alias{which,SimpleRleList-method} \alias{which.max,CompressedRleList-method} \alias{which.max,RleList-method} \alias{which.max,IntegerList-method} \alias{which.max,NumericList-method} \alias{which.min,CompressedRleList-method} \alias{which.min,RleList-method} \alias{which.min,IntegerList-method} \alias{which.min,NumericList-method} \alias{all,CompressedRleList-method} \alias{all,CompressedAtomicList-method} \alias{any,CompressedAtomicList-method} \alias{anyNA,CompressedAtomicList-method} \alias{diff.AtomicList} \alias{diff,CompressedAtomicList-method} \alias{pmax,IntegerList-method} \alias{pmax,NumericList-method} \alias{pmax,RleList-method} \alias{pmin,IntegerList-method} \alias{pmin,NumericList-method} \alias{pmin,RleList-method} \alias{pmax.int,IntegerList-method} \alias{pmax.int,NumericList-method} \alias{pmax.int,RleList-method} \alias{pmin.int,IntegerList-method} \alias{pmin.int,NumericList-method} \alias{pmin.int,RleList-method} \alias{mean,AtomicList-method} \alias{mean,CompressedLogicalList-method} \alias{mean,CompressedIntegerList-method} \alias{mean,CompressedNumericList-method} \alias{mean,CompressedRleList-method} \alias{var,AtomicList,missing-method} \alias{var,AtomicList,AtomicList-method} \alias{cov,AtomicList,AtomicList-method} \alias{cor,AtomicList,AtomicList-method} \alias{sd,AtomicList-method} \alias{median,AtomicList-method} \alias{median,CompressedAtomicList-method} \alias{quantile,AtomicList-method} \alias{mad,AtomicList-method} \alias{IQR,AtomicList-method} \alias{cumsum,CompressedAtomicList-method} \alias{cumprod,CompressedAtomicList-method} \alias{cummin,CompressedAtomicList-method} \alias{cummax,CompressedAtomicList-method} \alias{range,CompressedIntegerList-method} \alias{range,CompressedNumericList-method} \alias{range,CompressedLogicalList-method} \alias{smoothEnds,CompressedIntegerList-method} \alias{smoothEnds,SimpleIntegerList-method} \alias{smoothEnds,NumericList-method} \alias{smoothEnds,RleList-method} \alias{runmed,CompressedIntegerList-method} \alias{runmed,SimpleIntegerList-method} \alias{runmed,NumericList-method} \alias{runmed,RleList-method} \alias{runmean,RleList-method} \alias{runsum,RleList-method} \alias{runwtsum,RleList-method} \alias{runq,RleList-method} \alias{nchar,CompressedCharacterList-method} \alias{nchar,SimpleCharacterList-method} \alias{nchar,CompressedRleList-method} \alias{nchar,SimpleRleList-method} \alias{chartr,ANY,ANY,CompressedCharacterList-method} \alias{chartr,ANY,ANY,SimpleCharacterList-method} \alias{chartr,ANY,ANY,CompressedRleList-method} \alias{chartr,ANY,ANY,SimpleRleList-method} \alias{tolower,CompressedCharacterList-method} \alias{tolower,SimpleCharacterList-method} \alias{tolower,CompressedRleList-method} \alias{tolower,SimpleRleList-method} \alias{toupper,CompressedCharacterList-method} \alias{toupper,SimpleCharacterList-method} \alias{toupper,CompressedRleList-method} \alias{toupper,SimpleRleList-method} \alias{sub,ANY,ANY,CompressedCharacterList-method} \alias{sub,ANY,ANY,SimpleCharacterList-method} \alias{sub,ANY,ANY,CompressedRleList-method} \alias{sub,ANY,ANY,SimpleRleList-method} \alias{gsub,ANY,ANY,CompressedCharacterList-method} \alias{gsub,ANY,ANY,SimpleCharacterList-method} \alias{gsub,ANY,ANY,CompressedRleList-method} \alias{gsub,ANY,ANY,SimpleRleList-method} \alias{unstrsplit,CharacterList-method} \alias{unstrsplit,RleList-method} \alias{paste,CompressedAtomicList-method} \alias{startsWith,CharacterList,ANY-method} \alias{startsWith,RleList,ANY-method} \alias{endsWith,CharacterList,ANY-method} \alias{endsWith,RleList,ANY-method} \alias{rank,CompressedAtomicList-method} \alias{order,CompressedAtomicList-method} \alias{selfmatch,CompressedAtomicList-method} \alias{intersect,CompressedAtomicList,CompressedAtomicList-method} \alias{ifelse2} \alias{ifelse2,ANY,ANY,List-method} \alias{ifelse2,ANY,List,ANY-method} \alias{ifelse2,List,ANY,ANY-method} \alias{ifelse2,CompressedLogicalList,ANY,ANY-method} \alias{ifelse2,CompressedLogicalList,List,ANY-method} \alias{ifelse2,CompressedLogicalList,ANY,List-method} \alias{ifelse2,CompressedLogicalList,List,ANY-method} \alias{ifelse2,CompressedLogicalList,List,List-method} \alias{ifelse2,SimpleLogicalList,ANY,ANY-method} \alias{ifelse2,SimpleLogicalList,List,ANY-method} \alias{ifelse2,SimpleLogicalList,ANY,List-method} \alias{ifelse2,SimpleLogicalList,List,ANY-method} \alias{ifelse2,SimpleLogicalList,List,List-method} \title{Common operations on AtomicList objects} \description{ Common operations on \link{AtomicList} objects. } \section{Group Generics}{ AtomicList objects have support for S4 group generic functionality to operate within elements across objects: \describe{ \item{\code{Arith}}{\code{"+"}, \code{"-"}, \code{"*"}, \code{"^"}, \code{"\%\%"}, \code{"\%/\%"}, \code{"/"}} \item{\code{Compare}}{\code{"=="}, \code{">"}, \code{"<"}, \code{"!="}, \code{"<="}, \code{">="}} \item{\code{Logic}}{\code{"&"}, \code{"|"}} \item{\code{Ops}}{\code{"Arith"}, \code{"Compare"}, \code{"Logic"}} \item{\code{Math}}{\code{"abs"}, \code{"sign"}, \code{"sqrt"}, \code{"ceiling"}, \code{"floor"}, \code{"trunc"}, \code{"cummax"}, \code{"cummin"}, \code{"cumprod"}, \code{"cumsum"}, \code{"log"}, \code{"log10"}, \code{"log2"}, \code{"log1p"}, \code{"acos"}, \code{"acosh"}, \code{"asin"}, \code{"asinh"}, \code{"atan"}, \code{"atanh"}, \code{"exp"}, \code{"expm1"}, \code{"cos"}, \code{"cosh"}, \code{"sin"}, \code{"sinh"}, \code{"tan"}, \code{"tanh"}, \code{"gamma"}, \code{"lgamma"}, \code{"digamma"}, \code{"trigamma"}} \item{\code{Math2}}{\code{"round"}, \code{"signif"}} \item{\code{Summary}}{\code{"max"}, \code{"min"}, \code{"range"}, \code{"prod"}, \code{"sum"}, \code{"any"}, \code{"all"}} \item{\code{Complex}}{\code{"Arg"}, \code{"Conj"}, \code{"Im"}, \code{"Mod"}, \code{"Re"}} } See \link[methods]{S4groupGeneric} for more details. } \section{Other Methods}{ The AtomicList objects also support a large number of basic methods. Like the group generics above, these methods perform the corresponding operation on each element of the list separately. The methods are: \describe{ \item{Logical}{\code{!}, \code{which}, \code{which.max}, \code{which.min}} \item{Numeric}{\code{diff}, \code{pmax}, \code{pmax.int}, \code{pmin}, \code{pmin.int}, \code{mean}, \code{var}, \code{cov}, \code{cor}, \code{sd}, \code{median}, \code{quantile}, \code{mad}, \code{IQR}} \item{Running Window}{\code{smoothEnds}, \code{runmed}. \code{runmean}, \code{runsum}, \code{runwtsum}, \code{runq}} \item{Character}{\code{nchar}, \code{chartr}, \code{tolower}, \code{toupper}, \code{sub}, \code{gsub}, \code{startsWith}, \code{endsWith}} } The \code{which.min} and \code{which.max} functions have an extra argument, \code{global=FALSE}, which controls whether the returned subscripts are global (compatible with the unlisted form of the input) or local (compatible with the corresponding list element). The \code{rank} method only supports tie methods \dQuote{average}, \dQuote{first}, \dQuote{min} and \dQuote{max}. Since \code{\link{ifelse}} relies on non-standard evaluation for arguments that need to be in the generic signature, we provide \code{ifelse2}, which has eager but otherwise equivalent semantics. } \section{Specialized Methods}{ \describe{ \item{\code{unstrsplit(x, sep="")}:}{}: A fast \code{sapply(x, paste0, collapse=sep)}. See \code{?\link{unstrsplit}} for the details. } } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{AtomicList} objects. } } \examples{ ## group generics int1 <- c(1L,2L,3L,5L,2L,8L) int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L) col1 <- IntegerList(one = int1, int2) 2 * col1 col1 + col1 col1 > 2 sum(col1) # equivalent to (but faster than) 'sapply(col1, sum)' mean(col1) # equivalent to 'sapply(col1, mean)' } \keyword{methods} IRanges/man/CompressedHitsList-class.Rd0000644000175100017510000000146014626176651021076 0ustar00biocbuildbiocbuild\name{CompressedHitsList-class} \docType{class} \alias{class:CompressedHitsList} \alias{CompressedHitsList-class} \alias{CompressedHitsList} % coercion \alias{as.matrix,CompressedHitsList-method} % accessors \alias{space,CompressedHitsList-method} \alias{from,CompressedHitsList-method} \alias{to,CompressedHitsList-method} \alias{nLnode,CompressedHitsList-method} \alias{nRnode,CompressedHitsList-method} \title{CompressedHitsList objects} \description{ An efficient representation of \link{HitsList} objects. See \code{?\link{HitsList}} for more information about \link{HitsList} objects. } \note{ This class is highly experimental. It has not been well tested and may disappear at any time. } \author{Michael Lawrence} \seealso{ \link{HitsList} objects. } \keyword{methods} \keyword{classes} IRanges/man/CompressedList-class.Rd0000644000175100017510000001545114626176651020253 0ustar00biocbuildbiocbuild\name{CompressedList-class} \docType{class} \alias{class:CompressedList} \alias{CompressedList} \alias{CompressedList-class} % accessors \alias{length,CompressedList-method} \alias{names,CompressedList-method} \alias{names<-,CompressedList-method} \alias{elementNROWS,CompressedList-method} \alias{is.na,CompressedList-method} % coercion \alias{unlist,CompressedList-method} \alias{coerce,ANY,CompressedList-method} % concatenation \alias{bindROWS,CompressedList-method} % looping \alias{lapply,CompressedList-method} \alias{revElements,CompressedList-method} % displaying \alias{classNameForDisplay,CompressedList-method} % ops \alias{!,CompressedList-method} % setops \alias{match,CompressedList,vector-method} \title{CompressedList objects} \description{ Like the \link[S4Vectors]{SimpleList} class defined in the \pkg{S4Vectors} package, the CompressedList class extends the \link[S4Vectors]{List} virtual class. } \details{ Unlike the \link[S4Vectors]{SimpleList} class, CompressedList is virtual, that is, it cannot be instantiated. Many concrete (i.e. non-virtual) CompressedList subclasses are defined and documented in this package (e.g. \link{CompressedIntegerList}, \link{CompressedCharacterList}, \link{CompressedRleList}, etc...), as well as in other packages (e.g. \link[GenomicRanges]{GRangesList} in the \pkg{GenomicRanges} package, \link[GenomicAlignments]{GAlignmentsList} in the \pkg{GenomicAlignments} package, etc...). It's easy for developers to extend CompressedList to create a new CompressedList subclass and there is generally very little work involved to make this new subclass fully operational. In a CompressedList object the list elements are concatenated together in a single vector-like object. The \emph{partitioning} of this single vector-like object (i.e. the information about where each original list element starts and ends) is also kept in the CompressedList object. This internal representation is generally more memory efficient than \link[S4Vectors]{SimpleList}, especially if the object has many list elements (e.g. thousands or millions). Also it makes it possible to implement many basic list operations very efficiently. Many objects like \link{LogicalList}, \link{IntegerList}, \link{CharacterList}, \link{RleList}, etc... exist in 2 flavors: CompressedList and \link[S4Vectors]{SimpleList}. Each flavor is incarnated by a concrete subclass: \link{CompressedLogicalList} and \link{SimpleLogicalList} for virtual class \link{LogicalList}, \link{CompressedIntegerList} and \link{SimpleIntegerList} for virtual class \link{IntegerList}, etc... It's easy to switch from one representation to the other with \code{as(x, "CompressedList")} and \code{as(x, "SimpleList")}. Also the constructor function for those virtual classes have a switch that lets the user choose the representation at construction time e.g. \code{CharacterList(..., compress=TRUE)} or \code{CharacterList(..., compress=FALSE)}. See below for more information. } \section{Constructor}{ See the \link[S4Vectors]{List} man page in the \pkg{S4Vectors} package for a quick overview of how to construct \link{List} objects in general. Unlike for \link[S4Vectors]{SimpleList} objects, there is no \code{CompressedList} constructor function. However, many constructor functions for \link[S4Vectors]{List} derivatives provide the \code{compress} argument that lets the user choose between the CompressedList and \link[S4Vectors]{SimpleList} representations at construction time. For example, depending on whether the \code{compress} argument of the \code{\link{CharacterList}()} constructor is set to \code{TRUE} or \code{FALSE}, a \link{CompressedCharacterList} or \link{SimpleCharacterList} instance will be returned. Finally let's mention that the most efficient way to construct a CompressedList derivative is with \preformatted{ relist(unlisted, partitioning) } where \code{unlisted} is a vector-like object and \code{partitioning} a \link{PartitioningByEnd} object describing a partitioning of \code{unlisted}. The cost of this relist operation is virtually zero because \code{unlisted} and \code{partitioning} get stored \emph{as-is} in the returned object. } \section{Accessors}{ Same as for \link[S4Vectors]{List} objects. See the \link[S4Vectors]{List} man page in the \pkg{S4Vectors} package for more information. } \section{Coercion}{ All the coercions documented in the \link[S4Vectors]{List} man page apply to CompressedList objects. } \section{Subsetting}{ Same as for \link[S4Vectors]{List} objects. See the \link[S4Vectors]{List} man page for more information. } \section{Looping and functional programming}{ Same as for \link[S4Vectors]{List} objects. See \code{?`\link[S4Vectors]{List-utils}`} in the \pkg{S4Vectors} package for more information. } \section{Displaying}{ When a CompressedList object is displayed, the "Compressed" prefix is removed from the real class name of the object. See \code{\link[S4Vectors]{classNameForDisplay}} in the \pkg{S4Vectors} package for more information about this. } \seealso{ \itemize{ \item \link[S4Vectors]{List} in the \pkg{S4Vectors} package for an introduction to List objects and their derivatives (CompressedList is a direct subclass of List which makes CompressedList objects List derivatives). \item The \link[S4Vectors]{SimpleList} class defined and documented in the \pkg{S4Vectors} package for an alternative to CompressedList. \item \link[IRanges]{relist} and \link[IRanges]{extractList} for efficiently constructing a \link{List} derivative from a vector-like object. \item The \link{CompressedNumericList} class for an example of a concrete CompressedList subclass. \item \link{PartitioningByEnd} objects. These objects are used inside CompressedList derivatives to keep track of the \emph{partitioning} of the single vector-like object made of all the list elements concatenated together. } } \examples{ ## Fastest way to construct a CompressedList object: unlisted <- runif(12) partitioning <- PartitioningByEnd(c(5, 5, 10, 12), names=LETTERS[1:4]) partitioning x1 <- relist(unlisted, partitioning) x1 stopifnot(identical(lengths(partitioning), lengths(x1))) ## Note that the class of the CompressedList derivative returned by ## relist() is determined by relistToClass(): relistToClass(unlisted) stopifnot(relistToClass(unlisted) == class(x1)) ## Displaying a CompressedList object: x2 <- IntegerList(11:12, integer(0), 3:-2, compress=TRUE) class(x2) ## The "Simple" prefix is removed from the real class name of the ## object: x2 ## This is controlled by internal helper classNameForDisplay(): classNameForDisplay(x2) classNameForDisplay(x1) } \keyword{methods} \keyword{classes} IRanges/man/DataFrameList-class.Rd0000644000175100017510000002434714626176651017777 0ustar00biocbuildbiocbuild\name{DataFrameList-class} \docType{class} \alias{class:DataFrameList} \alias{DataFrameList-class} \alias{DataFrameList} \alias{class:DFrameList} \alias{DFrameList-class} \alias{DFrameList} \alias{class:SimpleDataFrameList} \alias{SimpleDataFrameList-class} \alias{SimpleDataFrameList} \alias{class:SimpleDFrameList} \alias{SimpleDFrameList-class} \alias{SimpleDFrameList} \alias{class:SplitDataFrameList} \alias{SplitDataFrameList-class} \alias{SplitDataFrameList} \alias{class:SplitDFrameList} \alias{SplitDFrameList-class} \alias{SplitDFrameList} \alias{class:SimpleSplitDataFrameList} \alias{SimpleSplitDataFrameList-class} \alias{SimpleSplitDataFrameList} \alias{class:SimpleSplitDFrameList} \alias{SimpleSplitDFrameList-class} \alias{SimpleSplitDFrameList} \alias{class:CompressedDataFrameList} \alias{CompressedDataFrameList-class} \alias{CompressedDataFrameList} \alias{class:CompressedDFrameList} \alias{CompressedDFrameList-class} \alias{CompressedDFrameList} \alias{class:CompressedSplitDataFrameList} \alias{CompressedSplitDataFrameList-class} \alias{CompressedSplitDataFrameList} \alias{class:CompressedSplitDFrameList} \alias{CompressedSplitDFrameList-class} \alias{CompressedSplitDFrameList} % accessors \alias{nrows,DataFrameList-method} \alias{ncols,DataFrameList-method} \alias{ncols,SimpleSplitDataFrameList-method} \alias{ncols,CompressedSplitDataFrameList-method} \alias{dims,DataFrameList-method} \alias{rownames,DataFrameList-method} \alias{colnames,DataFrameList-method} \alias{colnames,SplitDataFrameList-method} \alias{colnames,CompressedSplitDataFrameList-method} \alias{dimnames,DataFrameList-method} \alias{rownames<-,CompressedSplitDataFrameList-method} \alias{rownames<-,SimpleDataFrameList-method} \alias{colnames<-,CompressedSplitDataFrameList-method} \alias{colnames<-,SimpleDataFrameList-method} \alias{dimnames<-,DataFrameList-method} \alias{NROW,DataFrameList-method} \alias{ROWNAMES,DataFrameList-method} \alias{ROWNAMES<-,DataFrameList-method} \alias{columnMetadata} \alias{columnMetadata<-} \alias{columnMetadata,SimpleSplitDataFrameList-method} \alias{columnMetadata<-,SimpleSplitDataFrameList-method} \alias{columnMetadata,CompressedSplitDataFrameList-method} \alias{columnMetadata<-,CompressedSplitDataFrameList-method} \alias{commonColnames} \alias{commonColnames<-} \alias{commonColnames,SimpleSplitDataFrameList-method} \alias{commonColnames<-,SimpleSplitDataFrameList-method} \alias{commonColnames,CompressedSplitDataFrameList-method} \alias{commonColnames<-,CompressedSplitDataFrameList-method} % subsetting \alias{[,SimpleSplitDataFrameList-method} \alias{[,CompressedSplitDataFrameList-method} \alias{[<-,SplitDataFrameList-method} % transformation \alias{transform,SplitDataFrameList-method} % coercion \alias{coerce,ANY,DataFrameList-method} \alias{coerce,ANY,SimpleDataFrameList-method} \alias{coerce,ANY,SplitDataFrameList-method} \alias{coerce,ANY,SimpleSplitDataFrameList-method} \alias{coerce,ANY,CompressedDataFrameList-method} \alias{coerce,ANY,CompressedSplitDataFrameList-method} \alias{coerce,DataFrameList,DFrame-method} \alias{coerce,SplitDataFrameList,DFrame-method} \alias{coerce,ANY,CompressedSplitDFrameList-method} \alias{coerce,ANY,SimpleSplitDFrameList-method} \alias{coerce,List,CompressedSplitDFrameList-method} \alias{coerce,list,SplitDFrameList-method} \alias{coerce,List,SimpleSplitDFrameList-method} \alias{coerce,ANY,SplitDFrameList-method} \alias{coerce,SimpleList,SplitDFrameList-method} \alias{coerce,list,SplitDFrameList-method} \alias{coerce,DataFrame,SplitDFrameList-method} \alias{stack,DataFrameList-method} % splitting and combining \alias{cbind,DataFrameList-method} \alias{rbind,DataFrameList-method} % show \alias{classNameForDisplay,SimpleDFrameList-method} \alias{classNameForDisplay,CompressedDFrameList-method} \alias{show,SplitDataFrameList-method} % SDFLWrapperForTransform (internal) \alias{colnames,SDFLWrapperForTransform-method} \alias{[[,SDFLWrapperForTransform-method} \alias{[[<-,SDFLWrapperForTransform-method} \alias{as.env,SDFLWrapperForTransform-method} % deprecated & defunt \alias{nrow,DataFrameList-method} \alias{ncol,DataFrameList-method} \alias{ncol,SimpleSplitDataFrameList-method} \alias{ncol,CompressedSplitDataFrameList-method} \alias{dim,DataFrameList-method} \title{List of DataFrames} \description{Represents a list of \code{\linkS4class{DataFrame}} objects. The \code{SplitDataFrameList} class contains the additional restriction that all the columns be of the same name and type. Internally it is stored as a list of \code{DataFrame} objects and extends \code{\linkS4class{List}}.} \section{Accessors}{ In the following code snippets, \code{x} is a \code{DataFrameList}. \describe{ \item{\code{dims(x)}:}{ Get the two-column matrix indicating the number of rows and columns over the entire dataset.} \item{\code{dimnames(x)}:}{ Get the list of two CharacterLists, the first holding the rownames (possibly \code{NULL}) and the second the column names. } } In the following code snippets, \code{x} is a \code{SplitDataFrameList}. \describe{ \item{\code{commonColnames(x)}:}{ Get the character vector of column names present in the individual DataFrames in \code{x}.} \item{\code{commonColnames(x) <- value}:}{ Set the column names of the DataFrames in \code{x}.} \item{\code{columnMetadata(x)}:}{ Get the \code{DataFrame} of metadata along the columns, i.e., where each column in \code{x} is represented by a row in the metadata. The metadata is common across all elements of \code{x}. Note that calling \code{mcols(x)} returns the metadata on the \code{DataFrame} elements of \code{x}. } \item{\code{columnMetadata(x) <- value}:}{ Set the \code{DataFrame} of metadata for the columns. } } } \section{Subsetting}{ In the following code snippets, \code{x} is a \code{SplitDataFrameList}. In general \code{x} follows the conventions of \code{SimpleList}/\code{CompressedList} with the following addition: \describe{ \item{\code{x[i,j,drop]}:}{ If matrix subsetting is used, \code{i} selects either the list elements or the rows within the list elements as determined by the \code{[} method for \code{SimpleList}/\code{CompressedList}, \code{j} selects the columns, and \code{drop} is used when one column is selected and output can be coerced into an \code{AtomicList} or \code{IntegerRangesList} subclass. } \item{\code{x[i,j] <- value}:}{ If matrix subsetting is used, \code{i} selects either the list elements or the rows within the list elements as determined by the \code{[<-} method for \code{SimpleList}/\code{CompressedList}, \code{j} selects the columns and \code{value} is the replacement value for the selected region. } } } \section{Constructor}{ \describe{ \item{\code{DataFrameList(...)}:}{ Concatenates the \code{DataFrame} objects in \code{...} into a new \code{DataFrameList}.} \item{\code{SplitDataFrameList(..., compress = TRUE, cbindArgs = FALSE)}:}{ If \code{cbindArgs} is \code{FALSE}, the \code{...} arguments are coerced to \code{DataFrame} objects and concatenated to form the result. The arguments must have the same number and names of columns. If \code{cbindArgs} is \code{TRUE}, the arguments are combined as columns. The arguments must then be the same length, with each element of an argument mapping to an element in the result. If \code{compress = TRUE}, returns a \code{CompressedSplitDataFrameList}; else returns a \code{SimpleSplitDataFrameList}.} } } \section{Combining}{ In the following code snippets, objects in \code{...} are of class \code{DataFrameList}. \describe{ \item{\code{rbind(...)}:}{ Creates a new \code{DataFrameList} containing the element-by-element row concatenation of the objects in \code{...}. } \item{\code{cbind(...)}:}{ Creates a new \code{DataFrameList} containing the element-by-element column concatenation of the objects in \code{...}. } } } \section{Transformation}{ \describe{ \item{\code{transform(`_data`, ...)}:}{ Transforms a \code{SplitDataFrame} in a manner analogous to the base \code{\link{transform}}, where the columns are \code{List} objects adhering to the structure of \code{_data}. } } } \section{Coercion}{ In the following code snippets, \code{x} is a \code{DataFrameList}. \describe{ \item{\code{as(from, "DataFrame")}:}{ Coerces a \code{SplitDataFrameList} to a \code{DataFrame}, which has a column for every column in \code{from}, except each column is a \code{List} with the same structure as \code{from}. } \item{\code{as(from, "SplitDataFrameList")}:}{ By default, simply calls the \code{SplitDataFrameList} constructor on \code{from}. If \code{from} is a \code{List}, each element of \code{from} is passed as an argument to \code{SplitDataFrameList}, like calling \code{as.list} on a vector. If \code{from} is a \code{DataFrame}, each row becomes an element in the list.} \item{\code{stack(x, index.var = "name")}:}{ Unlists \code{x} and adds a column named \code{index.var} to the result, indicating the element of \code{x} from which each row was obtained. } \item{\code{as.data.frame(x, row.names = NULL, optional = FALSE, ..., value.name = "value", use.outer.mcols = FALSE, group_name.as.factor = FALSE)}:}{ Coerces \code{x} to a \code{data.frame}. See as.data.frame on the \code{List} man page for details (?\code{List}). } } } \author{Michael Lawrence, with contributions from Aaron Lun} \seealso{ \code{\linkS4class{DataFrame}} } \examples{ # Making a DataFrameList, which has different columns. out <- DataFrameList(DataFrame(X=1, Y=2), DataFrame(A=1:2, B=3:4)) out[[1]] # A more interesting SplitDataFrameList, which is guaranteed # to have the same columns. out <- SplitDataFrameList(DataFrame(X=1, Y=2), DataFrame(X=1:2, Y=3:4)) out[[1]] out[,"X"] out[,"Y"] commonColnames(out) commonColnames(out) <- c("x", "y") out[[1]] # We can also create these split objects using various split() functions: out <- splitAsList(DataFrame(X=runif(100), Y=rpois(100, 5)), sample(letters, 100, replace=TRUE)) out[['a']] } \keyword{methods} \keyword{classes} IRanges/man/Grouping-class.Rd0000644000175100017510000004773214626176651017114 0ustar00biocbuildbiocbuild\name{Grouping-class} \docType{class} % Grouping objects \alias{class:Grouping} \alias{Grouping-class} \alias{Grouping} \alias{nobj} \alias{grouplengths} \alias{grouplengths,Grouping-method} \alias{grouplengths,CompressedGrouping-method} \alias{show,Grouping-method} % ManyToOneGrouping objects \alias{class:ManyToOneGrouping} \alias{ManyToOneGrouping-class} \alias{ManyToOneGrouping} \alias{nobj,ManyToOneGrouping-method} \alias{nobj,CompressedManyToOneGrouping-method} \alias{members} \alias{members,ManyToOneGrouping-method} \alias{vmembers} \alias{vmembers,ManyToOneGrouping-method} \alias{togroup} \alias{togroup,ManyToOneGrouping-method} \alias{togrouplength} \alias{togrouplength,ManyToOneGrouping-method} \alias{coerce,grouping,Grouping-method} \alias{coerce,grouping,ManyToOneGrouping-method} \alias{coerce,vector,Grouping-method} \alias{coerce,vector,ManyToOneGrouping-method} \alias{coerce,ManyToOneGrouping,factor-method} \alias{coerce,DataFrame,Grouping-method} \alias{coerce,FactorList,Grouping-method} \alias{coerce,Hits,Grouping-method} % ManyToManyGrouping objects \alias{nobj,BaseManyToManyGrouping-method} \alias{coerce,vector,ManyToManyGrouping-method} % H2LGrouping and Dups objects \alias{class:H2LGrouping} \alias{H2LGrouping-class} \alias{H2LGrouping} \alias{high2low} \alias{high2low,H2LGrouping-method} \alias{high2low,ANY-method} \alias{low2high} \alias{low2high,H2LGrouping-method} \alias{length,H2LGrouping-method} \alias{nobj,H2LGrouping-method} \alias{grouplengths,H2LGrouping-method} \alias{members,H2LGrouping-method} \alias{vmembers,H2LGrouping-method} \alias{togroup,H2LGrouping-method} \alias{grouprank} \alias{grouprank,H2LGrouping-method} \alias{togrouprank} \alias{togrouprank,H2LGrouping-method} \alias{length<-,H2LGrouping-method} \alias{class:Dups} \alias{Dups-class} \alias{Dups} \alias{duplicated,Dups-method} \alias{show,Dups-method} % ManyToManyGrouping objects \alias{class:ManyToManyGrouping} \alias{ManyToManyGrouping-class} \alias{ManyToManyGrouping} \alias{nobj,ManyToManyGrouping-method} % GroupingRanges objects \alias{class:GroupingRanges} \alias{GroupingRanges-class} \alias{GroupingRanges} \alias{grouplengths,GroupingRanges-method} \alias{class:GroupingIRanges} \alias{GroupingIRanges-class} \alias{GroupingIRanges} % Partitioning objects \alias{class:Partitioning} \alias{Partitioning-class} \alias{Partitioning} \alias{parallel_slot_names,Partitioning-method} \alias{extractROWS,Partitioning-method} \alias{bindROWS,Partitioning-method} \alias{togroup,Partitioning-method} \alias{names,Partitioning-method} \alias{names<-,Partitioning-method} \alias{class:PartitioningByEnd} \alias{PartitioningByEnd-class} \alias{PartitioningByEnd} \alias{parallel_slot_names,PartitioningByEnd-method} \alias{end,PartitioningByEnd-method} \alias{length,PartitioningByEnd-method} \alias{nobj,PartitioningByEnd-method} \alias{start,PartitioningByEnd-method} \alias{width,PartitioningByEnd-method} \alias{coerce,IntegerRanges,PartitioningByEnd-method} \alias{class:PartitioningByWidth} \alias{PartitioningByWidth-class} \alias{PartitioningByWidth} \alias{parallel_slot_names,PartitioningByWidth-method} \alias{width,PartitioningByWidth-method} \alias{length,PartitioningByWidth-method} \alias{end,PartitioningByWidth-method} \alias{start,PartitioningByWidth-method} \alias{coerce,IntegerRanges,PartitioningByWidth-method} % PartitioningMap objects \alias{class:PartitioningMap} \alias{PartitioningMap-class} \alias{PartitioningMap} \alias{mapOrder} \alias{mapOrder,PartitioningMap-method} \alias{show,PartitioningMap-method} % SimpleGrouping & CompressedGrouping objects \alias{class:SimpleGrouping} \alias{SimpleGrouping-class} \alias{class:CompressedGrouping} \alias{CompressedGrouping-class} \alias{class:SimpleManyToOneGrouping} \alias{SimpleManyToOneGrouping-class} \alias{class:CompressedManyToOneGrouping} \alias{CompressedManyToOneGrouping-class} \alias{class:SimpleManyToManyGrouping} \alias{SimpleManyToManyGrouping-class} \alias{class:CompressedManyToManyGrouping} \alias{CompressedManyToManyGrouping-class} % old stuff (deprecated & defunct) \alias{togroup,ANY-method} \title{Grouping objects} \description{ We call \emph{grouping} an arbitrary mapping from a collection of NO objects to a collection of NG groups, or, more formally, a bipartite graph between integer sets [1, NO] and [1, NG]. Objects mapped to a given group are said to belong to, or to be assigned to, or to be in that group. Additionally, the objects in each group are ordered. So for example the 2 following groupings are considered different: \preformatted{ Grouping 1: NG = 3, NO = 5 group objects 1 : 4, 2 2 : 3 : 4 Grouping 2: NG = 3, NO = 5 group objects 1 : 2, 4 2 : 3 : 4 } There are no restriction on the mapping e.g. any object can be mapped to 0, 1, or more groups, and can be mapped twice to the same group. Also some or all the groups can be empty. The Grouping class is a virtual class that formalizes the most general kind of grouping. More specific groupings (e.g. \emph{many-to-one groupings} or \emph{block-groupings}) are formalized via specific Grouping subclasses. This man page documents the core Grouping API, and 3 important Grouping subclasses: ManyToOneGrouping, GroupingRanges, and Partitioning (the last one deriving from the 2 first). } \section{The core Grouping API}{ Let's give a formal description of the core Grouping API: Groups G_i are indexed from 1 to NG (1 <= i <= NG). Objects O_j are indexed from 1 to NO (1 <= j <= NO). Given that empty groups are allowed, NG can be greater than NO. If \code{x} is a Grouping object: \describe{ \item{\code{length(x)}:}{ Returns the number of groups (NG). } \item{\code{names(x)}:}{ Returns the names of the groups. } \item{\code{nobj(x)}:}{ Returns the number of objects (NO). } } Going from groups to objects: \describe{ \item{\code{x[[i]]}:}{ Returns the indices of the objects (the j's) that belong to G_i. This provides the mapping from groups to objects. } \item{\code{grouplengths(x, i=NULL)}:}{ Returns the number of objects in G_i. Works in a vectorized fashion (unlike \code{x[[i]]}). \code{grouplengths(x)} is equivalent to \code{grouplengths(x, seq_len(length(x)))}. If \code{i} is not NULL, \code{grouplengths(x, i)} is equivalent to \code{sapply(i, function(ii) length(x[[ii]]))}. } } Note to developers: Given that \code{length}, \code{names} and \code{[[} are expected to work on any Grouping object, those objects can be seen as \link{List} objects. More precisely, the Grouping class actually extends the \link{IntegerList} class. In particular, many other "list" operations like \code{as.list}, \code{elementNROWS}, and \code{unlist}, etc... should work out-of-the-box on any Grouping object. } \section{ManyToOneGrouping objects}{ The ManyToOneGrouping class is a virtual subclass of Grouping for representing \emph{many-to-one groupings}, that is, groupings where each object in the original collection of objects belongs to exactly one group. The grouping of an empty collection of objects in an arbitrary number of (necessarily empty) groups is a valid ManyToOneGrouping object. Note that, for a ManyToOneGrouping object, if NG is 0 then NO must also be 0. The ManyToOneGrouping API extends the core Grouping API by adding a couple more operations for going from groups to objects: \describe{ \item{\code{members(x, i)}:}{ Equivalent to \code{x[[i]]} if \code{i} is a single integer. Otherwise, if \code{i} is an integer vector of arbitrary length, it's equivalent to \code{sort(unlist(sapply(i, function(ii) x[[ii]])))}. } \item{\code{vmembers(x, L)}:}{ A version of \code{members} that works in a vectorized fashion with respect to the \code{L} argument (\code{L} must be a list of integer vectors). Returns \code{lapply(L, function(i) members(x, i))}. } } And also by adding operations for going from objects to groups: \describe{ \item{\code{togroup(x, j=NULL)}:}{ Returns the index i of the group that O_j belongs to. This provides the mapping from objects to groups (many-to-one mapping). Works in a vectorized fashion. \code{togroup(x)} is equivalent to \code{togroup(x, seq_len(nobj(x)))}: both return the entire mapping in an integer vector of length NO. If \code{j} is not NULL, \code{togroup(x, j)} is equivalent to \code{y <- togroup(x); y[j]}. } \item{\code{togrouplength(x, j=NULL)}:}{ Returns the number of objects that belong to the same group as O_j (including O_j itself). Equivalent to \code{grouplengths(x, togroup(x, j))}. } } One important property of any ManyToOneGrouping object \code{x} is that \code{unlist(as.list(x))} is always a permutation of \code{seq_len(nobj(x))}. This is a direct consequence of the fact that every object in the grouping belongs to one group and only one. } \section{2 ManyToOneGrouping concrete subclasses: H2LGrouping, Dups and SimpleManyToOneGrouping}{ [DOCUMENT ME] Constructors: \describe{ \item{\code{H2LGrouping(high2low=integer())}:}{ [DOCUMENT ME] } \item{\code{Dups(high2low=integer())}:}{ [DOCUMENT ME] } \item{\code{ManyToOneGrouping(..., compress=TRUE)}:}{ Collect \code{\dots} into a \code{ManyToOneGrouping}. The arguments will be coerced to integer vectors and combined into a list, unless there is a single list argument, which is taken to be an integer list. The resulting integer list should have a structure analogous to that of \code{Grouping} itself: each element represents a group in terms of the subscripts of the members. If \code{compress} is \code{TRUE}, the representation uses a \code{CompressedList}, otherwise a \code{SimpleList}. } } } \section{ManyToManyGrouping objects}{ The ManyToManyGrouping class is a virtual subclass of Grouping for representing \emph{many-to-many groupings}, that is, groupings where each object in the original collection of objects belongs to any number of groups. Constructors: \describe{ \item{\code{ManyToManyGrouping(x, compress=TRUE)}:}{ Collect \code{\dots} into a \code{ManyToManyGrouping}. The arguments will be coerced to integer vectors and combined into a list, unless there is a single list argument, which is taken to be an integer list. The resulting integer list should have a structure analogous to that of \code{Grouping} itself: each element represents a group in terms of the subscripts of the members. If \code{compress} is \code{TRUE}, the representation uses a \code{CompressedList}, otherwise a \code{SimpleList}. } } } \section{GroupingRanges objects}{ The GroupingRanges class is a virtual subclass of Grouping for representing \emph{block-groupings}, that is, groupings where each group is a block of adjacent elements in the original collection of objects. GroupingRanges objects support the IntegerRanges API (e.g. \code{\link{start}}, \code{\link{end}}, \code{\link{width}}, etc...) in addition to the Grouping API. See \code{?\link{IntegerRanges}} for a description of the \link{IntegerRanges} API. } \section{Partitioning objects}{ The Partitioning class is a virtual subclass of GroupingRanges for representing \emph{block-groupings} where the blocks fully cover the original collection of objects and don't overlap. Since this makes them \emph{many-to-one groupings}, the Partitioning class is also a subclass of ManyToOneGrouping. An additional constraint of Partitioning objects is that the blocks must be ordered by ascending position with respect to the original collection of objects. The Partitioning virtual class itself has 3 concrete subclasses: PartitioningByEnd (only stores the end of the groups, allowing fast mapping from groups to objects), and PartitioningByWidth (only stores the width of the groups), and PartitioningMap which contains PartitioningByEnd and two additional slots to re-order and re-list the object to a related mapping. Constructors: \describe{ \item{\code{PartitioningByEnd(x=integer(), NG=NULL, names=NULL)}:}{ \code{x} must be either a list-like object or a sorted integer vector. \code{NG} must be either \code{NULL} or a single integer. \code{names} must be either \code{NULL} or a character vector of length \code{NG} (if supplied) or \code{length(x)} (if \code{NG} is not supplied). Returns the following PartitioningByEnd object \code{y}: \itemize{ \item If \code{x} is a list-like object, then the returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{elementNROWS(x)}. \item If \code{x} is an integer vector and \code{NG} is not supplied, then \code{x} must be sorted (checked) and contain non-NA non-negative values (NOT checked). The returned object \code{y} has the same length as \code{x} and is such that \code{end(y)} is identical to \code{x}. \item If \code{x} is an integer vector and \code{NG} is supplied, then \code{x} must be sorted (checked) and contain values >= 1 and <= \code{NG} (checked). The returned object \code{y} is of length \code{NG} and is such that \code{togroup(y)} is identical to \code{x}. } If the \code{names} argument is supplied, it is used to name the partitions. } \item{\code{PartitioningByWidth(x=integer(), NG=NULL, names=NULL)}:}{ \code{x} must be either a list-like object or an integer vector. \code{NG} must be either \code{NULL} or a single integer. \code{names} must be either \code{NULL} or a character vector of length \code{NG} (if supplied) or \code{length(x)} (if \code{NG} is not supplied). Returns the following PartitioningByWidth object \code{y}: \itemize{ \item If \code{x} is a list-like object, then the returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{elementNROWS(x)}. \item If \code{x} is an integer vector and \code{NG} is not supplied, then \code{x} must contain non-NA non-negative values (NOT checked). The returned object \code{y} has the same length as \code{x} and is such that \code{width(y)} is identical to \code{x}. \item If \code{x} is an integer vector and \code{NG} is supplied, then \code{x} must be sorted (checked) and contain values >= 1 and <= \code{NG} (checked). The returned object \code{y} is of length \code{NG} and is such that \code{togroup(y)} is identical to \code{x}. } If the \code{names} argument is supplied, it is used to name the partitions. } \item{\code{PartitioningMap(x=integer(), mapOrder=integer())}:}{ \code{x} is a list-like object or a sorted integer vector used to construct a PartitioningByEnd object. \code{mapOrder} numeric vector of the mapped order. Returns a PartitioningMap object. } } Note that these constructors don't recycle their \code{names} argument (to remain consistent with what \code{`names<-`} does on standard vectors). } \section{Coercions to Grouping objects}{ These types can be coerced to different derivatives of Grouping objects: \describe{ \item{factor}{ Analogous to calling \code{split} with the factor. Returns a ManyToOneGrouping if there are no NAs, otherwise a ManyToManyGrouping. If a factor is explicitly converted to a ManytoOneGrouping, then any NAs are placed in the last group. } \item{vector}{ A vector is effectively treated as a factor, but more efficiently. The order of the groups is not defined. } \item{FactorList}{ Same as the factor coercion, except using the interaction of every factor in the list. The interaction has an NA wherever any of the elements has one. Every element must have the same length. } \item{DataFrame}{ Effectively converted via a FactorList by coercing each column to a factor. } \item{grouping}{ Equivalent Grouping representation of the base R \code{\link{grouping}} object. } \item{Hits}{ Returns roughly the same object as \code{as(x, "List")}, except it is a ManyToManyGrouping, i.e., it knows the number of right nodes. } } } \author{Hervé Pagès, Michael Lawrence} \seealso{ \link{IntegerList-class}, \link{IntegerRanges-class}, \link{IRanges-class}, \link{successiveIRanges}, \link[base]{cumsum}, \link[base]{diff} } \examples{ showClass("Grouping") # shows (some of) the known subclasses ## --------------------------------------------------------------------- ## A. H2LGrouping OBJECTS ## --------------------------------------------------------------------- high2low <- c(NA, NA, 2, 2, NA, NA, NA, 6, NA, 1, 2, NA, 6, NA, NA, 2) h2l <- H2LGrouping(high2low) h2l ## The core Grouping API: length(h2l) nobj(h2l) # same as 'length(h2l)' for H2LGrouping objects h2l[[1]] h2l[[2]] h2l[[3]] h2l[[4]] h2l[[5]] grouplengths(h2l) # same as 'unname(sapply(h2l, length))' grouplengths(h2l, 5:2) members(h2l, 5:2) # all the members are put together and sorted togroup(h2l) togroup(h2l, 5:2) togrouplength(h2l) # same as 'grouplengths(h2l, togroup(h2l))' togrouplength(h2l, 5:2) ## The List API: as.list(h2l) sapply(h2l, length) ## --------------------------------------------------------------------- ## B. Dups OBJECTS ## --------------------------------------------------------------------- dups1 <- as(h2l, "Dups") dups1 duplicated(dups1) # same as 'duplicated(togroup(dups1))' ### The purpose of a Dups object is to describe the groups of duplicated ### elements in a vector-like object: x <- c(2, 77, 4, 4, 7, 2, 8, 8, 4, 99) x_high2low <- high2low(x) x_high2low # same length as 'x' dups2 <- Dups(x_high2low) dups2 togroup(dups2) duplicated(dups2) togrouplength(dups2) # frequency for each element table(x) ## --------------------------------------------------------------------- ## C. Partitioning OBJECTS ## --------------------------------------------------------------------- pbe1 <- PartitioningByEnd(c(4, 7, 7, 8, 15), names=LETTERS[1:5]) pbe1 # the 3rd partition is empty ## The core Grouping API: length(pbe1) nobj(pbe1) pbe1[[1]] pbe1[[2]] pbe1[[3]] grouplengths(pbe1) # same as 'unname(sapply(pbe1, length))' # and 'width(pbe1)' togroup(pbe1) togrouplength(pbe1) # same as 'grouplengths(pbe1, togroup(pbe1))' names(pbe1) ## The IntegerRanges core API: start(pbe1) end(pbe1) width(pbe1) ## The List API: as.list(pbe1) sapply(pbe1, length) ## Replacing the names: names(pbe1)[3] <- "empty partition" pbe1 ## Coercion to an IRanges object: as(pbe1, "IRanges") ## Other examples: PartitioningByEnd(c(0, 0, 19), names=LETTERS[1:3]) PartitioningByEnd() # no partition PartitioningByEnd(integer(9)) # all partitions are empty x <- c(1L, 5L, 5L, 6L, 8L) pbe2 <- PartitioningByEnd(x, NG=10L) stopifnot(identical(togroup(pbe2), x)) pbw2 <- PartitioningByWidth(x, NG=10L) stopifnot(identical(togroup(pbw2), x)) ## --------------------------------------------------------------------- ## D. RELATIONSHIP BETWEEN Partitioning OBJECTS AND successiveIRanges() ## --------------------------------------------------------------------- mywidths <- c(4, 3, 0, 1, 7) ## The 3 following calls produce the same ranges: ir <- successiveIRanges(mywidths) # IRanges instance. pbe <- PartitioningByEnd(cumsum(mywidths)) # PartitioningByEnd instance. pbw <- PartitioningByWidth(mywidths) # PartitioningByWidth instance. stopifnot(identical(as(ir, "PartitioningByEnd"), pbe)) stopifnot(identical(as(ir, "PartitioningByWidth"), pbw)) } \keyword{methods} \keyword{classes} IRanges/man/Hits-class-leftovers.Rd0000644000175100017510000000560414626176651020230 0ustar00biocbuildbiocbuild\name{Hits-class-leftovers} \docType{class} \alias{Hits-examples} % coercion \alias{coerce,SortedByQueryHits,PartitioningByEnd-method} \alias{coerce,SortedByQueryHits,Partitioning-method} \alias{coerce,SortedByQueryHits,IntegerRanges-method} \alias{coerce,SortedByQueryHits,IRanges-method} \alias{coerce,SortedByQueryHits,CompressedIntegerList-method} \alias{coerce,SortedByQueryHits,IntegerList-method} \alias{coerce,SortedByQueryHits,List-method} \alias{as.list,SortedByQueryHits-method} \alias{coerce,Hits,CompressedIntegerList-method} \alias{coerce,Hits,IntegerList-method} \alias{coerce,Hits,List-method} \alias{as.list,Hits-method} \alias{coerce,Hits,Grouping} \title{Examples of basic manipulation of Hits objects} \description{ IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Hits} objects located in the \pkg{S4Vectors} package. } \details{ The \code{as.data.frame} method coerces a \code{Hits} object to a two column \code{data.frame} with one row for each hit, where the value in the first column is the index of an element in the query and the value in the second column is the index of an element in the subject. } \section{Coercion}{ In the code snippets below, \code{x} is a \code{Hits} object. \describe{ \item{\code{as.list(x)}:}{ Coerces \code{x} to a list of integers, grouping the the right node hits for each left node. } \item{\code{as(x, "List")}:}{ Analogous to \code{as.list(x)}. } \item{\code{as(x, "Grouping")}:}{ Returns roughly the same object as \code{as(x, "List")}, except it is a ManyToManyGrouping, i.e., it knows the number of right nodes. } } } \seealso{ The \link[S4Vectors]{Hits} class defined and documented in the \pkg{S4Vectors} package. } \examples{ query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) hits <- findOverlaps(query, subject) as.matrix(hits) as.data.frame(hits) as.table(hits) # hits per query as.table(t(hits)) # hits per subject ## Turn a Hits object into an IntegerList object with one list element ## per element in the original query. as(hits, "IntegerList") as(hits, "List") # same as as(hits, "IntegerList") ## Turn a Hits object into a PartitioningByEnd object that describes ## the grouping of hits by query. as(hits, "PartitioningByEnd") as(hits, "Partitioning") # same as as(hits, "PartitioningByEnd") ## --------------------------------------------------------------------- ## remapHits() ## --------------------------------------------------------------------- hits2 <- remapHits(hits, Rnodes.remapping=factor(c("e", "e", "d"), letters[1:5])) hits2 hits3 <- remapHits(hits, Rnodes.remapping=c(5, 5, 4), new.nRnode=5) hits3 stopifnot(identical(hits2, hits3)) } \keyword{methods} \keyword{classes} IRanges/man/IPos-class.Rd0000644000175100017510000002755314626176651016173 0ustar00biocbuildbiocbuild\name{IPos-class} \docType{class} % Classes \alias{class:IPos} \alias{IPos-class} \alias{IPos} \alias{parallel_slot_names,IPos-method} \alias{class:UnstitchedIPos} \alias{UnstitchedIPos-class} \alias{UnstitchedIPos} \alias{parallel_slot_names,UnstitchedIPos-method} \alias{class:StitchedIPos} \alias{StitchedIPos-class} \alias{StitchedIPos} \alias{updateObject,IPos-method} % Accessors \alias{pos} \alias{pos,UnstitchedIPos-method} \alias{pos,IPos-method} \alias{length,UnstitchedIPos-method} \alias{length,IPos-method} \alias{names,IPos-method} \alias{names<-,IPos-method} % Coercion \alias{coerce,UnstitchedIPos,StitchedIPos-method} \alias{coerce,StitchedIPos,UnstitchedIPos-method} \alias{coerce,IntegerRanges,UnstitchedIPos-method} \alias{coerce,IntegerRanges,StitchedIPos-method} \alias{coerce,IntegerRanges,IPos-method} \alias{coerce,ANY,UnstitchedIPos-method} \alias{coerce,ANY,StitchedIPos-method} \alias{coerce,ANY,IPos-method} \alias{as.data.frame.IPos} \alias{as.data.frame,IPos-method} % Other \alias{extractROWS,IPos-method} \alias{summary.IPos} \alias{summary,IPos-method} \alias{show,IPos-method} \alias{bindROWS,IPos-method} \title{IPos objects} \description{ The IPos class is a container for storing a set of \emph{integer positions}. It exists in 2 flavors: UnstitchedIPos and StitchedIPos. Each flavor uses a particular internal representation: \itemize{ \item In an UnstitchedIPos instance the positions are stored as an integer vector. \item In a StitchedIPos instance the positions are stored as an \link{IRanges} object where each range represents a run of \emph{consecutive positions} (i.e. a run of positions that are adjacent and in \emph{ascending order}). This storage is particularly memory-efficient when the vector of positions contains long runs of consecutive positions. } Because integer positions can be seen as integer ranges of width 1, the IPos class extends the \link{IntegerRanges} virtual class. } \usage{ IPos(pos=integer(0), names=NULL, ..., stitch=NA) # constructor function } \arguments{ \item{pos}{ An integer or numeric vector, or an \link{IRanges} object (or other \link{IntegerRanges} derivative). If \code{pos} is anything else, \code{IPos()} will first try to coerce it to an IRanges object with \code{as(pos, "IRanges")}. When \code{pos} is an \link{IRanges} object (or other \link{IntegerRanges} derivative), each range in it is interpreted as a run of consecutive positions. } \item{names}{ A character vector or \code{NULL}. } \item{...}{ Metadata columns to set on the IPos object. All the metadata columns must be vector-like objects of the same length as the object to construct. } \item{stitch}{ \code{TRUE}, \code{FALSE}, or \code{NA} (the default). Controls which internal representation should be used: StitchedIPos (when \code{stitch} is \code{TRUE}) or UnstitchedIPos (when \code{stitch} is \code{FALSE}). When \code{stitch} is \code{NA} (the default), which internal representation will be used depends on the type of \code{pos}: UnstitchedIPos if \code{pos} is an integer or numeric vector, and StitchedIPos otherwise. } } \details{ Even though an \link{IRanges} object can be used for storing integer positions, using an IPos object is more efficient. In particular the memory footprint of an UnstitchedIPos object is half that of an \link{IRanges} object. OTOH the memory footprint of a StitchedIPos object can vary a lot but will never be worse than that of an \link{IRanges} object. However it will reduce dramatically if the vector of positions contains long runs of consecutive positions. In the worst case scenario (i.e. when the object contains no consecutive positions) its memory footprint will be the same as that of an \link{IRanges} object. Like for any \link[S4Vectors]{Vector} derivative, the length of an IPos object cannot exceed \code{.Machine$integer.max} (i.e. 2^31 on most platforms). \code{IPos()} will return an error if \code{pos} contains too many positions. } \value{ An UnstitchedIPos or StitchedIPos object. If the input object \code{pos} is itself an IPos derivative, its metadata columns are propagated. } \section{Accessors}{ \subsection{Getters}{ IPos objects support the same set of getters as other \link{IntegerRanges} derivatives (i.e. \code{length()}, \code{start()}, \code{end()}, \code{names()}, \code{mcols()}, etc...), plus the \code{pos()} getter which is equivalent to \code{start()} and \code{end()}. See \code{?\link{IntegerRanges}} for the list of getters supported by \link{IntegerRanges} derivatives. } \subsection{Setters}{ IPos derivatives support the \code{names()}, \code{mcols()} and \code{metadata()} setters only. In particular there is no \code{pos()} setter for IPos derivatives at the moment (although one might be added in the future). } } \section{Coercion}{ From UnstitchedIPos to StitchedIPos and vice-versa: coercion back and forth between UnstitchedIPos and StitchedIPos is supported via \code{as(x, "StitchedIPos")} and \code{as(x, "UnstitchedIPos")}. This is the most efficient and recommended way to switch between the 2 internal representations. Note that this switch can have dramatic consequences on memory usage so is for advanced users only. End users should almost never need to do this switch when following a typical workflow. From \link{IntegerRanges} to UnstitchedIPos, StitchedIPos, or IPos: An \link{IntegerRanges} derivative \code{x} in which all the ranges have a width of 1 can be coerced to an UnstitchedIPos or StitchedIPos object with \code{as(x, "UnstitchedIPos")} or \code{as(x, "StitchedIPos")}, respectively. For convenience \code{as(x, "IPos")} is supported and is equivalent to \code{as(x, "UnstitchedIPos")}. From IPos to \link{IRanges}: An IPos derivative \code{x} can be coerced to an \link{IRanges} object with \code{as(x, "IRanges")}. However be aware that if \code{x} is a StitchedIPos instance, the memory footprint of the resulting object can be thousands times (or more) than that of \code{x}! See "MEMORY USAGE" in the Examples section below. From IPos to ordinary R objects: Like with any other \link{IntegerRanges} derivative, \code{as.character()}, \code{as.factor()}, and \code{as.data.frame()} work on an IPos derivative \code{x}. Note however that \code{as.data.frame(x)} returns a data frame with a \code{pos} column (containing \code{pos(x)}) instead of the \code{start}, \code{end}, and \code{width} columns that one gets with other \link{IntegerRanges} derivatives. } \section{Subsetting}{ An IPos derivative can be subsetted exactly like an \link{IRanges} object. } \section{Concatenation}{ IPos derivatives can be concatenated with \code{c()} or \code{append()}. See \code{?\link[S4Vectors]{c}} in the \pkg{S4Vectors} package for more information about concatenating Vector derivatives. } \section{Splitting and Relisting}{ Like with an \link{IRanges} object, \code{split()} and \code{relist()} work on an IPos derivative. } \author{ Hervé Pagès; based on ideas borrowed from Georg Stricker \email{georg.stricker@in.tum.de} and Julien Gagneur \email{gagneur@in.tum.de} } \seealso{ \itemize{ \item The \link[GenomicRanges]{GPos} class in the \pkg{GenomicRanges} package for representing a set of \emph{genomic positions} (i.e. genomic ranges of width 1, a.k.a. \emph{genomic loci}). \item The \link{IRanges} class for storing a set of \emph{integer ranges} of arbitrary width. \item \link{IPosRanges-comparison} for comparing and ordering integer ranges and/or positions. \item \link{findOverlaps-methods} for finding overlapping integer ranges and/or positions. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. \item \link{coverage-methods} for computing the coverage of a set of ranges and/or positions. \item \link{nearest-methods} for finding the nearest integer range/position neighbor. } } \examples{ showClass("IPos") # shows the known subclasses ## --------------------------------------------------------------------- ## BASIC EXAMPLES ## --------------------------------------------------------------------- ## Example 1: ipos1a <- IPos(c(44:53, 5:10, 2:5)) ipos1a # unstitched length(ipos1a) pos(ipos1a) # same as 'start(ipos1a)' and 'end(ipos1a)' as.character(ipos1a) as.data.frame(ipos1a) as(ipos1a, "IRanges") as.data.frame(as(ipos1a, "IRanges")) ipos1a[9:17] ipos1b <- IPos(c(44:53, 5:10, 2:5), stitch=TRUE) ipos1b # stitched ## 'ipos1a' and 'ipos1b' are semantically equivalent, only their ## internal representations differ: all(ipos1a == ipos1b) ipos1c <- IPos(c("44-53", "5-10", "2-5")) ipos1c # stitched identical(ipos1b, ipos1c) ## Example 2: my_pos <- IRanges(c(1, 6, 12, 17), c(5, 10, 16, 20)) ipos2 <- IPos(my_pos) ipos2 # stitched ## Example 3: ipos3A <- ipos3B <- IPos(c("1-15000", "15400-88700")) npos <- length(ipos3A) mcols(ipos3A)$sample <- Rle("sA") sA_counts <- sample(10, npos, replace=TRUE) mcols(ipos3A)$counts <- sA_counts mcols(ipos3B)$sample <- Rle("sB") sB_counts <- sample(10, npos, replace=TRUE) mcols(ipos3B)$counts <- sB_counts ipos3 <- c(ipos3A, ipos3B) ipos3 ## --------------------------------------------------------------------- ## MEMORY USAGE ## --------------------------------------------------------------------- ## Coercion to IRanges works on a StitchedIPos object... ipos4 <- IPos(c("1-125000", "135000-575000")) ir4 <- as(ipos4, "IRanges") ir4 ## ... but is generally not a good idea: object.size(ipos4) object.size(ir4) # 1652 times bigger than the StitchedIPos object! ## Shuffling the order of the positions impacts memory usage: ipos4r <- rev(ipos4) object.size(ipos4r) ipos4s <- sample(ipos4) object.size(ipos4s) ## If one anticipates a lot of shuffling of the positions, ## then an UnstitchedIPos object should be used instead: ipos4b <- as(ipos4, "UnstitchedIPos") object.size(ipos4b) # initial size is bigger than stitched version object.size(rev(ipos4b)) # size didn't change object.size(sample(ipos4b)) # size didn't change ## AN IMPORTANT NOTE: In the worst situations, IPos still performs ## as good as an IRanges object. object.size(as(ipos4r, "IRanges")) # same size as 'ipos4r' object.size(as(ipos4s, "IRanges")) # same size as 'ipos4s' ## Best case scenario is when the object is strictly sorted (i.e. ## positions are in strict ascending order). ## This can be checked with: is.unsorted(ipos4, strict=TRUE) # 'ipos4' is strictly sorted ## --------------------------------------------------------------------- ## USING MEMORY-EFFICIENT METADATA COLUMNS ## --------------------------------------------------------------------- ## In order to keep memory usage as low as possible, it is recommended ## to use a memory-efficient representation of the metadata columns that ## we want to set on the object. Rle's are particularly well suited for ## this, especially if the metadata columns contain long runs of ## identical values. This is the case for example if we want to use an ## IPos object to represent the coverage of sequencing reads along a ## chromosome. ## Example 5: library(pasillaBamSubset) library(Rsamtools) # for the BamFile() constructor function bamfile1 <- BamFile(untreated1_chr4()) bamfile2 <- BamFile(untreated3_chr4()) ipos5 <- IPos(IRanges(1, seqlengths(bamfile1)[["chr4"]])) library(GenomicAlignments) # for "coverage" method for BamFile objects cvg1 <- coverage(bamfile1)$chr4 cvg2 <- coverage(bamfile2)$chr4 mcols(ipos5) <- DataFrame(cvg1, cvg2) ipos5 object.size(ipos5) # lightweight ## Keep only the positions where coverage is at least 10 in one of the ## 2 samples: ipos5[mcols(ipos5)$cvg1 >= 10 | mcols(ipos5)$cvg2 >= 10] } \keyword{methods} \keyword{classes} IRanges/man/IPosRanges-class.Rd0000644000175100017510000003157614626176651017333 0ustar00biocbuildbiocbuild\name{IPosRanges-class} \docType{class} % Classes: \alias{class:IPosRanges} \alias{IPosRanges-class} \alias{IPosRanges} % Generics and methods: \alias{width} \alias{start,Ranges-method} \alias{end,Ranges-method} \alias{width,Ranges-method} \alias{length,Ranges-method} \alias{start,Pos-method} \alias{end,Pos-method} \alias{width,Pos-method} \alias{elementNROWS,Ranges-method} \alias{mid} \alias{mid,Ranges-method} \alias{isEmpty,Ranges-method} \alias{isNormal} \alias{isNormal,Ranges-method} \alias{whichFirstNotNormal} \alias{whichFirstNotNormal,Ranges-method} \alias{start<-} \alias{width<-} \alias{end<-} \alias{as.character,IPosRanges-method} \alias{as.factor,IPosRanges-method} \alias{as.matrix,IPosRanges-method} \alias{as.data.frame.IPosRanges} \alias{as.data.frame,IPosRanges-method} \alias{summary.IPosRanges} \alias{summary,IPosRanges-method} \alias{show,IPosRanges-method} \alias{getListElement,IPosRanges-method} \alias{tile} \alias{tile,IPosRanges-method} \alias{slidingWindows} \alias{slidingWindows,IPosRanges-method} \title{IPosRanges objects} \description{ The IPosRanges \emph{virtual} class is a general container for storing a vector of ranges of integer positions. } \details{ An IPosRanges object is a vector-like object where each element describes a "range of integer positions". A "range of integer values" is a finite set of consecutive integer values. Each range can be fully described with exactly 2 integer values which can be arbitrarily picked up among the 3 following values: its "start" i.e. its smallest (or first, or leftmost) value; its "end" i.e. its greatest (or last, or rightmost) value; and its "width" i.e. the number of integer values in the range. For example the set of integer values that are greater than or equal to -20 and less than or equal to 400 is the range that starts at -20 and has a width of 421. In other words, a range is a closed, one-dimensional interval with integer end points and on the domain of integers. The starting point (or "start") of a range can be any integer (see \code{start} below) but its "width" must be a non-negative integer (see \code{width} below). The ending point (or "end") of a range is equal to its "start" plus its "width" minus one (see \code{end} below). An "empty" range is a range that contains no value i.e. a range that has a null width. Depending on the context, it can be interpreted either as just the empty \emph{set} of integers or, more precisely, as the position \emph{between} its "end" and its "start" (note that for an empty range, the "end" equals the "start" minus one). The length of an IPosRanges object is the number of ranges in it, not the number of integer values in its ranges. An IPosRanges object is considered empty iff all its ranges are empty. IPosRanges objects have a vector-like semantic i.e. they only support single subscript subsetting (unlike, for example, standard R data frames which can be subsetted by row and by column). The IPosRanges class itself is a virtual class. The following classes derive directly from it: \link{IRanges}, \link{IPos}, \link{NCList}, and \link{GroupingRanges}. } \section{Methods}{ In the code snippets below, \code{x}, \code{y} and \code{object} are IPosRanges objects. Not all the functions described below will necessarily work with all kinds of IPosRanges derivatives but they should work at least for \link{IRanges} objects. Note that many more operations on IPosRanges objects are described in other man pages of the \pkg{IRanges} package. See for example the man page for \emph{intra range transformations} (e.g. \code{shift()}, see \code{?`\link{intra-range-methods}`}), or the man page for inter range transformations (e.g. \code{reduce()}, see \code{?`\link{inter-range-methods}`}), or the man page for \code{findOverlaps} methods (see \code{?`\link{findOverlaps-methods}`}), or the man page for \link{IntegerRangesList} objects where the \code{split} method for \link{IntegerRanges} derivatives is documented. \describe{ \item{\code{length(x)}:}{ The number of ranges in \code{x}. } \item{\code{start(x)}:}{ The start values of the ranges. This is an integer vector of the same length as \code{x}. } \item{\code{width(x)}:}{ The number of integer values in each range. This is a vector of non-negative integers of the same length as \code{x}. } \item{\code{end(x)}:}{ \code{start(x) + width(x) - 1L} } \item{\code{mid(x)}:}{ returns the midpoint of the range, \code{start(x) + floor((width(x) - 1)/2)}. } \item{\code{names(x)}:}{ \code{NULL} or a character vector of the same length as \code{x}. } \item{\code{tile(x, n, width, ...)}:}{ Splits each range in \code{x} into subranges as specified by \code{n} (number of ranges) or \code{width}. Only one of \code{n} or \code{width} can be specified. The return value is a \code{IRangesList} the same length as \code{x}. IPosRanges with a width less than the \code{width} argument are returned unchanged. } \item{\code{slidingWindows(x, width, step=1L)}:}{ Generates sliding windows within each range of \code{x}, of width \code{width}, and starting every \code{step} positions. The return value is a \code{IRangesList} the same length as \code{x}. IPosRanges with a width less than the \code{width} argument are returned unchanged. If the sliding windows do not exactly cover \code{x}, the last window is partial. } \item{\code{isEmpty(x)}:}{ Return a logical value indicating whether \code{x} is empty or not. } \item{\code{as.matrix(x, ...)}:}{ Convert \code{x} into a 2-column integer matrix containing \code{start(x)} and \code{width(x)}. Extra arguments (\code{...}) are ignored. } \item{\code{as.data.frame(x, row.names=NULL, optional=FALSE)}:}{ Convert \code{x} into a standard R data frame object. \code{row.names} must be \code{NULL} or a character vector giving the row names for the data frame, and \code{optional} is ignored. See \code{?\link{as.data.frame}} for more information about these arguments. } \item{\code{x[[i]]}:}{ Return integer vector \code{start(x[i]):end(x[i])} denoted by \code{i}. Subscript \code{i} can be a single integer or a character string. } \item{\code{x[i]}:}{ Return a new IPosRanges object (of the same type as \code{x}) made of the selected ranges. \code{i} can be a numeric vector, a logical vector, \code{NULL} or missing. If \code{x} is a \link{NormalIRanges} object and \code{i} a positive numeric subscript (i.e. a numeric vector of positive values), then \code{i} must be strictly increasing. } \item{\code{rep(x, times, length.out, each)}:}{ Repeats the values in \code{x} through one of the following conventions: \describe{ \item{\code{times}}{Vector giving the number of times to repeat each element if of length \code{length(x)}, or to repeat the IPosRanges elements if of length 1.} \item{\code{length.out}}{Non-negative integer. The desired length of the output vector.} \item{\code{each}}{Non-negative integer. Each element of \code{x} is repeated \code{each} times.} } } \item{\code{c(x, ..., ignore.mcols=FALSE)}:}{ Concatenate IPosRanges object \code{x} and the IPosRanges objects in \code{...} together. See \code{?\link[S4Vectors]{c}} in the \pkg{S4Vectors} package for more information about concatenating Vector derivatives. } \item{\code{x * y}:}{ The arithmetic operation \code{x * y} is for centered zooming. It symmetrically scales the width of \code{x} by \code{1/y}, where \code{y} is a numeric vector that is recycled as necessary. For example, \code{x * 2} results in ranges with half their previous width but with approximately the same midpoint. The ranges have been \dQuote{zoomed in}. If \code{y} is negative, it is equivalent to \code{x * (1/abs(y))}. Thus, \code{x * -2} would double the widths in \code{x}. In other words, \code{x} has been \dQuote{zoomed out}. } \item{\code{x + y}:}{ Expands the ranges in \code{x} on either side by the corresponding value in the numeric vector \code{y}. } \item{\code{show(x)}:}{ By default the \code{show} method displays 5 head and 5 tail lines. The number of lines can be altered by setting the global options \code{showHeadLines} and \code{showTailLines}. If the object length is less than the sum of the options, the full object is displayed. These options affect display of \link{IRanges}, \link{IPos}, \link[S4Vectors]{Hits}, \link[GenomicRanges]{GRanges}, \link[GenomicRanges]{GPos}, \link[GenomicAlignments]{GAlignments}, \link[Biostrings]{XStringSet} objects, and more... } } } \section{Normality}{ An IPosRanges object \code{x} is implicitly representing an arbitrary finite set of integers (that are not necessarily consecutive). This set is the set obtained by taking the union of all the values in all the ranges in \code{x}. This representation is clearly not unique: many different IPosRanges objects can be used to represent the same set of integers. However one and only one of them is guaranteed to be \emph{normal}. By definition an IPosRanges object is said to be \emph{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). Here is a simple algorithm to determine whether \code{x} is \emph{normal}: (1) if \code{length(x) == 0}, then \code{x} is normal; (2) if \code{length(x) == 1}, then \code{x} is normal iff \code{width(x) >= 1}; (3) if \code{length(x) >= 2}, then \code{x} is normal iff: \preformatted{ start(x)[i] <= end(x)[i] < start(x)[i+1] <= end(x)[i+1]} for every 1 <= \code{i} < \code{length(x)}. The obvious advantage of using a \emph{normal} IPosRanges object to represent a given finite set of integers is that it is the smallest in terms of number of ranges and therefore in terms of storage space. Also the fact that we impose its ranges to be ordered from left to right makes it unique for this representation. A special container (\link{NormalIRanges}) is provided for holding a \emph{normal} \link{IRanges} object: a \link{NormalIRanges} object is just an \link{IRanges} object that is guaranteed to be \emph{normal}. Here are some methods related to the notion of \emph{normal} IPosRanges: \describe{ \item{\code{isNormal(x)}:}{ Return TRUE or FALSE indicating whether \code{x} is \emph{normal} or not. } \item{\code{whichFirstNotNormal(x)}:}{ Return \code{NA} if \code{x} is \emph{normal}, or the smallest valid indice \code{i} in \code{x} for which \code{x[1:i]} is not \emph{normal}. } } } \author{H. Pagès and M. Lawrence} \seealso{ \itemize{ \item The \link{IRanges} class, a concrete IPosRanges direct subclass for storing a set of \emph{integer ranges}. \item The \link{IPos} class, an IPosRanges direct subclass for representing a set of \emph{integer positions} (i.e. \emph{integer ranges} of width 1). \item \link{IPosRanges-comparison} for comparing and ordering ranges. \item \link{findOverlaps-methods} for finding/counting overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations of \link{IntegerRanges} derivatives. \item \link{coverage-methods} for computing the coverage of a set of ranges. \item \link{setops-methods} for set operations on ranges. \item \link{nearest-methods} for finding the nearest range neighbor. } } \examples{ ## --------------------------------------------------------------------- ## Basic manipulation ## --------------------------------------------------------------------- x <- IRanges(start=c(2:-1, 13:15), width=c(0:3, 2:0)) x length(x) start(x) width(x) end(x) isEmpty(x) as.matrix(x) as.data.frame(x) ## Subsetting: x[4:2] # 3 ranges x[-1] # 6 ranges x[FALSE] # 0 range x0 <- x[width(x) == 0] # 2 ranges isEmpty(x0) ## Use the replacement methods to resize the ranges: width(x) <- width(x) * 2 + 1 x end(x) <- start(x) # equivalent to width(x) <- 0 x width(x) <- c(2, 0, 4) x start(x)[3] <- end(x)[3] - 2 # resize the 3rd range x ## Name the elements: names(x) names(x) <- c("range1", "range2") x x[is.na(names(x))] # 5 ranges x[!is.na(names(x))] # 2 ranges ir <- IRanges(c(1,5), c(3,10)) ir*1 # no change ir*c(1,2) # zoom second range by 2X ir*-2 # zoom out 2X } \keyword{methods} \keyword{classes} IRanges/man/IPosRanges-comparison.Rd0000644000175100017510000002735014626176651020373 0ustar00biocbuildbiocbuild\name{IPosRanges-comparison} \alias{IPosRanges-comparison} \alias{pcompare} \alias{pcompare,IPosRanges,IPosRanges-method} \alias{rangeComparisonCodeToLetter} \alias{match,IPosRanges,IPosRanges-method} \alias{selfmatch,IPosRanges-method} \alias{is.unsorted,IPosRanges-method} \alias{order,IPosRanges-method} \title{Comparing and ordering ranges} \description{ Methods for comparing and/or ordering the ranges in \link{IPosRanges} derivatives (e.g. \link{IRanges}, \link{IPos}, or \link{NCList} objects). } \usage{ ## match() & selfmatch() ## --------------------- \S4method{match}{IPosRanges,IPosRanges}(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash")) \S4method{selfmatch}{IPosRanges}(x, method=c("auto", "quick", "hash")) ## order() and related methods ## ---------------------------- \S4method{is.unsorted}{IPosRanges}(x, na.rm=FALSE, strictly=FALSE) \S4method{order}{IPosRanges}(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")) ## Generalized parallel comparison of 2 IPosRanges derivatives ## ----------------------------------------------------------- \S4method{pcompare}{IPosRanges,IPosRanges}(x, y) rangeComparisonCodeToLetter(code) } \arguments{ \item{x, table, y}{ \link{IPosRanges} derivatives e.g. \link{IRanges}, \link{IPos}, or \link{NCList} objects. } \item{nomatch}{ The value to be returned in the case when no match is found. It is coerced to an \code{integer}. } \item{incomparables}{ Not supported. } \item{method}{ For \code{match} and \code{selfmatch}: Use a Quicksort-based (\code{method="quick"}) or a hash-based (\code{method="hash"}) algorithm. The latter tends to give better performance, except maybe for some pathological input that we've not encountered so far. When \code{method="auto"} is specified, the most efficient algorithm will be used, that is, the hash-based algorithm if \code{length(x) <= 2^29}, otherwise the Quicksort-based algorithm. For \code{order}: The \code{method} argument is ignored. } \item{na.rm}{ Ignored. } \item{strictly}{ Logical indicating if the check should be for \emph{strictly} increasing values. } \item{...}{ One or more \link{IPosRanges} derivatives. The 2nd and following objects are used to break ties. } \item{na.last}{ Ignored. } \item{decreasing}{ \code{TRUE} or \code{FALSE}. } \item{code}{ A vector of codes as returned by \code{pcompare}. } } \details{ Two ranges of an \link{IPosRanges} derivative are considered equal iff they share the same start and width. \code{duplicated()} and \code{unique()} on an \link{IPosRanges} derivative are conforming to this. Note that with this definition, 2 empty ranges are generally not equal (they need to share the same start to be considered equal). This means that, when it comes to comparing ranges, an empty range is interpreted as a position between its end and start. For example, a typical usecase is comparison of insertion points defined along a string (like a DNA sequence) and represented as empty ranges. The "natural order" for the elements of an \link{IPosRanges} derivative is to order them (a) first by start and (b) then by width. This way, the space of integer ranges is totally ordered. \code{pcompare()}, \code{==}, \code{!=}, \code{<=}, \code{>=}, \code{<} and \code{>} on \link{IPosRanges} derivatives behave accordingly to this "natural order". \code{is.unsorted()}, \code{order()}, \code{sort()}, \code{rank()} on \link{IPosRanges} derivatives also behave accordingly to this "natural order". Finally, note that some \emph{inter range transformations} like \code{\link{reduce}} or \code{\link{disjoin}} also use this "natural order" implicitly when operating on \link{IPosRanges} derivatives. \describe{ \item{\code{pcompare(x, y)}:}{ Performs element-wise (aka "parallel") comparison of 2 \link{IPosRanges} objects of \code{x} and \code{y}, that is, returns an integer vector where the i-th element is a code describing how \code{x[i]} is qualitatively positioned with respect to \code{y[i]}. Here is a summary of the 13 predefined codes (and their letter equivalents) and their meanings: \preformatted{ -6 a: x[i]: .oooo....... 6 m: x[i]: .......oooo. y[i]: .......oooo. y[i]: .oooo....... -5 b: x[i]: ..oooo...... 5 l: x[i]: ......oooo.. y[i]: ......oooo.. y[i]: ..oooo...... -4 c: x[i]: ...oooo..... 4 k: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooo..... -3 d: x[i]: ...oooooo... 3 j: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooooo... -2 e: x[i]: ..oooooooo.. 2 i: x[i]: ....oooo.... y[i]: ....oooo.... y[i]: ..oooooooo.. -1 f: x[i]: ...oooo..... 1 h: x[i]: ...oooooo... y[i]: ...oooooo... y[i]: ...oooo..... 0 g: x[i]: ...oooooo... y[i]: ...oooooo... } Note that this way of comparing ranges is a refinement over the standard ranges comparison defined by the \code{==}, \code{!=}, \code{<=}, \code{>=}, \code{<} and \code{>} operators. In particular a code that is \code{< 0}, \code{= 0}, or \code{> 0}, corresponds to \code{x[i] < y[i]}, \code{x[i] == y[i]}, or \code{x[i] > y[i]}, respectively. The \code{pcompare} method for \link{IPosRanges} derivatives is guaranteed to return predefined codes only but methods for other objects (e.g. for \link[GenomicRanges]{GenomicRanges} objects) can return non-predefined codes. Like for the predefined codes, the sign of any non-predefined code must tell whether \code{x[i]} is less than, or greater than \code{y[i]}. } \item{\code{rangeComparisonCodeToLetter(x)}:}{ Translate the codes returned by \code{pcompare}. The 13 predefined codes are translated as follow: -6 -> a; -5 -> b; -4 -> c; -3 -> d; -2 -> e; -1 -> f; 0 -> g; 1 -> h; 2 -> i; 3 -> j; 4 -> k; 5-> l; 6 -> m. Any non-predefined code is translated to X. The translated codes are returned in a factor with 14 levels: a, b, ..., l, m, X. } \item{\code{match(x, table, nomatch=NA_integer_, method=c("auto", "quick", "hash"))}:}{ Returns an integer vector of the length of \code{x}, containing the index of the first matching range in \code{table} (or \code{nomatch} if there is no matching range) for each range in \code{x}. } \item{\code{selfmatch(x, method=c("auto", "quick", "hash"))}:}{ Equivalent to, but more efficient than, \code{match(x, x, method=method)}. } \item{\code{duplicated(x, fromLast=FALSE, method=c("auto", "quick", "hash"))}:}{ Determines which elements of \code{x} are equal to elements with smaller subscripts, and returns a logical vector indicating which elements are duplicates. \code{duplicated(x)} is equivalent to, but more efficient than, \code{duplicated(as.data.frame(x))} on an \link{IPosRanges} derivative. See \code{\link[base]{duplicated}} in the \pkg{base} package for more details. } \item{\code{unique(x, fromLast=FALSE, method=c("auto", "quick", "hash"))}:}{ Removes duplicate ranges from \code{x}. \code{unique(x)} is equivalent to, but more efficient than, \code{unique(as.data.frame(x))} on an \link{IPosRanges} derivative. See \code{\link[base]{unique}} in the \pkg{base} package for more details. } \item{\code{x \%in\% table}:}{ A shortcut for finding the ranges in \code{x} that match any of the ranges in \code{table}. Returns a logical vector of length equal to the number of ranges in \code{x}. } \item{\code{findMatches(x, table, method=c("auto", "quick", "hash"))}:}{ An enhanced version of \code{match} that returns all the matches in a \link{Hits} object. } \item{\code{countMatches(x, table, method=c("auto", "quick", "hash"))}:}{ Returns an integer vector of the length of \code{x} containing the number of matches in \code{table} for each element in \code{x}. } \item{\code{order(...)}:}{ Returns a permutation which rearranges its first argument (an \link{IPosRanges} derivative) into ascending order, breaking ties by further arguments (also \link{IPosRanges} derivatives). } \item{\code{sort(x)}:}{ Sorts \code{x}. See \code{\link[base]{sort}} in the \pkg{base} package for more details. } \item{\code{rank(x, na.last=TRUE, ties.method=c("average", "first", "random", "max", "min"))}:}{ Returns the sample ranks of the ranges in \code{x}. See \code{\link[base]{rank}} in the \pkg{base} package for more details. } } } \author{Hervé Pagès} \seealso{ \itemize{ \item The \link{IPosRanges} class. \item \link[S4Vectors]{Vector-comparison} in the \pkg{S4Vectors} package for general information about comparing, ordering, and tabulating vector-like objects. \item \link[GenomicRanges]{GenomicRanges-comparison} in the \pkg{GenomicRanges} package for comparing and ordering genomic ranges. \item \code{\link{findOverlaps}} for finding overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. \item \link{setops-methods} for set operations on \link{IRanges} objects. } } \examples{ ## --------------------------------------------------------------------- ## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 IPosRanges ## DERIVATIVES ## --------------------------------------------------------------------- x0 <- IRanges(1:11, width=4) x0 y0 <- IRanges(6, 9) pcompare(x0, y0) pcompare(IRanges(4:6, width=6), y0) pcompare(IRanges(6:8, width=2), y0) pcompare(x0, y0) < 0 # equivalent to 'x0 < y0' pcompare(x0, y0) == 0 # equivalent to 'x0 == y0' pcompare(x0, y0) > 0 # equivalent to 'x0 > y0' rangeComparisonCodeToLetter(-10:10) rangeComparisonCodeToLetter(pcompare(x0, y0)) ## Handling of zero-width ranges (a.k.a. empty ranges): x1 <- IRanges(11:17, width=0) x1 pcompare(x1, x1[4]) pcompare(x1, IRanges(12, 15)) ## Note that x1[2] and x1[6] are empty ranges on the edge of non-empty ## range IRanges(12, 15). Even though -1 and 3 could also be considered ## valid codes for describing these configurations, pcompare() ## considers x1[2] and x1[6] to be *adjacent* to IRanges(12, 15), and ## thus returns codes -5 and 5: pcompare(x1[2], IRanges(12, 15)) # -5 pcompare(x1[6], IRanges(12, 15)) # 5 x2 <- IRanges(start=c(20L, 8L, 20L, 22L, 25L, 20L, 22L, 22L), width=c( 4L, 0L, 11L, 5L, 0L, 9L, 5L, 0L)) x2 which(width(x2) == 0) # 3 empty ranges x2[2] == x2[2] # TRUE x2[2] == x2[5] # FALSE x2 == x2[4] x2 >= x2[3] ## --------------------------------------------------------------------- ## B. match(), selfmatch(), %in%, duplicated(), unique() ## --------------------------------------------------------------------- table <- x2[c(2:4, 7:8)] match(x2, table) x2 \%in\% table duplicated(x2) unique(x2) ## --------------------------------------------------------------------- ## C. findMatches(), countMatches() ## --------------------------------------------------------------------- findMatches(x2, table) countMatches(x2, table) x2_levels <- unique(x2) countMatches(x2_levels, x2) ## --------------------------------------------------------------------- ## D. order() AND RELATED METHODS ## --------------------------------------------------------------------- is.unsorted(x2) order(x2) sort(x2) rank(x2, ties.method="first") } \keyword{methods} IRanges/man/IRanges-class.Rd0000644000175100017510000001427114626176651016642 0ustar00biocbuildbiocbuild\name{IRanges-class} \docType{class} % IRanges objects: \alias{class:IRanges} \alias{IRanges-class} \alias{parallel_slot_names,IRanges-method} % Accessors \alias{start,IRanges-method} \alias{width,IRanges-method} \alias{names,IRanges-method} \alias{start<-,IRanges-method} \alias{width<-,IRanges-method} \alias{end<-,IRanges-method} \alias{names<-,IRanges-method} \alias{ranges,IntegerRanges-method} \alias{isNormal,IRanges-method} % NormalIRanges objects: \alias{class:NormalIRanges} \alias{NormalIRanges-class} \alias{NormalIRanges} \alias{isEmpty,NormalIRanges-method} \alias{isNormal,NormalIRanges-method} \alias{max,NormalIRanges-method} \alias{min,NormalIRanges-method} % Coercion: \alias{coerce,IntegerRanges,IRanges-method} \alias{coerce,logical,IRanges-method} \alias{coerce,logical,NormalIRanges-method} \alias{coerce,integer,IRanges-method} \alias{coerce,integer,NormalIRanges-method} \alias{coerce,numeric,IRanges-method} \alias{coerce,numeric,NormalIRanges-method} \alias{coerce,character,IRanges-method} \alias{coerce,factor,IRanges-method} \alias{coerce,ANY,IntegerRanges-method} \title{IRanges and NormalIRanges objects} \description{ The IRanges class is a simple implementation of the \link{IntegerRanges} container where 2 integer vectors of the same length are used to store the start and width values. See the \link{IntegerRanges} virtual class for a formal definition of \link{IntegerRanges} objects and for their methods (all of them should work for IRanges objects). Some subclasses of the IRanges class are: NormalIRanges, \link{Views}, etc... A NormalIRanges object is just an IRanges object that is guaranteed to be "normal". See the Normality section in the man page for \link{IntegerRanges} objects for the definition and properties of "normal" \link{IntegerRanges} objects. } \section{Constructor}{ See \code{?`\link{IRanges-constructor}`}. } \section{Coercion}{ \describe{ \item{\code{ranges(x, use.names=FALSE, use.mcols=FALSE)}:}{ Squeeze the ranges out of \link{IntegerRanges} object \code{x} and return them in an IRanges object \emph{parallel} to \code{x} (i.e. same length as \code{x}). } \item{\code{as(from, "IRanges")}:}{ Creates an IRanges instance from an \link{IntegerRanges} derivative, or from a logical or integer vector. When \code{from} is a logical vector, the resulting IRanges object contains the indices for the runs of \code{TRUE} values. When \code{from} is an integer vector, the elements are either singletons or "increase by 1" sequences. } \item{\code{as(from, "NormalIRanges")}:}{ Creates a NormalIRanges instance from a logical or integer vector. When \code{from} is an integer vector, the elements must be strictly increasing. } } } \section{Concatenation}{ \describe{ \item{\code{c(x, ..., ignore.mcols=FALSE)}:}{ Concatenate IRanges object \code{x} and the IRanges objects in \code{...} together. See \code{?\link[S4Vectors]{c}} in the \pkg{S4Vectors} package for more information about concatenating Vector derivatives. } } } \section{Methods for NormalIRanges objects}{ \describe{ \item{\code{max(x)}:}{ The maximum value in the finite set of integers represented by \code{x}. } \item{\code{min(x)}:}{ The minimum value in the finite set of integers represented by \code{x}. } } } \author{Hervé Pagès} \seealso{ \itemize{ \item The \link[GenomicRanges]{GRanges} class in the \pkg{GenomicRanges} package for storing a set of \emph{genomic ranges}. \item The \link{IPos} class for representing a set of \emph{integer positions} (i.e. \emph{integer ranges} of width 1). \item \link{IPosRanges-comparison} for comparing and ordering integer ranges and/or positions. \item \link{IRanges-utils} for some utility functions for creating or modifying IRanges objects. \item \link{findOverlaps-methods} for finding overlapping integer ranges and/or positions. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. \item \link{coverage-methods} for computing the coverage of a set of ranges and/or positions. \item \link{setops-methods} for set operations on IRanges objects. \item \link{nearest-methods} for finding the nearest integer range/position neighbor. } } \examples{ showClass("IRanges") # shows the known subclasses ## --------------------------------------------------------------------- ## A. MANIPULATING IRanges OBJECTS ## --------------------------------------------------------------------- ## All the methods defined for IntegerRanges objects work on IRanges ## objects. ## See ?IntegerRanges for some examples. ## Also see ?`IRanges-utils` and ?`setops-methods` for additional ## operations on IRanges objects. ## Concatenating IRanges objects ir1 <- IRanges(c(1, 10, 20), width=5) mcols(ir1) <- DataFrame(score=runif(3)) ir2 <- IRanges(c(101, 110, 120), width=10) mcols(ir2) <- DataFrame(score=runif(3)) ir3 <- IRanges(c(1001, 1010, 1020), width=20) mcols(ir3) <- DataFrame(value=runif(3)) some.iranges <- c(ir1, ir2) ## all.iranges <- c(ir1, ir2, ir3) ## This will raise an error all.iranges <- c(ir1, ir2, ir3, ignore.mcols=TRUE) stopifnot(is.null(mcols(all.iranges))) ## --------------------------------------------------------------------- ## B. A NOTE ABOUT PERFORMANCE ## --------------------------------------------------------------------- ## Using an IRanges object for storing a big set of ranges is more ## efficient than using a standard R data frame: N <- 2000000L # nb of ranges W <- 180L # width of each range start <- 1L end <- 50000000L set.seed(777) range_starts <- sort(sample(end-W+1L, N)) range_widths <- rep.int(W, N) ## Instantiation is faster system.time(x <- IRanges(start=range_starts, width=range_widths)) system.time(y <- data.frame(start=range_starts, width=range_widths)) ## Subsetting is faster system.time(x16 <- x[c(TRUE, rep.int(FALSE, 15))]) system.time(y16 <- y[c(TRUE, rep.int(FALSE, 15)), ]) ## Internal representation is more compact object.size(x16) object.size(y16) } \keyword{methods} \keyword{classes} IRanges/man/IRanges-constructor.Rd0000644000175100017510000001560514626176651020124 0ustar00biocbuildbiocbuild\name{IRanges-constructor} \alias{IRanges-constructor} \alias{IRanges} \alias{solveUserSEW} \title{The IRanges constructor and supporting functions} \description{ The \code{IRanges} function is a constructor that can be used to create IRanges instances. \code{solveUserSEW} is a low-level utility function for solving a set of user-supplied start/end/width triplets. } \usage{ ## IRanges constructor: IRanges(start=NULL, end=NULL, width=NULL, names=NULL, ...) ## Supporting functions (not for the end user): solveUserSEW(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE) } \arguments{ \item{start, end, width}{ For \code{IRanges}: \code{NULL} or vector of integers. For \code{solveUserSEW}: vector of integers (eventually with NAs). } \item{names}{ A character vector or \code{NULL}. } \item{...}{ Metadata columns to set on the IRanges object. All the metadata columns must be vector-like objects of the same length as the object to construct. } \item{refwidths}{ Vector of non-NA non-negative integers containing the reference widths. } \item{rep.refwidths}{ \code{TRUE} or \code{FALSE}. Use of \code{rep.refwidths=TRUE} is supported only when \code{refwidths} is of length 1. } \item{translate.negative.coord, allow.nonnarrowing}{ \code{TRUE} or \code{FALSE}. } } \section{IRanges constructor}{ Return the IRanges object containing the ranges specified by \code{start}, \code{end} and \code{width}. Input falls into one of two categories: \describe{ \item{Category 1}{ \code{start}, \code{end} and \code{width} are numeric vectors (or NULLs). If necessary they are recycled to the length of the longest (NULL arguments are filled with NAs). After this recycling, each row in the 3-column matrix obtained by binding those 3 vectors together is "solved" i.e. NAs are treated as unknown in the equation \code{end = start + width - 1}. Finally, the solved matrix is returned as an \link{IRanges} instance. } \item{Category 2}{ The \code{start} argument is a logical vector or logical Rle object and \code{IRanges(start)} produces the same result as \code{as(start, "IRanges")}. Note that, in that case, the returned IRanges instance is guaranteed to be normal. } } Note that the \code{names} argument is never recycled (to remain consistent with what \code{`names<-`} does on standard vectors). } \section{Supporting functions}{ \describe{ \item{\code{solveUserSEW(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE)}:}{ Use of \code{rep.refwidths=TRUE} is supported only when \code{refwidths} is of length 1. If \code{rep.refwidths=FALSE} (the default) then \code{start}, \code{end} and \code{width} are recycled to the length of \code{refwidths} (it's an error if one of them is longer than \code{refwidths}, or is of zero length while \code{refwidths} is not). If \code{rep.refwidths=TRUE} then \code{refwidths} is first replicated L times where L is the length of the longest of \code{start}, \code{end} and \code{width}. After this replication, \code{start}, \code{end} and \code{width} are recycled to the new length of \code{refwidths} (L) (it's an error if one of them is of zero length while L is != 0). From now, \code{refwidths}, \code{start}, \code{end} and \code{width} are integer vectors of equal lengths. Each row in the 3-column matrix obtained by binding those 3 vectors together must contain at least one NA (otherwise an error is returned). Then each row is "solved" i.e. the 2 following transformations are performed (\code{i} is the indice of the row): (1) if \code{translate.negative.coord} is TRUE then a negative value of \code{start[i]} or \code{end[i]} is considered to be a \code{-refwidths[i]}-based coordinate so \code{refwidths[i]+1} is added to it to make it 1-based; (2) the NAs in the row are treated as unknowns which values are deduced from the known values in the row and from \code{refwidths[i]}. The exact rules for (2) are the following. Rule (2a): if the row contains at least 2 NAs, then \code{width[i]} must be one of them (otherwise an error is returned), and if \code{start[i]} is one of them it is replaced by 1, and if \code{end[i]} is one of them it is replaced by \code{refwidths[i]}, and finally \code{width[i]} is replaced by \code{end[i] - start[i] + 1}. Rule (2b): if the row contains only 1 NA, then it is replaced by the solution of the \code{width[i] == end[i] - start[i] + 1} equation. Finally, the set of solved rows is returned as an \link{IRanges} object of the same length as \code{refwidths} (after replication if \code{rep.refwidths=TRUE}). Note that an error is raised if either (1) the set of user-supplied start/end/width values is invalid or (2) \code{allow.nonnarrowing} is FALSE and the ranges represented by the solved start/end/width values are not narrowing the ranges represented by the user-supplied start/end/width values. } } } \author{Hervé Pagès} \seealso{ \itemize{ \item \link{IRanges-class} for the IRanges class. \item \code{\link{narrow}} } } \examples{ ## --------------------------------------------------------------------- ## A. USING THE IRanges() CONSTRUCTOR ## --------------------------------------------------------------------- IRanges(start=11, end=rep.int(20, 5)) IRanges(start=11, width=rep.int(20, 5)) IRanges(-2, 20) # only one range IRanges(start=c(2, 0, NA), end=c(NA, NA, 14), width=11:0) IRanges() # IRanges instance of length zero IRanges(names=character()) ## With ranges specified as strings: IRanges(c("11-20", "15-14", "-4--2")) ## With logical input: x <- IRanges(c(FALSE, TRUE, TRUE, FALSE, TRUE)) # logical vector input isNormal(x) # TRUE x <- IRanges(Rle(1:30) \%\% 5 <= 2) # logical Rle input isNormal(x) # TRUE ## --------------------------------------------------------------------- ## B. USING solveUserSEW() ## --------------------------------------------------------------------- refwidths <- c(5:3, 6:7) refwidths solveUserSEW(refwidths) solveUserSEW(refwidths, start=4) solveUserSEW(refwidths, end=3, width=2) solveUserSEW(refwidths, start=-3) solveUserSEW(refwidths, start=-3, width=2) solveUserSEW(refwidths, end=-4) ## The start/end/width arguments are recycled: solveUserSEW(refwidths, start=c(3, -4, NA), end=c(-2, NA)) ## Using 'rep.refwidths=TRUE': solveUserSEW(10, start=-(1:6), rep.refwidths=TRUE) solveUserSEW(10, end=-(1:6), width=3, rep.refwidths=TRUE) } \keyword{utilities} IRanges/man/IRanges-internals.Rd0000644000175100017510000000042114626176651017524 0ustar00biocbuildbiocbuild\name{IRanges internals} \alias{coerce,ANY,vector-method} \title{IRanges internals} \description{ Objects, classes and methods defined in the \pkg{IRanges} package that are not intended to be used directly. } \keyword{internal} \keyword{classes} \keyword{methods} IRanges/man/IRanges-utils.Rd0000644000175100017510000001050514626176651016671 0ustar00biocbuildbiocbuild\name{IRanges-utils} \alias{IRanges-utils} \alias{successiveIRanges} \alias{breakInChunks} \alias{whichAsIRanges} % Coercion: \alias{asNormalIRanges} \alias{coerce,IRanges,NormalIRanges-method} \title{IRanges utility functions} \description{ Utility functions for creating or modifying \link{IRanges} objects. } \usage{ ## Create an IRanges instance: successiveIRanges(width, gapwidth=0, from=1) breakInChunks(totalsize, nchunk, chunksize) ## Turn a logical vector into a set of ranges: whichAsIRanges(x) ## Coercion: asNormalIRanges(x, force=TRUE) } \arguments{ \item{width}{ A vector of non-negative integers (with no NAs) specifying the widths of the ranges to create. } \item{gapwidth}{ A single integer or an integer vector with one less element than the \code{width} vector specifying the widths of the gaps separating one range from the next one. } \item{from}{ A single integer specifying the starting position of the first range. } \item{totalsize}{ A single non-negative integer. The total size of the object to break. } \item{nchunk}{ A single positive integer. The number of chunks. } \item{chunksize}{ A single positive integer. The size of the chunks (last chunk might be smaller). } \item{x}{ A logical vector for \code{whichAsIRanges}. An \link{IRanges} object for \code{asNormalIRanges}. } \item{force}{ \code{TRUE} or \code{FALSE}. Should \code{x} be turned into a \link{NormalIRanges} object even if \code{isNormal(x)} is \code{FALSE}? } } \details{ \code{successiveIRanges} returns an \link{IRanges} instance containing the ranges that have the widths specified in the \code{width} vector and are separated by the gaps specified in \code{gapwidth}. The first range starts at position \code{from}. When \code{gapwidth=0} and \code{from=1} (the defaults), the returned IRanges can be seen as a partitioning of the 1:sum(width) interval. See \code{?Partitioning} for more details on this. \code{breakInChunks} returns a \link{PartitioningByEnd} object describing the "chunks" that result from breaking a vector-like object of length \code{totalsize} in the chunks described by \code{nchunk} or \code{chunksize}. \code{whichAsIRanges} returns an \link{IRanges} instance containing all of the ranges where \code{x} is \code{TRUE}. If \code{force=TRUE} (the default), then \code{asNormalIRanges} will turn \code{x} into a \link{NormalIRanges} instance by reordering and reducing the set of ranges if necessary (i.e. only if \code{isNormal(x)} is \code{FALSE}, otherwise the set of ranges will be untouched). If \code{force=FALSE}, then \code{asNormalIRanges} will turn \code{x} into a \link{NormalIRanges} instance only if \code{isNormal(x)} is \code{TRUE}, otherwise it will raise an error. Note that when \code{force=FALSE}, the returned object is guaranteed to contain exactly the same set of ranges than \code{x}. \code{as(x, "NormalIRanges")} is equivalent to \code{asNormalIRanges(x, force=TRUE)}. } \author{Hervé Pagès} \seealso{ \itemize{ \item \link{IRanges} objects. \item \link{Partitioning} objects. \item \code{\link{equisplit}} for splitting a list-like object into a specified number of partitions. \item \link{intra-range-methods} and \link{inter-range-methods} for intra range and inter range transformations. \item \link{setops-methods} for performing set operations on \link{IRanges} objects. \item \code{\link{solveUserSEW}} \item \code{\link{successiveViews}} } } \examples{ vec <- as.integer(c(19, 5, 0, 8, 5)) successiveIRanges(vec) breakInChunks(600999, chunksize=50000) # chunks of size 50000 (last # chunk is smaller) whichAsIRanges(vec >= 5) x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) asNormalIRanges(x) # 3 non-empty ranges ordered from left to right and # separated by gaps of width >= 1. ## More on normality: example(`IRanges-class`) isNormal(x16) # FALSE if (interactive()) x16 <- asNormalIRanges(x16) # Error! whichFirstNotNormal(x16) # 57 isNormal(x16[1:56]) # TRUE xx <- asNormalIRanges(x16[1:56]) class(xx) max(xx) min(xx) } \keyword{utilities} IRanges/man/IRangesList-class.Rd0000644000175100017510000001452014626176651017473 0ustar00biocbuildbiocbuild\name{IRangesList-class} \docType{class} % IRangesList objects: \alias{class:IRangesList} \alias{class:CompressedIRangesList} \alias{class:SimpleIRangesList} \alias{IRangesList-class} \alias{CompressedIRangesList-class} \alias{SimpleIRangesList-class} \alias{IRangesList} \alias{CompressedIRangesList} \alias{SimpleIRangesList} % accessors \alias{end,CompressedRangesList-method} \alias{width,CompressedRangesList-method} \alias{start,CompressedRangesList-method} \alias{pos,CompressedPosList-method} % coercion \alias{coerce,list,CompressedIRangesList-method} \alias{coerce,list,SimpleIRangesList-method} \alias{coerce,list,IRangesList-method} \alias{coerce,List,CompressedIRangesList-method} \alias{coerce,IntegerRanges,CompressedIRangesList-method} \alias{coerce,List,SimpleIRangesList-method} \alias{coerce,SimpleList,SimpleIRangesList-method} \alias{coerce,IntegerRangesList,SimpleIRangesList-method} \alias{coerce,SimpleIntegerRangesList,SimpleIRangesList-method} \alias{coerce,List,IRangesList-method} \alias{coerce,CompressedRleList,CompressedIRangesList-method} % NormalIRangesList objects: \alias{class:NormalIRangesList} \alias{class:SimpleNormalIRangesList} \alias{class:CompressedNormalIRangesList} \alias{NormalIRangesList-class} \alias{SimpleNormalIRangesList-class} \alias{CompressedNormalIRangesList-class} \alias{NormalIRangesList} \alias{SimpleNormalIRangesList} \alias{CompressedNormalIRangesList} \alias{isNormal,SimpleIRangesList-method} \alias{isNormal,CompressedIRangesList-method} % general \alias{min,SimpleNormalIRangesList-method} \alias{min,CompressedNormalIRangesList-method} \alias{max,SimpleNormalIRangesList-method} \alias{max,CompressedNormalIRangesList-method} \alias{summary,CompressedIRangesList-method} % more coercions \alias{as.list,CompressedNormalIRangesList-method} \alias{unlist,SimpleNormalIRangesList-method} \alias{coerce,IntegerRangesList,SimpleNormalIRangesList-method} \alias{coerce,SimpleIRangesList,SimpleNormalIRangesList-method} \alias{coerce,NormalIRangesList,CompressedNormalIRangesList-method} \alias{coerce,CompressedIRangesList,CompressedNormalIRangesList-method} \alias{coerce,IntegerRangesList,CompressedNormalIRangesList-method} \alias{coerce,IntegerRangesList,NormalIRangesList-method} \alias{coerce,LogicalList,NormalIRangesList-method} \alias{coerce,LogicalList,SimpleNormalIRangesList-method} \alias{coerce,LogicalList,CompressedNormalIRangesList-method} \alias{coerce,RleList,NormalIRangesList-method} \alias{coerce,RleList,SimpleNormalIRangesList-method} \alias{coerce,RleList,CompressedNormalIRangesList-method} \title{List of IRanges and NormalIRanges} \description{\code{\linkS4class{IRangesList}} and \code{\linkS4class{NormalIRangesList}} objects for storing \code{\linkS4class{IRanges}} and \code{\linkS4class{NormalIRanges}} objects respectively.} \section{Constructor}{ \describe{ \item{\code{IRangesList(..., compress=TRUE)}:}{ The \code{...} argument accepts either a comma-separated list of \code{IRanges} objects, or a single \code{LogicalList} / logical \code{RleList} object, or 2 elements named \code{start} and \code{end} each of them being either a list of integer vectors or an IntegerList object. When \code{IRanges} objects are supplied, each of them becomes an element in the new \code{IRangesList}, in the same order, which is analogous to the \code{\link{list}} constructor. If \code{compress}, the internal storage of the data is compressed. } } } \section{Coercion}{ In the code snippets below, \code{from} is a \emph{list-like} object. \describe{ \item{\code{as(from, "SimpleIRangesList")}:}{ Coerces \code{from}, to a \code{\linkS4class{SimpleIRangesList}}, requiring that all \code{IntegerRanges} elements are coerced to internal \code{IRanges} elements. This is a convenient way to ensure that all \code{IntegerRanges} have been imported into R (and that there is no unwanted overhead when accessing them). } \item{\code{as(from, "CompressedIRangesList")}:}{ Coerces \code{from}, to a \code{\linkS4class{CompressedIRangesList}}, requiring that all \code{IntegerRanges} elements are coerced to internal \code{IRanges} elements. This is a convenient way to ensure that all \code{IntegerRanges} have been imported into R (and that there is no unwanted overhead when accessing them). } \item{\code{as(from, "SimpleNormalIRangesList")}:}{ Coerces \code{from}, to a \code{\linkS4class{SimpleNormalIRangesList}}, requiring that all \code{IntegerRanges} elements are coerced to internal \code{NormalIRanges} elements. } \item{\code{as(from, "CompressedNormalIRangesList")}:}{ Coerces \code{from}, to a \code{\linkS4class{CompressedNormalIRangesList}}, requiring that all \code{IntegerRanges} elements are coerced to internal \code{NormalIRanges} elements. } } In the code snippet below, \code{x} is an \code{IRangesList} object. \describe{ \item{\code{unlist(x)}:}{ Unlists \code{x}, an \code{IRangesList}, by concatenating all of the ranges into a single \code{IRanges} instance. If the length of \code{x} is zero, an empty \code{IRanges} is returned. } } } \section{Methods for NormalIRangesList objects}{ \describe{ \item{\code{max(x)}:}{ An integer vector containing the maximum values of each of the elements of \code{x}. } \item{\code{min(x)}:}{ An integer vector containing the minimum values of each of the elements of \code{x}. } } } \author{Michael Lawrence and Hervé Pagès} \seealso{ \itemize{ \item \code{\link{IntegerRangesList}}, the parent of this class, for more functionality. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. \item \link{setops-methods} for set operations on IRangesList objects. } } \examples{ range1 <- IRanges(start=c(1,2,3), end=c(5,2,8)) range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5)) named <- IRangesList(one = range1, two = range2) length(named) # 2 names(named) # "one" and "two" named[[1]] # range1 unnamed <- IRangesList(range1, range2) names(unnamed) # NULL x <- IRangesList(start=list(c(1,2,3), c(15,45,20,1)), end=list(c(5,2,8), c(15,100,80,5))) as.list(x) } \keyword{classes} \keyword{methods} IRanges/man/IntegerRanges-class.Rd0000644000175100017510000000053514626176651020045 0ustar00biocbuildbiocbuild\name{IntegerRanges-class} \docType{class} % Classes: \alias{class:IntegerRanges} \alias{IntegerRanges-class} \alias{IntegerRanges} \title{IntegerRanges objects} \description{ The IntegerRanges \emph{virtual} class is a general container for storing ranges on the space of integers. } \details{ TODO } \keyword{methods} \keyword{classes} IRanges/man/IntegerRangesList-class.Rd0000644000175100017510000001350014626176651020675 0ustar00biocbuildbiocbuild\name{IntegerRangesList-class} \docType{class} % Classes \alias{class:IntegerRangesList} \alias{IntegerRangesList-class} \alias{IntegerRangesList} \alias{class:SimpleIntegerRangesList} \alias{SimpleIntegerRangesList-class} \alias{SimpleIntegerRangesList} % Accessors \alias{start,RangesList-method} \alias{start<-,IntegerRangesList-method} \alias{end,RangesList-method} \alias{end<-,IntegerRangesList-method} \alias{width,RangesList-method} \alias{width<-,IntegerRangesList-method} \alias{pos,PosList-method} \alias{space} \alias{space,IntegerRangesList-method} \alias{isNormal,IntegerRangesList-method} \alias{whichFirstNotNormal,IntegerRangesList-method} % Show \alias{show,IntegerRangesList-method} % Merge \alias{merge,IntegerRangesList,missing-method} \alias{merge,missing,IntegerRangesList-method} \alias{merge,IntegerRangesList,IntegerRangesList-method} \title{IntegerRangesList objects} \description{ The IntegerRangesList \emph{virtual} class is a general container for storing a list of \link{IntegerRanges} objects. Most users are probably more interested in the \link{IRangesList} container, an IntegerRangesList derivative for storing a list of \link{IRanges} objects. } \details{ The place of IntegerRangesList in the \emph{Vector class hierarchy}: \preformatted{ Vector ^ | List ^ | RangesList ^ ^ / \ / \ / \ / \ / \ / \ IntegerRangesList GenomicRangesList ^ ^ | | IRangesList GRangesList ^ ^ ^ ^ / \ / \ / \ / \ / \ / \ SimpleIRangesList \ SimpleGRangesList \ CompressedIRangesList CompressedGRangesList } Note that the \emph{Vector class hierarchy} has many more classes. In particular \link[S4Vectors]{Vector}, \link[S4Vectors]{List}, \link[IRanges]{RangesList}, and \link[IRanges]{IntegerRangesList} have other subclasses not shown here. } \section{Accessors}{ In the code snippets below, \code{x} is a \code{IntegerRangesList} object. All of these accessors collapse over the spaces: \describe{ \item{\code{start(x), start(x) <- value}:}{ Get or set the starts of the ranges. When setting the starts, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{\code{end(x), end(x) <- value}:}{ Get or set the ends of the ranges. When setting the ends, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{\code{width(x), width(x) <- value}:}{ Get or set the widths of the ranges. When setting the widths, \code{value} can be an integer vector of length \code{sum(elementNROWS(x))} or an IntegerList object of length \code{length(x)} and names \code{names(x)}.} \item{\code{space(x)}:}{ Gets the spaces of the ranges as a character vector. This is equivalent to \code{names(x)}, except each name is repeated according to the length of its element. } } } \section{Coercion}{ In the code snippet below, \code{x} is an \code{IntegerRangesList} object. \describe{ \item{\code{as.data.frame(x, row.names = NULL, optional = FALSE, ..., value.name = "value", use.outer.mcols = FALSE, group_name.as.factor = FALSE)}:}{ Coerces \code{x} to a \code{data.frame}. See as.data.frame on the \code{List} man page for details (?\code{List}). } } In the following code snippet, \code{from} is something other than a \code{IntegerRangesList}: \describe{ \item{\code{as(from, "IntegerRangesList")}:}{ When \code{from} is a \code{IntegerRanges}, analogous to \code{as.list} on a vector. } } } \section{Arithmetic Operations}{ Any arithmetic operation, such as \code{x + y}, \code{x * y}, etc, where \code{x} is a \code{IntegerRangesList}, is performed identically on each element. Currently, \code{IntegerRanges} supports only the \code{*} operator, which zooms the ranges by a numeric factor. } \author{M. Lawrence & H. Pagès} \seealso{ \itemize{ \item \link{IRangesList} objects. \item \link{IntegerRanges} and \link{IRanges} objects. } } \examples{ ## --------------------------------------------------------------------- ## Basic manipulation ## --------------------------------------------------------------------- range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) named <- IRangesList(one = range1, two = range2) length(named) # 2 start(named) # same as start(c(range1, range2)) names(named) # "one" and "two" named[[1]] # range1 unnamed <- IRangesList(range1, range2) names(unnamed) # NULL # edit the width of the ranges in the list edited <- named width(edited) <- rep(c(3,2), elementNROWS(named)) edited # same as list(range1, range2) as.list(IRangesList(range1, range2)) # coerce to data.frame as.data.frame(named) IRangesList(range1, range2) ## zoom in 2X collection <- IRangesList(one = range1, range2) collection * 2 } \keyword{methods} \keyword{classes} IRanges/man/MaskCollection-class.Rd0000644000175100017510000001600014626176651020211 0ustar00biocbuildbiocbuild\name{MaskCollection-class} \docType{class} % Classes: \alias{class:MaskCollection} \alias{MaskCollection-class} \alias{MaskCollection} % Basic accessor methods: \alias{nir_list} \alias{nir_list,MaskCollection-method} \alias{length,MaskCollection-method} \alias{width,MaskCollection-method} \alias{active} \alias{active,MaskCollection-method} \alias{active<-} \alias{active<-,MaskCollection-method} \alias{names,MaskCollection-method} \alias{names<-,MaskCollection-method} \alias{desc} \alias{desc,MaskCollection-method} \alias{desc<-} \alias{desc<-,MaskCollection-method} % Constructor: \alias{Mask} % Other methods: \alias{max,MaskCollection-method} \alias{min,MaskCollection-method} \alias{maskedwidth} \alias{maskedwidth,MaskCollection-method} \alias{maskedratio} \alias{maskedratio,MaskCollection-method} % Subsetting and appending: \alias{append,MaskCollection,MaskCollection-method} % Endomorphisms: \alias{collapse} \alias{collapse,MaskCollection-method} % Coercion: \alias{coerce,MaskCollection,NormalIRanges-method} % "show" method: \alias{MaskCollection.show_frame} \alias{show,MaskCollection-method} \title{MaskCollection objects} \description{ The MaskCollection class is a container for storing a collection of masks that can be used to mask regions in a sequence. } \details{ In the context of the Biostrings package, a mask is a set of regions in a sequence that need to be excluded from some computation. For example, when calling \code{\link[Biostrings:letterFrequency]{alphabetFrequency}} or \code{\link[Biostrings]{matchPattern}} on a chromosome sequence, you might want to exclude some regions like the centromere or the repeat regions. This can be achieved by putting one or several masks on the sequence before calling \code{\link[Biostrings:letterFrequency]{alphabetFrequency}} on it. A MaskCollection object is a vector-like object that represents such set of masks. Like standard R vectors, it has a "length" which is the number of masks contained in it. But unlike standard R vectors, it also has a "width" which determines the length of the sequences it can be "put on". For example, a MaskCollection object of width 20000 can only be put on an \link[Biostrings:XString-class]{XString} object of 20000 letters. Each mask in a MaskCollection object \code{x} is just a finite set of integers that are >= 1 and <= \code{width(x)}. When "put on" a sequence, these integers indicate the positions of the letters to mask. Internally, each mask is represented by a \link{NormalIRanges} object. } \section{Basic accessor methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{\code{length(x)}:}{ The number of masks in \code{x}. } \item{\code{width(x)}:}{ The common with of all the masks in \code{x}. This determines the length of the sequences that \code{x} can be "put on". } \item{\code{active(x)}:}{ A logical vector of the same length as \code{x} where each element indicates whether the corresponding mask is active or not. } \item{\code{names(x)}:}{ \code{NULL} or a character vector of the same length as \code{x}. } \item{\code{desc(x)}:}{ \code{NULL} or a character vector of the same length as \code{x}. } \item{\code{nir_list(x)}:}{ A list of the same length as \code{x}, where each element is a \link{NormalIRanges} object representing a mask in \code{x}. } } } \section{Constructor}{ \describe{ \item{\code{Mask(mask.width, start=NULL, end=NULL, width=NULL)}:}{ Return a single mask (i.e. a MaskCollection object of length 1) of width \code{mask.width} (a single integer >= 1) and masking the ranges of positions specified by \code{start}, \code{end} and \code{width}. See the \code{\link{IRanges}} constructor (\code{?\link{IRanges}}) for how \code{start}, \code{end} and \code{width} can be specified. Note that the returned mask is active and unnamed. } } } \section{Other methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{\code{isEmpty(x)}:}{ Return a logical vector of the same length as \code{x}, indicating, for each mask in \code{x}, whether it's empty or not. } \item{\code{max(x)}:}{ The greatest (or last, or rightmost) masked position for each mask. This is a numeric vector of the same length as \code{x}. } \item{\code{min(x)}:}{ The smallest (or first, or leftmost) masked position for each mask. This is a numeric vector of the same length as \code{x}. } \item{\code{maskedwidth(x)}:}{ The number of masked position for each mask. This is an integer vector of the same length as \code{x} where all values are >= 0 and <= \code{width(x)}. } \item{\code{maskedratio(x)}:}{ \code{maskedwidth(x) / width(x)} } } } \section{Subsetting and appending}{ In the code snippets below, \code{x} and \code{values} are MaskCollection objects. \describe{ \item{\code{x[i]}:}{ Return a new MaskCollection object made of the selected masks. Subscript \code{i} can be a numeric, logical or character vector. } \item{\code{x[[i, exact=TRUE]]}:}{ Extract the mask selected by \code{i} as a \link{NormalIRanges} object. Subscript \code{i} can be a single integer or a character string. } \item{\code{append(x, values, after=length(x))}:}{ Add masks in \code{values} to \code{x}. } } } \section{Other methods}{ In the code snippets below, \code{x} is a MaskCollection object. \describe{ \item{\code{collapse(x)}:}{ Return a MaskCollection object of length 1 obtained by collapsing all the active masks in \code{x}. } } } \author{Hervé Pagès} \seealso{ \link{NormalIRanges-class}, \link{read.Mask}, \link[Biostrings]{MaskedXString-class}, \code{\link{reverse}}, \code{\link[Biostrings]{alphabetFrequency}}, \code{\link[Biostrings]{matchPattern}} } \examples{ ## Making a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks length(mymasks) width(mymasks) collapse(mymasks) ## Names and descriptions: names(mymasks) <- c("A", "B", "C") # names should be short and unique... mymasks mymasks[c("C", "A")] # ...to make subsetting by names easier desc(mymasks) <- c("you can be", "more verbose", "here") mymasks[-2] ## Activate/deactivate masks: active(mymasks)["B"] <- FALSE mymasks collapse(mymasks) active(mymasks) <- FALSE # deactivate all masks mymasks active(mymasks)[-1] <- TRUE # reactivate all masks except mask 1 active(mymasks) <- !active(mymasks) # toggle all masks ## Other advanced operations: mymasks[[2]] length(mymasks[[2]]) mymasks[[2]][-3] append(mymasks[-2], gaps(mymasks[2])) } \keyword{methods} \keyword{classes} IRanges/man/NCList-class.Rd0000644000175100017510000001665014626176651016451 0ustar00biocbuildbiocbuild\name{NCList-class} \docType{class} % NCList objects: \alias{class:NCList} \alias{NCList-class} \alias{NCList} \alias{length,NCList-method} \alias{names,NCList-method} \alias{start,NCList-method} \alias{end,NCList-method} \alias{width,NCList-method} \alias{coerce,IntegerRanges,NCList-method} \alias{extractROWS,NCList,ANY-method} \alias{bindROWS,NCList-method} % NCLists objects: \alias{class:NCLists} \alias{NCLists-class} \alias{NCLists} \alias{parallel_slot_names,NCLists-method} \alias{ranges,NCLists-method} \alias{length,NCLists-method} \alias{names,NCLists-method} \alias{start,NCLists-method} \alias{end,NCLists-method} \alias{width,NCLists-method} \alias{elementNROWS,NCLists-method} \alias{coerce,NCLists,CompressedIRangesList-method} \alias{coerce,NCLists,IRangesList-method} \alias{coerce,IntegerRangesList,NCLists-method} \title{Nested Containment List objects} \description{ The NCList class is a container for storing the Nested Containment List representation of a \link{IntegerRanges} object. Preprocessing a \link{IntegerRanges} object as a Nested Containment List allows efficient overlap-based operations like \code{\link{findOverlaps}}. The NCLists class is a container for storing a collection of NCList objects. An NCLists object is typically the result of preprocessing each list element of a \link{IntegerRangesList} object as a Nested Containment List. Like with NCList, the NCLists object can then be used for efficient overlap-based operations. To preprocess a \link{IntegerRanges} or \link{IntegerRangesList} object, simply call the \code{NCList} or \code{NCLists} constructor function on it. } \usage{ NCList(x, circle.length=NA_integer_) NCLists(x, circle.length=NA_integer_) } \arguments{ \item{x}{ The \link{IntegerRanges} or \link{IntegerRangesList} object to preprocess. } \item{circle.length}{ Use only if the space (or spaces if \code{x} is a \link{IntegerRangesList} object) on top of which the ranges in \code{x} are defined needs (need) to be considered circular. If that's the case, then use \code{circle.length} to specify the length(s) of the circular space(s). For \code{NCList}, \code{circle.length} must be a single positive integer (or NA if the space is linear). For \code{NCLists}, it must be an integer vector parallel to \code{x} (i.e. same length) and with positive or NA values (NAs indicate linear spaces). } } \details{ The \pkg{GenomicRanges} package also defines the \code{\link[GenomicRanges]{GNCList}} constructor and class for preprocessing and representing a vector of genomic ranges as a data structure based on Nested Containment Lists. Some important differences between the new findOverlaps/countOverlaps implementation based on Nested Containment Lists (BioC >= 3.1) and the old implementation based on Interval Trees (BioC < 3.1): \itemize{ \item With the new implementation, the hits returned by \code{\link{findOverlaps}} are not \emph{fully} ordered (i.e. ordered by queryHits and subject Hits) anymore, but only \emph{partially} ordered (i.e. ordered by queryHits only). Other than that, and except for the 2 particular situations mentioned below, the 2 implementations produce the same output. However, the new implementation is faster and more memory efficient. \item With the new implementation, either the query or the subject can be preprocessed with \code{NCList} for a \link{IntegerRanges} object (replacement for \code{IntervalTree}), \code{NCLists} for a \link{IntegerRangesList} object (replacement for \code{IntervalForest}), and \code{\link[GenomicRanges]{GNCList}} for a \link[GenomicRanges]{GenomicRanges} object (replacement for \code{GIntervalTree}). However, for a one-time use, it is NOT advised to explicitely preprocess the input. This is because \code{\link{findOverlaps}} or \code{\link{countOverlaps}} will take care of it and do a better job at it (by preprocessing only what's needed when it's needed, and releasing memory as they go). \item With the new implementation, \code{\link{countOverlaps}} on \link{IntegerRanges} or \link[GenomicRanges]{GenomicRanges} objects doesn't call \code{\link{findOverlaps}} in order to collect all the hits in a growing \link{Hits} object and count them only at the end. Instead, the counting happens at the C level and the hits are not kept. This reduces memory usage considerably when there is a lot of hits. \item When \code{minoverlap=0}, zero-width ranges are now interpreted as insertion points and considered to overlap with ranges that contain them. With the old alogrithm, zero-width ranges were always ignored. This is the 1st situation where the new and old implementations produce different outputs. \item When using \code{select="arbitrary"}, the new implementation will generally not select the same hits as the old implementation. This is the 2nd situation where the new and old implementations produce different outputs. \item The new implementation supports preprocessing of a \link[GenomicRanges]{GenomicRanges} object with ranges defined on circular sequences (e.g. on the mitochnodrial chromosome). See \link[GenomicRanges]{GNCList} in the \pkg{GenomicRanges} package for some examples. \item Objects preprocessed with \code{NCList}, \code{NCLists}, and \code{\link[GenomicRanges]{GNCList}} are serializable (with \code{save}) for later use. Not a typical thing to do though, because preprocessing is very cheap (i.e. very fast and memory efficient). } } \value{ An NCList object for the \code{NCList} constructor and an NCLists object for the \code{NCLists} constructor. } \author{Hervé Pagès} \references{ Alexander V. Alekseyenko and Christopher J. Lee -- Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases. Bioinformatics (2007) 23 (11): 1386-1393. doi: 10.1093/bioinformatics/btl647 } \seealso{ \itemize{ \item The \code{\link[GenomicRanges]{GNCList}} constructor and class defined in the \pkg{GenomicRanges} package. \item \code{\link{findOverlaps}} for finding/counting interval overlaps between two \emph{range-based} objects. \item \link{IntegerRanges} and \link{IntegerRangesList} objects. } } \examples{ ## The example below is for illustration purpose only and does NOT ## reflect typical usage. This is because, for a one-time use, it is ## NOT advised to explicitely preprocess the input for findOverlaps() ## or countOverlaps(). These functions will take care of it and do a ## better job at it (by preprocessing only what's needed when it's ## needed, and release memory as they go). query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) ## Either the query or the subject of findOverlaps() can be preprocessed: ppsubject <- NCList(subject) hits1 <- findOverlaps(query, ppsubject) hits1 ppquery <- NCList(query) hits2 <- findOverlaps(ppquery, subject) hits2 ## Note that 'hits1' and 'hits2' contain the same hits but not in the ## same order. stopifnot(identical(sort(hits1), sort(hits2))) } \keyword{classes} \keyword{methods} IRanges/man/RangedSelection-class.Rd0000644000175100017510000000465314626176651020363 0ustar00biocbuildbiocbuild\name{RangedSelection-class} \docType{class} \alias{RangedSelection-class} % accessors \alias{ranges,RangedSelection-method} \alias{colnames,RangedSelection-method} \alias{ranges<-,RangedSelection-method} \alias{colnames<-,RangedSelection-method} % coercion \alias{coerce,IntegerRangesList,RangedSelection-method} % constructor \alias{RangedSelection} \title{Selection of ranges and columns} \description{A \code{RangedSelection} represents a query against a table of interval data in terms of ranges and column names. The ranges select any table row with an overlapping interval. Note that the intervals are always returned, even if no columns are selected.} \details{ Traditionally, tabular data structures have supported the \code{\link{subset}} function, which allows one to select a subset of the rows and columns from the table. In that case, the rows and columns are specified by two separate arguments. As querying interval data sources, especially those external to R, such as binary indexed files and databases, is increasingly common, there is a need to encapsulate the row and column specifications into a single data structure, mostly for the sake of interface cleanliness. The \code{RangedSelection} class fills that role. } \section{Constructor}{ \describe{ \item{\code{RangedSelection(ranges=IRangesList(), colnames = character())}:}{ Constructors a \code{RangedSelection} with the given \code{ranges} and \code{colnames}. } } } \section{Coercion}{ \describe{ \item{\code{as(from, "RangedSelection")}:}{ Coerces \code{from} to a \code{RangedSelection} object. Typically, \code{from} is a \code{\linkS4class{IntegerRangesList}}, the ranges of which become the ranges in the new \code{RangedSelection}. } } } \section{Accessors}{ In the code snippets below, \code{x} is always a \code{RangedSelection}. \describe{ \item{\code{ranges(x), ranges(x) <- value}:}{ Gets or sets the ranges, a \code{\linkS4class{IntegerRangesList}}, that select rows with overlapping intervals. } \item{\code{colnames(x), colnames(x) <- value}:}{ Gets the names, a \code{character} vector, indicating the columns. } } } \author{ Michael Lawrence } \examples{ rl <- IRangesList(chr1 = IRanges(c(1, 5), c(3, 6))) RangedSelection(rl) as(rl, "RangedSelection") # same as above RangedSelection(rl, "score") } \keyword{methods} \keyword{classes} IRanges/man/Rle-class-leftovers.Rd0000644000175100017510000000417314626176651020043 0ustar00biocbuildbiocbuild\name{Rle-class-leftovers} \docType{class} \alias{ranges,Rle-method} \alias{coerce,Rle,IRanges-method} \alias{coerce,Rle,NormalIRanges-method} \alias{findRange} \alias{findRange,Rle-method} \alias{splitRanges} \alias{splitRanges,Rle-method} \alias{splitRanges,vector_OR_factor-method} \title{Rle objects (old man page)} \description{ IMPORTANT NOTE - 7/3/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Rle} objects located in the \pkg{S4Vectors} package. } \section{Coercion}{ In the code snippets below, \code{from} is an Rle object: \describe{ \item{\code{as(from, "IRanges")}:}{ Creates an \link{IRanges} instance from a logical Rle. Note that this instance is guaranteed to be normal. } \item{\code{as(from, "NormalIRanges")}:}{ Creates a \link{NormalIRanges} instance from a logical Rle. } } } \section{General Methods}{ In the code snippets below, \code{x} is an Rle object: \describe{ \item{\code{split(x, f, drop=FALSE)}:}{ Splits \code{x} according to \code{f} to create a \link{CompressedRleList} object. If \code{f} is a list-like object then \code{drop} is ignored and \code{f} is treated as if it was \code{rep(seq_len(length(f)), sapply(f, length))}, so the returned object has the same shape as \code{f} (it also receives the names of \code{f}). Otherwise, if \code{f} is not a list-like object, empty list elements are removed from the returned object if \code{drop} is \code{TRUE}. } \item{\code{findRange(x, vec)}:}{ Returns an \link{IRanges} object representing the ranges in Rle \code{vec} that are referenced by the indices in the integer vector \code{x}. } \item{\code{splitRanges(x)}:}{ Returns a \linkS4class{CompressedIRangesList} object that contains the ranges for each of the unique run values. } } } \seealso{ The \link[S4Vectors]{Rle} class defined and documented in the \pkg{S4Vectors} package. } \examples{ x <- Rle(10:1, 1:10) x } \keyword{methods} \keyword{classes} IRanges/man/RleViews-class.Rd0000644000175100017510000000270414626176651017050 0ustar00biocbuildbiocbuild\name{RleViews-class} \docType{class} % Classes: \alias{class:RleViews} \alias{RleViews-class} \alias{RleViews} % Constructors: \alias{Views,Rle-method} % Methods: \alias{show,RleViews-method} \title{The RleViews class} \description{ The RleViews class is the basic container for storing a set of views (start/end locations) on the same Rle object. } \details{ An RleViews object contains a set of views (start/end locations) on the same \link{Rle} object called "the subject vector" or simply "the subject". Each view is defined by its start and end locations: both are integers such that start <= end. An RleViews object is in fact a particular case of a \link{Views} object (the RleViews class contains the \link{Views} class) so it can be manipulated in a similar manner: see \code{?\link{Views}} for more information. Note that two views can overlap and that a view can be "out of limits" i.e. it can start before the first element of the subject or/and end after its last element. } \author{P. Aboyoun} \seealso{ \link{Views-class}, \link{Rle-class}, \link{view-summarization-methods} } \examples{ subject <- Rle(rep(c(3L, 2L, 18L, 0L), c(3,2,1,5))) myViews <- Views(subject, 3:0, 5:8) myViews subject(myViews) length(myViews) start(myViews) end(myViews) width(myViews) myViews[[2]] set.seed(0) vec <- Rle(sample(0:2, 20, replace = TRUE)) vec Views(vec, vec > 0) } \keyword{methods} \keyword{classes} IRanges/man/RleViewsList-class.Rd0000644000175100017510000000453114626176651017704 0ustar00biocbuildbiocbuild\name{RleViewsList-class} \docType{class} \alias{RleViewsList-class} \alias{SimpleRleViewsList-class} % accessor \alias{subject,SimpleRleViewsList-method} % constructor \alias{Views,RleList-method} \alias{RleViewsList} % coercion \alias{coerce,RleViewsList,SimpleIRangesList-method} \alias{coerce,RleViewsList,IRangesList-method} \title{List of RleViews} \description{An extension of \linkS4class{ViewsList} that holds only \linkS4class{RleViews} objects. Useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate \linkS4class{RleViews} object. } \details{ For more information on methods available for RleViewsList objects consult the man pages for \link{ViewsList-class} and \link{view-summarization-methods}. } \section{Constructor}{ \describe{ \item{\code{RleViewsList(..., rleList, rangesList)}:}{ Either \code{...} or the \code{rleList}/\code{rangesList} couplet provide the RleViews for the list. If \code{...} is provided, each of these arguments must be RleViews objects. Alternatively, \code{rleList} and \code{rangesList} accept Rle and IntegerRanges objects respectively that are meshed together for form the RleViewsList. } \item{\code{Views(subject, start=NULL, end=NULL, width=NULL, names=NULL)}:}{ Same as \code{RleViewsList(rleList = subject, rangesList = start)}. } } } \section{Coercion}{ In the code snippet below, \code{from} is an RleViewsList object: \describe{ \item{\code{as(from, "IRangesList")}:}{ Creates an \code{IRangesList} object containing the view locations in \code{from}. } } } \author{P. Aboyoun} \seealso{ \link{ViewsList-class}, \link{view-summarization-methods} } \examples{ ## Rle objects subject1 <- Rle(c(3L,2L,18L,0L), c(3,2,1,5)) set.seed(0) subject2 <- Rle(c(0L,5L,2L,0L,3L), c(8,5,2,7,4)) ## Views rleViews1 <- Views(subject1, 3:0, 5:8) rleViews2 <- Views(subject2, subject2 > 0) ## RleList and IntegerRangesList objects rleList <- RleList(subject1, subject2) rangesList <- IRangesList(IRanges(3:0, 5:8), IRanges(subject2 > 0)) ## methods for construction method1 <- RleViewsList(rleViews1, rleViews2) method2 <- RleViewsList(rleList = rleList, rangesList = rangesList) identical(method1, method2) ## calculation over the views viewSums(method1) } \keyword{methods} \keyword{classes} IRanges/man/Vector-class-leftovers.Rd0000644000175100017510000000567714626176651020575 0ustar00biocbuildbiocbuild\name{Vector-class-leftovers} \docType{class} \alias{window<-,Vector-method} \alias{window<-.Vector} \alias{window<-,vector-method} \alias{window<-.vector} \alias{window<-,factor-method} \alias{window<-.factor} \alias{tapply,ANY,Vector-method} \alias{tapply,Vector,ANY-method} \alias{tapply,Vector,Vector-method} \alias{with,Vector-method} \alias{eval} \alias{eval,expression,Vector-method} \alias{eval,language,Vector-method} \title{Vector objects (old man page)} \description{ IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for \link[S4Vectors]{Vector} objects located in the \pkg{S4Vectors} package. } \section{Evaluation}{ In the following code snippets, \code{x} is a Vector object. \describe{ \item{\code{with(x, expr)}:}{ Evaluates \code{expr} within \code{as.env(x)} via \code{eval(x)}. } \item{\code{eval(expr, envir, enclos=parent.frame())}:}{ Evaluates \code{expr} within \code{envir}, where \code{envir} is coerced to an environment with \code{as.env(envir, enclos)}. The \code{expr} is first processed with \code{\link{bquote}}, such that any escaped symbols are directly resolved in the calling frame. } } } \section{Convenience wrappers for common subsetting operations}{ In the code snippets below, \code{x} is a Vector object or regular R vector object. The R vector object methods for \code{window} are defined in this package and the remaining methods are defined in base R. \describe{ \item{\code{window(x, start=NA, end=NA, width=NA) <- value}:}{ Replace the subsequence window specified on the left (i.e. the subsequence in \code{x} specified by \code{start}, \code{end} and \code{width}) by \code{value}. \code{value} must either be of class \code{class(x)}, belong to a subclass of \code{class(x)}, or be coercible to \code{class(x)} or a subclass of \code{class(x)}. The elements of \code{value} are repeated to create a Vector with the same number of elements as the width of the subsequence window it is replacing. } } } \section{Looping}{ In the code snippets below, \code{x} is a Vector object. \describe{ \item{\code{tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)}:}{ Like the standard \code{\link[base]{tapply}} function defined in the base package, the \code{tapply} method for Vector objects applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors. } } } \section{Coercion}{ \describe{ \item{\code{as.list(x)}:}{ coerce a Vector to a list, where the \code{i}th element of the result corresponds to \code{x[i]}. } } } \seealso{ The \link[S4Vectors]{Vector} class defined and documented in the \pkg{S4Vectors} package. } \keyword{methods} \keyword{classes} IRanges/man/Views-class.Rd0000644000175100017510000001241214626176651016402 0ustar00biocbuildbiocbuild\name{Views-class} \docType{class} \alias{class:Views} \alias{Views-class} \alias{parallel_slot_names,Views-method} \alias{subject} \alias{subject,Views-method} \alias{ranges,Views-method} \alias{ranges<-} \alias{ranges<-,Views-method} \alias{start,Views-method} \alias{width,Views-method} \alias{names,Views-method} \alias{start<-,Views-method} \alias{end<-,Views-method} \alias{width<-,Views-method} \alias{names<-,Views-method} \alias{Views} \alias{unlist,Views-method} \alias{coerce,Vector,Views-method} \alias{coerce,Views,IntegerRanges-method} \alias{coerce,Views,IRanges-method} \alias{coerce,Views,NormalIRanges-method} \alias{as.matrix,Views-method} \alias{bindROWS,Views-method} \alias{trim} \alias{trim,Views-method} \alias{subviews} \alias{subviews,Views-method} \alias{successiveViews} \title{Views objects} \description{ The Views virtual class is a general container for storing a set of views on an arbitrary \link{Vector} object, called the "subject". Its primary purpose is to introduce concepts and provide some facilities that can be shared by the concrete classes that derive from it. Some direct subclasses of the Views class are: \link{RleViews}, \link[XVector]{XIntegerViews} (defined in the XVector package), \link[Biostrings]{XStringViews} (defined in the Biostrings package), etc... } \section{Constructor}{ \describe{ \item{\code{Views(subject, start=NULL, end=NULL, width=NULL, names=NULL)}:}{ This constructor is a generic function with dispatch on argument \code{subject}. Specific methods must be defined for the subclasses of the Views class. For example a method for \link[Biostrings:XString-class]{XString} subjects is defined in the Biostrings package that returns an \link[Biostrings:XStringViews-class]{XStringViews} object. There is no default method. The treatment of the \code{start}, \code{end} and \code{width} arguments is the same as with the \code{\link{IRanges}} constructor, except that, in addition, \code{Views} allows \code{start} to be an \link{IntegerRanges} object. With this feature, \code{Views(subject, IRanges(my_starts, my_ends, my_widths, my_names))} and \code{Views(subject, my_starts, my_ends, my_widths, my_names)} are equivalent (except when \code{my_starts} is itself a \link{IntegerRanges} object). } } } \section{Coercion}{ In the code snippets below, \code{from} is a Views object: \describe{ \item{\code{as(from, "IRanges")}:}{ Creates an \code{IRanges} object containing the view locations in \code{from}. } } } \section{Accessor-like methods}{ All the accessor-like methods defined for \code{IRanges} objects work on Views objects. In addition, the following accessors are defined for Views objects: \describe{ \item{\code{subject(x)}:}{ Return the subject of the views. } } } \section{Subsetting}{ \describe{ \item{\code{x[i]}:}{ Select the views specified by \code{i}. } \item{\code{x[[i]]}:}{ Extracts the view selected by \code{i} as an object of the same class as \code{subject(x)}. Subscript \code{i} can be a single integer or a character string. The result is the subsequence of \code{subject(x)} defined by \code{window(subject(x), start=start(x)[i], end=end(x)[i])} or an error if the view is "out of limits" (i.e. \code{start(x)[i] < 1} or \code{end(x)[i] > length(subject(x))}). } } } \section{Concatenation}{ \describe{ \item{\code{c(x, ..., ignore.mcols=FALSE)}:}{ Concatenate \code{Views} objects. They must have the same subject. } } } \section{Other methods}{ \describe{ \item{\code{trim(x, use.names=TRUE)}:}{ Equivalent to \code{restrict(x, start=1L, end=length(subject(x)), keep.all.ranges=TRUE, use.names=use.names)}. } \item{\code{subviews(x, start=NA, end=NA, width=NA, use.names=TRUE)}:}{ \code{start}, \code{end}, and \code{width} arguments must be vectors of integers, eventually with NAs, that contain coordinates relative to the current ranges. Equivalent to \code{trim(narrow(x, start=start, end=end, width=width, use.names=use.names))}. } \item{\code{successiveViews(subject, width, gapwidth=0, from=1)}:}{ Equivalent to \code{Views(subject, successiveIRanges(width, gapwidth, from))}. See \code{?successiveIRanges} for a description of the \code{width}, \code{gapwidth} and \code{from} arguments. } } } \author{Hervé Pagès} \seealso{ \link{IRanges-class}, \link{Vector-class}, \link{IRanges-utils}, \link[XVector]{XVector}. Some direct subclasses of the Views class: \link{RleViews-class}, \link[XVector]{XIntegerViews-class}, \link[XVector]{XDoubleViews-class}, \link[Biostrings]{XStringViews-class}. \code{\link{findOverlaps}}. } \examples{ showClass("Views") # shows (some of) the known subclasses ## Create a set of 4 views on an XInteger subject of length 10: subject <- Rle(3:-6) v1 <- Views(subject, start=4:1, end=4:7) ## Extract the 2nd view: v1[[2]] ## Some views can be "out of limits" v2 <- Views(subject, start=4:-1, end=6) trim(v2) subviews(v2, end=-2) ## See ?`XIntegerViews-class` in the XVector package for more examples. } \keyword{methods} \keyword{classes} IRanges/man/ViewsList-class.Rd0000644000175100017510000000234514626176651017242 0ustar00biocbuildbiocbuild\name{ViewsList-class} \docType{class} \alias{class:ViewsList} \alias{ViewsList-class} \alias{ViewsList} \alias{class:SimpleViewsList} \alias{SimpleViewsList-class} \alias{SimpleViewsList} % accessors \alias{ranges,SimpleViewsList-method} \alias{start,SimpleViewsList-method} \alias{end,SimpleViewsList-method} \alias{width,SimpleViewsList-method} % coercion \alias{as.matrix,ViewsList-method} \title{List of Views} \description{An extension of \linkS4class{List} that holds only \linkS4class{Views} objects. } \details{ ViewsList is a virtual class. Specialized subclasses like e.g. \linkS4class{RleViewsList} are useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate \linkS4class{RleViews} object. As a \linkS4class{List} subclass, ViewsList inherits all the methods available for \linkS4class{List} objects. It also presents an API that is very similar to that of \linkS4class{Views}, where operations are vectorized over the elements and generally return lists. } \author{P. Aboyoun and H. Pagès} \seealso{ \link{List-class}, \link{RleViewsList-class}. \code{\link{findOverlaps}}. } \examples{ showClass("ViewsList") } \keyword{methods} \keyword{classes} IRanges/man/coverage-methods.Rd0000644000175100017510000003633414626176651017447 0ustar00biocbuildbiocbuild\name{coverage-methods} \alias{coverage-methods} \alias{coverage} \alias{coverage,IntegerRanges-method} \alias{coverage,StitchedIPos-method} \alias{coverage,Views-method} \alias{coverage,IntegerRangesList-method} \title{Coverage of a set of ranges} \description{ For each position in the space underlying a set of ranges, counts the number of ranges that cover it. } \usage{ coverage(x, shift=0L, width=NULL, weight=1L, ...) \S4method{coverage}{IntegerRanges}(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) \S4method{coverage}{IntegerRangesList}(x, shift=0L, width=NULL, weight=1L, method=c("auto", "sort", "hash", "naive")) } \arguments{ \item{x}{ A \link{IntegerRanges}, \link{Views}, or \link{IntegerRangesList} object. See \code{?`\link[GenomicRanges]{coverage-methods}`} in the \pkg{GenomicRanges} package for \code{coverage} methods for other objects. } \item{shift, weight}{ \code{shift} specifies how much each range in \code{x} should be shifted before the coverage is computed. A positive shift value will shift the corresponding range in \code{x} to the right, and a negative value to the left. NAs are not allowed. \code{weight} assigns a weight to each range in \code{x}. \itemize{ \item If \code{x} is an \link{IntegerRanges} or \link{Views} object: each of these arguments must be an integer or numeric vector parallel to \code{x} (will get recycled if necessary). Alternatively, each of these arguments can also be specified as a single string naming a metadata column in \code{x} (i.e. a column in \code{mcols(x)}) to be used as the \code{shift} (or \code{weight}) vector. Note that when \code{x} is an \link{IPos} object, each of these arguments can only be a single number. \item If \code{x} is an \link{IntegerRangesList} object: each of these arguments must be a numeric vector or list-like object of the same length as \code{x} (will get recycled if necessary). If it's a numeric vector, it's first turned into a list with \code{as.list}. After recycling, each list element \code{shift[[i]]} (or \code{weight[[i]]}) must be an integer or numeric vector parallel to \code{x[[i]]} (will get recycled if necessary). } If \code{weight} is an integer vector or list-like object of integer vectors, the coverage vector(s) will be returned as integer-\link{Rle} object(s). If it's a numeric vector or list-like object of numeric vectors, the coverage vector(s) will be returned as numeric-\link{Rle} object(s). } \item{width}{ Specifies the length of the returned coverage vector(s). \itemize{ \item If \code{x} is an \link{IntegerRanges} object: \code{width} must be \code{NULL} (the default), an NA, or a single non-negative integer. After being shifted, the ranges in \code{x} are always clipped on the left to keep only their positive portion i.e. their intersection with the [1, +inf) interval. If \code{width} is a single non-negative integer, then they're also clipped on the right to keep only their intersection with the [1, width] interval. In that case \code{coverage} returns a vector of length \code{width}. Otherwise, it returns a vector that extends to the last position in the underlying space covered by the shifted ranges. \item If \code{x} is a \link{Views} object: Same as for a \link{IntegerRanges} object, except that, if \code{width} is \code{NULL} then it's treated as if it was \code{length(subject(x))}. \item If \code{x} is a \link{IntegerRangesList} object: \code{width} must be \code{NULL} or an integer vector parallel to \code{x} (i.e. with one element per list element in \code{x}). If not \code{NULL}, the vector must contain NAs or non-negative integers and it will get recycled to the length of \code{x} if necessary. If \code{NULL}, it is replaced with \code{NA} and recycled to the length of \code{x}. Finally \code{width[i]} is used to compute the coverage vector for \code{x[[i]]} and is therefore treated like explained above (when \code{x} is a \link{IntegerRanges} object). } } \item{method}{ If \code{method} is set to \code{"sort"}, then \code{x} is sorted previous to the calculation of the coverage. If \code{method} is set to \code{"hash"} or \code{"naive"}, then \code{x} is hashed directly to a vector of length \code{width} without previous sorting. The \code{"hash"} method is faster than the \code{"sort"} method when \code{x} is large (i.e. contains a lot of ranges). When \code{x} is small and \code{width} is big (e.g. \code{x} represents a small set of reads aligned to a big chromosome), then \code{method="sort"} is faster and uses less memory than \code{method="hash"}. The \code{"naive"} method is a slower version of the \code{"hash"} method that has the advantage of avoiding floating point artefacts in the no-coverage regions of the numeric-Rle object returned by \code{coverage()} when the weights are supplied as a numeric vector of type \code{double}. See "FLOATING POINT ARITHMETIC CAN BRING A SURPRISE" section in the Examples below for more information. Using \code{method="auto"} selects between the \code{"sort"} and \code{"hash"} methods, picking the one that is predicted to be faster based on \code{length(x)} and \code{width}. } \item{...}{ Further arguments to be passed to or from other methods. } } \value{ If \code{x} is a \link{IntegerRanges} or \link{Views} object: An integer- or numeric-\link{Rle} object depending on whether \code{weight} is an integer or numeric vector. If \code{x} is a \link{IntegerRangesList} object: An \link{RleList} object with one coverage vector per list element in \code{x}, and with \code{x} names propagated to it. The i-th coverage vector can be either an integer- or numeric-\link{Rle} object, depending on the type of \code{weight[[i]]} (after \code{weight} has gone thru \code{as.list} and recycling, like described previously). } \author{H. Pagès and P. Aboyoun} \seealso{ \itemize{ \item \link[GenomicRanges]{coverage-methods} in the \pkg{GenomicRanges} package for more \code{coverage} methods. \item The \code{\link{slice}} function for slicing the \link{Rle} or \link{RleList} object returned by \code{coverage}. \item \link{IntegerRanges}, \link{IPos}, \link{IntegerRangesList}, \link{Rle}, and \link{RleList} objects. } } \examples{ ## --------------------------------------------------------------------- ## A. COVERAGE OF AN IRanges OBJECT ## --------------------------------------------------------------------- x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L), width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L)) coverage(x) coverage(x, shift=7) coverage(x, shift=7, width=27) coverage(x, shift=c(-4, 2)) # 'shift' gets recycled coverage(x, shift=c(-4, 2), width=12) coverage(x, shift=-max(end(x))) coverage(restrict(x, 1, 10)) coverage(reduce(x), shift=7) coverage(gaps(shift(x, 7), start=1, end=27)) ## With weights: coverage(x, weight=as.integer(10^(0:7))) # integer-Rle coverage(x, weight=c(2.8, -10)) # numeric-Rle, 'shift' gets recycled ## --------------------------------------------------------------------- ## B. FLOATING POINT ARITHMETIC CAN BRING A SURPRISE ## --------------------------------------------------------------------- ## Please be aware that rounding errors in floating point arithmetic can ## lead to some surprising results when computing a weighted coverage: y <- IRanges(c(4, 10), c(18, 15)) w1 <- 0.958 w2 <- 1e4 cvg <- coverage(y, width=100, weight=c(w1, w2)) cvg # non-zero coverage at positions 19 to 100! ## This is an artefact of floating point arithmetic and the algorithm ## used to compute the weighted coverage. It can be observed with basic ## floating point arithmetic: w1 + w2 - w2 - w1 # very small non-zero value! ## Note that this only happens with the "sort" and "hash" methods but ## not with the "naive" method: coverage(y, width=100, weight=c(w1, w2), method="sort") coverage(y, width=100, weight=c(w1, w2), method="hash") coverage(y, width=100, weight=c(w1, w2), method="naive") ## These very small non-zero coverage values in the no-coverage regions ## of the numeric-Rle object returned by coverage() are not always ## present. But when they are, they can cause problems downstream or ## in unit tests. For example downstream code that relies on things ## like 'cvg != 0' to find regions with coverage won't work properly. ## This can be mitigated either by selecting the "naive" method (be aware ## that this can slow down things significantly) or by "cleaning" 'cvg' ## first e.g. with something like 'cvg <- round(cvg, digits)' where ## 'digits' is a carefully chosen number of digits: cvg <- round(cvg, digits=3) ## Note that this rounding will also have the interesting side effect of ## reducing the memory footprint of the Rle object in general (because ## some runs might get merged into a single run as a consequence of the ## rounding). ## --------------------------------------------------------------------- ## C. COVERAGE OF AN IPos OBJECT ## --------------------------------------------------------------------- pos_runs <- IRanges(c(1, 5, 9), c(10, 8, 15)) ipos <- IPos(pos_runs) coverage(ipos) ## --------------------------------------------------------------------- ## D. COVERAGE OF AN IRangesList OBJECT ## --------------------------------------------------------------------- x <- IRangesList(A=IRanges(3*(4:-1), width=1:3), B=IRanges(2:10, width=5)) cvg <- coverage(x) cvg stopifnot(identical(cvg[[1]], coverage(x[[1]]))) stopifnot(identical(cvg[[2]], coverage(x[[2]]))) coverage(x, width=c(50, 9)) coverage(x, width=c(NA, 9)) coverage(x, width=9) # 'width' gets recycled ## Each list element in 'shift' and 'weight' gets recycled to the length ## of the corresponding element in 'x'. weight <- list(as.integer(10^(0:5)), -0.77) cvg2 <- coverage(x, weight=weight) cvg2 # 1st coverage vector is an integer-Rle, 2nd is a numeric-Rle identical(mapply(coverage, x=x, weight=weight), as.list(cvg2)) ## --------------------------------------------------------------------- ## E. SOME MATHEMATICAL PROPERTIES OF THE coverage() FUNCTION ## --------------------------------------------------------------------- ## PROPERTY 1: The coverage vector is not affected by reordering the ## input ranges: set.seed(24) x <- IRanges(sample(1000, 40, replace=TRUE), width=17:10) cvg0 <- coverage(x) stopifnot(identical(coverage(sample(x)), cvg0)) ## Of course, if the ranges are shifted and/or assigned weights, then ## this doesn't hold anymore, unless the 'shift' and/or 'weight' ## arguments are reordered accordingly. ## PROPERTY 2: The coverage of the concatenation of 2 IntegerRanges ## objects 'x' and 'y' is the sum of the 2 individual coverage vectors: y <- IRanges(sample(-20:280, 36, replace=TRUE), width=28) stopifnot(identical(coverage(c(x, y), width=100), coverage(x, width=100) + coverage(y, width=100))) ## Note that, because adding 2 vectors in R recycles the shortest to ## the length of the longest, the following is generally FALSE: identical(coverage(c(x, y)), coverage(x) + coverage(y)) # FALSE ## It would only be TRUE if the 2 coverage vectors that we add had the ## same length, which would only happen by chance. By using the same ## 'width' value when we computed the 2 coverages previously, we made ## sure they had the same length. ## Because of properties 1 & 2, we have: x1 <- x[c(TRUE, FALSE)] # pick up 1st, 3rd, 5th, etc... ranges x2 <- x[c(FALSE, TRUE)] # pick up 2nd, 4th, 6th, etc... ranges cvg1 <- coverage(x1, width=100) cvg2 <- coverage(x2, width=100) stopifnot(identical(coverage(x, width=100), cvg1 + cvg2)) ## PROPERTY 3: Multiplying the weights by a scalar has the effect of ## multiplying the coverage vector by the same scalar: weight <- runif(40) cvg3 <- coverage(x, weight=weight) stopifnot(all.equal(coverage(x, weight=-2.68 * weight), -2.68 * cvg3)) ## Because of properties 1 & 2 & 3, we have: stopifnot(identical(coverage(x, width=100, weight=c(5L, -11L)), 5L * cvg1 - 11L * cvg2)) ## PROPERTY 4: Using the sum of 2 weight vectors produces the same ## result as using the 2 weight vectors separately and summing the ## 2 results: weight2 <- 10 * runif(40) + 3.7 stopifnot(all.equal(coverage(x, weight=weight + weight2), cvg3 + coverage(x, weight=weight2))) ## PROPERTY 5: Repeating any input range N number of times is ## equivalent to multiplying its assigned weight by N: times <- sample(0:10L, length(x), replace=TRUE) stopifnot(all.equal(coverage(rep(x, times), weight=rep(weight, times)), coverage(x, weight=weight * times))) ## In particular, if 'weight' is not supplied: stopifnot(identical(coverage(rep(x, times)), coverage(x, weight=times))) ## PROPERTY 6: If none of the input range actually gets clipped during ## the "shift and clip" process, then: ## ## sum(cvg) = sum(width(x) * weight) ## stopifnot(sum(cvg3) == sum(width(x) * weight)) ## In particular, if 'weight' is not supplied: stopifnot(sum(cvg0) == sum(width(x))) ## Note that this property is sometimes used in the context of a ## ChIP-Seq analysis to estimate "the number of reads in a peak", that ## is, the number of short reads that belong to a peak in the coverage ## vector computed from the genomic locations (a.k.a. genomic ranges) ## of the aligned reads. Because of property 6, the number of reads in ## a peak is approximately the area under the peak divided by the short ## read length. ## PROPERTY 7: If 'weight' is not supplied, then disjoining or reducing ## the ranges before calling coverage() has the effect of "shaving" the ## coverage vector at elevation 1: table(cvg0) shaved_cvg0 <- cvg0 runValue(shaved_cvg0) <- pmin(runValue(cvg0), 1L) table(shaved_cvg0) stopifnot(identical(coverage(disjoin(x)), shaved_cvg0)) stopifnot(identical(coverage(reduce(x)), shaved_cvg0)) ## --------------------------------------------------------------------- ## F. SOME SANITY CHECKS ## --------------------------------------------------------------------- dummy_coverage <- function(x, shift=0L, width=NULL) { y <- IRanges:::unlist_as_integer(shift(x, shift)) if (is.null(width)) width <- max(c(0L, y)) Rle(tabulate(y, nbins=width)) } check_real_vs_dummy <- function(x, shift=0L, width=NULL) { res1 <- coverage(x, shift=shift, width=width) res2 <- dummy_coverage(x, shift=shift, width=width) stopifnot(identical(res1, res2)) } check_real_vs_dummy(x) check_real_vs_dummy(x, shift=7) check_real_vs_dummy(x, shift=7, width=27) check_real_vs_dummy(x, shift=c(-4, 2)) check_real_vs_dummy(x, shift=c(-4, 2), width=12) check_real_vs_dummy(x, shift=-max(end(x))) ## With a set of distinct single positions: x3 <- IRanges(sample(50000, 20000), width=1) stopifnot(identical(sort(start(x3)), which(coverage(x3) != 0L))) } \keyword{methods} \keyword{utilities} IRanges/man/extractList.Rd0000644000175100017510000001310614626176651016511 0ustar00biocbuildbiocbuild\name{extractList} \alias{relist} \alias{relist,ANY,PartitioningByEnd-method} \alias{relist,ANY,List-method} \alias{relist,Vector,list-method} \alias{extractList} \alias{extractList,ANY,ANY-method} \alias{extractList,ANY-method} \alias{regroup} \title{Group elements of a vector-like object into a list-like object} \description{ \code{relist} and \code{split} are 2 common ways of grouping the elements of a vector-like object into a list-like object. The \pkg{IRanges} and \pkg{S4Vectors} packages define \code{relist} and \code{split} methods that operate on a \link{Vector} object and return a \link{List} object. Because \code{relist} and \code{split} both impose restrictions on the kind of grouping that they support (e.g. every element in the input object needs to go in a group and can only go in one group), the \pkg{IRanges} package introduces the \code{extractList} generic function for performing \emph{arbitrary} groupings. } \usage{ ## relist() ## -------- \S4method{relist}{ANY,List}(flesh, skeleton) \S4method{relist}{Vector,list}(flesh, skeleton) ## extractList() ## ------------- extractList(x, i) ## regroup() ## --------- regroup(x, g) } \arguments{ \item{flesh, x}{ A vector-like object. } \item{skeleton}{ A list-like object. Only the "shape" (i.e. element lengths) of \code{skeleton} matters. Its exact content is ignored. } \item{i}{ A list-like object. Unlike for \code{skeleton}, the content here matters (see Details section below). Note that \code{i} can be a \link{IntegerRanges} object (a particular type of list-like object), and, in that case, \code{extractList} is particularly fast (this is a common use case). } \item{g}{ A \linkS4class{Grouping} or an object coercible to one. For \code{regroup}, \code{g} groups the elements of \code{x}. } } \details{ Like \code{split}, \code{relist} and \code{extractList} have in common that they return a list-like object where all the list elements have the same class as the original vector-like object. Methods that return a \link{List} derivative return an object of class \code{\link[S4Vectors]{relistToClass}(x)}. By default, \code{extractList(x, i)} is equivalent to: \preformatted{ relist(x[unlist(i)], i) } An exception is made when \code{x} is a data-frame-like object. In that case \code{x} is subsetted along the rows, that is, \code{extractList(x, i)} is equivalent to: \preformatted{ relist(x[unlist(i), ], i) } This is more or less how the default method is implemented, except for some optimizations when \code{i} is a \link{IntegerRanges} object. \code{relist} and \code{split} can be seen as special cases of \code{extractList}: \preformatted{ relist(flesh, skeleton) is equivalent to extractList(flesh, PartitioningByEnd(skeleton)) split(x, f) is equivalent to extractList(x, split(seq_along(f), f)) } It is good practise to use \code{extractList} only for cases not covered by \code{relist} or \code{split}. Whenever possible, using \code{relist} or \code{split} is preferred as they will always perform more efficiently. In addition their names carry meaning and are familiar to most R users/developers so they'll make your code easier to read/understand. Note that the transformation performed by \code{relist} or \code{split} is always reversible (via \code{unlist} and \code{unsplit}, respectively), but not the transformation performed by \code{extractList} (in general). The \code{regroup} function splits the elements of \code{unlist(x)} into a list according to the grouping \code{g}. Each element of \code{unlist(x)} inherits its group from its parent element of \code{x}. \code{regroup} is different from \code{relist} and \code{split}, because \code{x} is already grouped, and the goal is to combine groups. } \value{ The \code{relist} methods behave like \code{utils::relist} except that they return a \link{List} object. If \code{skeleton} has names, then they are propagated to the returned value. \code{extractList} returns a list-like object parallel to \code{i} and with the same "shape" as \code{i} (i.e. same element lengths). If \code{i} has names, then they are propagated to the returned value. All these functions return a list-like object where the list elements have the same class as \code{x}. \code{\link[S4Vectors]{relistToClass}} gives the exact class of the returned object. } \author{Hervé Pagès} \seealso{ \itemize{ \item The \code{\link[S4Vectors]{relistToClass}} function and \code{\link[S4Vectors]{split}} methods defined in the \pkg{S4Vectors} package. \item The \code{\link[base]{unlist}} and \code{\link[utils]{relist}} functions in the \pkg{base} and \pkg{utils} packages, respectively. \item The \code{\link[base]{split}} and \code{\link[base]{unsplit}} functions in the \pkg{base} package. \item \link{PartitioningByEnd} objects. These objects are used inside \link{CompressedList} derivatives to keep track of the \emph{partitioning} of the single vector-like object made of all the list elements concatenated together. \item \link[S4Vectors]{Vector}, \link[S4Vectors]{List}, \link[S4Vectors]{Rle}, and \link[S4Vectors]{DataFrame} objects implemented in the \pkg{S4Vectors} package. \item \link{IntegerRanges} objects. } } \examples{ ## On an Rle object: x <- Rle(101:105, 6:2) i <- IRanges(6:10, 16:12, names=letters[1:5]) extractList(x, i) ## On a DataFrame object: df <- DataFrame(X=x, Y=LETTERS[1:20]) extractList(df, i) } \keyword{manip} IRanges/man/extractListFragments.Rd0000644000175100017510000001651014626176651020362 0ustar00biocbuildbiocbuild\name{extractListFragments} \alias{INCOMPATIBLE_ARANGES_MSG} \alias{extractListFragments} \alias{equisplit} \title{Extract list fragments from a list-like object} \description{ Utilities for extracting \emph{list fragments} from a list-like object. } \usage{ extractListFragments(x, aranges, use.mcols=FALSE, msg.if.incompatible=INCOMPATIBLE_ARANGES_MSG) equisplit(x, nchunk, chunksize, use.mcols=FALSE) } \arguments{ \item{x}{ The list-like object from which to extract the list fragments. Can be any \link{List} derivative for \code{extractListFragments}. Can also be an ordinary list if \code{extractListFragments} is called with \code{use.mcols=TRUE}. Can be any \link{List} derivative that supports \code{relist()} for \code{equisplit}. } \item{aranges}{ An \link{IntegerRanges} derivative containing the \emph{absolute ranges} (i.e. the ranges \emph{along \code{unlist(x)}}) of the list fragments to extract. The ranges in \code{aranges} must be compatible with the \emph{cumulated length} of all the list elements in \code{x}, that is, \code{start(aranges)} and \code{end(aranges)} must be >= 1 and <= \code{sum(elementNROWS(x))}, respectively. Also please note that only \link{IntegerRanges} objects that are disjoint and sorted are supported at the moment. } \item{use.mcols}{ Whether to propagate the metadata columns on \code{x} (if any) or not. Must be \code{TRUE} or \code{FALSE} (the default). If set to \code{FALSE}, instead of having the metadata columns propagated from \code{x}, the object returned by \code{extractListFragments} has metadata columns \code{revmap} and \code{revmap2}, and the object returned by \code{equisplit} has metadata column \code{revmap}. Note that this is the default. } \item{msg.if.incompatible}{ The error message to use if \code{aranges} is not compatible with the \emph{cumulated length} of all the list elements in \code{x}. } \item{nchunk}{ The number of chunks. Must be a single positive integer. } \item{chunksize}{ The size of the chunks (last chunk might be smaller). Must be a single positive integer. } } \details{ A \emph{list fragment} of list-like object \code{x} is a window in one of its list elements. \code{extractListFragments} is a low-level utility that extracts list fragments from list-like object \code{x} according to the absolute ranges in \code{aranges}. \code{equisplit} fragments and splits list-like object \code{x} into a specified number of partitions with equal (total) width. This is useful for instance to ensure balanced loading of workers in parallel evaluation. For example, if \code{x} is a \link[GenomicRanges]{GRanges} object, each partition is also a \link[GenomicRanges]{GRanges} object and the set of all partitions is returned as a \link[GenomicRanges]{GRangesList} object. } \value{ An object of the same class as \code{x} for \code{extractListFragments}. An object of class \code{\link[S4Vectors]{relistToClass}(x)} for \code{equisplit}. } \author{Hervé Pagès} \seealso{ \itemize{ \item \link{IRanges} and \link{IRangesList} objects. \item \link{Partitioning} objects. \item \link{IntegerList} objects. \item \code{\link{breakInChunks}} from breaking a vector-like object in chunks. \item \link[GenomicRanges]{GRanges} and \link[GenomicRanges]{GRangesList} objects defined in the \pkg{GenomicRanges} package. \item \link[S4Vectors]{List} objects defined in the \pkg{S4Vectors} package. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. } } \examples{ ## --------------------------------------------------------------------- ## A. extractListFragments() ## --------------------------------------------------------------------- x <- IntegerList(a=101:109, b=5:-5) x aranges <- IRanges(start=c(2, 4, 8, 17, 17), end=c(3, 6, 14, 16, 19)) aranges extractListFragments(x, aranges) x2 <- IRanges(c(1, 101, 1001, 10001), width=c(10, 5, 0, 12), names=letters[1:4]) mcols(x2)$label <- LETTERS[1:4] x2 aranges <- IRanges(start=13, end=20) extractListFragments(x2, aranges) extractListFragments(x2, aranges, use.mcols=TRUE) aranges2 <- PartitioningByWidth(c(3, 9, 13, 0, 2)) extractListFragments(x2, aranges2) extractListFragments(x2, aranges2, use.mcols=TRUE) x2b <- as(x2, "IntegerList") extractListFragments(x2b, aranges2) x2c <- as.list(x2b) extractListFragments(x2c, aranges2, use.mcols=TRUE) ## --------------------------------------------------------------------- ## B. equisplit() ## --------------------------------------------------------------------- ## equisplit() first calls breakInChunks() internally to create a ## PartitioningByWidth object that contains the absolute ranges of the ## chunks, then calls extractListFragments() on it 'x' to extract the ## fragments of 'x' that correspond to these absolute ranges. Finally ## the IRanges object returned by extractListFragments() is split into ## an IRangesList object where each list element corresponds to a chunk. equisplit(x2, nchunk=2) equisplit(x2, nchunk=2, use.mcols=TRUE) equisplit(x2, chunksize=5) library(GenomicRanges) gr <- GRanges(c("chr1", "chr2"), IRanges(1, c(100, 1e5))) equisplit(gr, nchunk=2) equisplit(gr, nchunk=1000) ## --------------------------------------------------------------------- ## C. ADVANCED extractListFragments() EXAMPLES ## --------------------------------------------------------------------- ## === D1. Fragment list-like object into length 1 fragments === ## First we construct a Partitioning object where all the partitions ## have a width of 1: x2_cumlen <- nobj(PartitioningByWidth(x2)) # Equivalent to # length(unlist(x2)) except # that it doesn't unlist 'x2' # so is much more efficient. aranges1 <- PartitioningByEnd(seq_len(x2_cumlen)) aranges1 ## Then we use it to fragment 'x2': extractListFragments(x2, aranges1) extractListFragments(x2b, aranges1) extractListFragments(x2c, aranges1, use.mcols=TRUE) ## === D2. Fragment a Partitioning object === partitioning2 <- PartitioningByEnd(x2b) # same as PartitioningByEnd(x2) extractListFragments(partitioning2, aranges2) ## Note that when the 1st arg is a Partitioning derivative, then ## swapping the 1st and 2nd elements in the call to extractListFragments() ## doesn't change the returned partitioning: extractListFragments(aranges2, partitioning2) ## --------------------------------------------------------------------- ## D. SANITY CHECKS ## --------------------------------------------------------------------- ## If 'aranges' is 'PartitioningByEnd(x)' or 'PartitioningByWidth(x)' ## and 'x' has no zero-length list elements, then ## 'extractListFragments(x, aranges, use.mcols=TRUE)' is a no-op. check_no_ops <- function(x) { aranges <- PartitioningByEnd(x) stopifnot(identical( extractListFragments(x, aranges, use.mcols=TRUE), x )) aranges <- PartitioningByWidth(x) stopifnot(identical( extractListFragments(x, aranges, use.mcols=TRUE), x )) } check_no_ops(x2[lengths(x2) != 0]) check_no_ops(x2b[lengths(x2b) != 0]) check_no_ops(x2c[lengths(x2c) != 0]) check_no_ops(gr) } \keyword{utilities} IRanges/man/findOverlaps-methods.Rd0000644000175100017510000003664714626176651020317 0ustar00biocbuildbiocbuild\name{findOverlaps-methods} \alias{findOverlaps-methods} \alias{findOverlaps} \alias{findOverlaps,IntegerRanges,IntegerRanges-method} \alias{findOverlaps,integer,IntegerRanges-method} \alias{findOverlaps,Vector,missing-method} \alias{findOverlaps,IntegerRangesList,IntegerRangesList-method} \alias{findOverlaps,Pairs,missing-method} \alias{findOverlaps,Pairs,ANY-method} \alias{findOverlaps,ANY,Pairs-method} \alias{findOverlaps,Pairs,Pairs-method} \alias{countOverlaps} \alias{countOverlaps,Vector,Vector-method} \alias{countOverlaps,integer,Vector-method} \alias{countOverlaps,Vector,missing-method} \alias{countOverlaps,IntegerRanges,IntegerRanges-method} \alias{countOverlaps,IntegerRangesList,IntegerRangesList-method} \alias{overlapsAny} \alias{overlapsAny,Vector,Vector-method} \alias{overlapsAny,integer,Vector-method} \alias{overlapsAny,Vector,missing-method} \alias{overlapsAny,IntegerRangesList,IntegerRangesList-method} \alias{\%over\%} \alias{\%within\%} \alias{\%outside\%} \alias{subsetByOverlaps} \alias{subsetByOverlaps,Vector,Vector-method} \alias{overlapsRanges} \alias{overlapsRanges,IntegerRanges,IntegerRanges-method} \alias{overlapsRanges,IntegerRangesList,IntegerRangesList-method} \alias{poverlaps} \alias{poverlaps,IntegerRanges,IntegerRanges-method} \alias{poverlaps,IntegerRanges,integer-method} \alias{poverlaps,integer,IntegerRanges-method} \alias{mergeByOverlaps} \alias{findOverlapPairs} \title{Finding overlapping ranges} \description{ Various methods for finding/counting interval overlaps between two "range-based" objects: a query and a subject. NOTE: This man page describes the methods that operate on \link{IntegerRanges} and \link{IntegerRangesList} derivatives. See \code{?`\link[GenomicRanges]{findOverlaps,GenomicRanges,GenomicRanges-method}`} in the \pkg{GenomicRanges} package for methods that operate on \link[GenomicRanges]{GenomicRanges} or \link[GenomicRanges]{GRangesList} objects. } \usage{ findOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ...) countOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) overlapsAny(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) query \%over\% subject query \%within\% subject query \%outside\% subject subsetByOverlaps(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) overlapsRanges(query, subject, hits=NULL, ...) poverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type = c("any", "start", "end", "within", "equal"), ...) mergeByOverlaps(query, subject, ...) findOverlapPairs(query, subject, ...) } \arguments{ \item{query, subject, x, ranges}{ Each of them can be an \link{IntegerRanges} (e.g. \link{IRanges}, \link{Views}) or \link{IntegerRangesList} (e.g. \link{IRangesList}, \link{ViewsList}) derivative. In addition, if \code{subject} or \code{ranges} is an \link{IntegerRanges} object, \code{query} or \code{x} can be an integer vector to be converted to length-one ranges. If \code{query} (or \code{x}) is an \link{IntegerRangesList} object, then \code{subject} (or \code{ranges}) must also be an \link{IntegerRangesList} object. If both arguments are list-like objects with names, each list element from the 2nd argument is paired with the list element from the 1st argument with the matching name, if any. Otherwise, list elements are paired by position. The overlap is then computed between the pairs as described below. If \code{subject} is omitted, \code{query} is queried against itself. In this case, and only this case, the \code{drop.self} and \code{drop.redundant} arguments are allowed. By default, the result will contain hits for each range against itself, and if there is a hit from A to B, there is also a hit for B to A. If \code{drop.self} is \code{TRUE}, all self matches are dropped. If \code{drop.redundant} is \code{TRUE}, only one of A->B and B->A is returned. } \item{maxgap}{ A single integer >= -1. If \code{type} is set to \code{"any"}, \code{maxgap} is interpreted as the maximum \emph{gap} that is allowed between 2 ranges for the ranges to be considered as overlapping. The \emph{gap} between 2 ranges is the number of positions that separate them. The \emph{gap} between 2 adjacent ranges is 0. By convention when one range has its start or end strictly inside the other (i.e. non-disjoint ranges), the \emph{gap} is considered to be -1. If \code{type} is set to anything else, \code{maxgap} has a special meaning that depends on the particular \code{type}. See \code{type} below for more information. } \item{minoverlap}{ A single non-negative integer. Only ranges with a minimum of \code{minoverlap} overlapping positions are considered to be overlapping. When \code{type} is \code{"any"}, at least one of \code{maxgap} and \code{minoverlap} must be set to its default value. } \item{type}{ By default, any overlap is accepted. By specifying the \code{type} parameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). If \code{type} is \code{start} or \code{end}, the intervals are required to have matching starts or ends, respectively. Specifying \code{equal} as the type returns the intersection of the \code{start} and \code{end} matches. If \code{type} is \code{within}, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy the \code{minoverlap} constraint described above. The \code{maxgap} parameter has special meaning with the special overlap types. For \code{start}, \code{end}, and \code{equal}, it specifies the maximum difference in the starts, ends or both, respectively. For \code{within}, it is the maximum amount by which the subject may be wider than the query. If \code{maxgap} is set to -1 (the default), it's replaced internally by 0. } \item{select}{ If \code{query} is an \link{IntegerRanges} derivative: When \code{select} is \code{"all"} (the default), the results are returned as a \link[S4Vectors]{Hits} object. Otherwise the returned value is an integer vector \emph{parallel} to \code{query} (i.e. same length) containing the first, last, or arbitrary overlapping interval in \code{subject}, with \code{NA} indicating intervals that did not overlap any intervals in \code{subject}. If \code{query} is an \link{IntegerRangesList} derivative: When \code{select} is \code{"all"} (the default), the results are returned as a \link[S4Vectors]{HitsList} object. Otherwise the returned value depends on the \code{drop} argument. When \code{select != "all" && !drop}, an \link{IntegerList} is returned, where each element of the result corresponds to a space in \code{query}. When \code{select != "all" && drop}, an integer vector is returned containing indices that are offset to align with the unlisted \code{query}. } \item{invert}{ If \code{TRUE}, keep only the ranges in \code{x} that do \emph{not} overlap \code{ranges}. } \item{hits}{ The \link[S4Vectors]{Hits} or \link[S4Vectors]{HitsList} object returned by \code{findOverlaps}, or \code{NULL}. If \code{NULL} then \code{hits} is computed by calling \code{findOverlaps(query, subject, ...)} internally (the extra arguments passed to \code{overlapsRanges} are passed to \code{findOverlaps}). } \item{...}{ Further arguments to be passed to or from other methods: \itemize{ \item \code{drop}: Supported only when \code{query} is an \link{IntegerRangesList} derivative. \code{FALSE} by default. See \code{select} argument above for the details. \item \code{drop.self}, \code{drop.redundant}: When \code{subject} is omitted, the \code{drop.self} and \code{drop.redundant} arguments (both \code{FALSE} by default) are allowed. See \code{query} and \code{subject} arguments above for the details. } } } \details{ A common type of query that arises when working with intervals is finding which intervals in one set overlap those in another. The simplest approach is to call the \code{findOverlaps} function on a \link{IntegerRanges} or other object with range information (aka "range-based object"). } \value{ For \code{findOverlaps}: see \code{select} argument above. For \code{countOverlaps}: the overlap hit count for each range in \code{query} using the specified \code{findOverlaps} parameters. For \link{IntegerRangesList} objects, it returns an \link{IntegerList} object. \code{overlapsAny} finds the ranges in \code{query} that overlap any of the ranges in \code{subject}. For \link{IntegerRanges} derivatives, it returns a logical vector of length equal to the number of ranges in \code{query}. For \link{IntegerRangesList} derivatives, it returns a \link{LogicalList} object where each element of the result corresponds to a space in \code{query}. \code{\%over\%} and \code{\%within\%} are convenience wrappers for the 2 most common use cases. Currently defined as \code{`\%over\%` <- function(query, subject) overlapsAny(query, subject)} and \code{`\%within\%` <- function(query, subject) overlapsAny(query, subject, type="within")}. \code{\%outside\%} is simply the inverse of \code{\%over\%}. \code{subsetByOverlaps} returns the subset of \code{x} that has an overlap hit with a range in \code{ranges} using the specified \code{findOverlaps} parameters. When \code{hits} is a \link[S4Vectors]{Hits} (or \link[S4Vectors]{HitsList}) object, \code{overlapsRanges(query, subject, hits)} returns a \link{IntegerRanges} (or \link{IntegerRangesList}) object of the \emph{same shape} as \code{hits} holding the regions of intersection between the overlapping ranges in objects \code{query} and \code{subject}, which should be the same query and subject used in the call to \code{findOverlaps} that generated \code{hits}. \emph{Same shape} means same length when \code{hits} is a \link[S4Vectors]{Hits} object, and same length and same elementNROWS when \code{hits} is a \link[S4Vectors]{HitsList} object. \code{poverlaps} compares \code{query} and \code{subject} in parallel (like e.g., \code{pmin}) and returns a logical vector indicating whether each pair of ranges overlaps. Integer vectors are treated as width-one ranges. \code{mergeByOverlaps} computes the overlap between query and subject according to the arguments in \code{\dots}. It then extracts the corresponding hits from each object and returns a \code{DataFrame} containing one column for the query and one for the subject, as well as any \code{mcols} that were present on either object. The query and subject columns are named by quoting and deparsing the corresponding argument. \code{findOverlapPairs} is like \code{mergeByOverlaps}, except it returns a formal \code{\link[S4Vectors:Pairs-class]{Pairs}} object that provides useful downstream conveniences, such as finding the intersection of the overlapping ranges with \code{\link{pintersect}}. } \references{ Allen's Interval Algebra: James F. Allen: Maintaining knowledge about temporal intervals. In: Communications of the ACM. 26/11/1983. ACM Press. S. 832-843, ISSN 0001-0782 } \author{Michael Lawrence and Hervé Pagès} \seealso{ \itemize{ \item \link[S4Vectors]{Hits} and \link[S4Vectors]{HitsList} objects in the \pkg{S4Vectors} package for representing a set of hits between 2 vector-like or list-like objects. \item \link[GenomicRanges]{findOverlaps,GenomicRanges,GenomicRanges-method} in the \pkg{GenomicRanges} package for methods that operate on \link[GenomicRanges]{GRanges} or \link[GenomicRanges]{GRangesList} objects. \item The \link{NCList} class and constructor. \item The \link{IntegerRanges}, \link{Views}, \link{IntegerRangesList}, and \link{ViewsList} classes. \item The \link{IntegerList} and \link{LogicalList} classes. } } \examples{ ## --------------------------------------------------------------------- ## findOverlaps() ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) findOverlaps(query, subject) ## at most one hit per query findOverlaps(query, subject, select="first") findOverlaps(query, subject, select="last") findOverlaps(query, subject, select="arbitrary") ## including adjacent ranges in the result findOverlaps(query, subject, maxgap=0L) query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2), c(5, 4)) ## one IRanges object with itself findOverlaps(query) ## single points as query subject <- IRanges(c(1, 6, 13), c(4, 9, 14)) findOverlaps(c(3L, 7L, 10L), subject, select="first") ## special overlap types query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) findOverlaps(query, subject, type="start") findOverlaps(query, subject, type="start", maxgap=1L) findOverlaps(query, subject, type="end", select="first") ov <- findOverlaps(query, subject, type="within", maxgap=1L) ov ## Using pairs to find intersection of overlapping ranges hits <- findOverlaps(query, subject) p <- Pairs(query, subject, hits=hits) pintersect(p) ## Shortcut p <- findOverlapPairs(query, subject) pintersect(p) ## --------------------------------------------------------------------- ## overlapsAny() ## --------------------------------------------------------------------- overlapsAny(query, subject, type="start") overlapsAny(query, subject, type="end") query \%over\% subject # same as overlapsAny(query, subject) query \%within\% subject # same as overlapsAny(query, subject, # type="within") ## --------------------------------------------------------------------- ## overlapsRanges() ## --------------------------------------------------------------------- ## Extract the regions of intersection between the overlapping ranges: overlapsRanges(query, subject, ov) ## --------------------------------------------------------------------- ## Using IntegerRangesList objects ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) qpartition <- factor(c("a","a","b")) qlist <- split(query, qpartition) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) spartition <- factor(c("a","a","b")) slist <- split(subject, spartition) ## at most one hit per query findOverlaps(qlist, slist, select="first") findOverlaps(qlist, slist, select="last") findOverlaps(qlist, slist, select="arbitrary") query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) qpartition <- factor(c("a","a","b","b")) qlist <- split(query, qpartition) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) spartition <- factor(c("a","a","b","b")) slist <- split(subject, spartition) overlapsAny(qlist, slist, type="start") overlapsAny(qlist, slist, type="end") qlist %over% slist subsetByOverlaps(qlist, slist) countOverlaps(qlist, slist) } \keyword{methods} IRanges/man/inter-range-methods.Rd0000644000175100017510000003420314626176651020060 0ustar00biocbuildbiocbuild\name{inter-range-methods} \alias{inter-range-methods} \alias{range} \alias{range,IntegerRanges-method} \alias{range,StitchedIPos-method} \alias{range,IntegerRangesList-method} \alias{range,CompressedIRangesList-method} \alias{reduce} \alias{reduce,IntegerRanges-method} \alias{reduce,Views-method} \alias{reduce,IntegerRangesList-method} \alias{reduce,CompressedIRangesList-method} \alias{gaps} \alias{gaps,IntegerRanges-method} \alias{gaps,Views-method} \alias{gaps,IntegerRangesList-method} \alias{gaps,CompressedIRangesList-method} \alias{gaps,MaskCollection-method} \alias{disjoin} \alias{disjoin,IntegerRanges-method} \alias{disjoin,NormalIRanges-method} \alias{disjoin,IntegerRangesList-method} \alias{disjoin,CompressedIRangesList-method} \alias{isDisjoint} \alias{isDisjoint,IntegerRanges-method} \alias{isDisjoint,StitchedIPos-method} \alias{isDisjoint,NormalIRanges-method} \alias{isDisjoint,IntegerRangesList-method} \alias{disjointBins} \alias{disjointBins,IntegerRanges-method} \alias{disjointBins,NormalIRanges-method} \alias{disjointBins,IntegerRangesList-method} \title{Inter range transformations of an IntegerRanges, Views, IntegerRangesList, or MaskCollection object} \description{ Range-based transformations are grouped in 2 categories: \enumerate{ \item \emph{Intra range transformations} (e.g. \code{\link{shift}()}) transform each range individually (and independently of the other ranges). They return an object \emph{parallel} to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described in the \link{intra-range-methods} man page (see \code{?`\link{intra-range-methods}`}). \item \emph{Inter range transformations} (e.g. \code{reduce()}) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally \emph{NOT} parallel to the input object. Those transformations are described below. } } \usage{ ## range() ## ------- \S4method{range}{IntegerRanges}(x, ..., with.revmap=FALSE, na.rm=FALSE) \S4method{range}{IntegerRangesList}(x, ..., with.revmap=FALSE, na.rm=FALSE) ## reduce() ## -------- reduce(x, drop.empty.ranges=FALSE, ...) \S4method{reduce}{IntegerRanges}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) \S4method{reduce}{Views}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) \S4method{reduce}{IntegerRangesList}(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) ## gaps() ## ------ gaps(x, start=NA, end=NA, ...) ## disjoin(), isDisjoint(), and disjointBins() ## ------------------------------------------- disjoin(x, ...) \S4method{disjoin}{IntegerRanges}(x, with.revmap=FALSE) \S4method{disjoin}{IntegerRangesList}(x, with.revmap=FALSE) isDisjoint(x, ...) disjointBins(x, ...) } \arguments{ \item{x}{ A \link{IntegerRanges} or \link{IntegerRangesList} object for \code{range}, \code{disjoin}, \code{isDisjoint}, and \code{disjointBins}. A \link{IntegerRanges}, \link{Views}, or \link{IntegerRangesList} object for \code{reduce} and \code{gaps}. } \item{...}{ For \code{range}, additional \link{IntegerRanges} or \link{IntegerRangesList} object to consider. } \item{na.rm}{ Ignored. } \item{drop.empty.ranges}{ \code{TRUE} or \code{FALSE}. Should empty ranges be dropped? } \item{min.gapwidth}{ Ranges separated by a gap of at least \code{min.gapwidth} positions are not merged. } \item{with.revmap}{ \code{TRUE} or \code{FALSE}. Should the mapping from output to input ranges be stored in the returned object? If yes, then it is stored as metadata column \code{revmap} of type \link{IntegerList}. } \item{with.inframe.attrib}{ \code{TRUE} or \code{FALSE}. For internal use. } \item{start, end}{ \itemize{ \item If \code{x} is a \link{IntegerRanges} or \link{Views} object: A single integer or \code{NA}. Use these arguments to specify the interval of reference i.e. which interval the returned gaps should be relative to. \item If \code{x} is a \link{IntegerRangesList} object: Integer vectors containing the coordinate bounds for each \link{IntegerRangesList} top-level element. } } } \details{ Unless specified otherwise, when \code{x} is a \link{IntegerRangesList} object, any transformation described here is equivalent to applying the transformation to each \link{IntegerRangesList} top-level element separately. \subsection{reduce}{ \code{reduce} first orders the ranges in \code{x} from left to right, then merges the overlapping or adjacent ones. }\subsection{range}{ \code{range} first concatenates \code{x} and the objects in \code{...} together. If the \link{IRanges} object resulting from this concatenation contains at least 1 range, then \code{range} returns an \link{IRanges} instance with a single range, from the minimum start to the maximum end of the concatenated object. Otherwise (i.e. if the concatenated object contains no range), \code{IRanges()} is returned (i.e. an \link{IRanges} instance of length 0). When passing more than 1 \link{IntegerRangesList} object to \code{range()}, they are first merged into a single \link{IntegerRangesList} object: by name if all objects have names, otherwise, if they are all of the same length, by position. Else, an exception is thrown. }\subsection{gaps}{ \code{gaps} returns the "normal" \link{IRanges} object representing the set of integers that remain after the set of integers represented by \code{x} has been removed from the interval specified by the \code{start} and \code{end} arguments. If \code{x} is a \link{Views} object, then \code{start=NA} and \code{end=NA} are interpreted as \code{start=1} and \code{end=length(subject(x))}, respectively, so, if \code{start} and \code{end} are not specified, then gaps are extracted with respect to the entire subject. }\subsection{isDisjoint}{ An \link{IntegerRanges} object \code{x} is considered to be "disjoint" if its ranges are non-overlapping. \code{isDisjoint} tests whether the object is "disjoint" or not. Note that a "normal" \link{IntegerRanges} object is always "disjoint" but the opposite is not true. See \code{?isNormal} for more information about normal \link{IntegerRanges} objects. About empty ranges. \code{isDisjoint} handles empty ranges (a.k.a. zero-width ranges) as follow: single empty range A is considered to overlap with single range B iff it's contained in B without being on the edge of B (in which case it would be ambiguous whether A is contained in or adjacent to B). More precisely, single empty range A is considered to overlap with single range B iff \preformatted{ start(B) < start(A) and end(A) < end(B)} Because A is an empty range it verifies \code{end(A) = start(A) - 1} so the above is equivalent to: \preformatted{ start(B) < start(A) <= end(B)} and also equivalent to: \preformatted{ start(B) <= end(A) < end(B)} Finally, it is also equivalent to: \preformatted{ pcompare(A, B) == 2} See \code{?`\link{IPosRanges-comparison}`} for the meaning of the codes returned by the \code{\link{pcompare}} function. }\subsection{disjoin}{ \code{disjoin} returns a disjoint object, by finding the union of the end points in \code{x}. In other words, the result consists of a range for every interval, of maximal length, over which the set of overlapping ranges in \code{x} is the same and at least of size 1. }\subsection{disjointBins}{ \code{disjointBins} segregates \code{x} into a set of bins so that the ranges in each bin are disjoint. Lower-indexed bins are filled first. The method returns an integer vector indicating the bin index for each range. } } \value{ If \code{x} is an \link{IntegerRanges} object: \itemize{ \item \code{range}, \code{reduce}, \code{gaps}, and \code{disjoin} return an \link{IRanges} instance. \item \code{isDisjoint} returns \code{TRUE} or \code{FALSE}. \item \code{disjointBins} returns an integer vector \emph{parallel} to \code{x}, that is, where the i-th element corresponds to the i-th element in \code{x}. } If \code{x} is a \link{Views} object: \code{reduce} and \code{gaps} return a \link{Views} object on the same subject as \code{x} but with modified views. If \code{x} is a \link{IntegerRangesList} object: \itemize{ \item \code{range}, \code{reduce}, \code{gaps}, and \code{disjoin} return a \link{IntegerRangesList} object \emph{parallel} to \code{x}. \item \code{isDisjoint} returns a logical vector \emph{parallel} to \code{x}. \item \code{disjointBins} returns an \link{IntegerList} object \emph{parallel} to \code{x}. } } \author{H. Pagès, M. Lawrence, and P. Aboyoun} \seealso{ \itemize{ \item \link{intra-range-methods} for intra range transformations. \item The \link{IntegerRanges}, \link{Views}, \link{IntegerRangesList}, and \link{MaskCollection} classes. \item The \link[GenomicRanges]{inter-range-methods} man page in the \pkg{GenomicRanges} package for \emph{inter range transformations} of genomic ranges. \item \link{setops-methods} for set operations on \link{IRanges} objects. \item \code{\link[S4Vectors]{endoapply}} in the \pkg{S4Vectors} package. } } \examples{ ## --------------------------------------------------------------------- ## range() ## --------------------------------------------------------------------- ## On an IntegerRanges object: x <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 3, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 0, 3)) range(x) ## On an IntegerRangesList object (XVector package required): range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) if (require(XVector)) { range(collection) } irl1 <- IRangesList(a=IRanges(c(1, 2),c(4, 3)), b=IRanges(c(4, 6),c(10, 7))) irl2 <- IRangesList(c=IRanges(c(0, 2),c(4, 5)), a=IRanges(c(4, 5),c(6, 7))) range(irl1, irl2) # matched by names names(irl2) <- NULL range(irl1, irl2) # now by position ## --------------------------------------------------------------------- ## reduce() ## --------------------------------------------------------------------- ## On an IntegerRanges object: reduce(x) y <- reduce(x, with.revmap=TRUE) mcols(y)$revmap # an IntegerList reduce(x, drop.empty.ranges=TRUE) y <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) mcols(y)$revmap ## Use the mapping from reduced to original ranges to split the DataFrame ## of original metadata columns by reduced range: ir0 <- IRanges(c(11:13, 2, 7:6), width=3) mcols(ir0) <- DataFrame(id=letters[1:6], score=1:6) ir <- reduce(ir0, with.revmap=TRUE) ir revmap <- mcols(ir)$revmap revmap relist(mcols(ir0)[unlist(revmap), ], revmap) # a SplitDataFrameList ## On an IntegerRangesList object. These 4 are the same: res1 <- reduce(collection) res2 <- IRangesList(one=reduce(range1), reduce(range2), reduce(range3)) res3 <- do.call(IRangesList, lapply(collection, reduce)) res4 <- endoapply(collection, reduce) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) reduce(collection, drop.empty.ranges=TRUE) ## --------------------------------------------------------------------- ## gaps() ## --------------------------------------------------------------------- ## On an IntegerRanges object: x0 <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 3)) gaps(x0) gaps(x0, start=-6, end=20) ## On a Views object: subject <- Rle(1:-3, 6:2) v <- Views(subject, start=c(8, 3), end=c(14, 4)) gaps(v) ## On an IntegerRangesList object. These 4 are the same: res1 <- gaps(collection) res2 <- IRangesList(one=gaps(range1), gaps(range2), gaps(range3)) res3 <- do.call(IRangesList, lapply(collection, gaps)) res4 <- endoapply(collection, gaps) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks gaps(mymasks) ## --------------------------------------------------------------------- ## disjoin() ## --------------------------------------------------------------------- ## On an IntegerRanges object: ir <- IRanges(c(1, 1, 4, 10), c(6, 3, 8, 10)) disjoin(ir) # IRanges(c(1, 4, 7, 10), c(3, 6, 8, 10)) disjoin(ir, with.revmap=TRUE) ## On an IntegerRangesList object: disjoin(collection) disjoin(collection, with.revmap=TRUE) ## --------------------------------------------------------------------- ## isDisjoint() ## --------------------------------------------------------------------- ## On an IntegerRanges object: isDisjoint(IRanges(c(2,5,1), c(3,7,3))) # FALSE isDisjoint(IRanges(c(2,9,5), c(3,9,6))) # TRUE isDisjoint(IRanges(1, 5)) # TRUE ## Handling of empty ranges: x <- IRanges(c(11, 16, 11, -2, 11), c(15, 29, 10, 10, 10)) stopifnot(isDisjoint(x)) ## Sliding an empty range along a non-empty range: sapply(11:17, function(i) pcompare(IRanges(i, width=0), IRanges(12, 15))) sapply(11:17, function(i) isDisjoint(c(IRanges(i, width=0), IRanges(12, 15)))) ## On an IntegerRangesList object: isDisjoint(collection) ## --------------------------------------------------------------------- ## disjointBins() ## --------------------------------------------------------------------- ## On an IntegerRanges object: disjointBins(IRanges(1, 5)) # 1L disjointBins(IRanges(c(3, 1, 10), c(5, 12, 13))) # c(2L, 1L, 2L) ## On an IntegerRangesList object: disjointBins(collection) } \keyword{utilities} IRanges/man/intra-range-methods.Rd0000644000175100017510000004051314626176651020055 0ustar00biocbuildbiocbuild\name{intra-range-methods} \alias{intra-range-methods} \alias{update_ranges} \alias{update_ranges,IRanges-method} \alias{update_ranges,Views-method} \alias{shift} \alias{shift,Ranges-method} \alias{shift,IPos-method} \alias{shift,RangesList-method} \alias{narrow} \alias{narrow,ANY-method} \alias{windows,Ranges-method} \alias{narrow,MaskCollection-method} \alias{resize} \alias{resize,Ranges-method} \alias{resize,RangesList-method} \alias{flank} \alias{flank,Ranges-method} \alias{flank,RangesList-method} \alias{promoters} \alias{terminators} \alias{promoters,IntegerRanges-method} \alias{terminators,IntegerRanges-method} \alias{promoters,RangesList-method} \alias{terminators,RangesList-method} \alias{reflect} \alias{reflect,IntegerRanges-method} \alias{restrict} \alias{restrict,IntegerRanges-method} \alias{restrict,Views-method} \alias{restrict,RangesList-method} \alias{threebands} \alias{threebands,IRanges-method} \alias{Ops,Ranges,numeric-method} \alias{Ops,RangesList,numeric-method} \alias{Ops,CompressedRangesList,numeric-method} \title{Intra range transformations of an IRanges, IPos, Views, RangesList, or MaskCollection object} \description{ Range-based transformations are grouped in 2 categories: \enumerate{ \item \emph{Intra range transformations} (e.g. \code{shift()}) transform each range individually (and independently of the other ranges). They return an object \emph{parallel} to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described below. \item \emph{Inter range transformations} (e.g. \code{\link{reduce}()}) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally \emph{NOT} parallel to the input object. Those transformations are described in the \link{inter-range-methods} man page (see \code{?`\link{inter-range-methods}`}). } Except for \code{threebands()}, all the transformations described in this man page are \emph{endomorphisms} that operate on a single "range-based" object, that is, they transform the ranges contained in the input object and return them in an object of the \emph{same class} as the input object. } \usage{ shift(x, shift=0L, use.names=TRUE) narrow(x, start=NA, end=NA, width=NA, use.names=TRUE) resize(x, width, fix="start", use.names=TRUE, ...) flank(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) promoters(x, upstream=2000, downstream=200, use.names=TRUE, ...) terminators(x, upstream=2000, downstream=200, use.names=TRUE, ...) reflect(x, bounds, use.names=TRUE) restrict(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) threebands(x, start=NA, end=NA, width=NA) } \arguments{ \item{x}{ An \link{IRanges}, \link{IPos}, \link{Views}, \link{RangesList}, or \link{MaskCollection} object. } \item{shift}{ An integer vector containing the shift information. Recycled as necessary so that each element corresponds to a range in \code{x}. Can also be a list-like object \emph{parallel} to \code{x} if \code{x} is a \link{RangesList} object. } \item{use.names}{ \code{TRUE} or \code{FALSE}. Should names be preserved? } \item{start, end}{ If \code{x} is an \link{IRanges}, \link{IPos} or \link{Views} object: A vector of integers for all functions except for \code{flank}. For \code{restrict}, the supplied \code{start} and \code{end} arguments must be vectors of integers, eventually with NAs, that specify the restriction interval(s). Recycled as necessary so that each element corresponds to a range in \code{x}. Same thing for \code{narrow} and \code{threebands}, except that here \code{start} and \code{end} must contain coordinates relative to the ranges in \code{x}. See the Details section below. For \code{flank}, \code{start} is a logical indicating whether \code{x} should be flanked at the start (\code{TRUE}) or the end (\code{FALSE}). Recycled as necessary so that each element corresponds to a range in \code{x}. Can also be list-like objects \emph{parallel} to \code{x} if \code{x} is a \link{RangesList} object. } \item{width}{ If \code{x} is an \link{IRanges}, \link{IPos} or \link{Views} object: For \code{narrow} and \code{threebands}, a vector of integers, eventually with NAs. See the SEW (Start/End/Width) interface for the details (\code{?solveUserSEW}). For \code{resize} and \code{flank}, the width of the resized or flanking regions. Note that if \code{both} is \code{TRUE}, this is effectively doubled. Recycled as necessary so that each element corresponds to a range in \code{x}. Can also be a list-like object \emph{parallel} to \code{x} if \code{x} is a \link{RangesList} object. } \item{fix}{ If \code{x} is an \link{IRanges}, \link{IPos} or \link{Views} object: A character vector or character-Rle of length 1 or \code{length(x)} containing the values \code{"start"}, \code{"end"}, and \code{"center"} denoting what to use as an anchor for each element in \code{x}. Can also be a list-like object \emph{parallel} to \code{x} if \code{x} is a \link{RangesList} object. } \item{...}{ Additional arguments for methods. } \item{both}{ If \code{TRUE}, extends the flanking region \code{width} positions \emph{into} the range. The resulting range thus straddles the end point, with \code{width} positions on either side. } \item{upstream, downstream}{ Vectors of non-NA non-negative integers. Recycled as necessary so that each element corresponds to a range in \code{x}. Can also be list-like objects \emph{parallel} to \code{x} if \code{x} is a \link{RangesList} object. \code{upstream} defines the number of nucleotides toward the 5' end and \code{downstream} defines the number toward the 3' end, relative to the transcription start site. Promoter regions are formed by merging the upstream and downstream ranges. Default values for the \code{upstream} and \code{downstream} arguments of \code{promoters()} were chosen based on our current understanding of gene regulation. On average, promoter regions in the mammalian genome are 5000 bp upstream and downstream of the transcription start site. Note that the same default values are used in \code{terminators()} at the moment. However this could be revisited if the case is made to use values that reflect more closely the biology of terminator regions in mammalian genomes. } \item{bounds}{ An \link{IRanges} object to serve as the reference bounds for the reflection, see below. } \item{keep.all.ranges}{ \code{TRUE} or \code{FALSE}. Should ranges that don't overlap with the restriction interval(s) be kept? Note that "don't overlap" means that they end strictly before \code{start - 1} or start strictly after \code{end + 1}. Ranges that end at \code{start - 1} or start at \code{end + 1} are always kept and their width is set to zero in the returned \link{IRanges} object. } } \details{ Unless specified otherwise, when \code{x} is a \link{RangesList} object, any transformation described here is equivalent to applying the transformation to each list element in \code{x}. \subsection{shift}{ \code{shift} shifts all the ranges in \code{x} by the amount specified by the \code{shift} argument. }\subsection{narrow}{ \code{narrow} narrows the ranges in \code{x} i.e. each range in the returned \link{IntegerRanges} object is a subrange of the corresponding range in \code{x}. The supplied start/end/width values are solved by a call to \code{solveUserSEW(width(x), start=start, end=end, width=width)} and therefore must be compliant with the rules of the SEW (Start/End/Width) interface (see \code{?\link{solveUserSEW}} for the details). Then each subrange is derived from the original range according to the solved start/end/width values for this range. Note that those solved values are interpreted relatively to the original range. }\subsection{resize}{ \code{resize} resizes the ranges to the specified width where either the start, end, or center is used as an anchor. }\subsection{flank}{ \code{flank} generates flanking ranges for each range in \code{x}. If \code{start} is \code{TRUE} for a given range, the flanking occurs at the start, otherwise the end. The widths of the flanks are given by the \code{width} parameter. The widths can be negative, in which case the flanking region is reversed so that it represents a prefix or suffix of the range in \code{x}. The \code{flank} operation is illustrated below for a call of the form \code{flank(x, 3, TRUE)}, where \code{x} indicates a range in \code{x} and \code{-} indicates the resulting flanking region: \preformatted{ ---xxxxxxx} If \code{start} were \code{FALSE}: \preformatted{ xxxxxxx---} For negative width, i.e. \code{flank(x, -3, FALSE)}, where \code{*} indicates the overlap between \code{x} and the result: \preformatted{ xxxx***} If \code{both} is \code{TRUE}, then, for all ranges in \code{x}, the flanking regions are extended \emph{into} (or out of, if width is negative) the range, so that the result straddles the given endpoint and has twice the width given by \code{width}. This is illustrated below for \code{flank(x, 3, both=TRUE)}: \preformatted{ ---***xxxx} }\subsection{promoters and terminators}{ \code{promoters} generates promoter ranges for each range in \code{x} relative to the transcription start site (TSS), where TSS is \code{start(x)}. The promoter range is expanded around the TSS according to the \code{upstream} and \code{downstream} arguments. \code{upstream} represents the number of nucleotides in the 5' direction and \code{downstream} the number in the 3' direction. The full range is defined as, (start(x) - upstream) to (start(x) + downstream - 1). For documentation for using \code{promoters} on a \link[GenomicRanges]{GRanges} object see \code{?`\link[GenomicRanges]{promoters,GenomicRanges-method}`} in the \pkg{GenomicRanges} package. \code{terminators} is similar to \code{promoters} except that the generated ranges are relative to the transcription end sites (TES) returned by \code{end(x)}. }\subsection{reflect}{ \code{reflect} "reflects" or reverses each range in \code{x} relative to the corresponding range in \code{bounds}, which is recycled as necessary. Reflection preserves the width of a range, but shifts it such the distance from the left bound to the start of the range becomes the distance from the end of the range to the right bound. This is illustrated below, where \code{x} represents a range in \code{x} and \code{[} and \code{]} indicate the bounds: \preformatted{ [..xxx.....] becomes [.....xxx..]} }\subsection{restrict}{ \code{restrict} restricts the ranges in \code{x} to the interval(s) specified by the \code{start} and \code{end} arguments. }\subsection{threebands}{ \code{threebands} extends the capability of \code{narrow} by returning the 3 ranges objects associated to the narrowing operation. The returned value \code{y} is a list of 3 ranges objects named \code{"left"}, \code{"middle"} and \code{"right"}. The middle component is obtained by calling \code{narrow} with the same arguments (except that names are dropped). The left and right components are also instances of the same class as \code{x} and they contain what has been removed on the left and right sides (respectively) of the original ranges during the narrowing. Note that original object \code{x} can be reconstructed from the left and right bands with \code{punion(y$left, y$right, fill.gap=TRUE)}. } } \author{H. Pagès, M. Lawrence, and P. Aboyoun} \seealso{ \itemize{ \item \link{inter-range-methods} for inter range transformations. \item The \link{IRanges}, \link{IPos}, \link{Views}, \link{RangesList}, and \link{MaskCollection} classes. \item The \link[GenomicRanges]{intra-range-methods} man page in the \pkg{GenomicRanges} package for \emph{intra range transformations} of genomic ranges. \item \link{setops-methods} for set operations on \link{IRanges} objects. \item \code{\link[S4Vectors]{endoapply}} in the \pkg{S4Vectors} package. } } \examples{ ## --------------------------------------------------------------------- ## shift() ## --------------------------------------------------------------------- ## On an IRanges object: ir1 <- successiveIRanges(c(19, 5, 0, 8, 5)) ir1 shift(ir1, shift=-3) ## On an IRangesList object: range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) shift(collection, shift=5) # same as endoapply(collection, shift, shift=5) ## Sanity check: res1 <- shift(collection, shift=5) res2 <- endoapply(collection, shift, shift=5) stopifnot(identical(res1, res2)) ## --------------------------------------------------------------------- ## narrow() ## --------------------------------------------------------------------- ## On an IRanges object: ir2 <- ir1[width(ir1) != 0] narrow(ir2, start=4, end=-2) narrow(ir2, start=-4, end=-2) narrow(ir2, end=5, width=3) narrow(ir2, start=c(3, 4, 2, 3), end=c(12, 5, 7, 4)) ## On an IRangesList object: narrow(collection[-3], start=2) narrow(collection[-3], end=-2) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks narrow(mymasks, start=8) ## --------------------------------------------------------------------- ## resize() ## --------------------------------------------------------------------- ## On an IRanges object: resize(ir2, 200) resize(ir2, 2, fix="end") ## On an IRangesList object: resize(collection, width=200) ## --------------------------------------------------------------------- ## flank() ## --------------------------------------------------------------------- ## On an IRanges object: ir3 <- IRanges(start=c(2,5,1), end=c(3,7,3)) flank(ir3, 2) flank(ir3, 2, start=FALSE) flank(ir3, 2, start=c(FALSE, TRUE, FALSE)) flank(ir3, c(2, -2, 2)) flank(ir3, 2, both = TRUE) flank(ir3, 2, start=FALSE, both=TRUE) flank(ir3, -2, start=FALSE, both=TRUE) ## On an IRangesList object: flank(collection, width=10) ## --------------------------------------------------------------------- ## promoters() ## --------------------------------------------------------------------- ## On an IRanges object: ir4 <- IRanges(start=10001:10004, end=12000) promoters(ir4, upstream=800, downstream=0) promoters(ir4, upstream=0, downstream=50) promoters(ir4, upstream=800, downstream=50) promoters(ir4, upstream=0, downstream=1) # TSS ## On an IRangesList object: promoters(collection, upstream=5, downstream=2) ## --------------------------------------------------------------------- ## reflect() ## --------------------------------------------------------------------- ## On an IRanges object: bounds <- IRanges(start=c(0, 5, 3), end=c(10, 6, 9)) reflect(ir3, bounds) ## reflect() does not yet support IRangesList objects! ## --------------------------------------------------------------------- ## restrict() ## --------------------------------------------------------------------- ## On an IRanges object: restrict(ir1, start=12, end=34) restrict(ir1, start=20) restrict(ir1, start=21) restrict(ir1, start=21, keep.all.ranges=TRUE) ## On an IRangesList object: restrict(collection, start=2, end=8) restrict(collection, start=2, end=8, keep.all.ranges=TRUE) ## --------------------------------------------------------------------- ## threebands() ## --------------------------------------------------------------------- ## On an IRanges object: z <- threebands(ir2, start=4, end=-2) ir2b <- punion(z$left, z$right, fill.gap=TRUE) stopifnot(identical(ir2, ir2b)) threebands(ir2, start=-5) ## threebands() does not support IRangesList objects. } \keyword{utilities} IRanges/man/multisplit.Rd0000644000175100017510000000160014626176651016405 0ustar00biocbuildbiocbuild\name{multisplit} \alias{multisplit} \title{ Split elements belonging to multiple groups } \description{ This is like \code{\link{split}}, except elements can belong to multiple groups, in which case they are repeated to appear in multiple elements of the return value. } \usage{ multisplit(x, f) } \arguments{ \item{x}{ The object to split, like a vector. } \item{f}{ A list-like object of vectors, the same length as \code{x}, where each element indicates the groups to which each element of \code{x} belongs. } } \value{ A list-like object, with an element for each unique value in the unlisted \code{f}, containing the elements in \code{x} where the corresponding element in \code{f} contained that value. Just try it. } \author{ Michael Lawrence } \examples{ multisplit(1:3, list(letters[1:2], letters[2:3], letters[2:4])) } \keyword{ manip } IRanges/man/nearest-methods.Rd0000644000175100017510000002047714626176651017316 0ustar00biocbuildbiocbuild\name{nearest-methods} \alias{nearest-methods} \alias{class:IntegerRanges_OR_missing} \alias{IntegerRanges_OR_missing-class} \alias{IntegerRanges_OR_missing} \alias{nearest} \alias{precede} \alias{follow} \alias{distance} \alias{distanceToNearest} \alias{selectNearest} \alias{nearest,IntegerRanges,IntegerRanges_OR_missing-method} \alias{precede,IntegerRanges,IntegerRanges_OR_missing-method} \alias{follow,IntegerRanges,IntegerRanges_OR_missing-method} \alias{distance,IntegerRanges,IntegerRanges-method} \alias{distance,Pairs,missing-method} \alias{distanceToNearest,IntegerRanges,IntegerRanges_OR_missing-method} \title{Finding the nearest range/position neighbor} \description{ The \code{nearest()}, \code{precede()}, \code{follow()}, \code{distance()} and \code{distanceToNearest()} methods for \link{IntegerRanges} derivatives (e.g. \link{IRanges} objects). } \usage{ \S4method{nearest}{IntegerRanges,IntegerRanges_OR_missing}(x, subject, select=c("arbitrary", "all")) \S4method{precede}{IntegerRanges,IntegerRanges_OR_missing}(x, subject, select=c("first", "all")) \S4method{follow}{IntegerRanges,IntegerRanges_OR_missing}(x, subject, select=c("last", "all")) \S4method{distance}{IntegerRanges,IntegerRanges}(x, y) \S4method{distance}{Pairs,missing}(x, y) \S4method{distanceToNearest}{IntegerRanges,IntegerRanges_OR_missing}(x, subject, select=c("arbitrary", "all")) } \arguments{ \item{x}{The query \link{IntegerRanges} derivative, or (for \code{distance()}) a \link[S4Vectors]{Pairs} object containing both the query (first) and subject (second). } \item{subject}{The subject \link{IntegerRanges} object, within which the nearest neighbors are found. Can be missing, in which case \code{x} is also the subject. } \item{select}{Logic for handling ties. By default, all the methods select a single interval (arbitrary for \code{nearest},the first by order in \code{subject} for \code{precede}, and the last for \code{follow}). To get all matchings, as a \link[S4Vectors]{Hits} object, use \code{"all"}. } \item{y}{For the \code{distance} method, a \link{IntegerRanges} derivative. Cannot be missing. If \code{x} and \code{y} are not the same length, the shortest will be recycled to match the length of the longest. } \item{hits}{The hits between \code{x} and \code{subject}} \item{...}{Additional arguments for methods} } \details{ \itemize{ \item{\code{nearest(x, subject, select=c("arbitrary", "all"))}:}{ The conventional nearest neighbor finder. Returns an integer vector containing the index of the nearest neighbor range in \code{subject} for each range in \code{x}. If there is no nearest neighbor (if \code{subject} is empty), NA's are returned. Here is roughly how it proceeds, for a range \code{xi} in \code{x}: \enumerate{ \item Find the ranges in \code{subject} that overlap \code{xi}. If a single range \code{si} in \code{subject} overlaps \code{xi}, \code{si} is returned as the nearest neighbor of \code{xi}. If there are multiple overlaps, one of the overlapping ranges is chosen arbitrarily. \item If no ranges in \code{subject} overlap with \code{xi}, then the range in \code{subject} with the shortest distance from its end to the start \code{xi} or its start to the end of \code{xi} is returned. } } \item{\code{precede(x, subject, select=c("first", "all"))}:}{ For each range in \code{x}, \code{precede} returns the index of the interval in \code{subject} that is directly preceded by the query range. Overlapping ranges are excluded. \code{NA} is returned when there are no qualifying ranges in \code{subject}. } \item{\code{follow(x, subject, select=c("last", "all"))}:}{ The opposite of \code{precede}, this function returns the index of the range in \code{subject} that a query range in \code{x} directly follows. Overlapping ranges are excluded. \code{NA} is returned when there are no qualifying ranges in \code{subject}. } \item{\code{distance(x, y)}:}{ Returns the distance for each range in \code{x} to the range in \code{y}. The \code{distance} method differs from others documented on this page in that it is symmetric; \code{y} cannot be missing. If \code{x} and \code{y} are not the same length, the shortest will be recycled to match the length of the longest. The \code{select} argument is not available for \code{distance} because comparisons are made in a pair-wise fashion. The return value is the length of the longest of \code{x} and \code{y}. The \code{distance} calculation changed in BioC 2.12 to accommodate zero-width ranges in a consistent and intuitive manner. The new distance can be explained by a \emph{block} model where a range is represented by a series of blocks of size 1. Blocks are adjacent to each other and there is no gap between them. A visual representation of \code{IRanges(4,7)} would be \preformatted{ +-----+-----+-----+-----+ 4 5 6 7 } The distance between two consecutive blocks is 0L (prior to Bioconductor 2.12 it was 1L). The new distance calculation now returns the size of the gap between two ranges. This change to distance affects the notion of overlaps in that we no longer say: x and y overlap <=> distance(x, y) == 0 Instead we say x and y overlap => distance(x, y) == 0 or x and y overlap or are adjacent <=> distance(x, y) == 0 } \item{\code{distanceToNearest(x, subject, select=c("arbitrary", "all"))}:}{ Returns the distance for each range in \code{x} to its nearest neighbor in \code{subject}. } \item{\code{selectNearest(hits, x, subject)}:}{ Selects the hits that have the minimum distance within those for each query range. Ties are possible and can be broken with \code{\link[S4Vectors]{breakTies}}. } } } \value{ For \code{nearest()}, \code{precede()} and \code{follow()}, an integer vector of indices in \code{subject}, or a \link[S4Vectors]{Hits} object if \code{select="all"}. For \code{distance()}, an integer vector of distances between the ranges in \code{x} and \code{y}. For \code{distanceToNearest()}, a \link[S4Vectors]{Hits} object with a metadata column reporting the \code{distance} between the pair. Access the \code{distance} metadata column with the \code{\link[S4Vectors]{mcols}()} accessor. For \code{selectNearest()}, a \link[S4Vectors]{Hits} object, sorted by query. } \author{M. Lawrence} \seealso{ \itemize{ \item \link[S4Vectors]{Hits} objects implemented in the \pkg{S4Vectors} package. \item \code{\link{findOverlaps}} for finding just the overlapping ranges. \item The \link{IntegerRanges} class. \item \link[GenomicRanges]{nearest-methods} in the \pkg{GenomicRanges} package for the \code{nearest()}, \code{precede()}, \code{follow()}, \code{distance()}, and \code{distanceToNearest()} methods for \link[GenomicRanges]{GenomicRanges} objects. } } \examples{ ## ------------------------------------------ ## precede() and follow() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(3, 7, 10)) subject <- IRanges(c(3, 2, 10), c(3, 13, 12)) precede(query, subject) # c(3L, 3L, NA) precede(IRanges(), subject) # integer() precede(query, IRanges()) # rep(NA_integer_, 3) precede(query) # c(3L, 3L, NA) follow(query, subject) # c(NA, NA, 1L) follow(IRanges(), subject) # integer() follow(query, IRanges()) # rep(NA_integer_, 3) follow(query) # c(NA, NA, 2L) ## ------------------------------------------ ## nearest() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) nearest(query, subject) # c(1L, 1L, 3L) nearest(query) # c(2L, 1L, 2L) ## ------------------------------------------ ## distance() ## ------------------------------------------ ## adjacent distance(IRanges(1,5), IRanges(6,10)) # 0L ## overlap distance(IRanges(1,5), IRanges(3,7)) # 0L ## zero-width sapply(-3:3, function(i) distance(shift(IRanges(4,3), i), IRanges(4,3))) } \keyword{utilities} \keyword{methods} IRanges/man/range-squeezers.Rd0000644000175100017510000000702514626176651017326 0ustar00biocbuildbiocbuild\name{range-squeezers} \alias{range-squeezers} \alias{ranges} \alias{rglist} \alias{rglist,Pairs-method} \title{Squeeze the ranges out of a range-based object} \description{ S4 generic functions for squeezing the ranges out of a range-based object. These are analog to range squeezers \code{\link[GenomicRanges]{granges}} and \code{\link[GenomicRanges]{grglist}} defined in the \pkg{GenomicRanges} package, except that \code{ranges} returns the ranges in an \link{IRanges} object (instead of a \link[GenomicRanges]{GRanges} object for \code{\link[GenomicRanges]{granges}}), and \code{rglist} returns them in an \link{IRangesList} object (instead of a \link[GenomicRanges]{GRangesList} object for \code{\link[GenomicRanges]{grglist}}). } \usage{ ranges(x, use.names=TRUE, use.mcols=FALSE, ...) rglist(x, use.names=TRUE, use.mcols=FALSE, ...) } \arguments{ \item{x}{ An object containing ranges e.g. a \link{IntegerRanges}, \link[GenomicRanges]{GenomicRanges}, \link[SummarizedExperiment]{RangedSummarizedExperiment}, \link[GenomicAlignments]{GAlignments}, \link[GenomicAlignments]{GAlignmentPairs}, or \link[GenomicAlignments]{GAlignmentsList} object, or a \link[S4Vectors]{Pairs} object containing ranges. } \item{use.names}{ \code{TRUE} (the default) or \code{FALSE}. Whether or not the names on \code{x} (accessible with \code{names(x)}) should be propagated to the returned object. } \item{use.mcols}{ \code{TRUE} or \code{FALSE} (the default). Whether or not the metadata columns on \code{x} (accessible with \code{mcols(x)}) should be propagated to the returned object. } \item{...}{ Additional arguments, for use in specific methods. } } \details{ Various packages (e.g. \pkg{IRanges}, \pkg{GenomicRanges}, \pkg{SummarizedExperiment}, \pkg{GenomicAlignments}, etc...) define and document various range squeezing methods for various types of objects. Note that these functions can be seen as \emph{object getters} or as functions performing coercion. For some objects (e.g. \link[GenomicAlignments]{GAlignments} and \link[GenomicAlignments]{GAlignmentPairs} objects defined in the \pkg{GenomicAlignments} package), \code{as(x, "IRanges")} and \code{as(x, "IRangesList")}, are equivalent to \code{ranges(x, use.names=TRUE, use.mcols=TRUE)} and \code{rglist(x, use.names=TRUE, use.mcols=TRUE)}, respectively. } \value{ An \link{IRanges} object for \code{ranges}. An \link{IRangesList} object for \code{rglist}. If \code{x} is a vector-like object (e.g. \link[GenomicAlignments]{GAlignments}), the returned object is expected to be \emph{parallel} to \code{x}, that is, the i-th element in the output corresponds to the i-th element in the input. If \code{use.names} is TRUE, then the names on \code{x} (if any) are propagated to the returned object. If \code{use.mcols} is TRUE, then the metadata columns on \code{x} (if any) are propagated to the returned object. } \author{H. Pagès} \seealso{ \itemize{ \item \link{IRanges} and \link{IRangesList} objects. \item \link[SummarizedExperiment]{RangedSummarizedExperiment} objects in the \pkg{SummarizedExperiment} packages. \item \link[GenomicAlignments]{GAlignments}, \link[GenomicAlignments]{GAlignmentPairs}, and \link[GenomicAlignments]{GAlignmentsList} objects in the \pkg{GenomicAlignments} package. } } \examples{ ## See ?GAlignments in the GenomicAlignments package for examples of ## "ranges" and "rglist" methods. } \keyword{methods} IRanges/man/read.Mask.Rd0000644000175100017510000001450314626176651016012 0ustar00biocbuildbiocbuild\name{read.Mask} \alias{read.Mask} \alias{read.agpMask} \alias{read.gapMask} \alias{read.liftMask} \alias{read.rmMask} \alias{read.trfMask} \title{Read a mask from a file} \description{ \code{read.agpMask} and \code{read.gapMask} extract the AGAPS mask from an NCBI "agp" file or a UCSC "gap" file, respectively. \code{read.liftMask} extracts the AGAPS mask from a UCSC "lift" file (i.e. a file containing offsets of contigs within sequences). \code{read.rmMask} extracts the RM mask from a RepeatMasker .out file. \code{read.trfMask} extracts the TRF mask from a Tandem Repeats Finder .bed file. } \usage{ read.agpMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) read.gapMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE) read.liftMask(file, seqname="?", mask.width=NA) read.rmMask(file, seqname="?", mask.width=NA, use.IDs=FALSE) read.trfMask(file, seqname="?", mask.width=NA) } \arguments{ \item{file}{ Either a character string naming a file or a connection open for reading. } \item{seqname}{ The name of the sequence for which the mask must be extracted. If no sequence is specified (i.e. \code{seqname="?"}) then an error is raised and the sequence names found in the file are displayed. If the file doesn't contain any information for the specified sequence, then a warning is issued and an empty mask of width \code{mask.width} is returned. } \item{mask.width}{ The width of the mask to return i.e. the length of the sequence this mask will be put on. See \code{?`\link{MaskCollection-class}`} for more information about the width of a \link{MaskCollection} object. } \item{gap.types}{ \code{NULL} or a character vector containing gap types. Use this argument to filter the assembly gaps that are to be extracted from the "agp" or "gap" file based on their type. Most common gap types are \code{"contig"}, \code{"clone"}, \code{"centromere"}, \code{"telomere"}, \code{"heterochromatin"}, \code{"short_arm"} and \code{"fragment"}. With \code{gap.types=NULL}, all the assembly gaps described in the file are extracted. With \code{gap.types="?"}, an error is raised and the gap types found in the file for the specified sequence are displayed. } \item{use.gap.types}{ Whether or not the gap types provided in the "agp" or "gap" file should be used to name the ranges constituing the returned mask. See \code{?`\link{IRanges-class}`} for more information about the names of an \link{IRanges} object. } \item{use.IDs}{ Whether or not the repeat IDs provided in the RepeatMasker .out file should be used to name the ranges constituing the returned mask. See \code{?`\link{IRanges-class}`} for more information about the names of an \link{IRanges} object. } } \seealso{ \link{MaskCollection-class}, \link{IRanges-class} } \examples{ ## --------------------------------------------------------------------- ## A. Extract a mask of assembly gaps ("AGAPS" mask) with read.agpMask() ## --------------------------------------------------------------------- ## Note: The hs_b36v3_chrY.agp file was obtained by downloading, ## extracting and renaming the hs_ref_chrY.agp.gz file from ## ## ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/ ## hs_ref_chrY.agp.gz 5 KB 24/03/08 04:33:00 PM ## ## on May 9, 2008. chrY_length <- 57772954 file1 <- system.file("extdata", "hs_b36v3_chrY.agp", package="IRanges") mask1 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length, use.gap.types=TRUE) mask1 mask1[[1]] mask11 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length, gap.types=c("centromere", "heterochromatin")) mask11[[1]] ## --------------------------------------------------------------------- ## B. Extract a mask of assembly gaps ("AGAPS" mask) with read.liftMask() ## --------------------------------------------------------------------- ## Note: The hg18liftAll.lft file was obtained by downloading, ## extracting and renaming the liftAll.zip file from ## ## http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/ ## liftAll.zip 03-Feb-2006 11:35 5.5K ## ## on May 8, 2008. file2 <- system.file("extdata", "hg18liftAll.lft", package="IRanges") mask2 <- read.liftMask(file2, seqname="chr1") mask2 if (interactive()) { ## contigs 7 and 8 for chrY are adjacent read.liftMask(file2, seqname="chrY") ## displays the sequence names found in the file read.liftMask(file2) ## specify an unknown sequence name read.liftMask(file2, seqname="chrZ", mask.width=300) } ## --------------------------------------------------------------------- ## C. Extract a RepeatMasker ("RM") or Tandem Repeats Finder ("TRF") ## mask with read.rmMask() or read.trfMask() ## --------------------------------------------------------------------- ## Note: The ce2chrM.fa.out and ce2chrM.bed files were obtained by ## downloading, extracting and renaming the chromOut.zip and ## chromTrf.zip files from ## ## http://hgdownload.cse.ucsc.edu/goldenPath/ce2/bigZips/ ## chromOut.zip 21-Apr-2004 09:05 2.6M ## chromTrf.zip 21-Apr-2004 09:07 182K ## ## on May 7, 2008. ## Before you can extract a mask with read.rmMask() or read.trfMask(), you ## need to know the length of the sequence that you're going to put the ## mask on: if (interactive()) { library(BSgenome.Celegans.UCSC.ce2) chrM_length <- seqlengths(Celegans)[["chrM"]] ## Read the RepeatMasker .out file for chrM in ce2: file3 <- system.file("extdata", "ce2chrM.fa.out", package="IRanges") RMmask <- read.rmMask(file3, seqname="chrM", mask.width=chrM_length) RMmask ## Read the Tandem Repeats Finder .bed file for chrM in ce2: file4 <- system.file("extdata", "ce2chrM.bed", package="IRanges") TRFmask <- read.trfMask(file4, seqname="chrM", mask.width=chrM_length) TRFmask desc(TRFmask) <- paste(desc(TRFmask), "[period<=12]") TRFmask ## Put the 2 masks on chrM: chrM <- Celegans$chrM masks(chrM) <- RMmask # this would drop all current masks, if any masks(chrM) <- append(masks(chrM), TRFmask) chrM } } \keyword{manip} IRanges/man/reverse-methods.Rd0000644000175100017510000000357314626176651017326 0ustar00biocbuildbiocbuild\name{reverse} \alias{reverse} \alias{reverse,character-method} \alias{reverse,IRanges-method} \alias{reverse,NormalIRanges-method} \alias{reverse,Views-method} \alias{reverse,MaskCollection-method} \title{reverse} \description{ A generic function for reversing vector-like or list-like objects. This man page describes methods for reversing a character vector, a \link{Views} object, or a \link{MaskCollection} object. Note that \code{reverse} is similar to but not the same as \code{\link[base]{rev}}. } \usage{ reverse(x, ...) } \arguments{ \item{x}{ A vector-like or list-like object. } \item{...}{ Additional arguments to be passed to or from methods. } } \details{ On a character vector or a \link{Views} object, \code{reverse} reverses each element individually, without modifying the top-level order of the elements. More precisely, each individual string of a character vector is reversed. } \value{ An object of the same class and length as the original object. } \seealso{ \itemize{ \item \code{\link[base]{rev}} in base R. \item \code{\link{revElements}} in the \pkg{S4Vectors} package to reverse all or some of the list elements of a list-like object. \item \link{Views} objects. \item \link{MaskCollection} objects. \item \link[XVector]{reverse-methods} in the \pkg{XVector} package. } } \examples{ ## On a character vector: reverse(c("Hi!", "How are you?")) rev(c("Hi!", "How are you?")) ## On a Views object: v <- successiveViews(Rle(c(-0.5, 12.3, 4.88), 4:2), 1:4) v reverse(v) rev(v) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) reverse(mymasks) } \keyword{methods} \keyword{manip} IRanges/man/seqapply.Rd0000644000175100017510000000236014626176651016041 0ustar00biocbuildbiocbuild\name{seqapply} \alias{unsplit,List-method} \alias{split<-,Vector-method} \title{ 2 methods that should be documented somewhere else } \description{ \code{unsplit} method for \link{List} object and \code{split<-} method for \link{Vector} object. } \usage{ \S4method{unsplit}{List}(value, f, drop = FALSE) \S4method{split}{Vector}(x, f, drop = FALSE, ...) <- value } \arguments{ \item{value}{ The \link{List} object to unsplit. } \item{f}{ A \code{factor} or \code{list} of factors } \item{drop}{ Whether to drop empty elements from the returned list } \item{x}{ Like \code{X} } \item{\dots}{ Extra arguments to pass to \code{FUN} } } \details{ \code{unsplit} unlists \code{value}, where the order of the returned vector is as if \code{value} were originally created by splitting that vector on the factor \code{f}. \code{split(x, f, drop = FALSE) <- value}: Virtually splits \code{x} by the factor \code{f}, replaces the elements of the resulting list with the elements from the list \code{value}, and restores \code{x} to its original form. Note that this works for any \code{Vector}, even though \code{split} itself is not universally supported. } \author{ Michael Lawrence } \keyword{manip} IRanges/man/setops-methods.Rd0000644000175100017510000001652514626176651017171 0ustar00biocbuildbiocbuild\name{setops-methods} \alias{setops-methods} \alias{union} \alias{union,IntegerRanges,IntegerRanges-method} \alias{union,IntegerRangesList,IntegerRangesList-method} \alias{union,CompressedIRangesList,CompressedIRangesList-method} \alias{union,Pairs,missing-method} \alias{intersect} \alias{intersect,IntegerRanges,IntegerRanges-method} \alias{intersect,IntegerRangesList,IntegerRangesList-method} \alias{intersect,CompressedIRangesList,CompressedIRangesList-method} \alias{intersect,Pairs,missing-method} \alias{setdiff} \alias{setdiff,IntegerRanges,IntegerRanges-method} \alias{setdiff,IntegerRangesList,IntegerRangesList-method} \alias{setdiff,CompressedIRangesList,CompressedIRangesList-method} \alias{setdiff,Pairs,missing-method} \alias{punion} \alias{punion,IntegerRanges,IntegerRanges-method} \alias{punion,Pairs,missing-method} \alias{pintersect} \alias{pintersect,IntegerRanges,IntegerRanges-method} \alias{pintersect,Pairs,missing-method} \alias{psetdiff} \alias{psetdiff,IntegerRanges,IntegerRanges-method} \alias{psetdiff,Pairs,missing-method} \alias{pgap} \alias{pgap,IntegerRanges,IntegerRanges-method} \title{Set operations on IntegerRanges and IntegerRangesList objects} \description{ Performs set operations on \link{IntegerRanges} and \link{IntegerRangesList} objects. } \usage{ ## Vector-wise set operations ## -------------------------- \S4method{union}{IntegerRanges,IntegerRanges}(x, y) \S4method{union}{Pairs,missing}(x, y, ...) \S4method{intersect}{IntegerRanges,IntegerRanges}(x, y) \S4method{intersect}{Pairs,missing}(x, y, ...) \S4method{setdiff}{IntegerRanges,IntegerRanges}(x, y) \S4method{setdiff}{Pairs,missing}(x, y, ...) ## Element-wise (aka "parallel") set operations ## -------------------------------------------- \S4method{punion}{IntegerRanges,IntegerRanges}(x, y, fill.gap=FALSE) \S4method{punion}{Pairs,missing}(x, y, ...) \S4method{pintersect}{IntegerRanges,IntegerRanges}(x, y, resolve.empty=c("none", "max.start", "start.x")) \S4method{pintersect}{Pairs,missing}(x, y, ...) \S4method{psetdiff}{IntegerRanges,IntegerRanges}(x, y) \S4method{psetdiff}{Pairs,missing}(x, y, ...) \S4method{pgap}{IntegerRanges,IntegerRanges}(x, y) } \arguments{ \item{x, y}{ Objects representing ranges. } \item{fill.gap}{ Logical indicating whether or not to force a union by using the rule \code{start = min(start(x), start(y)), end = max(end(x), end(y))}. } \item{resolve.empty}{ One of \code{"none"}, \code{"max.start"}, or \code{"start.x"} denoting how to handle ambiguous empty ranges formed by intersections. \code{"none"} - throw an error if an ambiguous empty range is formed, \code{"max.start"} - associate the maximum start value with any ambiguous empty range, and \code{"start.x"} - associate the start value of \code{x} with any ambiguous empty range. (See Details section below for the definition of an ambiguous range.) } \item{...}{ The methods for \link[S4Vectors]{Pairs} objects pass any extra argument to the internal call to \code{punion(first(x), last(x), ...)}, \code{pintersect(first(x), last(x), ...)}, etc... } } \details{ The \code{union}, \code{intersect} and \code{setdiff} methods for \link{IntegerRanges} objects return a "normal" \link{IntegerRanges} object representing the union, intersection and (asymmetric!) difference of the sets of integers represented by \code{x} and \code{y}. \code{punion}, \code{pintersect}, \code{psetdiff} and \code{pgap} are generic functions that compute the element-wise (aka "parallel") union, intersection, (asymmetric!) difference and gap between each element in \code{x} and its corresponding element in \code{y}. Methods for \link{IntegerRanges} objects are defined. For these methods, \code{x} and \code{y} must have the same length (i.e. same number of ranges). They return a \link{IntegerRanges} object \emph{parallel} to \code{x} and \code{y} i.e. where the i-th range corresponds to the i-th range in \code{x} and in\code{y}) and represents the union/intersection/difference/gap of/between the corresponding \code{x[i]} and \code{y[i]}. If \code{x} is a \code{\link[S4Vectors]{Pairs}} object, then \code{y} should be missing, and the operation is performed between the members of each pair. By default, \code{pintersect} will throw an error when an "ambiguous empty range" is formed. An ambiguous empty range can occur three different ways: 1) when corresponding non-empty ranges elements \code{x} and \code{y} have an empty intersection, 2) if the position of an empty range element does not fall within the corresponding limits of a non-empty range element, or 3) if two corresponding empty range elements do not have the same position. For example if empty range element [22,21] is intersected with non-empty range element [1,10], an error will be produced; but if it is intersected with the range [22,28], it will produce [22,21]. As mentioned in the Arguments section above, this behavior can be changed using the \code{resolve.empty} argument. } \value{ On \link{IntegerRanges} objects, \code{union}, \code{intersect}, and \code{setdiff} return an \link{IRanges} \emph{instance} that is guaranteed to be \emph{normal} (see \code{\link{isNormal}}) but is NOT promoted to \link{NormalIRanges}. On \link{IntegerRanges} objects, \code{punion}, \code{pintersect}, \code{psetdiff}, and \code{pgap} return an object of the same class and length as their first argument. } \author{H. Pagès and M. Lawrence} \seealso{ \itemize{ \item \code{pintersect} is similar to \code{\link{narrow}}, except the end points are absolute, not relative. \code{pintersect} is also similar to \code{\link{restrict}}, except ranges outside of the restriction become empty and are not discarded. \item \link[GenomicRanges]{setops-methods} in the \pkg{GenomicRanges} package for set operations on genomic ranges. \item \link{findOverlaps-methods} for finding/counting overlapping ranges. \item \link{intra-range-methods} and \link{inter-range-methods} for \emph{intra range} and \emph{inter range} transformations. \item \link{IntegerRanges} and \link{IntegerRangesList} objects. In particular, \emph{normality} of an \link{IntegerRanges} object is discussed in the man page for \link{IntegerRanges} objects. \item \code{\link[S4Vectors]{mendoapply}} in the \pkg{S4Vectors} package. } } \examples{ x <- IRanges(c(1, 5, -2, 0, 14), c(10, 9, 3, 11, 17)) subject <- Rle(1:-3, 6:2) y <- Views(subject, start=c(14, 0, -5, 6, 18), end=c(20, 2, 2, 8, 20)) ## Vector-wise operations: union(x, ranges(y)) union(ranges(y), x) intersect(x, ranges(y)) intersect(ranges(y), x) setdiff(x, ranges(y)) setdiff(ranges(y), x) ## Element-wise (aka "parallel") operations: try(punion(x, ranges(y))) punion(x[3:5], ranges(y)[3:5]) punion(x, ranges(y), fill.gap=TRUE) try(pintersect(x, ranges(y))) pintersect(x[3:4], ranges(y)[3:4]) pintersect(x, ranges(y), resolve.empty="max.start") psetdiff(ranges(y), x) try(psetdiff(x, ranges(y))) start(x)[4] <- -99 end(y)[4] <- 99 psetdiff(x, ranges(y)) pgap(x, ranges(y)) ## On IntegerRangesList objects: irl1 <- IRangesList(a=IRanges(c(1,2),c(4,3)), b=IRanges(c(4,6),c(10,7))) irl2 <- IRangesList(c=IRanges(c(0,2),c(4,5)), a=IRanges(c(4,5),c(6,7))) union(irl1, irl2) intersect(irl1, irl2) setdiff(irl1, irl2) } \keyword{utilities} IRanges/man/slice-methods.Rd0000644000175100017510000000464214626176651016750 0ustar00biocbuildbiocbuild\name{slice-methods} \alias{slice-methods} \alias{slice} \alias{slice,ANY-method} \alias{slice,Rle-method} \alias{slice,RleList-method} \title{Slice a vector-like or list-like object} \description{ \code{slice} is a generic function that creates views on a vector-like or list-like object that contain the elements that are within the specified bounds. } \usage{ slice(x, lower=-Inf, upper=Inf, ...) \S4method{slice}{Rle}(x, lower=-Inf, upper=Inf, includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE) \S4method{slice}{RleList}(x, lower=-Inf, upper=Inf, includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE) } \arguments{ \item{x}{ An \link{Rle} or \link{RleList} object, or any object coercible to an Rle object. } \item{lower, upper}{ The lower and upper bounds for the slice. } \item{includeLower, includeUpper}{ Logical indicating whether or not the specified boundary is open or closed. } \item{rangesOnly}{ A logical indicating whether or not to drop the original data from the output. } \item{...}{ Additional arguments to be passed to specific methods. } } \details{ \code{slice} is useful for finding areas of absolute maxima (peaks), absolute minima (troughs), or fluctuations within specified limits. One or more view summarization methods can be used on the result of \code{slice}. See \code{?`link{view-summarization-methods}`} } \value{ The method for \link{Rle} objects returns an \link{RleViews} object if \code{rangesOnly=FALSE} or an \link{IRanges} object if \code{rangesOnly=TRUE}. The method for \link{RleList} objects returns an \link{RleViewsList} object if \code{rangesOnly=FALSE} or an \link{IRangesList} object if \code{rangesOnly=TRUE}. } \author{P. Aboyoun} \seealso{ \itemize{ \item \link{view-summarization-methods} for summarizing the views returned by \code{slice}. \item \link[XVector]{slice-methods} in the \pkg{XVector} package for more \code{slice} methods. \item \code{\link{coverage}} for computing the coverage across a set of ranges. \item The \link{Rle}, \link{RleList}, \link{RleViews}, and \link{RleViewsList} classes. } } \examples{ ## Views derived from coverage x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L), width=c(5L, 6L, 3L, 4L, 3L, 3L)) cvg <- coverage(x) slice(cvg, lower=2) slice(cvg, lower=2, rangesOnly=TRUE) } \keyword{methods} IRanges/man/view-summarization-methods.Rd0000644000175100017510000001067014626176651021521 0ustar00biocbuildbiocbuild\name{view-summarization-methods} \alias{view-summarization-methods} \alias{viewApply} \alias{viewApply,Views-method} \alias{viewApply,RleViews-method} \alias{viewApply,RleViewsList-method} \alias{viewMins} \alias{viewMins,RleViews-method} \alias{viewMins,RleViewsList-method} \alias{viewMaxs} \alias{viewMaxs,RleViews-method} \alias{viewMaxs,RleViewsList-method} \alias{viewSums} \alias{viewSums,RleViews-method} \alias{viewSums,RleViewsList-method} \alias{viewMeans} \alias{viewMeans,RleViews-method} \alias{viewMeans,RleViewsList-method} \alias{viewWhichMins} \alias{viewWhichMins,RleViews-method} \alias{viewWhichMins,RleViewsList-method} \alias{viewWhichMaxs} \alias{viewWhichMaxs,RleViews-method} \alias{viewWhichMaxs,RleViewsList-method} \alias{viewRangeMins} \alias{viewRangeMins,RleViews-method} \alias{viewRangeMins,RleViewsList-method} \alias{viewRangeMaxs} \alias{viewRangeMaxs,RleViews-method} \alias{viewRangeMaxs,RleViewsList-method} \alias{Summary,Views-method} \alias{mean,Views-method} \alias{max,Views-method} \alias{min,Views-method} \alias{sum,Views-method} \alias{which.min,Views-method} \alias{which.max,Views-method} \title{Summarize views on a vector-like object with numeric values} \description{ \code{viewApply} applies a function on each view of a \link{Views} or \link{ViewsList} object. \code{viewMins}, \code{viewMaxs}, \code{viewSums}, \code{viewMeans} calculate respectively the minima, maxima, sums, and means of the views in a \link{Views} or \link{ViewsList} object. } \usage{ viewApply(X, FUN, ..., simplify = TRUE) viewMins(x, na.rm=FALSE) \S4method{min}{Views}(x, ..., na.rm = FALSE) viewMaxs(x, na.rm=FALSE) \S4method{max}{Views}(x, ..., na.rm = FALSE) viewSums(x, na.rm=FALSE) \S4method{sum}{Views}(x, ..., na.rm = FALSE) viewMeans(x, na.rm=FALSE) \S4method{mean}{Views}(x, ...) viewWhichMins(x, na.rm=FALSE) \S4method{which.min}{Views}(x) viewWhichMaxs(x, na.rm=FALSE) \S4method{which.max}{Views}(x) viewRangeMins(x, na.rm=FALSE) viewRangeMaxs(x, na.rm=FALSE) } \arguments{ \item{X}{ A Views object. } \item{FUN}{ The function to be applied to each view in \code{X}. } \item{...}{ Additional arguments to be passed on. } \item{simplify}{ A logical value specifying whether or not the result should be simplified to a vector or matrix if possible. } \item{x}{ An \link{RleViews} or \link{RleViewsList} object. } \item{na.rm}{ Logical indicating whether or not to include missing values in the results. } } \details{ The \code{viewMins}, \code{viewMaxs}, \code{viewSums}, and \code{viewMeans} functions provide efficient methods for calculating the specified numeric summary by performing the looping in compiled code. The \code{viewWhichMins}, \code{viewWhichMaxs}, \code{viewRangeMins}, and \code{viewRangeMaxs} functions provide efficient methods for finding the locations of the minima and maxima. } \value{ For all the functions in this man page (except \code{viewRangeMins} and \code{viewRangeMaxs}): A numeric vector of the length of \code{x} if \code{x} is an \link{RleViews} object, or a \link{List} object of the length of \code{x} if it's an \link{RleViewsList} object. For \code{viewRangeMins} and \code{viewRangeMaxs}: An \link{IRanges} object if \code{x} is an \link{RleViews} object, or an \link{IRangesList} object if it's an \link{RleViewsList} object. } \note{ For convenience, methods for \code{min}, \code{max}, \code{sum}, \code{mean}, \code{which.min} and \code{which.max} are provided as wrappers around the corresponding \code{view*} functions (which might be deprecated at some point). } \author{P. Aboyoun} \seealso{ \itemize{ \item The \code{\link{slice}} function for slicing an \link{Rle} or \link{RleList} object. \item \link[XVector]{view-summarization-methods} in the \pkg{XVector} package for more view summarization methods. \item The \link{RleViews} and \link{RleViewsList} classes. \item The \code{\link{which.min}} and \code{\link{colSums}} functions. } } \examples{ ## Views derived from coverage x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L), width=c(5L, 6L, 3L, 4L, 3L, 3L)) cvg <- coverage(x) cvg_views <- slice(cvg, lower=2) viewApply(cvg_views, diff) viewMins(cvg_views) viewMaxs(cvg_views) viewSums(cvg_views) viewMeans(cvg_views) viewWhichMins(cvg_views) viewWhichMaxs(cvg_views) viewRangeMins(cvg_views) viewRangeMaxs(cvg_views) } \keyword{methods} \keyword{arith} IRanges/src/0000755000175100017510000000000014641351314013713 5ustar00biocbuildbiocbuildIRanges/src/CompressedAtomicList_utils.c0000644000175100017510000002412514626176651021414 0ustar00biocbuildbiocbuild/**************************************************************************** * Utilities for CompressedAtomicList objects * ****************************************************************************/ #include "IRanges.h" #define R_INT_MIN (1+INT_MIN) #define PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, \ NA_CHECK, INIT, UPDATE, EXTRA_INIT) \ { \ SEXP unlistData = _get_CompressedList_unlistData(x); \ SEXP partitioning = _get_CompressedList_partitioning(x); \ SEXP ends = _get_PartitioningByEnd_end(partitioning); \ Rboolean _na_rm = asLogical(na_rm); \ int prev_end = 0; \ SEXP ans = allocVector(ANS_TYPE, length(ends)); \ for (int i = 0; i < length(ends); i++) { \ int end = INTEGER(ends)[i]; \ C_TYPE summary = INIT; \ EXTRA_INIT; \ for (int j = prev_end; j < end; j++) { \ C_TYPE val = ACCESSOR(unlistData)[j]; \ if (NA_CHECK) { \ if (_na_rm) { \ continue; \ } else { \ summary = NA_ ## ANS_ACCESSOR; \ break; \ } \ } \ UPDATE; \ } \ ANS_ACCESSOR(ans)[i] = summary; \ prev_end = end; \ } \ SET_NAMES(ans, _get_CompressedList_names(x)); \ return ans; \ } #define PARTITIONED_SUM(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, NA_CHECK) \ { \ PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ANS_ACCESSOR, \ NA_CHECK, 0, summary += val, ); \ } #define PARTITIONED_PROD(ACCESSOR, NA_CHECK) \ { \ PARTITIONED_AGG(double, ACCESSOR, REALSXP, REAL, \ NA_CHECK, 1, summary *= val, ); \ } /* --- .Call ENTRY POINT --- */ SEXP C_sum_CompressedLogicalList(SEXP x, SEXP na_rm) { PARTITIONED_SUM(Rboolean, LOGICAL, INTSXP, INTEGER, val == NA_LOGICAL); } /* --- .Call ENTRY POINT --- */ SEXP C_sum_CompressedIntegerList(SEXP x, SEXP na_rm) { PARTITIONED_SUM(int, INTEGER, INTSXP, INTEGER, val == NA_INTEGER); } /* --- .Call ENTRY POINT --- */ SEXP C_sum_CompressedNumericList(SEXP x, SEXP na_rm) { PARTITIONED_SUM(double, REAL, REALSXP, REAL, ISNA(val)); } /* --- .Call ENTRY POINT --- */ SEXP C_prod_CompressedLogicalList(SEXP x, SEXP na_rm) { PARTITIONED_PROD(LOGICAL, val == NA_LOGICAL); } /* --- .Call ENTRY POINT --- */ SEXP C_prod_CompressedIntegerList(SEXP x, SEXP na_rm) { PARTITIONED_PROD(INTEGER, val == NA_INTEGER); } /* --- .Call ENTRY POINT --- */ SEXP C_prod_CompressedNumericList(SEXP x, SEXP na_rm) { PARTITIONED_PROD(REAL, ISNA(val)); } #define PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, RELOP) \ { \ PARTITIONED_AGG(C_TYPE, ACCESSOR, ANS_TYPE, ACCESSOR, \ NA_CHECK, INIT, \ if (val RELOP summary) summary = val, ); \ } #define PARTITIONED_MIN(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT) \ { \ PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, <); \ } #define PARTITIONED_MAX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT) \ { \ PARTITIONED_EX(C_TYPE, ACCESSOR, ANS_TYPE, NA_CHECK, INIT, >); \ } /* --- .Call ENTRY POINT --- */ SEXP C_min_CompressedLogicalList(SEXP x, SEXP na_rm) { PARTITIONED_MIN(Rboolean, LOGICAL, LGLSXP, val == NA_LOGICAL, TRUE); } /* --- .Call ENTRY POINT --- */ SEXP C_min_CompressedIntegerList(SEXP x, SEXP na_rm) { PARTITIONED_MIN(int, INTEGER, INTSXP, val == NA_INTEGER, INT_MAX); } /* --- .Call ENTRY POINT --- */ SEXP C_min_CompressedNumericList(SEXP x, SEXP na_rm) { PARTITIONED_MIN(double, REAL, REALSXP, ISNA(val), R_PosInf); } /* --- .Call ENTRY POINT --- */ SEXP C_max_CompressedLogicalList(SEXP x, SEXP na_rm) { PARTITIONED_MAX(Rboolean, LOGICAL, LGLSXP, val == NA_LOGICAL, TRUE); } /* --- .Call ENTRY POINT --- */ SEXP C_max_CompressedIntegerList(SEXP x, SEXP na_rm) { PARTITIONED_MAX(int, INTEGER, INTSXP, val == NA_INTEGER, R_INT_MIN); } /* --- .Call ENTRY POINT --- */ SEXP C_max_CompressedNumericList(SEXP x, SEXP na_rm) { PARTITIONED_MAX(double, REAL, REALSXP, ISNA(val), R_NegInf); } #define PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, RELOP) \ { \ SEXP na_rm = ScalarLogical(TRUE); \ PARTITIONED_AGG(C_TYPE, ACCESSOR, INTSXP, INTEGER, \ NA_CHECK, NA_INTEGER, \ if (val RELOP summary_val) \ (summary_val = val, summary = j - prev_end + 1), \ C_TYPE summary_val = INIT) \ } #define PARTITIONED_WHICH_MIN(C_TYPE, ACCESSOR, NA_CHECK, INIT) \ { \ PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, <) \ } #define PARTITIONED_WHICH_MAX(C_TYPE, ACCESSOR, NA_CHECK, INIT) \ { \ PARTITIONED_WHICH_AGG(C_TYPE, ACCESSOR, NA_CHECK, INIT, >) \ } /* --- .Call ENTRY POINT --- */ SEXP C_which_min_CompressedLogicalList(SEXP x) { PARTITIONED_WHICH_MIN(Rboolean, LOGICAL, val == NA_LOGICAL, TRUE); } /* --- .Call ENTRY POINT --- */ SEXP C_which_min_CompressedIntegerList(SEXP x) { PARTITIONED_WHICH_MIN(int, INTEGER, val == NA_INTEGER, INT_MAX); } /* --- .Call ENTRY POINT --- */ SEXP C_which_min_CompressedNumericList(SEXP x) { PARTITIONED_WHICH_MIN(double, REAL, ISNA(val), R_PosInf); } /* --- .Call ENTRY POINT --- */ SEXP C_which_max_CompressedLogicalList(SEXP x) { PARTITIONED_WHICH_MAX(Rboolean, LOGICAL, val == NA_LOGICAL, TRUE); } /* --- .Call ENTRY POINT --- */ SEXP C_which_max_CompressedIntegerList(SEXP x) { PARTITIONED_WHICH_MAX(int, INTEGER, val == NA_INTEGER, R_INT_MIN); } /* --- .Call ENTRY POINT --- */ SEXP C_which_max_CompressedNumericList(SEXP x) { PARTITIONED_WHICH_MAX(double, REAL, ISNA(val), R_NegInf); } #define PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, BREAK_CHECK, OFFSET) \ { \ SEXP unlistData = _get_CompressedList_unlistData(x); \ SEXP partition = _get_CompressedList_partitioning(x); \ SEXP ends = _get_PartitioningByEnd_end(partition); \ Rboolean _na_rm = asLogical(na_rm); \ int prev_end = 0; \ SEXP ans = allocVector(LGLSXP, length(ends)); \ for (int i = 0; i < length(ends); i++) { \ int end = INTEGER(ends)[i]; \ Rboolean summary = FALSE; \ for (int j = prev_end + OFFSET; j < end; j++) { \ C_TYPE val = ACCESSOR(unlistData)[j]; \ if (NA_CHECK) { \ if (_na_rm) { \ continue; \ } else { \ summary = NA_LOGICAL; \ break; \ } \ } \ if (BREAK_CHECK) { \ summary = TRUE; \ break; \ } \ } \ LOGICAL(ans)[i] = summary; \ prev_end = end; \ } \ SET_NAMES(ans, _get_CompressedList_names(x)); \ return ans; \ } #define PARTITIONED_IS_UNSORTED(C_TYPE, ACCESSOR, NA_CHECK) \ { \ if (asLogical(strictly)) { \ PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, \ val <= ACCESSOR(unlistData)[j-1], 1); \ } else { \ PARTITIONED_BREAK(C_TYPE, ACCESSOR, NA_CHECK, \ val < ACCESSOR(unlistData)[j-1], 1); \ } \ } /* --- .Call ENTRY POINT --- */ SEXP C_is_unsorted_CompressedLogicalList(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(Rboolean, LOGICAL, val == NA_LOGICAL); } /* --- .Call ENTRY POINT --- */ SEXP C_is_unsorted_CompressedIntegerList(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(int, INTEGER, val == NA_INTEGER); } /* --- .Call ENTRY POINT --- */ SEXP C_is_unsorted_CompressedNumericList(SEXP x, SEXP na_rm, SEXP strictly) { PARTITIONED_IS_UNSORTED(double, REAL, ISNA(val)); } IRanges/src/CompressedIRangesList_class.c0000644000175100017510000001310414626176651021470 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of CompressedIRangesList objects * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include #define R_INT_MIN (1+INT_MIN) /**************************************************************************** * C-level abstract getters. */ CompressedIRangesList_holder _hold_CompressedIRangesList(SEXP x) { CompressedIRangesList_holder x_holder; SEXP x_end; x_holder.classname = get_classname(x); x_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(x)); x_holder.length = LENGTH(x_end); x_holder.end = INTEGER(x_end); x_holder.unlistData_holder = _hold_IRanges( _get_CompressedList_unlistData(x)); return x_holder; } int _get_length_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder) { return x_holder->length; } IRanges_holder _get_elt_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i) { int offset, length; offset = i == 0 ? 0 : x_holder->end[i - 1]; length = x_holder->end[i] - offset; return _get_linear_subset_from_IRanges_holder( &(x_holder->unlistData_holder), offset, length); } int _get_eltNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i) { /* IRanges_holder ir_holder; ir_holder = _get_elt_from_CompressedIRangesList_holder(x_holder, i); return _get_length_from_IRanges_holder(&ir_holder); */ int offset; offset = i == 0 ? 0 : x_holder->end[i - 1]; return x_holder->end[i] - offset; /* faster than the above */ } /**************************************************************************** * CompressedIRangesList methods. */ /* --- .Call ENTRY POINT --- */ SEXP C_isNormal_CompressedIRangesList(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, i; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_LOGICAL(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder( &x_holder, i); LOGICAL(ans)[i] = _is_normal_IRanges_holder(&ir_holder); } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_summary_CompressedIRangesList(SEXP object) { int ans_len; SEXP part_end; SEXP ans, ans_names, col_names; part_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(object)); ans_len = LENGTH(part_end); PROTECT(ans = allocMatrix(INTSXP, ans_len, 2)); memset(INTEGER(ans), 0, 2 * ans_len * sizeof(int)); if (ans_len > 0) { int i, j, prev_end = 0; int *ans1_elt, *ans2_elt; const int *part_end_elt, *ranges_width; SEXP unlistData = _get_CompressedList_unlistData(object); ranges_width = INTEGER(_get_IRanges_width(unlistData)); for (i = 0, ans1_elt = INTEGER(ans), ans2_elt = (INTEGER(ans) + ans_len), part_end_elt = INTEGER(part_end); i < ans_len; i++, ans1_elt++, ans2_elt++, part_end_elt++) { *ans1_elt = *part_end_elt - prev_end; for (j = 0; j < *ans1_elt; j++) { *ans2_elt += *ranges_width; ranges_width++; } prev_end = *part_end_elt; } } PROTECT(ans_names = NEW_LIST(2)); PROTECT(col_names = NEW_CHARACTER(2)); SET_STRING_ELT(col_names, 0, mkChar("Length")); SET_STRING_ELT(col_names, 1, mkChar("WidthSum")); SET_VECTOR_ELT(ans_names, 0, duplicate(_get_CompressedList_names(object))); SET_VECTOR_ELT(ans_names, 1, col_names); SET_DIMNAMES(ans, ans_names); UNPROTECT(3); return ans; } /**************************************************************************** * CompressedNormalIRangesList methods. */ /* --- .Call ENTRY POINT --- */ SEXP C_min_CompressedNormalIRangesList(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = INT_MAX; } else { *ans_elt = _get_start_elt_from_IRanges_holder(&ir_holder, 0); } } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_max_CompressedNormalIRangesList(SEXP x, SEXP use_names) { SEXP ans, ans_names; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = R_INT_MIN; } else { *ans_elt = _get_end_elt_from_IRanges_holder(&ir_holder, ir_len - 1); } } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } IRanges/src/CompressedList_class.c0000644000175100017510000000625614626176651020231 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of CompressedList objects * ****************************************************************************/ #include "IRanges.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP unlistData_symbol = NULL, partitioning_symbol = NULL; SEXP _get_CompressedList_unlistData(SEXP x) { INIT_STATIC_SYMBOL(unlistData) return GET_SLOT(x, unlistData_symbol); } SEXP _get_CompressedList_partitioning(SEXP x) { INIT_STATIC_SYMBOL(partitioning) return GET_SLOT(x, partitioning_symbol); } /* Not strict "slot getters" but very much like. */ int _get_CompressedList_length(SEXP x) { return LENGTH(_get_PartitioningByEnd_end( _get_CompressedList_partitioning(x))); } SEXP _get_CompressedList_names(SEXP x) { return _get_Partitioning_names( _get_CompressedList_partitioning(x)); } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_CompressedList_unlistData(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(unlistData) SET_SLOT(x, unlistData_symbol, value); return; } static void set_CompressedList_partitioning(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(partitioning) SET_SLOT(x, partitioning_symbol, value); return; } /**************************************************************************** * C-level constructor. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_CompressedList(const char *classname, SEXP unlistData, SEXP partitioning) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_CompressedList_unlistData(ans, unlistData); set_CompressedList_partitioning(ans, partitioning); UNPROTECT(2); return ans; } /**************************************************************************** * C-level abstract getters for CompressedIntegerList objects. */ CompressedIntsList_holder _hold_CompressedIntegerList(SEXP x) { SEXP partitioning_end; CompressedIntsList_holder x_holder; partitioning_end = _get_PartitioningByEnd_end( _get_CompressedList_partitioning(x)); x_holder.length = LENGTH(partitioning_end); x_holder.breakpoints = INTEGER(partitioning_end); x_holder.unlisted = INTEGER(_get_CompressedList_unlistData(x)); return x_holder; } int _get_length_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder) { return x_holder->length; } Ints_holder _get_elt_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder, int i) { Ints_holder x_elt_holder; int offset; if (i == 0) { offset = 0; } else { offset = x_holder->breakpoints[i - 1]; } x_elt_holder.ptr = x_holder->unlisted + offset; x_elt_holder.length = x_holder->breakpoints[i] - offset; return x_elt_holder; } IRanges/src/Grouping_class.c0000644000175100017510000001040714626176651017054 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of Grouping objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP high2low_symbol = NULL, low2high_symbol = NULL, end_symbol = NULL, NAMES_symbol = NULL; SEXP _get_H2LGrouping_high2low(SEXP x) { INIT_STATIC_SYMBOL(high2low) return GET_SLOT(x, high2low_symbol); } SEXP _get_H2LGrouping_low2high(SEXP x) { INIT_STATIC_SYMBOL(low2high) return GET_SLOT(x, low2high_symbol); } SEXP _get_Partitioning_names(SEXP x) { INIT_STATIC_SYMBOL(NAMES) return GET_SLOT(x, NAMES_symbol); } SEXP _get_PartitioningByEnd_end(SEXP x) { INIT_STATIC_SYMBOL(end) return GET_SLOT(x, end_symbol); } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_Partitioning_names(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(NAMES) SET_SLOT(x, NAMES_symbol, value); return; } static void set_PartitioningByEnd_end(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(end) SET_SLOT(x, end_symbol, value); return; } /**************************************************************************** * C-level constructor. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_PartitioningByEnd(const char *classname, SEXP end, SEXP names) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_PartitioningByEnd_end(ans, end); if (names == NULL) names = R_NilValue; set_Partitioning_names(ans, names); UNPROTECT(2); return ans; } /**************************************************************************** * --- .Call ENTRY POINTS --- * ****************************************************************************/ SEXP C_members_H2LGrouping(SEXP x, SEXP group_ids) { SEXP ans, x_high2low, x_low2high, x_low2high_elt; int x_length, nids, ans_length, i, j, group_id, *ans_elt; if (TYPEOF(group_ids) != INTSXP) error("the group ids must be integers"); x_high2low = _get_H2LGrouping_high2low(x); x_low2high = _get_H2LGrouping_low2high(x); x_length = LENGTH(x_low2high); /* same as LENGTH(x_high2low) */ nids = LENGTH(group_ids); /* 1st pass: determine 'ans_length' */ ans_length = 0; for (j = 0; j < nids; j++) { group_id = INTEGER(group_ids)[j]; if (group_id == NA_INTEGER) error("some group ids are NAs"); i = group_id - 1; if (i < 0 || i >= x_length) error("subscript out of bounds"); if (INTEGER(x_high2low)[i] != NA_INTEGER) continue; ans_length++; x_low2high_elt = VECTOR_ELT(x_low2high, i); if (x_low2high_elt == R_NilValue) continue; ans_length += LENGTH(x_low2high_elt); } PROTECT(ans = NEW_INTEGER(ans_length)); /* 2nd pass: fill 'ans' */ ans_elt = INTEGER(ans); for (j = 0; j < nids; j++) { group_id = INTEGER(group_ids)[j]; i = group_id - 1; if (INTEGER(x_high2low)[i] != NA_INTEGER) continue; *(ans_elt++) = i + 1; x_low2high_elt = VECTOR_ELT(x_low2high, i); if (x_low2high_elt == R_NilValue) continue; memcpy(ans_elt, INTEGER(x_low2high_elt), sizeof(int) * LENGTH(x_low2high_elt)); ans_elt += LENGTH(x_low2high_elt); } sort_int_array(INTEGER(ans), ans_length, 0); UNPROTECT(1); return ans; } SEXP C_vmembers_H2LGrouping(SEXP x, SEXP group_ids_list) { SEXP ans, group_ids; int ans_length, i; ans_length = LENGTH(group_ids_list); PROTECT(ans = NEW_LIST(ans_length)); for (i = 0; i < ans_length; i++) { group_ids = VECTOR_ELT(group_ids_list, i); if (TYPEOF(group_ids) != INTSXP) error("'L' must be a list of integer vectors"); SET_VECTOR_ELT(ans, i, C_members_H2LGrouping(x, group_ids)); } UNPROTECT(1); return ans; } IRanges/src/IPosRanges_comparison.c0000644000175100017510000001552714626176651020351 0ustar00biocbuildbiocbuild/**************************************************************************** * Range-wise comparison of 2 IPosRanges derivatives * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * Generalized comparison of 2 integer ranges. * * There are 13 different ways 2 integer ranges x and y can be positioned * with respect to each other. They are summarized in the following table * together with the codes we assign them: * * numeric code & | numeric code & * 1-letter code & | 1-letter code & * long code | long code * --------------- --------------- | --------------- --------------- * x: .oooo....... -6 'a' "x y" | x: .......oooo. 6 'm' "y x" * y: .......oooo. | y: .oooo....... * --------------- --------------- | --------------- --------------- * x: ..oooo...... -5 'b' "xy" | x: ......oooo.. 5 'l' "yx" * y: ......oooo.. | y: ..oooo...... * --------------- --------------- | --------------- --------------- * x: ...oooo..... -4 'c' "x=y" | x: .....oooo... 4 'k' "y=x" * y: .....oooo... | y: ...oooo..... * --------------- --------------- | --------------- --------------- * x: ...oooooo... -3 'd' "x=" | x: .....oooo... 3 'j' "y=" * y: .....oooo... | y: ...oooooo... * --------------- --------------- | --------------- --------------- * x: ..oooooooo.. -2 'e' "x=x" | x: ....oooo.... 2 'i' "y=y" * y: ....oooo.... | y: ..oooooooo.. * --------------- --------------- | --------------- --------------- * x: ...oooo..... -1 'f' "=y" | x: ...oooooo... 1 'h' "=x" * y: ...oooooo... | y: ...oooo..... * --------------- --------------------------------- --------------- * \ x: ...oooooo... 0 'g' "=" / * \ y: ...oooooo... / * \---------------------------------/ * Notes: * o This way of comparing ranges is a refinement over the standard ranges * comparison defined by the ==, !=, <=, >=, < and > operators. In * particular a numeric code that is < 0, = 0, or > 0 corresponds to * x < y, x == y, or x > y, respectively. * o In this file we use the term "overlap" in a loose way even when there * is actually no overlap between ranges x and y. Real overlaps correspond * to numeric codes >= -4 and <= 4, and to long codes that contain an * equal ("="). * o Long codes are designed to be user-friendly whereas numeric and * 1-letter codes are designed to be more compact and memory efficient. * Typically the formers will be exposed to the end user and translated * internally into the latters. * o Swapping x and y changes the sign of the corresponding numeric code and * substitutes "x" by "y" and "y" by "x" in the corresponding long code. * o Reflecting ranges x and y relative to an arbitrary position (i.e. doing * a symetry with respect to a vertical axis) has the effect of reversing * the associated long code e.g. "x=y" becomes "y=x". The effect on the * numeric code is implemented by the _invert_overlap_code() function. * * 'x_start', 'x_width', 'y_start' and 'y_width' are assumed to be non NA (not * checked). 'x_start' and 'y_start' must be 1-based. 'x_width' and 'y_width' * are assumed to be >= 0 (not checked). */ int _overlap_code(int x_start, int x_width, int y_start, int y_width) { int x_end_plus1, y_end_plus1; x_end_plus1 = x_start + x_width; if (x_end_plus1 < y_start) return -6; if (x_end_plus1 == y_start) { if (x_width == 0 && y_width == 0) return 0; return -5; } y_end_plus1 = y_start + y_width; if (y_end_plus1 < x_start) return 6; if (y_end_plus1 == x_start) return 5; if (x_start < y_start) { if (x_end_plus1 < y_end_plus1) return -4; if (x_end_plus1 == y_end_plus1) return -3; return -2; } if (x_start == y_start) { if (x_end_plus1 < y_end_plus1) return -1; if (x_end_plus1 == y_end_plus1) return 0; return 1; } if (x_end_plus1 < y_end_plus1) return 2; if (x_end_plus1 == y_end_plus1) return 3; return 4; } int _invert_overlap_code(int code) { if (code == -2 || code == 0 || code == 2) return code; if (code <= -4 || code >= 4) return - code; /* Only possible values left: -3, -1, 1, 3 */ return code < 0 ? code + 4 : code - 4; } /* Vectorized comparison of 2 vectors of ranges. */ static void pcompare_ranges( const int *x_start, const int *x_width, int x_len, const int *y_start, const int *y_width, int y_len, int *out, int out_len, int with_warning) { int i, j, k; for (i = j = k = 0; k < out_len; i++, j++, k++) { if (i >= x_len) i = 0; /* recycle i */ if (j >= y_len) j = 0; /* recycle j */ out[k] = _overlap_code(x_start[i], x_width[i], y_start[j], y_width[j]); } /* This warning message is meaningful only when 'out_len' is 'max(x_len, y_len)' and is consistent with the warning we get from binary arithmetic/comparison operations on numeric vectors. */ if (with_warning && out_len != 0 && (i != x_len || j != y_len)) warning("longer object length is not a multiple " "of shorter object length"); return; } /* --- .Call ENTRY POINT --- * 'x_start' and 'x_width': integer vectors of the same length M. * 'y_start' and 'y_width': integer vectors of the same length N. * The 4 integer vectors are assumed to be NA free and 'x_width' and * 'y_width' are assumed to contain non-negative values. For efficiency * reasons, those assumptions are not checked. * If M != N then the shorter object is recycled to the length of the longer * object, except if M or N is 0 in which case the object with length != 0 is * truncated to length 0. */ SEXP C_pcompare_IPosRanges(SEXP x_start, SEXP x_width, SEXP y_start, SEXP y_width) { int x_len, y_len, ans_len; const int *x_start_p, *x_width_p, *y_start_p, *y_width_p; SEXP ans; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); y_len = check_integer_pairs(y_start, y_width, &y_start_p, &y_width_p, "start(y)", "width(y)"); if (x_len == 0 || y_len == 0) ans_len = 0; else ans_len = x_len >= y_len ? x_len : y_len; PROTECT(ans = NEW_INTEGER(ans_len)); pcompare_ranges(x_start_p, x_width_p, x_len, y_start_p, y_width_p, y_len, INTEGER(ans), ans_len, 1); UNPROTECT(1); return ans; } IRanges/src/IRanges.h0000644000175100017510000001613714626176651015440 0ustar00biocbuildbiocbuild#include "../inst/include/IRanges_defines.h" #include #define INIT_STATIC_SYMBOL(NAME) \ { \ if (NAME ## _symbol == NULL) \ NAME ## _symbol = install(# NAME); \ } /* Ranges_class.c */ SEXP C_validate_Ranges( SEXP x_start, SEXP x_end, SEXP x_width ); /* IPosRanges_comparison.c */ int _overlap_code( int x_start, int x_width, int y_start, int y_width ); int _invert_overlap_code( int code ); SEXP C_pcompare_IPosRanges( SEXP x_start, SEXP x_width, SEXP y_start, SEXP y_width ); /* IRanges_class.c */ SEXP _get_IRanges_start(SEXP x); SEXP _get_IRanges_width(SEXP x); SEXP _get_IRanges_names(SEXP x); int _get_IRanges_length(SEXP x); IRanges_holder _hold_IRanges(SEXP x); int _get_length_from_IRanges_holder(const IRanges_holder *x_holder); int _get_width_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); int _get_start_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); int _get_end_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); SEXP _get_names_elt_from_IRanges_holder( const IRanges_holder *x_holder, int i ); IRanges_holder _get_linear_subset_from_IRanges_holder( const IRanges_holder *x_holder, int offset, int length ); void _set_IRanges_names( SEXP x, SEXP names ); void _copy_IRanges_slots( SEXP x, SEXP x0 ); SEXP _new_IRanges( const char *classname, SEXP start, SEXP width, SEXP names ); SEXP _new_IRanges_from_IntPairAE( const char *classname, const IntPairAE *intpair_ae ); SEXP _new_list_of_IRanges_from_IntPairAEAE( const char *element_type, const IntPairAEAE *intpair_aeae ); SEXP _alloc_IRanges( const char *classname, int length ); int _is_normal_IRanges_holder(const IRanges_holder *x_holder); SEXP C_isNormal_IRanges(SEXP x); SEXP C_from_integer_to_IRanges(SEXP x); SEXP C_from_logical_to_NormalIRanges(SEXP x); /* IRanges_constructor.c */ SEXP C_solve_start_end_width( SEXP start, SEXP end, SEXP width ); SEXP C_solve_user_SEW( SEXP refwidths, SEXP start, SEXP end, SEXP width, SEXP translate_negative_coord, SEXP allow_nonnarrowing ); /* Grouping_class.c */ SEXP _get_H2LGrouping_high2low(SEXP x); SEXP _get_H2LGrouping_low2high(SEXP x); SEXP _get_Partitioning_names(SEXP x); SEXP _get_PartitioningByEnd_end(SEXP x); SEXP _new_PartitioningByEnd( const char *classname, SEXP end, SEXP names ); SEXP C_members_H2LGrouping( SEXP x, SEXP group_ids ); SEXP C_vmembers_H2LGrouping( SEXP x, SEXP group_ids_list ); /* RleViews_utils.c */ SEXP C_viewMins_RleViews( SEXP x, SEXP na_rm ); SEXP C_viewMaxs_RleViews( SEXP x, SEXP na_rm ); SEXP C_viewSums_RleViews( SEXP x, SEXP na_rm ); SEXP C_viewMeans_RleViews( SEXP x, SEXP na_rm ); SEXP C_viewWhichMins_RleViews( SEXP x, SEXP na_rm ); SEXP C_viewWhichMaxs_RleViews( SEXP x, SEXP na_rm ); /* SimpleIRangesList_class.c */ SEXP C_isNormal_SimpleIRangesList(SEXP x, SEXP use_names); SEXP C_min_SimpleNormalIRangesList(SEXP x); SEXP C_max_SimpleNormalIRangesList(SEXP x); /* CompressedList_class.c */ SEXP _get_CompressedList_unlistData(SEXP x); SEXP _get_CompressedList_partitioning(SEXP x); int _get_CompressedList_length(SEXP x); SEXP _get_CompressedList_names(SEXP x); SEXP _new_CompressedList( const char *classname, SEXP unlistData, SEXP partitioning ); CompressedIntsList_holder _hold_CompressedIntegerList( SEXP x ); int _get_length_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder ); Ints_holder _get_elt_from_CompressedIntsList_holder( const CompressedIntsList_holder *x_holder, int i ); /* CompressedIRangesList_class.c */ CompressedIRangesList_holder _hold_CompressedIRangesList(SEXP x); int _get_length_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder ); IRanges_holder _get_elt_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i ); int _get_eltNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder, int i ); SEXP C_isNormal_CompressedIRangesList( SEXP x, SEXP use_names ); SEXP C_summary_CompressedIRangesList( SEXP object ); SEXP C_min_CompressedNormalIRangesList( SEXP x, SEXP use_names ); SEXP C_max_CompressedNormalIRangesList( SEXP x, SEXP use_names ); /* inter_range_methods.c */ SEXP C_range_IRanges(SEXP x); SEXP C_reduce_IntegerRanges( SEXP x_start, SEXP x_width, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap, SEXP with_inframe_start ); SEXP C_reduce_CompressedIRangesList( SEXP x, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap ); SEXP C_gaps_IntegerRanges( SEXP x_start, SEXP x_width, SEXP start, SEXP end ); SEXP C_gaps_CompressedIRangesList( SEXP x, SEXP start, SEXP end ); SEXP C_disjointBins_IntegerRanges( SEXP x_start, SEXP x_width ); /* coverage_methods.c */ SEXP C_coverage_IRanges( SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_len, SEXP method ); SEXP C_coverage_CompressedIRangesList( SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_lens, SEXP method ); /* NCList.c */ SEXP C_new_NCList(); SEXP C_free_NCList(SEXP nclist_xp); SEXP C_build_NCList( SEXP nclist_xp, SEXP x_start, SEXP x_end, SEXP x_subset ); SEXP C_new_NCListAsINTSXP_from_NCList(SEXP nclist_xp); SEXP C_print_NCListAsINTSXP( SEXP x_nclist, SEXP x_start, SEXP x_end ); SEXP C_find_overlaps_NCList( SEXP q_start, SEXP q_end, SEXP s_start, SEXP s_end, SEXP nclist, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length ); SEXP C_find_overlaps_in_groups_NCList( SEXP q_start, SEXP q_end, SEXP q_space, SEXP q_groups, SEXP s_start, SEXP s_end, SEXP s_space, SEXP s_groups, SEXP nclists, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length ); /* CompressedAtomicList_utils.c */ SEXP C_sum_CompressedLogicalList( SEXP x, SEXP na_rm ); SEXP C_sum_CompressedIntegerList( SEXP x, SEXP na_rm ); SEXP C_sum_CompressedNumericList( SEXP x, SEXP na_rm ); SEXP C_prod_CompressedLogicalList( SEXP x, SEXP na_rm ); SEXP C_prod_CompressedIntegerList( SEXP x, SEXP na_rm ); SEXP C_prod_CompressedNumericList( SEXP x, SEXP na_rm ); SEXP C_min_CompressedLogicalList( SEXP x, SEXP na_rm ); SEXP C_min_CompressedIntegerList( SEXP x, SEXP na_rm ); SEXP C_min_CompressedNumericList( SEXP x, SEXP na_rm ); SEXP C_max_CompressedLogicalList( SEXP x, SEXP na_rm ); SEXP C_max_CompressedIntegerList( SEXP x, SEXP na_rm ); SEXP C_max_CompressedNumericList( SEXP x, SEXP na_rm ); SEXP C_which_min_CompressedLogicalList(SEXP x); SEXP C_which_min_CompressedIntegerList(SEXP x); SEXP C_which_min_CompressedNumericList(SEXP x); SEXP C_which_max_CompressedLogicalList(SEXP x); SEXP C_which_max_CompressedIntegerList(SEXP x); SEXP C_which_max_CompressedNumericList(SEXP x); SEXP C_is_unsorted_CompressedLogicalList( SEXP x, SEXP na_rm, SEXP strictly ); SEXP C_is_unsorted_CompressedIntegerList( SEXP x, SEXP na_rm, SEXP strictly ); SEXP C_is_unsorted_CompressedNumericList( SEXP x, SEXP na_rm, SEXP strictly ); /* extractListFragments.c */ SEXP C_find_partition_overlaps( SEXP q_end, SEXP s_end, SEXP with_split_partitions ); IRanges/src/IRanges_class.c0000644000175100017510000002375414626176651016623 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of IRanges objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /**************************************************************************** * C-level slot getters. * * Be careful that these functions do NOT duplicate the returned slot. * Thus they cannot be made .Call entry points! */ static SEXP start_symbol = NULL, width_symbol = NULL, NAMES_symbol = NULL; SEXP _get_IRanges_start(SEXP x) { INIT_STATIC_SYMBOL(start) return GET_SLOT(x, start_symbol); } SEXP _get_IRanges_width(SEXP x) { INIT_STATIC_SYMBOL(width) return GET_SLOT(x, width_symbol); } SEXP _get_IRanges_names(SEXP x) { INIT_STATIC_SYMBOL(NAMES) return GET_SLOT(x, NAMES_symbol); } /* Not a strict "slot getter" but very much like. */ int _get_IRanges_length(SEXP x) { return LENGTH(_get_IRanges_start(x)); } /**************************************************************************** * C-level abstract getters. */ IRanges_holder _hold_IRanges(SEXP x) { IRanges_holder x_holder; x_holder.classname = get_classname(x); x_holder.is_constant_width = 0; x_holder.length = _get_IRanges_length(x); x_holder.width = INTEGER(_get_IRanges_width(x)); x_holder.start = INTEGER(_get_IRanges_start(x)); x_holder.end = NULL; x_holder.SEXP_offset = 0; x_holder.names = _get_IRanges_names(x); return x_holder; } int _get_length_from_IRanges_holder(const IRanges_holder *x_holder) { return x_holder->length; } int _get_width_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { return x_holder->is_constant_width ? x_holder->width[0] : x_holder->width[i]; } int _get_start_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { if (x_holder->start) return x_holder->start[i]; return x_holder->end[i] - _get_width_elt_from_IRanges_holder(x_holder, i) + 1; } int _get_end_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { if (x_holder->end) return x_holder->end[i]; return x_holder->start[i] + _get_width_elt_from_IRanges_holder(x_holder, i) - 1; } SEXP _get_names_elt_from_IRanges_holder(const IRanges_holder *x_holder, int i) { return STRING_ELT(x_holder->names, x_holder->SEXP_offset + i); } IRanges_holder _get_linear_subset_from_IRanges_holder( const IRanges_holder *x_holder, int offset, int length) { IRanges_holder y_holder; y_holder = *x_holder; y_holder.length = length; y_holder.start += offset; if (!y_holder.is_constant_width) y_holder.width += offset; y_holder.SEXP_offset += offset; return y_holder; } /**************************************************************************** * C-level slot setters. * * Be careful that these functions do NOT duplicate the assigned value! */ static void set_IRanges_start(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(start) SET_SLOT(x, start_symbol, value); return; } static void set_IRanges_width(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(width) SET_SLOT(x, width_symbol, value); /* Rprintf("set_IRanges_width(): value=%p _get_IRanges_width(x)=%p\n", value, _get_IRanges_width(x)); */ return; } static void set_IRanges_names(SEXP x, SEXP value) { INIT_STATIC_SYMBOL(NAMES) SET_SLOT(x, NAMES_symbol, value); return; } /* WARNING: Use only AFTER 'x@start' has been set! Because this setter is trying to figure out what the length of 'x' is. */ void _set_IRanges_names(SEXP x, SEXP names) { if (names == NULL) names = R_NilValue; else if (names != R_NilValue && LENGTH(names) != _get_IRanges_length(x)) error("_set_IRanges_names(): " "number of names and number of elements differ"); set_IRanges_names(x, names); return; } /* Note that 'start' and 'width' must NOT contain NAs. set_IRanges_slots() trusts the caller and does NOT check this! */ static void set_IRanges_slots(SEXP x, SEXP start, SEXP width, SEXP names) { if (LENGTH(width) != LENGTH(start)) error("set_IRanges_slots(): " "number of starts and number of widths differ"); set_IRanges_start(x, start); set_IRanges_width(x, width); _set_IRanges_names(x, names); return; } void _copy_IRanges_slots(SEXP x, SEXP x0) { SEXP slot; PROTECT(slot = duplicate(_get_IRanges_start(x0))); set_IRanges_start(x, slot); UNPROTECT(1); PROTECT(slot = duplicate(_get_IRanges_width(x0))); set_IRanges_width(x, slot); UNPROTECT(1); PROTECT(slot = duplicate(_get_IRanges_names(x0))); set_IRanges_names(x, slot); UNPROTECT(1); return; } /**************************************************************************** * C-level constructors. */ /* Be careful that this constructor does NOT duplicate its arguments before putting them in the slots of the returned object. So don't try to make it a .Call entry point! */ SEXP _new_IRanges(const char *classname, SEXP start, SEXP width, SEXP names) { SEXP classdef, ans; PROTECT(classdef = MAKE_CLASS(classname)); PROTECT(ans = NEW_OBJECT(classdef)); set_IRanges_slots(ans, start, width, names); UNPROTECT(2); return ans; } SEXP _new_IRanges_from_IntPairAE(const char *classname, const IntPairAE *intpair_ae) { SEXP ans, start, width; PROTECT(start = new_INTEGER_from_IntAE(intpair_ae->a)); PROTECT(width = new_INTEGER_from_IntAE(intpair_ae->b)); PROTECT(ans = _new_IRanges(classname, start, width, R_NilValue)); UNPROTECT(3); return ans; } /* TODO: Try to make this faster by making only 1 call to _new_IRanges() (or _alloc_IRanges()) and cloning and modifying this initial object inside the for loop. */ SEXP _new_list_of_IRanges_from_IntPairAEAE(const char *element_type, const IntPairAEAE *intpair_aeae) { SEXP ans, ans_elt; int nelt, i; const IntPairAE *ae; nelt = IntPairAEAE_get_nelt(intpair_aeae); PROTECT(ans = NEW_LIST(nelt)); for (i = 0; i < nelt; i++) { ae = intpair_aeae->elts[i]; PROTECT(ans_elt = _new_IRanges_from_IntPairAE(element_type, ae)); SET_VECTOR_ELT(ans, i, ans_elt); UNPROTECT(1); } UNPROTECT(1); return ans; } /* Allocation WITHOUT initialization. The 'start' and 'width' slots are not initialized (they contain junk). */ SEXP _alloc_IRanges(const char *classname, int length) { SEXP start, width, ans; PROTECT(start = NEW_INTEGER(length)); PROTECT(width = NEW_INTEGER(length)); PROTECT(ans = _new_IRanges(classname, start, width, R_NilValue)); UNPROTECT(3); return ans; } /**************************************************************************** * Validity functions. */ int _is_normal_IRanges_holder(const IRanges_holder *x_holder) { int x_len, i; x_len = _get_length_from_IRanges_holder(x_holder); if (x_len == 0) return 1; if (_get_width_elt_from_IRanges_holder(x_holder, 0) <= 0) return 0; for (i = 1; i < x_len; i++) { if (_get_width_elt_from_IRanges_holder(x_holder, i) <= 0) return 0; if (_get_start_elt_from_IRanges_holder(x_holder, i) <= _get_end_elt_from_IRanges_holder(x_holder, i - 1) + 1) return 0; } return 1; } /* --- .Call ENTRY POINT --- */ SEXP C_isNormal_IRanges(SEXP x) { IRanges_holder ir_holder; ir_holder = _hold_IRanges(x); return ScalarLogical(_is_normal_IRanges_holder(&ir_holder)); } /**************************************************************************** * Coercion functions. */ /* --- .Call ENTRY POINT --- */ SEXP C_from_integer_to_IRanges(SEXP x) { SEXP ans, ans_start, ans_width; int i, x_length, ans_length; int *start_buf, *width_buf; int *x_elt, *start_elt, *width_elt, prev_elt_plus1; x_length = LENGTH(x); if (x_length == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); } else { ans_length = 1; start_buf = (int *) R_alloc((long) x_length, sizeof(int)); width_buf = (int *) R_alloc((long) x_length, sizeof(int)); start_buf[0] = INTEGER(x)[0]; width_buf[0] = 1; prev_elt_plus1 = start_buf[0] + 1; start_elt = start_buf; width_elt = width_buf; for (i = 1, x_elt = (INTEGER(x)+1); i < x_length; i++, x_elt++) { if (*x_elt == NA_INTEGER) error("cannot create an IRanges object from an integer vector with missing values"); if (*x_elt == prev_elt_plus1) { *width_elt += 1; } else { ans_length++; start_elt++; width_elt++; *start_elt = *x_elt; *width_elt = 1; prev_elt_plus1 = *x_elt; } prev_elt_plus1++; } PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); memcpy(INTEGER(ans_start), start_buf, sizeof(int) * ans_length); memcpy(INTEGER(ans_width), width_buf, sizeof(int) * ans_length); } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_from_logical_to_NormalIRanges(SEXP x) { SEXP ans, ans_start, ans_width; int i, x_length, buf_length, ans_length; int *start_buf, *width_buf; int *x_elt, *start_elt, *width_elt, prev_elt; x_length = LENGTH(x); if (x_length == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); } else { buf_length = x_length / 2 + 1; ans_length = 0; start_buf = (int *) R_alloc((long) buf_length, sizeof(int)); width_buf = (int *) R_alloc((long) buf_length, sizeof(int)); prev_elt = 0; start_elt = start_buf - 1; width_elt = width_buf - 1; for (i = 1, x_elt = LOGICAL(x); i <= x_length; i++, x_elt++) { if (*x_elt == NA_LOGICAL) error("cannot create an IRanges object from a logical vector with missing values"); if (*x_elt == 1) { if (prev_elt) { *width_elt += 1; } else { ans_length++; start_elt++; width_elt++; *start_elt = i; *width_elt = 1; } } prev_elt = *x_elt; } PROTECT(ans_start = NEW_INTEGER(ans_length)); PROTECT(ans_width = NEW_INTEGER(ans_length)); memcpy(INTEGER(ans_start), start_buf, sizeof(int) * ans_length); memcpy(INTEGER(ans_width), width_buf, sizeof(int) * ans_length); } PROTECT(ans = _new_IRanges("NormalIRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } IRanges/src/IRanges_constructor.c0000644000175100017510000002347214626176651020100 0ustar00biocbuildbiocbuild/**************************************************************************** * Support functions for the IRanges constructor * ****************************************************************************/ #include "IRanges.h" #define R_INT_MIN (INT_MIN + 1) #define R_INT_MAX INT_MAX static char errmsg_buf[200]; /**************************************************************************** * C_solve_start_end_width() */ /* Return -1 if the range specified by 'start', 'end', and 'width' is invalid, and 0 otherwise. */ static int solve_range(int start, int end, int width, int *solved_start, int *solved_width) { long long int tmp; static const char *too_many_NAs = "at least two out of 'start', " "'end', and 'width', must\n be " "supplied"; *solved_start = start; *solved_width = width; if (width == NA_INTEGER) { if (start == NA_INTEGER || end == NA_INTEGER) { snprintf(errmsg_buf, sizeof(errmsg_buf), "%s", too_many_NAs); return -1; } /* Compute and check 'width'. */ tmp = (long long int) end - start + 1; if (tmp < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'end' must be >= 'start' - 1"); return -1; } if (tmp > R_INT_MAX) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the 'width' (%lld) inferred from the " "supplied 'start'\n and 'end' is too big " "(>= 2^31)", tmp); return -1; } *solved_width = (int) tmp; return 0; } if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative widths are not allowed"); return -1; } if (start == NA_INTEGER) { if (end == NA_INTEGER) { snprintf(errmsg_buf, sizeof(errmsg_buf), "%s", too_many_NAs); return -1; } /* Compute and check 'start'. */ tmp = (long long int) end - width + 1; if (tmp < R_INT_MIN || tmp > R_INT_MAX) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the 'start' (%lld) inferred from the " "supplied 'end'\n and 'width' is beyond " "the limits of what is currently supported " "(must\n be > -2^31 and < 2^31 for now)", tmp); return -1; } *solved_start = (int) tmp; return 0; } if (end == NA_INTEGER) { if (start == NA_INTEGER) { snprintf(errmsg_buf, sizeof(errmsg_buf), "%s", too_many_NAs); return -1; } /* Compute and check 'end'. */ tmp = (long long int) start + width - 1; if (tmp < R_INT_MIN || tmp > R_INT_MAX) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the 'end' (%lld) inferred from the " "supplied 'start'\n and 'width' is beyond " "the limits of what is currently supported " "(must\n be > -2^31 and < 2^31 for now)", tmp); return -1; } return 0; } tmp = (long long int) end - start + 1; if (width != tmp) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the supplied 'width' (%d) doesn't match " "the width\n inferred from the supplied " "'start' and 'end' (%lld)", width, tmp); return -1; } return 0; } /* --- .Call ENTRY POINT --- 'start' and 'width' can be used **as-is** to construct the IRanges object to return if they satisfy at least both criteria: (a) They don't have a "dim" or "names" attribute on them. (b) They don't contain NAs. Note that this just reflects what validObject() expects to see in the "start" and "width" slots of an IRanges object. If they can't be used **as-is** then they need to be modified (i.e. the names need to be removed and/or the NAs in them need to be resolved). This requires duplicating them first. Of course they also must define valid ranges, that is, after resolution of the NAs, the width must be >= 0 and < 2^31, the start must be > -2^31 and < 2^31, and the implicit end must be > -2^31 and < 2^31. This is checked early and an error is raised on the first invalid range (see 1st pass below). */ SEXP C_solve_start_end_width(SEXP start, SEXP end, SEXP width) { int ans_len, use_start_as_is, use_width_as_is, i, solved_start, solved_width; const int *start_p, *end_p, *width_p; SEXP ans, ans_start, ans_width; if (!(IS_INTEGER(start) && IS_INTEGER(end) && IS_INTEGER(width))) error("the supplied 'start', 'end', and 'width', " "must be integer vectors"); ans_len = LENGTH(start); if (LENGTH(end) != ans_len || LENGTH(width) != ans_len) error("'start', 'end', and 'width' must have the same length"); use_start_as_is = GET_DIM(start) == R_NilValue && GET_NAMES(start) == R_NilValue; use_width_as_is = GET_DIM(width) == R_NilValue && GET_NAMES(width) == R_NilValue; /* 1st pass: Solve and check the supplied ranges and determine whether 'start' and/or 'width' can be used as-is or not. */ start_p = INTEGER(start); end_p = INTEGER(end); width_p = INTEGER(width); for (i = 0; i < ans_len; i++) { if (solve_range(*start_p, *end_p, *width_p, &solved_start, &solved_width) != 0) error("In range %d: %s.", i + 1, errmsg_buf); if (use_start_as_is && *start_p == NA_INTEGER) use_start_as_is = 0; if (use_width_as_is && *width_p == NA_INTEGER) use_width_as_is = 0; start_p++; end_p++; width_p++; } ans_start = start; ans_width = width; if (!(use_start_as_is && use_width_as_is)) { /* 2nd pass: Allocate and populate 'ans_start' and/or 'ans_width'. */ if (!use_start_as_is) PROTECT(ans_start = NEW_INTEGER(ans_len)); if (!use_width_as_is) PROTECT(ans_width = NEW_INTEGER(ans_len)); start_p = INTEGER(start); end_p = INTEGER(end); width_p = INTEGER(width); for (i = 0; i < ans_len; i++) { /* All ranges got validated during the 1st pass so we don't need to check the returned value again. */ solve_range(*start_p, *end_p, *width_p, &solved_start, &solved_width); if (!use_start_as_is) INTEGER(ans_start)[i] = solved_start; if (!use_width_as_is) INTEGER(ans_width)[i] = solved_width; start_p++; end_p++; width_p++; } } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(1 + !use_start_as_is + !use_width_as_is); return ans; } /**************************************************************************** * C_solve_user_SEW() */ static int translate_negative_coord0; static int nonnarrowing_is_OK; static int translate_negative_startorend(int refwidth, int startorend) { if (startorend < 0) startorend += refwidth + 1; return startorend; } static int check_start(int refwidth, int start, const char *what) { if (nonnarrowing_is_OK) return 0; if (start < 1) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s start " "(%d) is < 1", what, start); return -1; } if (start > refwidth + 1) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s start " "(%d) is > refwidth + 1", what, start); return -1; } return 0; } static int check_end(int refwidth, int end, const char *what) { if (nonnarrowing_is_OK) return 0; if (end < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s end " "(%d) is < 0", what, end); return -1; } if (end > refwidth) { snprintf(errmsg_buf, sizeof(errmsg_buf), "'allow.nonnarrowing' is FALSE and the %s end " "(%d) is > refwidth", what, end); return -1; } return 0; } static int solve_user_SEW_row(int refwidth, int start, int end, int width, int *solved_start, int *solved_width) { if (refwidth == NA_INTEGER || refwidth < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative values or NAs are not allowed in 'refwidths'"); return -1; } if (start != NA_INTEGER) { if (translate_negative_coord0) start = translate_negative_startorend(refwidth, start); if (check_start(refwidth, start, "supplied") != 0) return -1; } if (end != NA_INTEGER) { if (translate_negative_coord0) end = translate_negative_startorend(refwidth, end); if (check_end(refwidth, end, "supplied") != 0) return -1; } if (width == NA_INTEGER) { if (start == NA_INTEGER) start = 1; if (end == NA_INTEGER) end = refwidth; width = end - start + 1; if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "the supplied start/end lead to a " "negative width"); return -1; } } else if (width < 0) { snprintf(errmsg_buf, sizeof(errmsg_buf), "negative values are not allowed in 'width'"); return -1; } else if ((start == NA_INTEGER) == (end == NA_INTEGER)) { snprintf(errmsg_buf, sizeof(errmsg_buf), "either the supplied start or the supplied end " "(but not both) must be NA when the supplied width " "is not NA"); return -1; } else { // Either 'start' or 'end' is NA if (start == NA_INTEGER) { start = end - width + 1; if (check_start(refwidth, start, "solved") != 0) return -1; } else { end = start + width - 1; if (check_end(refwidth, end, "solved") != 0) return -1; } } *solved_start = start; *solved_width = width; return 0; } /* * --- .Call ENTRY POINT --- */ SEXP C_solve_user_SEW(SEXP refwidths, SEXP start, SEXP end, SEXP width, SEXP translate_negative_coord, SEXP allow_nonnarrowing) { SEXP ans, ans_start, ans_width; int ans_len, i0, i1, i2, i3; translate_negative_coord0 = LOGICAL(translate_negative_coord)[0]; nonnarrowing_is_OK = LOGICAL(allow_nonnarrowing)[0]; ans_len = LENGTH(refwidths); PROTECT(ans_start = NEW_INTEGER(ans_len)); PROTECT(ans_width = NEW_INTEGER(ans_len)); for (i0 = i1 = i2 = i3 = 0; i0 < ans_len; i0++, i1++, i2++, i3++) { /* recycling */ if (i1 >= LENGTH(start)) i1 = 0; if (i2 >= LENGTH(end)) i2 = 0; if (i3 >= LENGTH(width)) i3 = 0; if (solve_user_SEW_row(INTEGER(refwidths)[i0], INTEGER(start)[i1], INTEGER(end)[i2], INTEGER(width)[i3], INTEGER(ans_start) + i0, INTEGER(ans_width) + i0) != 0) { UNPROTECT(2); error("solving row %d: %s", i0 + 1, errmsg_buf); } } PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } IRanges/src/NCList.c0000644000175100017510000013063314626176651015235 0ustar00biocbuildbiocbuild/**************************************************************************** * A Nested Containment List implementation * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include /* for malloc, realloc, free, abs */ #include /* for log10 */ /* #include static double cumulated_time = 0.0; static clock_t clock0; static void start_clock() { clock0 = clock(); } static void stop_clock() { cumulated_time += ((double) clock() - clock0) / CLOCKS_PER_SEC; } static void init_clock(const char *msg) { printf("%s", msg); cumulated_time = 0.0; clock0 = clock(); } static void print_elapsed_time() { stop_clock(); printf("%8.6f s\n", cumulated_time); } */ /**************************************************************************** * A simple wrapper to realloc() */ /* 'new_nmemb' must be > 'old_nmemb'. */ static void *realloc2(void *ptr, int new_nmemb, int old_nmemb, size_t size) { void *new_ptr; if (new_nmemb <= old_nmemb) error("IRanges internal error in realloc2(): " "'new_nmemb' <= 'old_nmemb'"); size *= new_nmemb; if (old_nmemb == 0) { new_ptr = malloc(size); } else { new_ptr = realloc(ptr, size); } if (new_ptr == NULL) error("IRanges internal error in realloc2(): " "memory (re)allocation failed"); return new_ptr; } static int get_new_maxdepth(int maxdepth) { return maxdepth == 0 ? 16384 : 4 * maxdepth; } /**************************************************************************** * NCList structure */ /* sizeof(NCList) is 24 bytes (0x18 bytes) */ typedef struct nclist_t { int buflength; /* >= 0 */ int nchildren; /* >= 0 and <= buflength */ struct nclist_t *childrenbuf; /* Of length 'buflength'. */ int *rgidbuf; /* Of length 'nchildren'. The IDs of the ranges asso- ciated with the children. The ID of a range is just its 0-based position in original IntegerRanges ob- ject 'x'. Allows reverse mapping of the children in- to 'x' (e.g. to find their start, end, or width). */ } NCList; static void init_NCList(NCList *nclist) { nclist->buflength = nclist->nchildren = 0; return; } /**************************************************************************** * Utilities to walk on an NCList structure non-recursively */ typedef struct NCList_walking_stack_elt_t { const NCList *parent_nclist; int n; /* point to n-th child of 'parent_nclist' */ } NCListWalkingStackElt; #define GET_NCLIST(stack_elt) \ ((stack_elt)->parent_nclist->childrenbuf + (stack_elt)->n) #define GET_RGID(stack_elt) \ ((stack_elt)->parent_nclist->rgidbuf[(stack_elt)->n]) static NCListWalkingStackElt *NCList_walking_stack = NULL; static int NCList_walking_stack_maxdepth = 0; static int NCList_walking_stack_depth = 0; #define RESET_NCLIST_WALKING_STACK() NCList_walking_stack_depth = 0 /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static NCListWalkingStackElt *pop_NCListWalkingStackElt() { NCList_walking_stack_depth--; return NCList_walking_stack + NCList_walking_stack_depth; } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static NCListWalkingStackElt *peek_NCListWalkingStackElt() { return NCList_walking_stack + NCList_walking_stack_depth - 1; } static void extend_NCList_walking_stack() { int new_maxdepth; new_maxdepth = get_new_maxdepth(NCList_walking_stack_maxdepth); NCList_walking_stack = (NCListWalkingStackElt *) realloc2(NCList_walking_stack, new_maxdepth, NCList_walking_stack_maxdepth, sizeof(NCListWalkingStackElt)); NCList_walking_stack_maxdepth = new_maxdepth; return; } /* Return a pointer to n-th child. */ static const NCList *move_to_child(const NCList *parent_nclist, int n) { NCListWalkingStackElt *stack_elt; if (NCList_walking_stack_depth == NCList_walking_stack_maxdepth) extend_NCList_walking_stack(); stack_elt = NCList_walking_stack + NCList_walking_stack_depth++; stack_elt->parent_nclist = parent_nclist; stack_elt->n = n; return GET_NCLIST(stack_elt); } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static const NCList *move_to_right_sibling_or_uncle(const NCList *nclist) { NCListWalkingStackElt *stack_elt; stack_elt = NCList_walking_stack + NCList_walking_stack_depth; do { stack_elt--; if (++(stack_elt->n) < stack_elt->parent_nclist->nchildren) return ++nclist; nclist = stack_elt->parent_nclist; } while (--NCList_walking_stack_depth != 0); return NULL; } /* Must NOT be called when 'NCList_walking_stack_depth' is 0 (i.e. when stack in empty). */ static const NCList *move_to_right_uncle() { const NCList *parent_nclist; parent_nclist = pop_NCListWalkingStackElt()->parent_nclist; if (NCList_walking_stack_depth == 0) return NULL; return move_to_right_sibling_or_uncle(parent_nclist); } static const NCList *move_down(const NCList *nclist) { while (nclist->nchildren != 0) nclist = move_to_child(nclist, 0); return nclist; } /* Top-down walk: parent is treated before children and children are treated from left to right. For a top-down walk that visits the entire tree (i.e. "complete walk") do: RESET_NCLIST_WALKING_STACK(); for (nclist = top_nclist; nclist != NULL; nclist = next_top_down(nclist)) { treat nclist } */ static const NCList *next_top_down(const NCList *nclist) { /* Try to move to first child, if any. */ if (nclist->nchildren != 0) return move_to_child(nclist, 0); if (NCList_walking_stack_depth == 0) return NULL; return move_to_right_sibling_or_uncle(nclist); } /* Bottom-up walk: children are treated from left to right and before parent. For a bottom-up walk that visits the entire tree (i.e. "complete walk"), do: RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { treat nclist } */ static const NCList *next_bottom_up() { NCListWalkingStackElt *stack_elt; const NCList *parent_nclist; if (NCList_walking_stack_depth == 0) return NULL; stack_elt = peek_NCListWalkingStackElt(); stack_elt->n++; parent_nclist = stack_elt->parent_nclist; if (stack_elt->n < parent_nclist->nchildren) { /* Move down thru the next children. */ return move_down(GET_NCLIST(stack_elt)); } /* All children have been treated --> move 1 level up. */ NCList_walking_stack_depth--; return parent_nclist; } /**************************************************************************** * Test the top-down and bottom-up non-recursive walks on an NCList structure */ /* static void print_NCList_walking_stack() { int d; printf("NCList_walking_stack:"); for (d = 0; d < NCList_walking_stack_depth; d++) printf(" %d", NCList_walking_stack[d].n); printf("\n"); return; } static void print_NCList_node(const NCList *nclist, int depth) { int d, n; for (d = 0; d < depth; d++) printf("-"); printf(" "); printf("NCList node at address %p:\n", nclist); for (d = 0; d < depth; d++) printf("-"); printf(" "); printf(" buflength=%d; nchildren=%d\n", nclist->buflength, nclist->nchildren); for (d = 0; d < depth; d++) printf("-"); printf(" "); printf(" rgidbuf:"); for (n = 0; n < nclist->nchildren; n++) printf(" %d", nclist->rgidbuf[n]); printf("\n"); return; } static void print_NCList_rec(const NCList *nclist, int depth) { int n; print_NCList_node(nclist, depth); for (n = 0; n < nclist->nchildren; n++) print_NCList_rec(nclist->childrenbuf + n, depth + 1); return; } static void test_complete_top_down_walk(const NCList *top_nclist) { const NCList *nclist; printf("======= START complete top-down walk ========\n"); RESET_NCLIST_WALKING_STACK(); for (nclist = top_nclist; nclist != NULL; nclist = next_top_down(nclist)) { print_NCList_walking_stack(); print_NCList_node(nclist, NCList_walking_stack_depth); printf("\n"); fflush(stdout); } printf("======== END complete top-down walk =========\n"); return; } static void test_complete_bottom_up_walk(const NCList *top_nclist) { const NCList *nclist; printf("======= START complete bottom-up walk =======\n"); RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { print_NCList_walking_stack(); print_NCList_node(nclist, NCList_walking_stack_depth); printf("\n"); fflush(stdout); } printf("======== END complete bottom-up walk ========\n"); return; } */ /**************************************************************************** * free_NCList() */ static void free_NCList(const NCList *top_nclist) { const NCList *nclist; /* Complete bottom-up walk. */ RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { if (nclist->buflength != 0) { free(nclist->childrenbuf); free(nclist->rgidbuf); } } return; } /**************************************************************************** * C_new_NCList() and C_free_NCList() */ /* --- .Call ENTRY POINT --- */ SEXP C_new_NCList() { NCList *top_nclist; //init_clock("preprocessing: T1 = "); top_nclist = (NCList *) malloc(sizeof(NCList)); if (top_nclist == NULL) error("C_new_NCList: memory allocation failed"); init_NCList(top_nclist); return R_MakeExternalPtr(top_nclist, R_NilValue, R_NilValue); } /* --- .Call ENTRY POINT --- */ SEXP C_free_NCList(SEXP nclist_xp) { NCList *top_nclist; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("C_free_NCList: pointer to NCList struct is NULL"); free_NCList(top_nclist); free(top_nclist); R_SetExternalPtrAddr(nclist_xp, NULL); return R_NilValue; } /**************************************************************************** * C_build_NCList() */ static void extend_NCList(NCList *nclist) { int old_buflength, new_buflength; NCList *new_childrenbuf; int *new_rgidbuf; old_buflength = nclist->buflength; if (old_buflength == 0) { new_buflength = 1; } else { if (old_buflength < 256) new_buflength = 16 * old_buflength; else if (old_buflength < 131072) new_buflength = 8 * old_buflength; else if (old_buflength < 8388608) new_buflength = 4 * old_buflength; else if (old_buflength < 134217728) new_buflength = 2 * old_buflength; else new_buflength = old_buflength + 67108864; } new_childrenbuf = (NCList *) realloc2(nclist->childrenbuf, new_buflength, old_buflength, sizeof(NCList)); new_rgidbuf = (int *) realloc2(nclist->rgidbuf, new_buflength, old_buflength, sizeof(int)); nclist->buflength = new_buflength; nclist->childrenbuf = new_childrenbuf; nclist->rgidbuf = new_rgidbuf; return; } typedef struct NCList_building_stack_elt_t { NCList *nclist; int rgid; /* range ID */ } NCListBuildingStackElt; static NCListBuildingStackElt *NCList_building_stack = NULL; static int NCList_building_stack_maxdepth = 0; static NCListBuildingStackElt append_NCList_elt(NCList *landing_nclist, int rgid) { int nchildren; NCListBuildingStackElt stack_elt; nchildren = landing_nclist->nchildren; if (nchildren == landing_nclist->buflength) extend_NCList(landing_nclist); stack_elt.nclist = landing_nclist->childrenbuf + nchildren; stack_elt.rgid = landing_nclist->rgidbuf[nchildren] = rgid; init_NCList(stack_elt.nclist); landing_nclist->nchildren++; return stack_elt; } static void extend_NCList_building_stack() { int new_maxdepth; new_maxdepth = get_new_maxdepth(NCList_building_stack_maxdepth); NCList_building_stack = (NCListBuildingStackElt *) realloc2(NCList_building_stack, new_maxdepth, NCList_building_stack_maxdepth, sizeof(NCListBuildingStackElt)); NCList_building_stack_maxdepth = new_maxdepth; return; } static void build_NCList(NCList *top_nclist, const int *x_start_p, const int *x_end_p, const int *x_subset_p, int x_len) { int *base, rgid, retcode, i, d, current_end; NCList *landing_nclist; NCListBuildingStackElt stack_elt; /* Compute the order of 'x' (or its subset) in 'base'. The sorting is first by ascending start then by descending end. */ base = (int *) malloc(sizeof(int) * x_len); if (base == NULL) error("build_NCList: memory allocation failed"); if (x_subset_p == NULL) { for (rgid = 0; rgid < x_len; rgid++) base[rgid] = rgid; } else { memcpy(base, x_subset_p, sizeof(int) * x_len); } retcode = sort_int_pairs(base, x_len, x_start_p, x_end_p, 0, 1, 1, NULL, NULL); if (retcode != 0) { free(base); error("build_NCList: memory allocation failed"); } init_NCList(top_nclist); for (i = 0, d = -1; i < x_len; i++) { rgid = base[i]; current_end = x_end_p[rgid]; while (d >= 0 && x_end_p[NCList_building_stack[d].rgid] < current_end) d--; // unstack landing_nclist = d == -1 ? top_nclist : NCList_building_stack[d].nclist; // append 'rgid' to landing_nclist stack_elt = append_NCList_elt(landing_nclist, rgid); // put stack_elt on stack if (++d == NCList_building_stack_maxdepth) extend_NCList_building_stack(); NCList_building_stack[d] = stack_elt; } free(base); return; } /* --- .Call ENTRY POINT --- */ SEXP C_build_NCList(SEXP nclist_xp, SEXP x_start, SEXP x_end, SEXP x_subset) { NCList *top_nclist; int x_len; const int *x_start_p, *x_end_p, *x_subset_p; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("C_build_NCList: pointer to NCList struct is NULL"); x_len = check_integer_pairs(x_start, x_end, &x_start_p, &x_end_p, "start(x)", "end(x)"); if (x_subset == R_NilValue) { x_subset_p = NULL; } else { x_subset_p = INTEGER(x_subset); x_len = LENGTH(x_subset); } build_NCList(top_nclist, x_start_p, x_end_p, x_subset_p, x_len); return nclist_xp; } /**************************************************************************** * C_new_NCListAsINTSXP_from_NCList() */ /* * Setting an arbitrary hard limit on the max depth of NCListAsINTSXP objects * to prevent C stack overflows when walking on them recursively (e.g. with * print_NCListAsINTSXP_rec() or NCListAsINTSXP_get_y_overlaps_rec()). * A better solution would be to not use recursive code at all when traversing * an NCListAsINTSXP object. Then NCListAsINTSXP objects of arbitrary depth * could be supported and it wouldn't be necessary to set the limit below. */ #define NCListAsINTSXP_MAX_DEPTH 100000 #define NCListAsINTSXP_NCHILDREN(nclist) ((nclist)[0]) #define NCListAsINTSXP_RGIDS(nclist) ((nclist) + 1) #define NCListAsINTSXP_OFFSETS(nclist) \ ((nclist) + 1 + NCListAsINTSXP_NCHILDREN(nclist)) static int compute_NCListAsINTSXP_length(const NCList *top_nclist) { unsigned int ans_len; const NCList *nclist; int nchildren; ans_len = 0U; /* Complete bottom-up walk (top-down walk would also work). */ RESET_NCLIST_WALKING_STACK(); for (nclist = move_down(top_nclist); nclist != NULL; nclist = next_bottom_up()) { if (NCList_walking_stack_depth > NCListAsINTSXP_MAX_DEPTH) error("compute_NCListAsINTSXP_length: " "NCList object is too deep (has more " "than\n %d levels of nested ranges)", NCListAsINTSXP_MAX_DEPTH); nchildren = nclist->nchildren; if (nchildren == 0) continue; ans_len += 1U + 2U * (unsigned int) nchildren; if (ans_len > INT_MAX) error("compute_NCListAsINTSXP_length: " "NCList object is too big to fit in " "an integer vector"); } return (int) ans_len; } /* Recursive! */ static int dump_NCList_to_int_array_rec(const NCList *nclist, int *out) { int nchildren, offset, dump_len, n; const NCList *child_nclist; const int *rgid_p; nchildren = nclist->nchildren; if (nchildren == 0) return 0; offset = 1 + 2 * nchildren; NCListAsINTSXP_NCHILDREN(out) = nchildren; for (n = 0, child_nclist = nclist->childrenbuf, rgid_p = nclist->rgidbuf; n < nchildren; n++, child_nclist++, rgid_p++) { NCListAsINTSXP_RGIDS(out)[n] = *rgid_p; dump_len = dump_NCList_to_int_array_rec(child_nclist, out + offset); NCListAsINTSXP_OFFSETS(out)[n] = dump_len != 0 ? offset : -1; offset += dump_len; } return offset; } /* --- .Call ENTRY POINT --- */ SEXP C_new_NCListAsINTSXP_from_NCList(SEXP nclist_xp) { SEXP ans; const NCList *top_nclist; int ans_len; top_nclist = (NCList *) R_ExternalPtrAddr(nclist_xp); if (top_nclist == NULL) error("C_new_NCListAsINTSXP_from_NCList: " "pointer to NCList struct is NULL"); ans_len = compute_NCListAsINTSXP_length(top_nclist); PROTECT(ans = NEW_INTEGER(ans_len)); dump_NCList_to_int_array_rec(top_nclist, INTEGER(ans)); UNPROTECT(1); //print_elapsed_time(); return ans; } /**************************************************************************** * C_print_NCListAsINTSXP() */ /* Recursive! Print 1 line per range in 'nclist'. Return max depth. */ static int print_NCListAsINTSXP_rec(const int *nclist, const int *x_start_p, const int *x_end_p, int depth, const char *format) { int maxdepth, nchildren, n, d, rgid, offset, tmp; maxdepth = depth; nchildren = NCListAsINTSXP_NCHILDREN(nclist); for (n = 0; n < nchildren; n++) { for (d = 1; d < depth; d++) Rprintf("|"); rgid = NCListAsINTSXP_RGIDS(nclist)[n]; Rprintf(format, rgid + 1); Rprintf(": [%d, %d]\n", x_start_p[rgid], x_end_p[rgid]); offset = NCListAsINTSXP_OFFSETS(nclist)[n]; if (offset != -1) { tmp = print_NCListAsINTSXP_rec(nclist + offset, x_start_p, x_end_p, depth + 1, format); if (tmp > maxdepth) maxdepth = tmp; } } return maxdepth; } /* --- .Call ENTRY POINT --- */ SEXP C_print_NCListAsINTSXP(SEXP x_nclist, SEXP x_start, SEXP x_end) { const int *top_nclist; int x_len, max_digits, maxdepth; const int *x_start_p, *x_end_p; char format[15]; top_nclist = INTEGER(x_nclist); x_len = check_integer_pairs(x_start, x_end, &x_start_p, &x_end_p, "start(x)", "end(x)"); if (x_len == 0) { maxdepth = 0; } else { max_digits = (int) log10((double) x_len) + 1; sprintf(format, "%c0%d%c", '%', max_digits, 'd'); maxdepth = print_NCListAsINTSXP_rec(top_nclist, x_start_p, x_end_p, 1, format); } Rprintf("max depth = %d\n", maxdepth); return R_NilValue; } /**************************************************************************** * pp_find_overlaps() */ /* 6 supported types of overlap. */ #define TYPE_ANY 1 #define TYPE_START 2 #define TYPE_END 3 #define TYPE_WITHIN 4 #define TYPE_EXTEND 5 #define TYPE_EQUAL 6 typedef struct backpack_t { /* Members set by prepare_backpack(). */ const int *x_start_p; const int *x_end_p; const int *x_space_p; int maxgap; int minoverlap; int overlap_type; int min_overlap_score0; int (*is_hit_fun)(int rgid, const struct backpack_t *backpack); int select_mode; int circle_len; int pp_is_q; IntAE *hits; int *direct_out; /* Members set by update_backpack(). */ int y_rgid; int y_start; int y_end; int y_space; int min_x_end; int max_x_start; } Backpack; static int overlap_score0(int x_start, int x_end, int y_start, int y_end) { return (x_end <= y_end ? x_end : y_end) - (x_start >= y_start ? x_start : y_start); } static int is_TYPE_ANY_hit(int rgid, const Backpack *backpack) { int x_start, x_end; if (backpack->minoverlap == 0) return 1; /* Check the score */ x_start = backpack->x_start_p[rgid]; x_end = backpack->x_end_p[rgid]; return x_end - x_start >= backpack->min_overlap_score0; } static int is_TYPE_START_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the starts. */ x_start = backpack->x_start_p[rgid]; d = abs(backpack->y_start - x_start); if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; x_end = backpack->x_end_p[rgid]; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_TYPE_END_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the ends. */ x_end = backpack->x_end_p[rgid]; d = abs(backpack->y_end - x_end); if (backpack->circle_len != NA_INTEGER) d %= backpack->circle_len; if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; x_start = backpack->x_start_p[rgid]; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_TYPE_WITHIN_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d; if (backpack->maxgap == 0) return 1; x_start = backpack->x_start_p[rgid]; x_end = backpack->x_end_p[rgid]; d = backpack->y_start - x_start + x_end - backpack->y_end; return d <= backpack->maxgap; } static int is_TYPE_EXTEND_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d1, d2; x_start = backpack->x_start_p[rgid]; d1 = x_start - backpack->y_start; if (d1 < 0) return 0; x_end = backpack->x_end_p[rgid]; d2 = backpack->y_end - x_end; if (d2 < 0) return 0; if (x_end - x_start < backpack->min_overlap_score0) return 0; if (backpack->maxgap == 0) return 1; return d1 + d2 <= backpack->maxgap; } static int is_TYPE_EQUAL_hit(int rgid, const Backpack *backpack) { int x_start, x_end, d, score0; /* Check the distance between the starts. */ x_start = backpack->x_start_p[rgid]; d = abs(backpack->y_start - x_start); if (d > backpack->maxgap) return 0; /* Check the distance between the ends. */ x_end = backpack->x_end_p[rgid]; d = abs(backpack->y_end - x_end); if (backpack->circle_len != NA_INTEGER) d %= backpack->circle_len; if (d > backpack->maxgap) return 0; /* Check the score, but only if minoverlap != 0. */ if (backpack->minoverlap == 0) return 1; score0 = overlap_score0(x_start, x_end, backpack->y_start, backpack->y_end); return score0 >= backpack->min_overlap_score0; } static int is_hit(int rgid, const Backpack *backpack) { int x_space; /* 1st: perform checks common to all types of overlaps */ if (backpack->x_space_p != NULL && backpack->y_space != 0) { x_space = backpack->x_space_p[rgid]; if (x_space != 0 && x_space != backpack->y_space) return 0; } /* 2nd: perform checks specific to the current type of overlaps (by calling the callback function for this type) */ return backpack->is_hit_fun(rgid, backpack); } static void report_hit(int rgid, const Backpack *backpack) { int rgid1, q_rgid, s_rgid1, *selection_p; rgid1 = rgid + 1; /* 1-based */ if (backpack->select_mode == ALL_HITS) { /* Report the hit. */ IntAE_insert_at(backpack->hits, IntAE_get_nelt(backpack->hits), rgid1); return; } /* Update current selection if necessary. */ if (backpack->pp_is_q) { q_rgid = rgid; s_rgid1 = backpack->y_rgid + 1; } else { q_rgid = backpack->y_rgid; s_rgid1 = rgid1; } selection_p = backpack->direct_out + q_rgid; if (backpack->select_mode == COUNT_HITS) { (*selection_p)++; return; } if (*selection_p == NA_INTEGER || (backpack->select_mode == FIRST_HIT) == (s_rgid1 < *selection_p)) *selection_p = s_rgid1; return; } static Backpack prepare_backpack(const int *x_start_p, const int *x_end_p, const int *x_space_p, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, int pp_is_q, IntAE *hits, int *direct_out) { Backpack backpack; backpack.x_start_p = x_start_p; backpack.x_end_p = x_end_p; backpack.x_space_p = x_space_p; backpack.maxgap = maxgap; backpack.minoverlap = minoverlap; backpack.overlap_type = overlap_type; if (overlap_type == TYPE_ANY) backpack.min_overlap_score0 = minoverlap - maxgap - 2; else backpack.min_overlap_score0 = minoverlap - 1; /* set callback function for the current type of overlaps */ switch (overlap_type) { case TYPE_ANY: backpack.is_hit_fun = is_TYPE_ANY_hit; break; case TYPE_START: backpack.is_hit_fun = is_TYPE_START_hit; break; case TYPE_END: backpack.is_hit_fun = is_TYPE_END_hit; break; case TYPE_WITHIN: backpack.is_hit_fun = is_TYPE_WITHIN_hit; break; case TYPE_EXTEND: backpack.is_hit_fun = is_TYPE_EXTEND_hit; break; case TYPE_EQUAL: backpack.is_hit_fun = is_TYPE_EQUAL_hit; break; } backpack.select_mode = select_mode; backpack.circle_len = circle_len; backpack.pp_is_q = pp_is_q; backpack.hits = hits; backpack.direct_out = direct_out; return backpack; } static void update_backpack(Backpack *backpack, int y_rgid, int y_start, int y_end, int y_space) { int slack, min_x_end, max_x_start, min_overlap_score0; backpack->y_rgid = y_rgid; backpack->y_start = y_start; backpack->y_end = y_end; backpack->y_space = y_space; /* set 'min_x_end' and 'max_x_start' */ if (backpack->overlap_type == TYPE_ANY) { if (backpack->minoverlap == 0) { slack = backpack->maxgap + 1; } else { slack = 1 - backpack->minoverlap; } backpack->min_x_end = y_start - slack; backpack->max_x_start = y_end + slack; return; } if (backpack->overlap_type == TYPE_WITHIN) { backpack->min_x_end = backpack->y_end; backpack->max_x_start = backpack->y_start; return; } if (backpack->overlap_type == TYPE_EXTEND || backpack->minoverlap != 0 || backpack->circle_len != NA_INTEGER) { min_overlap_score0 = backpack->min_overlap_score0; backpack->min_x_end = y_start + min_overlap_score0; backpack->max_x_start = y_end - min_overlap_score0; if (backpack->overlap_type == TYPE_EXTEND) return; } /* TYPE_START, TYPE_END, or TYPE_EQUAL */ /* min_x_end */ if (backpack->overlap_type == TYPE_START) { /* TYPE_START */ if (backpack->minoverlap == 0) backpack->min_x_end = y_start - backpack->maxgap - 1; } else if (backpack->circle_len == NA_INTEGER) { /* TYPE_END or TYPE_EQUAL */ min_x_end = y_end - backpack->maxgap; if (backpack->minoverlap == 0 || min_x_end > backpack->min_x_end) backpack->min_x_end = min_x_end; } /* max_x_start */ if (backpack->overlap_type == TYPE_END) { /* TYPE_END */ if (backpack->minoverlap == 0) backpack->max_x_start = y_end + backpack->maxgap + 1; //} else if (backpack->circle_len == NA_INTEGER) { } else { /* TYPE_START or TYPE_EQUAL */ max_x_start = y_start + backpack->maxgap; if (backpack->minoverlap == 0 || max_x_start < backpack->max_x_start) backpack->max_x_start = max_x_start; } //printf("y_start=%d y_end=%d min_x_end=%d max_x_start=%d\n", // y_start, y_end, backpack->min_x_end, backpack->max_x_start); return; } static void shift_y(Backpack *backpack, int shift) { backpack->y_start += shift; backpack->y_end += shift; backpack->min_x_end += shift; backpack->max_x_start += shift; return; } typedef void (*GetYOverlapsFunType)(const void *x_nclist, const Backpack *backpack); static void pp_find_overlaps( const int *q_start_p, const int *q_end_p, const int *q_space_p, const int *q_subset_p, int q_len, const int *s_start_p, const int *s_end_p, const int *s_space_p, const int *s_subset_p, int s_len, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, const void *pp, int pp_is_q, GetYOverlapsFunType get_y_overlaps_fun, IntAE *qh_buf, IntAE *sh_buf, int *direct_out) { const int *x_start_p, *x_end_p, *x_space_p, *y_start_p, *y_end_p, *y_space_p, *y_subset_p; int y_len, backpack_select_mode, i, j, y_start, y_end, old_nhit, new_nhit, k; IntAE *xh_buf, *yh_buf; Backpack backpack; if (q_len == 0 || s_len == 0) return; if (pp_is_q) { x_start_p = q_start_p; x_end_p = q_end_p; x_space_p = q_space_p; xh_buf = qh_buf; y_start_p = s_start_p; y_end_p = s_end_p; y_space_p = s_space_p; y_subset_p = s_subset_p; y_len = s_len; yh_buf = sh_buf; if (overlap_type == TYPE_WITHIN) overlap_type = TYPE_EXTEND; else if (overlap_type == TYPE_EXTEND) overlap_type = TYPE_WITHIN; } else { x_start_p = s_start_p; x_end_p = s_end_p; x_space_p = s_space_p; xh_buf = sh_buf; y_start_p = q_start_p; y_end_p = q_end_p; y_space_p = q_space_p; y_subset_p = q_subset_p; y_len = q_len; yh_buf = qh_buf; } if (circle_len != NA_INTEGER && select_mode == COUNT_HITS) backpack_select_mode = ALL_HITS; else backpack_select_mode = select_mode; backpack = prepare_backpack(x_start_p, x_end_p, x_space_p, maxgap, minoverlap, overlap_type, backpack_select_mode, circle_len, pp_is_q, xh_buf, direct_out); for (i = 0; i < y_len; i++) { j = y_subset_p == NULL ? i : y_subset_p[i]; y_start = y_start_p[j]; y_end = y_end_p[j]; if (y_end - y_start < backpack.min_overlap_score0) continue; update_backpack(&backpack, j, y_start, y_end, y_space_p == NULL ? 0 : y_space_p[j]); /* pass 0 */ get_y_overlaps_fun(pp, &backpack); if (circle_len == NA_INTEGER) goto life_is_good; if (select_mode == ARBITRARY_HIT && !pp_is_q && direct_out[j] != NA_INTEGER) goto life_is_good; /* pass 1 */ shift_y(&backpack, - circle_len); get_y_overlaps_fun(pp, &backpack); if (select_mode == ARBITRARY_HIT && !pp_is_q && direct_out[j] != NA_INTEGER) goto life_is_good; /* pass 2 */ shift_y(&backpack, 2 * circle_len); get_y_overlaps_fun(pp, &backpack); life_is_good: if (backpack_select_mode != ALL_HITS) continue; old_nhit = IntAE_get_nelt(yh_buf); if (circle_len != NA_INTEGER) { /* delete duplicates */ IntAE_qsort(xh_buf, old_nhit, 0); IntAE_uniq(xh_buf, old_nhit); } new_nhit = IntAE_get_nelt(xh_buf); if (select_mode != COUNT_HITS) { j++; /* 1-based */ for (k = old_nhit; k < new_nhit; k++) IntAE_insert_at(yh_buf, k, j); continue; } if (pp_is_q) { for (k = old_nhit; k < new_nhit; k++) direct_out[xh_buf->elts[k] - 1]++; } else { direct_out[j] += new_nhit - old_nhit; } IntAE_set_nelt(xh_buf, old_nhit); } return; } /**************************************************************************** * int_bsearch() */ /* * 'subset_len' is assumed to be > 0. * Return the first index 'n' for which 'base[subset[n]] >= min', or * 'subset_len' if there is no such index. * TODO: Maybe move this to int_utils.c or sort_utils.c in S4Vectors/src/ */ static int int_bsearch(const int *subset, int subset_len, const int *base, int min) { int n1, n2, n, b; /* Check first element. */ n1 = 0; b = base[subset[n1]]; if (b >= min) return n1; /* Check last element. */ n2 = subset_len - 1; b = base[subset[n2]]; if (b < min) return subset_len; if (b == min) return n2; /* Binary search. Seems that using >> 1 instead of / 2 is faster, even when compiling with 'gcc -O2' (one would hope that the optimizer is able to do that kind of optimization). */ while ((n = (n1 + n2) >> 1) != n1) { b = base[subset[n]]; if (b == min) return n; if (b < min) n1 = n; else n2 = n; } return n2; } /**************************************************************************** * NCList_get_y_overlaps() */ /* Recursive! */ static void NCList_get_y_overlaps_rec(const NCList *x_nclist, const Backpack *backpack) { const int *rgidbuf; int nchildren, n, rgid; const NCList *child_nclist; rgidbuf = x_nclist->rgidbuf; nchildren = x_nclist->nchildren; n = int_bsearch(rgidbuf, nchildren, backpack->x_end_p, backpack->min_x_end); for (child_nclist = x_nclist->childrenbuf + n, rgidbuf = rgidbuf + n; n < nchildren; n++, child_nclist++, rgidbuf++) { rgid = *rgidbuf; if (backpack->x_start_p[rgid] > backpack->max_x_start) break; if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) break; } if (child_nclist->nchildren != 0) NCList_get_y_overlaps_rec(child_nclist, backpack); } return; } static int find_landing_child(const NCList *nclist, const Backpack *backpack) { int nchildren, n; nchildren = nclist->nchildren; if (nchildren == 0) return -1; n = int_bsearch(nclist->rgidbuf, nchildren, backpack->x_end_p, backpack->min_x_end); if (n >= nchildren) return -1; return n; } /* Non-recursive version of NCList_get_y_overlaps_rec(). */ static void NCList_get_y_overlaps(const NCList *top_nclist, const Backpack *backpack) { int n, rgid; const NCList *nclist; NCListWalkingStackElt *stack_elt; /* Incomplete top-down walk: only a pruned version of the full tree (i.e. a subtree starting at the same top node) will be visited. */ RESET_NCLIST_WALKING_STACK(); n = find_landing_child(top_nclist, backpack); if (n < 0) return; nclist = move_to_child(top_nclist, n); while (nclist != NULL) { stack_elt = peek_NCListWalkingStackElt(); rgid = GET_RGID(stack_elt); if (backpack->x_start_p[rgid] > backpack->max_x_start) { /* Skip all further siblings of 'nclist'. */ nclist = move_to_right_uncle(); continue; } if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) return; /* we're done! */ } n = find_landing_child(nclist, backpack); /* Skip first 'n' or all children of 'nclist'. */ nclist = n >= 0 ? move_to_child(nclist, n) : move_to_right_sibling_or_uncle(nclist); } return; } /**************************************************************************** * NCListAsINTSXP_get_y_overlaps() */ /* Recursive! */ static void NCListAsINTSXP_get_y_overlaps_rec(const int *x_nclist, const Backpack *backpack) { const int *rgid_p, *offset_p; int nchildren, n, rgid, offset; rgid_p = NCListAsINTSXP_RGIDS(x_nclist); nchildren = NCListAsINTSXP_NCHILDREN(x_nclist); n = int_bsearch(rgid_p, nchildren, backpack->x_end_p, backpack->min_x_end); for (rgid_p = rgid_p + n, offset_p = NCListAsINTSXP_OFFSETS(x_nclist) + n; n < nchildren; n++, rgid_p++, offset_p++) { rgid = *rgid_p; if (backpack->x_start_p[rgid] > backpack->max_x_start) break; if (is_hit(rgid, backpack)) { report_hit(rgid, backpack); if (backpack->select_mode == ARBITRARY_HIT && !backpack->pp_is_q) break; } offset = *offset_p; if (offset != -1) NCListAsINTSXP_get_y_overlaps_rec(x_nclist + offset, backpack); } return; } /**************************************************************************** * find_overlaps() */ static int find_overlaps( const int *q_start_p, const int *q_end_p, const int *q_space_p, const int *q_subset_p, int q_len, const int *s_start_p, const int *s_end_p, const int *s_space_p, const int *s_subset_p, int s_len, int maxgap, int minoverlap, int overlap_type, int select_mode, int circle_len, SEXP nclist_sxp, int pp_is_q, IntAE *qh_buf, IntAE *sh_buf, int *direct_out) { NCList nclist; const void *pp; GetYOverlapsFunType get_y_overlaps_fun; if (q_len == 0 || s_len == 0) return 0; if (nclist_sxp == R_NilValue) { /* On-the-fly preprocessing. */ pp_is_q = q_len < s_len; if (pp_is_q) build_NCList(&nclist, q_start_p, q_end_p, q_subset_p, q_len); else build_NCList(&nclist, s_start_p, s_end_p, s_subset_p, s_len); pp = &nclist; get_y_overlaps_fun = (GetYOverlapsFunType) NCList_get_y_overlaps; } else { pp = INTEGER(nclist_sxp); get_y_overlaps_fun = (GetYOverlapsFunType) NCListAsINTSXP_get_y_overlaps_rec; } pp_find_overlaps( q_start_p, q_end_p, q_space_p, q_subset_p, q_len, s_start_p, s_end_p, s_space_p, s_subset_p, s_len, maxgap, minoverlap, overlap_type, select_mode, circle_len, pp, pp_is_q, get_y_overlaps_fun, qh_buf, sh_buf, direct_out); if (nclist_sxp == R_NilValue) free_NCList(&nclist); return pp_is_q; } /**************************************************************************** * Helper functions shared by C_find_overlaps_NCList() and * C_find_overlaps_in_groups_NCList() */ static int get_overlap_type(SEXP type) { const char *type0; if (!IS_CHARACTER(type) || LENGTH(type) != 1) error("'type' must be a single string"); type = STRING_ELT(type, 0); if (type == NA_STRING) error("'type' cannot be NA"); type0 = CHAR(type); if (strcmp(type0, "any") == 0) return TYPE_ANY; if (strcmp(type0, "start") == 0) return TYPE_START; if (strcmp(type0, "end") == 0) return TYPE_END; if (strcmp(type0, "within") == 0) return TYPE_WITHIN; if (strcmp(type0, "extend") == 0) return TYPE_EXTEND; if (strcmp(type0, "equal") == 0) return TYPE_EQUAL; error("'type' must be \"any\", \"start\", \"end\", " "\"within\", \"extend\", or \"equal\""); return 0; } static int get_maxgap0(SEXP maxgap, int overlap_type) { int maxgap0; if (!IS_INTEGER(maxgap) || LENGTH(maxgap) != 1) error("'maxgap' must be a single integer"); maxgap0 = INTEGER(maxgap)[0]; if (maxgap0 == NA_INTEGER) error("'maxgap' cannot be NA"); if (maxgap0 < -1) error("'maxgap' must be >= -1"); if (maxgap0 == -1 && overlap_type != TYPE_ANY) maxgap0 = 0; return maxgap0; } static int get_minoverlap0(SEXP minoverlap, int maxgap, int overlap_type) { int minoverlap0; if (!IS_INTEGER(minoverlap) || LENGTH(minoverlap) != 1) error("'minoverlap' must be a single integer"); minoverlap0 = INTEGER(minoverlap)[0]; if (minoverlap0 == NA_INTEGER) error("'minoverlap' cannot be NA"); if (minoverlap0 < 0) error("'minoverlap' cannot be negative"); if (overlap_type == TYPE_ANY && maxgap != -1 && minoverlap0 != 0) error("when 'type' is \"any\", at least one of 'maxgap' " "and 'minoverlap' must be set to its default value"); return minoverlap0; } static int get_circle_length(SEXP circle_length) { int circle_len; if (!IS_INTEGER(circle_length) || LENGTH(circle_length) != 1) error("'circle_length' must be a single integer"); circle_len = INTEGER(circle_length)[0]; if (circle_len != NA_INTEGER && circle_len <= 0) error("'circle_length' must be a single " "positive integer or NA"); return circle_len; } static SEXP new_direct_out(int q_len, int select_mode) { SEXP ans; int init_val, i, *ans_elt; PROTECT(ans = NEW_INTEGER(q_len)); init_val = select_mode == COUNT_HITS ? 0 : NA_INTEGER; for (i = 0, ans_elt = INTEGER(ans); i < q_len; i++, ans_elt++) *ans_elt = init_val; UNPROTECT(1); return ans; } /**************************************************************************** * C_find_overlaps_NCList() * * --- .Call ENTRY POINT --- * Args: * q_start, q_end: Integer vectors of same length. * s_start, s_end: Integer vectors of same length. * nclist: An integer vector representing the Nested Containment * List for 'y'. * nclist_is_q: TRUE or FALSE. * maxgap: See get_maxgap0() C function. * minoverlap: See get_minoverlap0() C function. * type: See get_overlap_type() C function. * select: See _get_select_mode() C function in S4Vectors. * circle_length: A single positive integer or NA_INTEGER. */ SEXP C_find_overlaps_NCList( SEXP q_start, SEXP q_end, SEXP s_start, SEXP s_end, SEXP nclist, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length) { int q_len, s_len, maxgap0, minoverlap0, overlap_type, select_mode, circle_len, *direct_out, pp_is_q; const int *q_start_p, *q_end_p, *s_start_p, *s_end_p; IntAE *qh_buf, *sh_buf; SEXP ans; q_len = check_integer_pairs(q_start, q_end, &q_start_p, &q_end_p, "start(q)", "end(q)"); s_len = check_integer_pairs(s_start, s_end, &s_start_p, &s_end_p, "start(s)", "end(s)"); overlap_type = get_overlap_type(type); maxgap0 = get_maxgap0(maxgap, overlap_type); minoverlap0 = get_minoverlap0(minoverlap, maxgap0, overlap_type); select_mode = get_select_mode(select); circle_len = get_circle_length(circle_length); qh_buf = new_IntAE(0, 0, 0); sh_buf = new_IntAE(0, 0, 0); direct_out = NULL; if (select_mode != ALL_HITS) { PROTECT(ans = new_direct_out(q_len, select_mode)); direct_out = INTEGER(ans); } //init_clock("find_overlaps: T2 = "); pp_is_q = find_overlaps( q_start_p, q_end_p, NULL, NULL, q_len, s_start_p, s_end_p, NULL, NULL, s_len, maxgap0, minoverlap0, overlap_type, select_mode, circle_len, nclist, LOGICAL(nclist_is_q)[0], qh_buf, sh_buf, direct_out); //print_elapsed_time(); if (select_mode != ALL_HITS) { UNPROTECT(1); return ans; } return new_Hits("SortedByQueryHits", qh_buf->elts, sh_buf->elts, IntAE_get_nelt(qh_buf), q_len, s_len, !pp_is_q); } /**************************************************************************** * C_find_overlaps_in_groups_NCList() * * --- .Call ENTRY POINT --- * Args: * q_start, q_end, q_space: Integer vectors of same length (or NULL for * 'q_space'). * q_groups: A CompressedIntegerList object of length NG1. Each list * element (integer vector) represents a group of 0-based * indices into 'q_start', 'q_end', and 'q_space'. * s_start, s_end, s_space: Integer vectors of same length (or NULL for * 's_space'). * s_groups: A CompressedIntegerList object of length NG2. Each list * element (integer vector) represents a group of 0-based * indices into 's_start', 's_end', and 's_space'. * nclists: A list of length >= min(NG1, NG2). Each list element must * be NULL or an integer vector representing a Nested * Containment List. * nclist_is_q: A logical vector parallel to 'nclists'. * maxgap: See get_maxgap0() C function. * minoverlap: See get_minoverlap0() C function. * type: See get_overlap_type() C function. * select: See _get_select_mode() C function in S4Vectors. * circle_length: An integer vector of length >= min(NG1, NG2) with positive * or NA values. */ SEXP C_find_overlaps_in_groups_NCList( SEXP q_start, SEXP q_end, SEXP q_space, SEXP q_groups, SEXP s_start, SEXP s_end, SEXP s_space, SEXP s_groups, SEXP nclists, SEXP nclist_is_q, SEXP maxgap, SEXP minoverlap, SEXP type, SEXP select, SEXP circle_length) { int q_len, s_len, NG1, NG2, maxgap0, minoverlap0, overlap_type, select_mode, NG, i, qi_len, si_len, *direct_out; const int *q_start_p, *q_end_p, *q_space_p, *s_start_p, *s_end_p, *s_space_p; CompressedIntsList_holder q_groups_holder, s_groups_holder; Ints_holder qi_group_holder, si_group_holder; IntAE *qh_buf, *sh_buf; SEXP ans; /* Check query. */ q_len = check_integer_pairs(q_start, q_end, &q_start_p, &q_end_p, "q_start", "q_end"); if (q_space == R_NilValue) { q_space_p = NULL; } else { if (LENGTH(q_space) != q_len) error("'q_space' must have the length of 'q_start'"); q_space_p = INTEGER(q_space); } q_groups_holder = _hold_CompressedIntegerList(q_groups); NG1 = _get_length_from_CompressedIntsList_holder(&q_groups_holder); /* Check subject. */ s_len = check_integer_pairs(s_start, s_end, &s_start_p, &s_end_p, "s_start", "s_end"); if (s_space == R_NilValue) { s_space_p = NULL; } else { if (LENGTH(s_space) != s_len) error("'s_space' must have the length of 's_start'"); s_space_p = INTEGER(s_space); } s_groups_holder = _hold_CompressedIntegerList(s_groups); NG2 = _get_length_from_CompressedIntsList_holder(&s_groups_holder); overlap_type = get_overlap_type(type); maxgap0 = get_maxgap0(maxgap, overlap_type); minoverlap0 = get_minoverlap0(minoverlap, maxgap0, overlap_type); select_mode = get_select_mode(select); qh_buf = new_IntAE(0, 0, 0); sh_buf = new_IntAE(0, 0, 0); direct_out = NULL; if (select_mode != ALL_HITS) { PROTECT(ans = new_direct_out(q_len, select_mode)); direct_out = INTEGER(ans); } NG = NG1 <= NG2 ? NG1 : NG2; for (i = 0; i < NG; i++) { qi_group_holder = _get_elt_from_CompressedIntsList_holder( &q_groups_holder, i); qi_len = qi_group_holder.length; si_group_holder = _get_elt_from_CompressedIntsList_holder( &s_groups_holder, i); si_len = si_group_holder.length; find_overlaps( q_start_p, q_end_p, q_space_p, qi_group_holder.ptr, qi_len, s_start_p, s_end_p, s_space_p, si_group_holder.ptr, si_len, maxgap0, minoverlap0, overlap_type, select_mode, INTEGER(circle_length)[i], VECTOR_ELT(nclists, i), LOGICAL(nclist_is_q)[i], qh_buf, sh_buf, direct_out); } if (select_mode != ALL_HITS) { UNPROTECT(1); return ans; } return new_Hits("SortedByQueryHits", qh_buf->elts, sh_buf->elts, IntAE_get_nelt(qh_buf), q_len, s_len, 0); } /**************************************************************************** Algorithm complexity ==================== X: length of object to preprocess Y: length of other object H: nb of hits (upper bound is X * Y) Time of preprocessing: T1 = a * X * log(X) Time of find_overlaps(..., select="all"): T2 = b * Y * log(X) + c * H Total time T is T1 + T2. ****************************************************************************/ IRanges/src/R_init_IRanges.c0000644000175100017510000001256414626176651016737 0ustar00biocbuildbiocbuild#include "IRanges.h" #define CALLMETHOD_DEF(fun, numArgs) {#fun, (DL_FUNC) &fun, numArgs} #define REGISTER_CCALLABLE(fun) \ R_RegisterCCallable("IRanges", #fun, (DL_FUNC) &fun) static const R_CallMethodDef callMethods[] = { /* Ranges_class.c */ CALLMETHOD_DEF(C_validate_Ranges, 3), /* IPosRanges_comparison.c */ CALLMETHOD_DEF(C_pcompare_IPosRanges, 4), /* IRanges_class.c */ CALLMETHOD_DEF(C_isNormal_IRanges, 1), CALLMETHOD_DEF(C_from_integer_to_IRanges, 1), CALLMETHOD_DEF(C_from_logical_to_NormalIRanges, 1), /* IRanges_constructor.c */ CALLMETHOD_DEF(C_solve_start_end_width, 3), CALLMETHOD_DEF(C_solve_user_SEW, 6), /* Grouping_class.c */ CALLMETHOD_DEF(C_members_H2LGrouping, 2), CALLMETHOD_DEF(C_vmembers_H2LGrouping, 2), /* RleViews_utils.c */ CALLMETHOD_DEF(C_viewMins_RleViews, 2), CALLMETHOD_DEF(C_viewMaxs_RleViews, 2), CALLMETHOD_DEF(C_viewSums_RleViews, 2), CALLMETHOD_DEF(C_viewMeans_RleViews, 2), CALLMETHOD_DEF(C_viewWhichMins_RleViews, 2), CALLMETHOD_DEF(C_viewWhichMaxs_RleViews, 2), /* SimpleIRangesList_class.c */ CALLMETHOD_DEF(C_isNormal_SimpleIRangesList, 2), CALLMETHOD_DEF(C_min_SimpleNormalIRangesList, 1), CALLMETHOD_DEF(C_max_SimpleNormalIRangesList, 1), /* CompressedIRangesList_class.c */ CALLMETHOD_DEF(C_isNormal_CompressedIRangesList, 2), CALLMETHOD_DEF(C_summary_CompressedIRangesList, 1), CALLMETHOD_DEF(C_min_CompressedNormalIRangesList, 2), CALLMETHOD_DEF(C_max_CompressedNormalIRangesList, 2), /* inter_range_methods.c */ CALLMETHOD_DEF(C_range_IRanges, 1), CALLMETHOD_DEF(C_reduce_IntegerRanges, 6), CALLMETHOD_DEF(C_reduce_CompressedIRangesList, 4), CALLMETHOD_DEF(C_gaps_IntegerRanges, 4), CALLMETHOD_DEF(C_gaps_CompressedIRangesList, 3), CALLMETHOD_DEF(C_disjointBins_IntegerRanges, 2), /* coverage_methods.c */ CALLMETHOD_DEF(C_coverage_IRanges, 6), CALLMETHOD_DEF(C_coverage_CompressedIRangesList, 6), /* NCList.c */ CALLMETHOD_DEF(C_new_NCList, 0), CALLMETHOD_DEF(C_free_NCList, 1), CALLMETHOD_DEF(C_build_NCList, 4), CALLMETHOD_DEF(C_new_NCListAsINTSXP_from_NCList, 1), CALLMETHOD_DEF(C_print_NCListAsINTSXP, 3), CALLMETHOD_DEF(C_find_overlaps_NCList, 11), CALLMETHOD_DEF(C_find_overlaps_in_groups_NCList, 15), /* CompressedAtomicList_utils.c */ CALLMETHOD_DEF(C_sum_CompressedLogicalList, 2), CALLMETHOD_DEF(C_sum_CompressedIntegerList, 2), CALLMETHOD_DEF(C_sum_CompressedNumericList, 2), CALLMETHOD_DEF(C_prod_CompressedLogicalList, 2), CALLMETHOD_DEF(C_prod_CompressedIntegerList, 2), CALLMETHOD_DEF(C_prod_CompressedNumericList, 2), CALLMETHOD_DEF(C_min_CompressedLogicalList, 2), CALLMETHOD_DEF(C_min_CompressedIntegerList, 2), CALLMETHOD_DEF(C_min_CompressedNumericList, 2), CALLMETHOD_DEF(C_max_CompressedLogicalList, 2), CALLMETHOD_DEF(C_max_CompressedIntegerList, 2), CALLMETHOD_DEF(C_max_CompressedNumericList, 2), CALLMETHOD_DEF(C_which_min_CompressedLogicalList, 1), CALLMETHOD_DEF(C_which_min_CompressedIntegerList, 1), CALLMETHOD_DEF(C_which_min_CompressedNumericList, 1), CALLMETHOD_DEF(C_which_max_CompressedLogicalList, 1), CALLMETHOD_DEF(C_which_max_CompressedIntegerList, 1), CALLMETHOD_DEF(C_which_max_CompressedNumericList, 1), CALLMETHOD_DEF(C_is_unsorted_CompressedLogicalList, 3), CALLMETHOD_DEF(C_is_unsorted_CompressedIntegerList, 3), CALLMETHOD_DEF(C_is_unsorted_CompressedNumericList, 3), /* extractListFragments.c */ CALLMETHOD_DEF(C_find_partition_overlaps, 3), {NULL, NULL, 0} }; void R_init_IRanges(DllInfo *info) { R_registerRoutines(info, NULL, callMethods, NULL, NULL); /* IPosRanges_comparison.c */ REGISTER_CCALLABLE(_overlap_code); REGISTER_CCALLABLE(_invert_overlap_code); /* IRanges_class.c */ REGISTER_CCALLABLE(_get_IRanges_start); REGISTER_CCALLABLE(_get_IRanges_width); REGISTER_CCALLABLE(_get_IRanges_names); REGISTER_CCALLABLE(_get_IRanges_length); REGISTER_CCALLABLE(_hold_IRanges); REGISTER_CCALLABLE(_get_length_from_IRanges_holder); REGISTER_CCALLABLE(_get_width_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_start_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_end_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_names_elt_from_IRanges_holder); REGISTER_CCALLABLE(_get_linear_subset_from_IRanges_holder); REGISTER_CCALLABLE(_set_IRanges_names); REGISTER_CCALLABLE(_copy_IRanges_slots); REGISTER_CCALLABLE(_new_IRanges); REGISTER_CCALLABLE(_new_IRanges_from_IntPairAE); REGISTER_CCALLABLE(_new_list_of_IRanges_from_IntPairAEAE); REGISTER_CCALLABLE(_alloc_IRanges); /* Grouping_class.c */ REGISTER_CCALLABLE(_get_H2LGrouping_high2low); REGISTER_CCALLABLE(_get_H2LGrouping_low2high); REGISTER_CCALLABLE(_get_Partitioning_names); REGISTER_CCALLABLE(_get_PartitioningByEnd_end); REGISTER_CCALLABLE(_new_PartitioningByEnd); /* CompressedList_class.c */ REGISTER_CCALLABLE(_get_CompressedList_unlistData); REGISTER_CCALLABLE(_get_CompressedList_partitioning); REGISTER_CCALLABLE(_get_CompressedList_length); REGISTER_CCALLABLE(_get_CompressedList_names); REGISTER_CCALLABLE(_new_CompressedList); REGISTER_CCALLABLE(_hold_CompressedIntegerList); REGISTER_CCALLABLE(_get_length_from_CompressedIntsList_holder); REGISTER_CCALLABLE(_get_elt_from_CompressedIntsList_holder); /* CompressedIRangesList_class.c */ REGISTER_CCALLABLE(_hold_CompressedIRangesList); REGISTER_CCALLABLE(_get_length_from_CompressedIRangesList_holder); REGISTER_CCALLABLE(_get_elt_from_CompressedIRangesList_holder); REGISTER_CCALLABLE(_get_eltNROWS_from_CompressedIRangesList_holder); return; } IRanges/src/Ranges_class.c0000644000175100017510000000452714626176651016507 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of IntegerRanges objects * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /* * --- .Call ENTRY POINT --- * Doesn't raise an error but returns NULL or a single string describing the * first encountered validity failure. */ SEXP C_validate_Ranges(SEXP x_start, SEXP x_end, SEXP x_width) { static char validity_failures[200]; int x_len, i, tmp; const int *x_start_p, *x_end_p, *x_width_p; if (!IS_INTEGER(x_start) || !IS_INTEGER(x_end) || !IS_INTEGER(x_width)) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' must be integer vectors", "start(x)", "end(x)", "width(x)"); goto failure; } x_len = LENGTH(x_start); if (LENGTH(x_end) != x_len || LENGTH(x_width) != x_len) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' must have the same length", "start(x)", "end(x)", "width(x)"); goto failure; } for (i = 0, x_start_p = INTEGER(x_start), x_end_p = INTEGER(x_end), x_width_p = INTEGER(x_width); i < x_len; i++, x_start_p++, x_end_p++, x_width_p++) { if (*x_start_p == NA_INTEGER || *x_end_p == NA_INTEGER || *x_width_p == NA_INTEGER) { snprintf(validity_failures, sizeof(validity_failures), "'%s', '%s', and '%s' cannot contain NAs", "start(x)", "end(x)", "width(x)"); goto failure; } if (*x_width_p < 0) { snprintf(validity_failures, sizeof(validity_failures), "'%s' cannot contain negative integers", "width(x)"); goto failure; } /* Safe because NA_INTEGER == INT_MIN (see R_ext/Arith.h) */ tmp = *x_start_p - 1; /* The purpose of the 1st part of the test (the part before ||) is to avoid an integer overflow during the 2nd part of the test (the part after ||). */ if (tmp > INT_MAX - *x_width_p || tmp + *x_width_p != *x_end_p) { snprintf(validity_failures, sizeof(validity_failures), "'%s[i] - %s[i] != %s[i] + 1' for i = %d", "end(x)", "start(x)", "width(x)", i + 1); goto failure; } } return R_NilValue; failure: return mkString(validity_failures); } IRanges/src/RleViews_utils.c0000644000175100017510000005167414626176651017070 0ustar00biocbuildbiocbuild#include "IRanges.h" #include #include #include #define R_INT_MIN (1+INT_MIN) /* --- .Call ENTRY POINT --- */ SEXP C_viewMins_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = INT_MAX; } else if (type == 'r') { REAL(ans)[i] = R_PosInf; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else if (INTEGER(values)[index] < INTEGER(ans)[i]) { INTEGER(ans)[i] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else if (REAL(values)[index] < REAL(ans)[i]) { REAL(ans)[i] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_viewMaxs_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = R_INT_MIN; } else if (type == 'r') { REAL(ans)[i] = R_NegInf; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else if (INTEGER(values)[index] > INTEGER(ans)[i]) { INTEGER(ans)[i] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else if (REAL(values)[index] > REAL(ans)[i]) { REAL(ans)[i] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_viewSums_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_INTEGER(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case CPLXSXP: type = 'c'; PROTECT(ans = NEW_COMPLEX(ans_len)); break; default: error("Rle must contain either 'integer', 'numeric', or 'complex' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (type == 'i') { INTEGER(ans)[i] = 0; } else if (type == 'r') { REAL(ans)[i] = 0; } else if (type == 'c') { COMPLEX(ans)[i].r = 0; COMPLEX(ans)[i].i = 0; } if (width > 0) { while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { INTEGER(ans)[i] = NA_INTEGER; break; } } else { INTEGER(ans)[i] += INTEGER(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (INTEGER(ans)[i] != NA_INTEGER && (INTEGER(ans)[i] > INT_MAX || INTEGER(ans)[i] < R_INT_MIN)) error("Integer overflow"); } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } } else { REAL(ans)[i] += REAL(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'c') { while (lower_run <= upper_bound) { if (ISNAN(COMPLEX(values)[index].r) || ISNAN(COMPLEX(values)[index].i)) { if (!LOGICAL(na_rm)[0]) { COMPLEX(ans)[i].r = NA_REAL; COMPLEX(ans)[i].i = NA_REAL; break; } } else { COMPLEX(ans)[i].r += COMPLEX(values)[index].r * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); COMPLEX(ans)[i].i += COMPLEX(values)[index].i * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_viewMeans_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, n, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *lengths_elt; SEXP ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); ans = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case REALSXP: type = 'r'; PROTECT(ans = NEW_NUMERIC(ans_len)); break; case CPLXSXP: type = 'c'; PROTECT(ans = NEW_COMPLEX(ans_len)); break; default: error("Rle must contain either 'integer', 'numeric', or 'complex' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0; i < ans_len; i++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); if (width <= 0) { if (type == 'i') { REAL(ans)[i] = R_NaN; } else if (type == 'r') { REAL(ans)[i] = R_NaN; } else if (type == 'c') { COMPLEX(ans)[i].r = R_NaN; COMPLEX(ans)[i].i = R_NaN; } } else { n = width; if (type == 'i') { REAL(ans)[i] = 0; } else if (type == 'r') { REAL(ans)[i] = 0; } else if (type == 'c') { COMPLEX(ans)[i].r = 0; COMPLEX(ans)[i].i = 0; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { REAL(ans)[i] += INTEGER(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { REAL(ans)[i] = R_NaN; } else if (REAL(ans)[i] != NA_REAL) { REAL(ans)[i] /= n; } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { REAL(ans)[i] = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { REAL(ans)[i] += REAL(values)[index] * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { REAL(ans)[i] = R_NaN; } else if (REAL(ans)[i] != NA_REAL) { REAL(ans)[i] /= n; } } else if (type == 'c') { while (lower_run <= upper_bound) { if (ISNAN(COMPLEX(values)[index].r) || ISNAN(COMPLEX(values)[index].i)) { if (!LOGICAL(na_rm)[0]) { COMPLEX(ans)[i].r = NA_REAL; COMPLEX(ans)[i].i = NA_REAL; break; } n -= (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } else { COMPLEX(ans)[i].r += COMPLEX(values)[index].r * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); COMPLEX(ans)[i].i += COMPLEX(values)[index].i * (1 + (upper_bound < upper_run ? upper_bound : upper_run) - (lower_bound > lower_run ? lower_bound : lower_run)); } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } if (n == 0) { COMPLEX(ans)[i].r = R_NaN; COMPLEX(ans)[i].i = R_NaN; } else if (COMPLEX(ans)[i].r != NA_REAL) { COMPLEX(ans)[i].r /= n; COMPLEX(ans)[i].i /= n; } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(2); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_viewWhichMins_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *ans_elt, *lengths_elt; SEXP curr, ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); curr = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(curr = NEW_INTEGER(1)); break; case REALSXP: type = 'r'; PROTECT(curr = NEW_NUMERIC(1)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); PROTECT(ans = NEW_INTEGER(ans_len)); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0, ans_elt = INTEGER(ans); i < ans_len; i++, ans_elt++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); *ans_elt = NA_INTEGER; if (width > 0) { if (type == 'i') { INTEGER(curr)[0] = INT_MAX; } else if (type == 'r') { REAL(curr)[0] = R_PosInf; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { break; } } else if (INTEGER(values)[index] < INTEGER(curr)[0]) { *ans_elt = lower_bound; INTEGER(curr)[0] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { break; } } else if (REAL(values)[index] < REAL(curr)[0]) { *ans_elt = lower_bound; REAL(curr)[0] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(3); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_viewWhichMaxs_RleViews(SEXP x, SEXP na_rm) { char type = '?'; int i, start, width, ans_len, index, lower_run, upper_run, lower_bound, upper_bound; int max_index, *ans_elt, *lengths_elt; SEXP curr, ans, subject, values, lengths, ranges, names; IRanges_holder ranges_holder; subject = GET_SLOT(x, install("subject")); values = GET_SLOT(subject, install("values")); lengths = GET_SLOT(subject, install("lengths")); ranges = GET_SLOT(x, install("ranges")); ranges_holder = _hold_IRanges(ranges); ans_len = _get_length_from_IRanges_holder(&ranges_holder); curr = R_NilValue; switch (TYPEOF(values)) { case LGLSXP: case INTSXP: type = 'i'; PROTECT(curr = NEW_INTEGER(1)); break; case REALSXP: type = 'r'; PROTECT(curr = NEW_NUMERIC(1)); break; default: error("Rle must contain either 'integer' or 'numeric' values"); } if (!IS_LOGICAL(na_rm) || LENGTH(na_rm) != 1 || LOGICAL(na_rm)[0] == NA_LOGICAL) error("'na.rm' must be TRUE or FALSE"); PROTECT(ans = NEW_INTEGER(ans_len)); lengths_elt = INTEGER(lengths); max_index = LENGTH(lengths) - 1; index = 0; upper_run = *lengths_elt; for (i = 0, ans_elt = INTEGER(ans); i < ans_len; i++, ans_elt++) { if (i % 100000 == 99999) R_CheckUserInterrupt(); start = _get_start_elt_from_IRanges_holder(&ranges_holder, i); width = _get_width_elt_from_IRanges_holder(&ranges_holder, i); *ans_elt = NA_INTEGER; if (width > 0) { if (type == 'i') { INTEGER(curr)[0] = R_INT_MIN; } else if (type == 'r') { REAL(curr)[0] = R_NegInf; } while (index > 0 && upper_run > start) { upper_run -= *lengths_elt; lengths_elt--; index--; } while (upper_run < start) { lengths_elt++; index++; upper_run += *lengths_elt; } lower_run = upper_run - *lengths_elt + 1; lower_bound = start; upper_bound = start + width - 1; if (type == 'i') { while (lower_run <= upper_bound) { if (INTEGER(values)[index] == NA_INTEGER) { if (!LOGICAL(na_rm)[0]) { break; } } else if (INTEGER(values)[index] > INTEGER(curr)[0]) { *ans_elt = lower_bound; INTEGER(curr)[0] = INTEGER(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } else if (type == 'r') { while (lower_run <= upper_bound) { if (ISNAN(REAL(values)[index])) { if (!LOGICAL(na_rm)[0]) { break; } } else if (REAL(values)[index] > REAL(curr)[0]) { *ans_elt = lower_bound; REAL(curr)[0] = REAL(values)[index]; } if (index < max_index) { lengths_elt++; index++; lower_run = upper_run + 1; lower_bound = lower_run; upper_run += *lengths_elt; } else { break; } } } } } PROTECT(names = duplicate(_get_IRanges_names(ranges))); SET_NAMES(ans, names); UNPROTECT(3); return ans; } IRanges/src/S4Vectors_stubs.c0000644000175100017510000000003614626176651017146 0ustar00biocbuildbiocbuild#include "_S4Vectors_stubs.c" IRanges/src/SimpleIRangesList_class.c0000644000175100017510000000437414626176651020626 0ustar00biocbuildbiocbuild/**************************************************************************** * Low-level manipulation of SimpleIRangesList objects * ****************************************************************************/ #include "IRanges.h" #include #define R_INT_MIN (1+INT_MIN) /* * --- .Call ENTRY POINT --- */ SEXP C_isNormal_SimpleIRangesList(SEXP x, SEXP use_names) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, i; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_LOGICAL(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); LOGICAL(ans)[i] = _is_normal_IRanges_holder(&ir_holder); } if (LOGICAL(use_names)[0]) { PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(1); } UNPROTECT(1); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP C_min_SimpleNormalIRangesList(SEXP x) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = INT_MAX; } else { *ans_elt = _get_start_elt_from_IRanges_holder(&ir_holder, 0); } } PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(2); return ans; } /* * --- .Call ENTRY POINT --- */ SEXP C_max_SimpleNormalIRangesList(SEXP x) { SEXP list_ir, ans, ans_names; IRanges_holder ir_holder; int x_len, ir_len, i; int *ans_elt; list_ir = GET_SLOT(x, install("listData")); x_len = LENGTH(list_ir); PROTECT(ans = NEW_INTEGER(x_len)); for (i = 0, ans_elt = INTEGER(ans); i < x_len; i++, ans_elt++) { ir_holder = _hold_IRanges(VECTOR_ELT(list_ir, i)); ir_len = _get_length_from_IRanges_holder(&ir_holder); if (ir_len == 0) { *ans_elt = R_INT_MIN; } else { *ans_elt = _get_end_elt_from_IRanges_holder(&ir_holder, ir_len - 1); } } PROTECT(ans_names = duplicate(GET_NAMES(list_ir))); SET_NAMES(ans, ans_names); UNPROTECT(2); return ans; } IRanges/src/coverage_methods.c0000644000175100017510000006401714626176651017421 0ustar00biocbuildbiocbuild/**************************************************************************** * * * Weighted coverage of a set of integer ranges * * -------------------------------------------- * * * * Authors: H. Pag\`es and P. Aboyoun * * Code for "sort" method based on timing enhancements * * by Charles C. Berry * * * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include /* for qsort() */ #include /* for R_CheckUserInterrupt() */ static const char *x_label, *shift_label, *width_label, *weight_label; static void check_recycling_was_round(int last_pos_in_current, int current_len, const char *current_label, const char *target_label) { if (current_len >= 2 && last_pos_in_current < current_len) warning("'%s' length is not a divisor of '%s' length", current_label, target_label); return; } /**************************************************************************** * "sort" method * ****************************************************************************/ /**************************************************************************** * Basic manipulation of the SEids buffer (Start/End ids). */ #define SEid_TO_1BASED_INDEX(SEid) ((SEid) >= 0 ? (SEid) : -(SEid)) #define SEid_IS_END(SEid) ((SEid) >= 0) static const int *base_start; static const int *base_width; static int compar_SEids_for_asc_order(const void *p1, const void *p2) { int SEid1, SEid2, index1, index2, s1, s2; SEid1 = *((const int *) p1); SEid2 = *((const int *) p2); index1 = SEid_TO_1BASED_INDEX(SEid1); index2 = SEid_TO_1BASED_INDEX(SEid2); /* If SEid is a Start id, then s = start If SEid is an End id, then s = end + 1 */ s1 = base_start[index1]; if (SEid_IS_END(SEid1)) s1 += base_width[index1]; s2 = base_start[index2]; if (SEid_IS_END(SEid2)) s2 += base_width[index2]; return s1 - s2; } /* Initialize the SEids buffer (integer weights). */ static int init_SEids_int_weight(int *SEids, const int *x_width, int x_len, const int *weight, int weight_len) { int SEids_len, i, j, index; SEids_len = 0; for (i = j = 0, index = 1; i < x_len; i++, j++, index++) { if (j >= weight_len) j = 0; /* recycle j */ if (x_width[i] == 0 || weight[j] == 0) continue; *(SEids++) = index; /* Start id */ *(SEids++) = - index; /* End id */ SEids_len += 2; } check_recycling_was_round(j, weight_len, weight_label, x_label); return SEids_len; } /* Initialize the SEids buffer (numeric weights). */ static int init_SEids_double_weight(int *SEids, const int *x_width, int x_len, const double *weight, int weight_len) { int SEids_len, i, j, index; SEids_len = 0; for (i = j = 0, index = 1; i < x_len; i++, j++, index++) { if (j >= weight_len) j = 0; /* recycle j */ if (x_width[i] == 0 || weight[j] == 0.0) continue; *(SEids++) = index; /* Start id */ *(SEids++) = - index; /* End id */ SEids_len += 2; } check_recycling_was_round(j, weight_len, weight_label, x_label); return SEids_len; } /* Sort the SEids buffer. */ static void sort_SEids(int *SEids, int SEids_len, const int *x_start, const int *x_width) { base_start = x_start - 1; base_width = x_width - 1; qsort(SEids, SEids_len, sizeof(int), compar_SEids_for_asc_order); return; } /**************************************************************************** * int_coverage_sort(), double_coverage_sort() */ /* 'values_buf' and 'lengths_buf' must have a length >= SEids_len + 1 */ static void compute_int_coverage_in_bufs(const int *SEids, int SEids_len, const int *x_start, const int *x_width, const int *weight, int weight_len, int cvg_len, int *values_buf, int *lengths_buf) { int curr_val, curr_weight, curr_pos, i, prev_pos, index; *(values_buf++) = curr_val = 0; curr_pos = 1; reset_ovflow_flag(); /* we use safe_int_add() in loop below */ for (i = 0; i < SEids_len; i++, SEids++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); prev_pos = curr_pos; index = SEid_TO_1BASED_INDEX(*SEids) - 1; curr_pos = x_start[index]; curr_weight = weight[index % weight_len]; if (SEid_IS_END(*SEids)) { curr_weight = - curr_weight; curr_pos += x_width[index]; } curr_val = safe_int_add(curr_val, curr_weight); *(values_buf++) = curr_val; *(lengths_buf++) = curr_pos - prev_pos; } if (get_ovflow_flag()) warning("NAs produced by integer overflow"); *lengths_buf = cvg_len + 1 - curr_pos; return; } static void compute_double_coverage_in_bufs(const int *SEids, int SEids_len, const int *x_start, const int *x_width, const double *weight, int weight_len, int cvg_len, double *values_buf, int *lengths_buf) { double curr_val, curr_weight; int curr_pos, i, prev_pos, index; *(values_buf++) = curr_val = 0.0; curr_pos = 1; for (i = 0; i < SEids_len; i++, SEids++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); prev_pos = curr_pos; index = SEid_TO_1BASED_INDEX(*SEids) - 1; curr_pos = x_start[index]; curr_weight = weight[index % weight_len]; if (SEid_IS_END(*SEids)) { curr_weight = - curr_weight; curr_pos += x_width[index]; } curr_val += curr_weight; *(values_buf++) = curr_val; *(lengths_buf++) = curr_pos - prev_pos; } *lengths_buf = cvg_len + 1 - curr_pos; return; } static SEXP int_coverage_sort(const int *x_start, const int *x_width, int x_len, const int *weight, int weight_len, int cvg_len) { int *SEids, SEids_len, zero, buf_len, *values_buf, *lengths_buf; SEids = (int *) R_alloc((long) 2 * x_len, sizeof(int)); SEids_len = init_SEids_int_weight(SEids, x_width, x_len, weight, weight_len); if (SEids_len == 0) { //return an Rle with one run of 0's zero = 0; return construct_integer_Rle(1, &zero, &cvg_len, 0); } sort_SEids(SEids, SEids_len, x_start, x_width); buf_len = SEids_len + 1; values_buf = (int *) R_alloc((long) buf_len, sizeof(int)); lengths_buf = (int *) R_alloc((long) buf_len, sizeof(int)); compute_int_coverage_in_bufs(SEids, SEids_len, x_start, x_width, weight, weight_len, cvg_len, values_buf, lengths_buf); return construct_integer_Rle(buf_len, values_buf, lengths_buf, 0); } static SEXP double_coverage_sort(const int *x_start, const int *x_width, int x_len, const double *weight, int weight_len, int cvg_len) { int *SEids, SEids_len, buf_len, *lengths_buf; double zero, *values_buf; SEids = (int *) R_alloc((long) 2 * x_len, sizeof(int)); SEids_len = init_SEids_double_weight(SEids, x_width, x_len, weight, weight_len); if (SEids_len == 0) { //return an Rle with one run of 0's zero = 0.0; return construct_numeric_Rle(1, &zero, &cvg_len, 0); } sort_SEids(SEids, SEids_len, x_start, x_width); buf_len = SEids_len + 1; values_buf = (double *) R_alloc((long) buf_len, sizeof(double)); lengths_buf = (int *) R_alloc((long) buf_len, sizeof(int)); compute_double_coverage_in_bufs(SEids, SEids_len, x_start, x_width, weight, weight_len, cvg_len, values_buf, lengths_buf); return construct_numeric_Rle(buf_len, values_buf, lengths_buf, 0); } static SEXP coverage_sort(const int *x_start, const int *x_width, int x_len, SEXP weight, int cvg_len) { int weight_len; weight_len = LENGTH(weight); return IS_INTEGER(weight) ? int_coverage_sort(x_start, x_width, x_len, INTEGER(weight), weight_len, cvg_len) : double_coverage_sort(x_start, x_width, x_len, REAL(weight), weight_len, cvg_len); } /**************************************************************************** * "hash" method * ****************************************************************************/ static SEXP int_coverage_hash( const int *x_start, const int *x_width, int x_len, const int *weight, int weight_len, int cvg_len) { int *cvg_buf, w, *cvg_p, cumsum, i, j; cvg_buf = (int *) R_alloc((long) cvg_len + 1, sizeof(int)); memset(cvg_buf, 0, cvg_len * sizeof(int)); reset_ovflow_flag(); /* we use safe_int_add() in loop below */ for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ w = weight[j]; cvg_p = cvg_buf + *x_start - 1; *cvg_p = safe_int_add(*cvg_p, w); cvg_p += *x_width; *cvg_p = safe_int_add(*cvg_p, - w); } check_recycling_was_round(j, weight_len, weight_label, x_label); cumsum = 0; for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) { cumsum = safe_int_add(*cvg_p, cumsum); *cvg_p = cumsum; } if (get_ovflow_flag()) warning("NAs produced by integer overflow"); return construct_integer_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP double_coverage_hash( const int *x_start, const int *x_width, int x_len, const double *weight, int weight_len, int cvg_len) { double *cvg_buf, w, *cvg_p, cumsum; int i, j; cvg_buf = (double *) R_alloc((long) cvg_len + 1, sizeof(double)); for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) *cvg_p = 0.0; for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ w = weight[j]; cvg_p = cvg_buf + *x_start - 1; *cvg_p += w; cvg_p += *x_width; *cvg_p -= w; } check_recycling_was_round(j, weight_len, weight_label, x_label); cumsum = 0.0; for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) { cumsum += *cvg_p; *cvg_p = cumsum; } return construct_numeric_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP coverage_hash(const int *x_start, const int *x_width, int x_len, SEXP weight, int cvg_len) { int weight_len; weight_len = LENGTH(weight); return IS_INTEGER(weight) ? int_coverage_hash(x_start, x_width, x_len, INTEGER(weight), weight_len, cvg_len) : double_coverage_hash(x_start, x_width, x_len, REAL(weight), weight_len, cvg_len); } /**************************************************************************** * "naive" method * ****************************************************************************/ static SEXP int_coverage_naive( const int *x_start, const int *x_width, int x_len, const int *weight, int weight_len, int cvg_len) { int *cvg_buf, w, *cvg_p, i, j, k; cvg_buf = (int *) R_alloc((long) cvg_len + 1, sizeof(int)); memset(cvg_buf, 0, cvg_len * sizeof(int)); reset_ovflow_flag(); /* we use safe_int_add() in loop below */ for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ w = weight[j]; for (k = 0, cvg_p = cvg_buf + *x_start - 1; k < *x_width; k++, cvg_p++) { *cvg_p = safe_int_add(*cvg_p, w); } } check_recycling_was_round(j, weight_len, weight_label, x_label); if (get_ovflow_flag()) warning("NAs produced by integer overflow"); return construct_integer_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP double_coverage_naive( const int *x_start, const int *x_width, int x_len, const double *weight, int weight_len, int cvg_len) { double *cvg_buf, w, *cvg_p; int i, j, k; cvg_buf = (double *) R_alloc((long) cvg_len + 1, sizeof(double)); for (i = 0, cvg_p = cvg_buf; i < cvg_len; i++, cvg_p++) *cvg_p = 0.0; for (i = j = 0; i < x_len; i++, j++, x_start++, x_width++) { if (i % 500000 == 499999) R_CheckUserInterrupt(); if (j >= weight_len) j = 0; /* recycle j */ w = weight[j]; for (k = 0, cvg_p = cvg_buf + *x_start - 1; k < *x_width; k++, cvg_p++) { *cvg_p += w; } } check_recycling_was_round(j, weight_len, weight_label, x_label); return construct_numeric_Rle(cvg_len, cvg_buf, NULL, 0); } static SEXP coverage_naive(const int *x_start, const int *x_width, int x_len, SEXP weight, int cvg_len) { int weight_len; weight_len = LENGTH(weight); return IS_INTEGER(weight) ? int_coverage_naive(x_start, x_width, x_len, INTEGER(weight), weight_len, cvg_len) : double_coverage_naive(x_start, x_width, x_len, REAL(weight), weight_len, cvg_len); } /**************************************************************************** * Helper functions for checking args of type SEXP. * * They either pass (and return nothing) or raise an error with an * * informative message. * ****************************************************************************/ static void check_arg_is_integer(SEXP arg, const char *arg_label) { if (!IS_INTEGER(arg)) error("'%s' must be an integer vector", arg_label); return; } static void check_arg_is_numeric(SEXP arg, const char *arg_label) { if (!(IS_INTEGER(arg) || IS_NUMERIC(arg))) error("'%s' must be an integer or numeric vector", arg_label); return; } static void check_arg_is_list(SEXP arg, const char *arg_label) { if (!isVectorList(arg)) error("'%s' must be a list", arg_label); return; } /* * Check that 'arg_len' is equal to 'x_len', or that it's the length of an * argument that can be recycled to the length of 'x'. * Assumes that 'arg_len' and 'x_len' are >= 0. */ static void check_arg_is_recyclable(int arg_len, int x_len, const char *arg_label, const char *x_label) { if (arg_len < x_len) { if (arg_len == 0) error("cannot recycle zero-length '%s' " "to the length of '%s'", arg_label, x_label); } else if (arg_len > x_len) { if (arg_len >= 2) error("'%s' is longer than '%s'", arg_label, x_label); } return; } /**************************************************************************** * compute_coverage_from_IRanges_holder() * ****************************************************************************/ /* * This is probably overly cautious. Could be that the cast from double to int * with (int) already does exactly this (i.e. produces an NA_INTEGER for all * the cases explicitely handled here) and is portable. */ static int double2int(double x) { if (x == R_PosInf || x == R_NegInf || ISNAN(x) /* NA or NaN */ || x >= (double) INT_MAX + 1.00 || x <= (double) INT_MIN) return NA_INTEGER; return (int) x; } /* * Args: * x_holder: An IRanges_holder struct holding the input ranges, those * ranges being those of a fictive IRanges object 'x'. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * circle_len: A single integer. NA or > 0. * After the input ranges are shifted: * - If 'width' is a non-negative integer, then the ranges are clipped with * respect to the [1, width] interval and the function returns 'width'. * - If 'width' is NA, then the ranges are clipped with respect to the * [1, +inf) interval (i.e. they're only clipped on the left) and the * function returns 'max(end(x))' or 0 if 'x' is empty. * The shifted and clipped ranges are returned in 'out_ranges'. * Let's call 'cvg_len' the value returned by the function. If the output * ranges are in a tiling configuration with respect to the [1, cvg_len] * interval (i.e. they're non-overlapping, ordered from left to right, and * they fully cover the interval), then '*out_ranges_are_tiles' is set to 1. * Otherwise, it's set to 0. */ static int shift_and_clip_ranges(const IRanges_holder *x_holder, SEXP shift, int width, int circle_len, IntPairAE *out_ranges, int *out_ranges_are_tiles) { int x_len, shift_len, cvg_len, auto_cvg_len, prev_end, i, j, x_start, x_end, shift_elt, tmp; x_len = _get_length_from_IRanges_holder(x_holder); /* Check 'shift'. */ check_arg_is_numeric(shift, shift_label); shift_len = LENGTH(shift); check_arg_is_recyclable(shift_len, x_len, shift_label, x_label); /* Infer 'cvg_len' from 'width' and 'circle_len'. */ *out_ranges_are_tiles = 1; if (width == NA_INTEGER) { auto_cvg_len = 1; } else if (width < 0) { error("'%s' cannot be negative", width_label); } else if (width == 0) { return width; } else if (circle_len == NA_INTEGER) { auto_cvg_len = 0; } else if (circle_len <= 0) { error("length of underlying circular sequence is <= 0"); } else if (width > circle_len) { error("'%s' cannot be greater than length of " "underlying circular sequence", width_label); } else { auto_cvg_len = 1; } cvg_len = auto_cvg_len ? 0 : width; if (x_len == 0) { if (cvg_len != 0) *out_ranges_are_tiles = 0; return cvg_len; } IntPairAE_set_nelt(out_ranges, 0); prev_end = 0; for (i = j = 0; i < x_len; i++, j++) { if (j >= shift_len) j = 0; /* recycle j */ x_start = _get_start_elt_from_IRanges_holder(x_holder, i); x_end = _get_end_elt_from_IRanges_holder(x_holder, i); if (IS_INTEGER(shift)) { shift_elt = INTEGER(shift)[j]; if (shift_elt == NA_INTEGER) error("'%s' contains NAs", shift_label); } else { shift_elt = double2int(REAL(shift)[j]); if (shift_elt == NA_INTEGER) error("'%s' contains NAs, NaNs, or numbers " "that cannot be turned into integers", shift_label); } /* Risk of integer overflow! */ x_start += shift_elt; x_end += shift_elt; if (circle_len != NA_INTEGER) { tmp = x_start % circle_len; if (tmp <= 0) tmp += circle_len; x_end += tmp - x_start; x_start = tmp; } if (x_end < 0) { x_end = 0; } else if (x_end > cvg_len) { if (auto_cvg_len) cvg_len = x_end; else x_end = cvg_len; } if (x_start < 1) x_start = 1; else if (x_start > (tmp = cvg_len + 1)) x_start = tmp; if (*out_ranges_are_tiles) { if (x_start == prev_end + 1) prev_end = x_end; else *out_ranges_are_tiles = 0; } IntPairAE_insert_at(out_ranges, i, x_start, x_end - x_start + 1); } check_recycling_was_round(j, shift_len, shift_label, x_label); if (*out_ranges_are_tiles && x_end != cvg_len) *out_ranges_are_tiles = 0; return cvg_len; } /* * Args: * x_holder: An IRanges_holder struct holding the input ranges, those * ranges being those of a fictive IRanges object 'x'. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * weight: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary). * circle_len: A single integer. NA or > 0. * method: Either "auto", "sort", "hash", or "naive". * Returns an Rle object. */ static SEXP compute_coverage_from_IRanges_holder( const IRanges_holder *x_holder, SEXP shift, int width, SEXP weight, int circle_len, SEXP method, IntPairAE *ranges_buf) { int x_len, cvg_len, out_ranges_are_tiles, weight_len, effective_method, take_short_path; const int *x_start, *x_width; const char *method0; SEXP ans; x_len = _get_length_from_IRanges_holder(x_holder); cvg_len = shift_and_clip_ranges(x_holder, shift, width, circle_len, ranges_buf, &out_ranges_are_tiles); x_start = ranges_buf->a->elts; x_width = ranges_buf->b->elts; /* Check 'weight'. */ check_arg_is_numeric(weight, weight_label); weight_len = LENGTH(weight); check_arg_is_recyclable(weight_len, x_len, weight_label, x_label); /* Infer 'effective_method' from 'method' and 'cvg_len'. */ if (!IS_CHARACTER(method) || LENGTH(method) != 1) error("'method' must be a single string"); method = STRING_ELT(method, 0); if (method == NA_STRING) error("'method' cannot be NA"); method0 = CHAR(method); if (strcmp(method0, "auto") == 0) { /* Based on empirical observation. */ effective_method = x_len <= 0.25 * cvg_len ? 1 : 2; } else if (strcmp(method0, "sort") == 0) { effective_method = 1; } else if (strcmp(method0, "hash") == 0) { effective_method = 2; } else if (strcmp(method0, "naive") == 0) { effective_method = 3; } else { error("'method' must be \"auto\", \"sort\", \"hash\", " "or \"naive\""); } //Rprintf("out_ranges_are_tiles = %d\n", out_ranges_are_tiles); //Rprintf("x_len = %d\n", x_len); //Rprintf("cvg_len = %d\n", cvg_len); if (out_ranges_are_tiles) { if (cvg_len == 0) { take_short_path = 1; x_len = 0; } else if (weight_len == 1) { take_short_path = 1; x_len = 1; x_width = &cvg_len; } else if (weight_len == x_len) { take_short_path = 1; } else { take_short_path = 0; } if (take_short_path) { /* Short path for the tiling case. */ //Rprintf("taking short path\n"); return IS_INTEGER(weight) ? construct_integer_Rle(x_len, INTEGER(weight), x_width, 0) : construct_numeric_Rle(x_len, REAL(weight), x_width, 0); } } //Rprintf("taking normal path\n"); switch (effective_method) { case 1: ans = coverage_sort(x_start, x_width, x_len, weight, cvg_len); break; case 2: ans = coverage_hash(x_start, x_width, x_len, weight, cvg_len); break; default: ans = coverage_naive(x_start, x_width, x_len, weight, cvg_len); break; } return ans; } /* --- .Call ENTRY POINT --- * Args: * x: An IRanges object. * shift: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary) with no NAs. * width: A single integer. NA or >= 0. * weight: A numeric (integer or double) vector parallel to 'x' (will * get recycled if necessary). * circle_len: A single integer. NA or > 0. * method: Either "auto", "sort", "hash", or "naive". * Returns an Rle object. */ SEXP C_coverage_IRanges(SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_len, SEXP method) { IRanges_holder x_holder; int x_len; IntPairAE *ranges_buf; x_holder = _hold_IRanges(x); x_len = _get_length_from_IRanges_holder(&x_holder); /* Check 'width'. */ check_arg_is_integer(width, "width"); if (LENGTH(width) != 1) error("'%s' must be a single integer", "width"); /* Check 'circle_len'. */ check_arg_is_integer(circle_len, "circle.length"); if (LENGTH(circle_len) != 1) error("'%s' must be a single integer", "circle.length"); ranges_buf = new_IntPairAE(x_len, 0); x_label = "x"; shift_label = "shift"; width_label = "width"; weight_label = "weight"; return compute_coverage_from_IRanges_holder(&x_holder, shift, INTEGER(width)[0], weight, INTEGER(circle_len)[0], method, ranges_buf); } /* --- .Call ENTRY POINT --- * Args: * x: A CompressedIRangesList object of length N. * shift: A list of length N (will get recycled if necessary). After * recycling, each list element must be a numeric (integer or * double) vector parallel to x[[i]] that will itself get * recycled if necessary, and with no NAs. * width: An integer vector of length N (will get recycled if * necessary). Values must be NAs or >= 0. * or a single non-negative number. * weight: A list of length N (will get recycled if necessary). After * recycling, each list element must be a numeric (integer or * double) vector parallel to x[[i]] that will itself get * recycled if necessary. * circle_lens: An integer vector of length N (will get recycled if * necessary). Values must be NAs or > 0. * method: Either "auto", "sort", "hash", or "naive". * Returns a list of N RleList objects. */ SEXP C_coverage_CompressedIRangesList(SEXP x, SEXP shift, SEXP width, SEXP weight, SEXP circle_lens, SEXP method) { CompressedIRangesList_holder x_holder; int x_len, shift_len, width_len, weight_len, circle_lens_len, i, j, k, l, m; IntPairAE *ranges_buf; SEXP ans, ans_elt, shift_elt, weight_elt; IRanges_holder x_elt_holder; char x_label_buf[40], shift_label_buf[40], width_label_buf[40], weight_label_buf[40]; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); /* Check 'shift'. */ check_arg_is_list(shift, "shift"); shift_len = LENGTH(shift); check_arg_is_recyclable(shift_len, x_len, "shift", "x"); /* Check 'width'. */ check_arg_is_integer(width, "width"); width_len = LENGTH(width); check_arg_is_recyclable(width_len, x_len, "width", "x"); /* Check 'weight'. */ check_arg_is_list(weight, "weight"); weight_len = LENGTH(weight); check_arg_is_recyclable(weight_len, x_len, "weight", "x"); /* Check 'circle_lens'. */ check_arg_is_integer(circle_lens, "circle.length"); circle_lens_len = LENGTH(circle_lens); check_arg_is_recyclable(circle_lens_len, x_len, "circle.length", "x"); ranges_buf = new_IntPairAE(0, 0); x_label = x_label_buf; shift_label = shift_label_buf; width_label = width_label_buf; weight_label = weight_label_buf; PROTECT(ans = NEW_LIST(x_len)); for (i = j = k = l = m = 0; i < x_len; i++, j++, k++, l++, m++) { if (j >= shift_len) j = 0; /* recycle j */ if (k >= width_len) k = 0; /* recycle k */ if (l >= weight_len) l = 0; /* recycle l */ if (m >= circle_lens_len) m = 0; /* recycle m */ snprintf(x_label_buf, sizeof(x_label_buf), "x[[%d]]", i + 1); snprintf(shift_label_buf, sizeof(shift_label_buf), "shift[[%d]]", j + 1); snprintf(width_label_buf, sizeof(width_label_buf), "width[%d]", k + 1); snprintf(weight_label_buf, sizeof(weight_label_buf), "weight[[%d]]", l + 1); x_elt_holder = _get_elt_from_CompressedIRangesList_holder( &x_holder, i); shift_elt = VECTOR_ELT(shift, j); weight_elt = VECTOR_ELT(weight, l); PROTECT(ans_elt = compute_coverage_from_IRanges_holder( &x_elt_holder, shift_elt, INTEGER(width)[k], weight_elt, INTEGER(circle_lens)[m], method, ranges_buf)); SET_VECTOR_ELT(ans, i, ans_elt); UNPROTECT(1); } check_recycling_was_round(j, shift_len, "shift", "x"); check_recycling_was_round(k, width_len, "width", "x"); check_recycling_was_round(l, weight_len, "weight", "x"); check_recycling_was_round(m, circle_lens_len, "circle.length", "x"); UNPROTECT(1); return ans; } IRanges/src/extractListFragments.c0000644000175100017510000000707114626176651020255 0ustar00biocbuildbiocbuild/**************************************************************************** * A Nested Containment List implementation * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" /* * --- .Call ENTRY POINT --- * Args: * q_end, s_end: Integer vectors containing the breakpoints of 2 * compatible Partitioning objects. * with_split_partitions: TRUE or FALSE. * Find the overlaps between 2 Partitioning objects in linear time. Note that, * more generally speaking, the overlaps between 2 IntegerRanges derivatives * that are both disjoint and sorted can be found in linear time. However, the * algorithm implemented below is only for Partitioning objects (which are * a particular type of such objects). Also note that, although findOverlaps() * could be used for this, it isn't as efficient as the algorithm below * because of the cost of building a Nested Containment List object and using * a binary search on it. * If 'with_split_partitions' is FALSE, return a list of 2 sorted integer * vectors of the same length, the 1st one for the query hits and the 2nd one * for the subject hits. If 'with_split_partitions' is TRUE, a 3rd list * element that is also a sorted integer vector parallel to the first 2 * vectors is added. This vector contains the breakpoints of the Partitioning * object obtained by splitting the query by the subject (or vice-versa, this * split is commutative). */ SEXP C_find_partition_overlaps(SEXP q_end, SEXP s_end, SEXP with_split_partitions) { int q_len, s_len, q_prev_end, s_prev_end, i, j; IntPairAE *hits_buf; IntAE *split_partitions_buf; const int *q_end_p, *s_end_p; SEXP ans, ans_elt; q_len = LENGTH(q_end); s_len = LENGTH(s_end); hits_buf = new_IntPairAE(0, 0); if (LOGICAL(with_split_partitions)[0]) split_partitions_buf = new_IntAE(0, 0, 0); q_end_p = INTEGER(q_end); s_end_p = INTEGER(s_end); q_prev_end = s_prev_end = 0; i = j = 1; while (i <= q_len && j <= s_len) { if (q_prev_end == s_prev_end) { if (*q_end_p == q_prev_end) { i++; q_end_p++; continue; } if (*s_end_p == s_prev_end) { j++; s_end_p++; continue; } } IntPairAE_insert_at(hits_buf, IntPairAE_get_nelt(hits_buf), i, j); if (*q_end_p < *s_end_p) { q_prev_end = *q_end_p; if (LOGICAL(with_split_partitions)[0]) IntAE_insert_at(split_partitions_buf, IntAE_get_nelt(split_partitions_buf), q_prev_end); i++; q_end_p++; continue; } if (*s_end_p < *q_end_p) { s_prev_end = *s_end_p; if (LOGICAL(with_split_partitions)[0]) IntAE_insert_at(split_partitions_buf, IntAE_get_nelt(split_partitions_buf), s_prev_end); j++; s_end_p++; continue; } q_prev_end = *q_end_p; if (LOGICAL(with_split_partitions)[0]) IntAE_insert_at(split_partitions_buf, IntAE_get_nelt(split_partitions_buf), q_prev_end); i++; q_end_p++; s_prev_end = *s_end_p; j++; s_end_p++; } ans = PROTECT(NEW_LIST(LOGICAL(with_split_partitions)[0] ? 3 : 2)); ans_elt = PROTECT(new_INTEGER_from_IntAE(hits_buf->a)); SET_VECTOR_ELT(ans, 0, ans_elt); UNPROTECT(1); ans_elt = PROTECT(new_INTEGER_from_IntAE(hits_buf->b)); SET_VECTOR_ELT(ans, 1, ans_elt); UNPROTECT(1); if (LOGICAL(with_split_partitions)[0]) { ans_elt = PROTECT(new_INTEGER_from_IntAE(split_partitions_buf)); SET_VECTOR_ELT(ans, 2, ans_elt); UNPROTECT(1); } UNPROTECT(1); return ans; } IRanges/src/inter_range_methods.c0000644000175100017510000003415614626176651020124 0ustar00biocbuildbiocbuild/**************************************************************************** * Fast inter-range methods * * Author: H. Pag\`es * ****************************************************************************/ #include "IRanges.h" #include "S4Vectors_interface.h" #include #define R_INT_MIN (1+INT_MIN) /**************************************************************************** * Low-level helper functions. */ static int get_maxNROWS_from_CompressedIRangesList_holder( const CompressedIRangesList_holder *x_holder) { int x_len, ir_len_max, i, ir_len; x_len = _get_length_from_CompressedIRangesList_holder(x_holder); ir_len_max = 0; for (i = 0; i < x_len; i++) { ir_len = _get_eltNROWS_from_CompressedIRangesList_holder( x_holder, i); if (ir_len > ir_len_max) ir_len_max = ir_len; } return ir_len_max; } static int append_IRanges_holder_to_IntPairAE(IntPairAE *intpair_ae, const IRanges_holder *ir_holder) { int ir_len, j, start, width; ir_len = _get_length_from_IRanges_holder(ir_holder); for (j = 0; j < ir_len; j++) { start = _get_start_elt_from_IRanges_holder(ir_holder, j); width = _get_width_elt_from_IRanges_holder(ir_holder, j); IntPairAE_insert_at(intpair_ae, IntPairAE_get_nelt(intpair_ae), start, width); } return ir_len; } /**************************************************************************** * range() method for IRanges objects */ /* --- .Call ENTRY POINT --- */ SEXP C_range_IRanges(SEXP x) { int x_len, min, max, i, end; const int *start_p, *width_p; SEXP ans, ans_start, ans_width; x_len = _get_IRanges_length(x); if (x_len == 0) { PROTECT(ans_start = NEW_INTEGER(0)); PROTECT(ans_width = NEW_INTEGER(0)); PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } start_p = INTEGER(_get_IRanges_start(x)); width_p = INTEGER(_get_IRanges_width(x)); min = *(start_p++); max = min + *(width_p++) - 1; for (i = 1; i < x_len; i++, start_p++, width_p++) { if (*start_p < min) min = *start_p; end = *start_p + *width_p - 1; if (end > max) max = end; } PROTECT(ans_start = ScalarInteger(min)); PROTECT(ans_width = ScalarInteger(max - min + 1)); PROTECT(ans = _new_IRanges("IRanges", ans_start, ans_width, R_NilValue)); UNPROTECT(3); return ans; } /**************************************************************************** * reduce() methods for IntegerRanges and CompressedIRangesList objects */ /* WARNING: The reduced ranges are *appended* to 'out_ranges'! Returns the number of ranges that were appended. */ static int reduce_ranges(const int *x_start, const int *x_width, int x_len, int drop_empty_ranges, int min_gapwidth, int *order_buf, IntPairAE *out_ranges, IntAEAE *revmap, int *out_inframe_start) { int out_len, out_len0, i, j, start_j, width_j, end_j, append_or_drop, max_end, gapwidth, delta, width_inc; IntAE *tmp, *revmap_elt; if (min_gapwidth < 0) error("IRanges internal error in reduce_ranges(): " "negative min_gapwidth not supported"); get_order_of_int_pairs(x_start, x_width, x_len, 0, 0, order_buf, 0); out_len = out_len0 = IntPairAE_get_nelt(out_ranges); for (i = 0; i < x_len; i++) { j = order_buf[i]; start_j = x_start[j]; width_j = x_width[j]; end_j = start_j + width_j - 1; if (i == 0) { /* 'append_or_drop' is a toggle that indicates how the current input range should be added to 'out_ranges': 1 for appended (or dropped), 0 for merged. */ append_or_drop = 1; max_end = end_j; delta = start_j - 1; } else { /* If 'i' != 0 and 'append_or_drop' is 1 then the previous range was empty so 'gapwidth' will be >= 0. */ gapwidth = start_j - max_end - 1; if (gapwidth >= min_gapwidth) append_or_drop = 1; } if (append_or_drop) { if (width_j != 0 || (!drop_empty_ranges && (out_len == out_len0 || start_j != out_ranges->a->elts[ out_len - 1]))) { /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, start_j, width_j); if (revmap != NULL) { /* Append to 'revmap'. */ tmp = new_IntAE(1, 1, j + 1); IntAEAE_insert_at(revmap, out_len, tmp); revmap_elt = revmap->elts[out_len]; } out_len++; append_or_drop = 0; } max_end = end_j; if (i != 0) delta += gapwidth; } else { width_inc = end_j - max_end; if (width_inc > 0) { /* Merge with last range in 'out_ranges'. */ out_ranges->b->elts[out_len - 1] += width_inc; max_end = end_j; } if (!(width_j == 0 && drop_empty_ranges) && revmap != NULL) { /* Append to 'revmap'. */ IntAE_insert_at(revmap_elt, IntAE_get_nelt(revmap_elt), j + 1); } } if (out_inframe_start != NULL) out_inframe_start[j] = start_j - delta; } return out_len - out_len0; } /* --- .Call ENTRY POINT --- */ SEXP C_reduce_IntegerRanges(SEXP x_start, SEXP x_width, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap, SEXP with_inframe_start) { int x_len, *inframe_start; const int *x_start_p, *x_width_p; SEXP ans, ans_names, ans_revmap, ans_inframe_start; IntPairAE *out_ranges; IntAE *order_buf; IntAEAE *revmap; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); if (LOGICAL(with_revmap)[0]) { revmap = new_IntAEAE(0, 0); } else { revmap = NULL; } if (LOGICAL(with_inframe_start)[0]) { PROTECT(ans_inframe_start = NEW_INTEGER(x_len)); inframe_start = INTEGER(ans_inframe_start); } else { inframe_start = NULL; } out_ranges = new_IntPairAE(0, 0); order_buf = new_IntAE(x_len, 0, 0); reduce_ranges(x_start_p, x_width_p, x_len, LOGICAL(drop_empty_ranges)[0], INTEGER(min_gapwidth)[0], order_buf->elts, out_ranges, revmap, inframe_start); /* Make 'ans' */ PROTECT(ans = NEW_LIST(4)); PROTECT(ans_names = NEW_CHARACTER(4)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_STRING_ELT(ans_names, 2, mkChar("revmap")); SET_STRING_ELT(ans_names, 3, mkChar("inframe.start")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); if (revmap != NULL) { PROTECT(ans_revmap = new_LIST_from_IntAEAE(revmap, 0)); SET_VECTOR_ELT(ans, 2, ans_revmap); UNPROTECT(1); } if (inframe_start != NULL) { SET_VECTOR_ELT(ans, 3, ans_inframe_start); UNPROTECT(1); } UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_reduce_CompressedIRangesList(SEXP x, SEXP drop_empty_ranges, SEXP min_gapwidth, SEXP with_revmap) { SEXP ans, ans_names, ans_revmap, ans_breakpoints; //ans_unlistData, ans_partitioning; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, in_len_max, i; IntAE *order_buf; IntPairAE *in_ranges, *out_ranges; IntAEAE *revmap; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); if (LOGICAL(with_revmap)[0]) { revmap = new_IntAEAE(0, 0); } else { revmap = NULL; } in_len_max = get_maxNROWS_from_CompressedIRangesList_holder(&x_holder); order_buf = new_IntAE(in_len_max, 0, 0); in_ranges = new_IntPairAE(0, 0); out_ranges = new_IntPairAE(0, 0); PROTECT(ans_breakpoints = NEW_INTEGER(x_len)); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); IntPairAE_set_nelt(in_ranges, 0); append_IRanges_holder_to_IntPairAE(in_ranges, &ir_holder); reduce_ranges(in_ranges->a->elts, in_ranges->b->elts, IntPairAE_get_nelt(in_ranges), LOGICAL(drop_empty_ranges)[0], INTEGER(min_gapwidth)[0], order_buf->elts, out_ranges, revmap, NULL); INTEGER(ans_breakpoints)[i] = IntPairAE_get_nelt(out_ranges); } /* Make 'ans' */ PROTECT(ans = NEW_LIST(4)); PROTECT(ans_names = NEW_CHARACTER(4)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_STRING_ELT(ans_names, 2, mkChar("revmap")); SET_STRING_ELT(ans_names, 3, mkChar("breakpoints")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); if (revmap != NULL) { PROTECT(ans_revmap = new_LIST_from_IntAEAE(revmap, 0)); SET_VECTOR_ELT(ans, 2, ans_revmap); UNPROTECT(1); } SET_VECTOR_ELT(ans, 3, ans_breakpoints); UNPROTECT(2); /* PROTECT(ans_unlistData = _new_IRanges_from_IntPairAE("IRanges", out_ranges)); PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); PROTECT(ans_partitioning = _new_PartitioningByEnd( "PartitioningByEnd", ans_breakpoints, ans_names)); PROTECT(ans = _new_CompressedList(get_classname(x), ans_unlistData, ans_partitioning)); UNPROTECT(5); */ return ans; } /**************************************************************************** * gaps() methods for IntegerRanges and CompressedIRangesList objects */ /* WARNING: The ranges representing the gaps are *appended* to 'out_ranges'! Returns the number of ranges that were appended. */ static int gaps_ranges(const int *x_start, const int *x_width, int x_len, int restrict_start, int restrict_end, int *order_buf, IntPairAE *out_ranges) { int out_len, out_len0, i, j, start_j, width_j, end_j, max_end, gapstart, gapwidth; if (restrict_start != NA_INTEGER) max_end = restrict_start - 1; else max_end = NA_INTEGER; get_order_of_int_pairs(x_start, x_width, x_len, 0, 0, order_buf, 0); out_len = out_len0 = IntPairAE_get_nelt(out_ranges); for (i = 0; i < x_len; i++) { j = order_buf[i]; width_j = x_width[j]; if (width_j == 0) continue; start_j = x_start[j]; end_j = start_j + width_j - 1; if (max_end == NA_INTEGER) { max_end = end_j; } else { gapstart = max_end + 1; if (restrict_end != NA_INTEGER && start_j > restrict_end + 1) start_j = restrict_end + 1; gapwidth = start_j - gapstart; if (gapwidth >= 1) { /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, gapstart, gapwidth); out_len++; max_end = end_j; } else if (end_j > max_end) { max_end = end_j; } } if (restrict_end != NA_INTEGER && max_end >= restrict_end) break; } if (restrict_end != NA_INTEGER && max_end != NA_INTEGER && max_end < restrict_end) { gapstart = max_end + 1; gapwidth = restrict_end - max_end; /* Append to 'out_ranges'. */ IntPairAE_insert_at(out_ranges, out_len, gapstart, gapwidth); out_len++; } return out_len - out_len0; } /* --- .Call ENTRY POINT --- */ SEXP C_gaps_IntegerRanges(SEXP x_start, SEXP x_width, SEXP start, SEXP end) { int x_len; const int *x_start_p, *x_width_p; SEXP ans, ans_names; IntPairAE *out_ranges; IntAE *order_buf; x_len = check_integer_pairs(x_start, x_width, &x_start_p, &x_width_p, "start(x)", "width(x)"); out_ranges = new_IntPairAE(0, 0); order_buf = new_IntAE(x_len, 0, 0); gaps_ranges(x_start_p, x_width_p, x_len, INTEGER(start)[0], INTEGER(end)[0], order_buf->elts, out_ranges); PROTECT(ans = NEW_LIST(2)); PROTECT(ans_names = NEW_CHARACTER(2)); SET_STRING_ELT(ans_names, 0, mkChar("start")); SET_STRING_ELT(ans_names, 1, mkChar("width")); SET_NAMES(ans, ans_names); UNPROTECT(1); SET_VECTOR_ELT(ans, 0, new_INTEGER_from_IntAE(out_ranges->a)); SET_VECTOR_ELT(ans, 1, new_INTEGER_from_IntAE(out_ranges->b)); UNPROTECT(1); return ans; } /* --- .Call ENTRY POINT --- */ SEXP C_gaps_CompressedIRangesList(SEXP x, SEXP start, SEXP end) { SEXP ans, ans_names, ans_unlistData, ans_breakpoints, ans_partitioning; CompressedIRangesList_holder x_holder; IRanges_holder ir_holder; int x_len, in_len_max, start_len, end_len, *start_elt, *end_elt, i; IntAE *order_buf; IntPairAE *in_ranges, *out_ranges; x_holder = _hold_CompressedIRangesList(x); x_len = _get_length_from_CompressedIRangesList_holder(&x_holder); in_len_max = get_maxNROWS_from_CompressedIRangesList_holder(&x_holder); order_buf = new_IntAE(in_len_max, 0, 0); in_ranges = new_IntPairAE(0, 0); out_ranges = new_IntPairAE(0, 0); start_len = LENGTH(start); end_len = LENGTH(end); if (start_len != 1 && start_len != x_len) error("'start' must have length 1 or the length of 'x'"); if (end_len != 1 && end_len != x_len) error("'end' must have length 1 or the length of 'x'"); PROTECT(ans_breakpoints = NEW_INTEGER(x_len)); start_elt = INTEGER(start); end_elt = INTEGER(end); for (i = 0; i < x_len; i++) { ir_holder = _get_elt_from_CompressedIRangesList_holder(&x_holder, i); IntPairAE_set_nelt(in_ranges, 0); append_IRanges_holder_to_IntPairAE(in_ranges, &ir_holder); gaps_ranges(in_ranges->a->elts, in_ranges->b->elts, IntPairAE_get_nelt(in_ranges), *start_elt, *end_elt, order_buf->elts, out_ranges); INTEGER(ans_breakpoints)[i] = IntPairAE_get_nelt(out_ranges); if (start_len != 1) start_elt++; if (end_len != 1) end_elt++; } PROTECT(ans_unlistData = _new_IRanges_from_IntPairAE("IRanges", out_ranges)); PROTECT(ans_names = duplicate(_get_CompressedList_names(x))); PROTECT(ans_partitioning = _new_PartitioningByEnd( "PartitioningByEnd", ans_breakpoints, ans_names)); PROTECT(ans = _new_CompressedList(get_classname(x), ans_unlistData, ans_partitioning)); UNPROTECT(5); return ans; } /**************************************************************************** * disjointBins() method for IntegerRanges objects */ /* --- .Call ENTRY POINT --- * Worst case complexity of O(n^2) :(, but in practice very fast. */ SEXP C_disjointBins_IntegerRanges(SEXP x_start, SEXP x_width) { SEXP ans; IntAE *bin_ends = new_IntAE(128, 0, 0); PROTECT(ans = NEW_INTEGER(length(x_start))); for (int i = 0; i < length(x_start); i++) { // find a bin, starting at first int j = 0, end = INTEGER(x_start)[i] + INTEGER(x_width)[i] - 1; for (; j < IntAE_get_nelt(bin_ends) && bin_ends->elts[j] >= INTEGER(x_start)[i]; j++); // remember when this bin will be open if (j == IntAE_get_nelt(bin_ends)) IntAE_append(bin_ends, &end, 1); else bin_ends->elts[j] = end; // store the bin for this range INTEGER(ans)[i] = j + 1; } UNPROTECT(1); return ans; } IRanges/tests/0000755000175100017510000000000014626176651014302 5ustar00biocbuildbiocbuildIRanges/tests/run_unitTests.R0000644000175100017510000000011714626176651017312 0ustar00biocbuildbiocbuildrequire("IRanges") || stop("unable to load IRanges package") IRanges:::.test() IRanges/vignettes/0000755000175100017510000000000014641351314015134 5ustar00biocbuildbiocbuildIRanges/vignettes/IRangesOverview.Rnw0000644000175100017510000005176214626176651020732 0ustar00biocbuildbiocbuild%\VignetteIndexEntry{An Overview of the IRanges package} %\VignetteDepends{IRanges} %\VignetteKeywords{Ranges,IntegerRanges,IRanges,IRangesList,Views,AtomicList} %\VignettePackage{IRanges} \documentclass{article} \usepackage[authoryear,round]{natbib} <>= BiocStyle::latex(use.unsrturl=FALSE) @ \title{An Overview of the \Biocpkg{IRanges} package} \author{Patrick Aboyoun, Michael Lawrence, Herv\'e Pag\`es} \date{Edited: February 2018; Compiled: \today} \begin{document} \maketitle \tableofcontents <>= options(width=72) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Introduction} When analyzing sequences, we are often interested in particular consecutive subsequences. For example, the \Sexpr{letters} vector could be considered a sequence of lower-case letters, in alphabetical order. We would call the first five letters (\textit{a} to \textit{e}) a consecutive subsequence, while the subsequence containing only the vowels would not be consecutive. It is not uncommon for an analysis task to focus only on the geometry of the regions, while ignoring the underlying sequence values. A list of indices would be a simple way to select a subsequence. However, a sparser representation for consecutive subsequences would be a range, a pairing of a start position and a width, as used when extracting sequences with \Rfunction{window}. Two central classes are available in Bioconductor for representing ranges: the \Rclass{IRanges} class defined in the \Biocpkg{IRanges} package for representing ranges defined on a single space, and the \Rclass{GRanges} class defined in the \Biocpkg{GenomicRanges} package for representing ranges defined on multiple spaces. In this vignette, we will focus on \Rclass{IRanges} objects. We will rely on simple, illustrative example datasets, rather than large, real-world data, so that each data structure and algorithm can be explained in an intuitive, graphical manner. We expect that packages that apply \Biocpkg{IRanges} to a particular problem domain will provide vignettes with relevant, realistic examples. The \Biocpkg{IRanges} package is available at bioconductor.org and can be downloaded via \Rfunction{BiocManager::install}: <>= if (!require("BiocManager")) install.packages("BiocManager") BiocManager::install("IRanges") @ <>= library(IRanges) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{\Rclass{IRanges} objects} To construct an \Rclass{IRanges} object, we call the \Rfunction{IRanges} constructor. Ranges are normally specified by passing two out of the three parameters: start, end and width (see \Rcode{help(IRanges)} for more information). % <>= ir1 <- IRanges(start=1:10, width=10:1) ir1 ir2 <- IRanges(start=1:10, end=11) ir3 <- IRanges(end=11, width=10:1) identical(ir1, ir2) && identical(ir1, ir3) ir <- IRanges(c(1, 8, 14, 15, 19, 34, 40), width=c(12, 6, 6, 15, 6, 2, 7)) ir @ % All of the above calls construct the same \Rclass{IRanges} object, using different combinations of the \Rcode{start}, \Rcode{end} and \Rcode{width} parameters. Accessing the starts, ends and widths is supported via the \Rfunction{start}, \Rfunction{end} and \Rfunction{width} getters: <>= start(ir) @ <>= end(ir) @ <>= width(ir) @ Subsetting an \Rclass{IRanges} object is supported, by numeric and logical indices: <>= ir[1:4] @ <>= ir[start(ir) <= 15] @ In order to illustrate range operations, we'll create a function to plot ranges. <>= plotRanges <- function(x, xlim=x, main=deparse(substitute(x)), col="black", sep=0.5, ...) { height <- 1 if (is(xlim, "IntegerRanges")) xlim <- c(min(start(xlim)), max(end(xlim))) bins <- disjointBins(IRanges(start(x), end(x) + 1)) plot.new() plot.window(xlim, c(0, max(bins)*(height + sep))) ybottom <- bins * (sep + height) - height rect(start(x)-0.5, ybottom, end(x)+0.5, ybottom + height, col=col, ...) title(main) axis(1) } @ <>= plotRanges(ir) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ir-plotRanges} \caption{\label{fig-ir-plotRanges}% Plot of original ranges.} \end{center} \end{figure} \subsection{Normality} Sometimes, it is necessary to formally represent a subsequence, where no elements are repeated and order is preserved. Also, it is occasionally useful to think of an \Rclass{IRanges} object as a set of integers, where no elements are repeated and order does not matter. The \Rclass{NormalIRanges} class formally represents a set of integers. By definition an \Rclass{IRanges} object is said to be \textit{normal} when its ranges are: (a) not empty (i.e. they have a non-null width); (b) not overlapping; (c) ordered from left to right; (d) not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges). There are three main advantages of using a \textit{normal} \Rclass{IRanges} object: (1) it guarantees a subsequence encoding or set of integers, (2) it is compact in terms of the number of ranges, and (3) it uniquely identifies its information, which simplifies comparisons. The \Rfunction{reduce} function reduces any \Rclass{IRanges} object to a \Rclass{NormalIRanges} by merging redundant ranges. <>= reduce(ir) plotRanges(reduce(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-reduce} \caption{\label{fig-ranges-reduce}% Plot of reduced ranges.} \end{center} \end{figure} \subsection{Lists of \Rclass{IRanges} objects} It is common to manipulate collections of \Rclass{IRanges} objects during an analysis. Thus, the \Biocpkg{IRanges} package defines some specific classes for working with multiple \Rclass{IRanges} objects. The \Rclass{IRangesList} class asserts that each element is an \Rclass{IRanges} object and provides convenience methods, such as \Rfunction{start}, \Rfunction{end} and \Rfunction{width} accessors that return \Rclass{IntegerList} objects, aligning with the \Rclass{IRangesList} object. Note that \Rclass{IntegerList} objects will be covered later in more details in the ``Lists of Atomic Vectors'' section of this document. To explicitly construct an \Rclass{IRangesList}, use the \Rfunction{IRangesList} function. <>= rl <- IRangesList(ir, rev(ir)) @ % <>= start(rl) @ \subsection{Vector Extraction} As the elements of an \Rclass{IRanges} object encode consecutive subsequences, they may be used directly in sequence extraction. Note that when a \textit{normal} \Rclass{IRanges} is given as the index, the result is a subsequence, as no elements are repeated or reordered. If the sequence is a \Rclass{Vector} subclass (i.e. not an ordinary \Rclass{vector}), the canonical \Rfunction{[} function accepts an \Rclass{IRanges} object. % <>= set.seed(0) lambda <- c(rep(0.001, 4500), seq(0.001, 10, length=500), seq(10, 0.001, length=500)) xVector <- rpois(1e7, lambda) yVector <- rpois(1e7, lambda[c(251:length(lambda), 1:250)]) xRle <- Rle(xVector) yRle <- Rle(yVector) irextract <- IRanges(start=c(4501, 4901) , width=100) xRle[irextract] @ % \subsection{Finding Overlapping Ranges} The function \Rfunction{findOverlaps} detects overlaps between two \Rclass{IRanges} objects. <>= ol <- findOverlaps(ir, reduce(ir)) as.matrix(ol) @ \subsection{Counting Overlapping Ranges} The function \Rfunction{coverage} counts the number of ranges over each position. <>= cov <- coverage(ir) plotRanges(ir) cov <- as.vector(cov) mat <- cbind(seq_along(cov)-0.5, cov) d <- diff(cov) != 0 mat <- rbind(cbind(mat[d,1]+1, mat[d,2]), mat) mat <- mat[order(mat[,1]),] lines(mat, col="red", lwd=4) axis(2) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-coverage} \caption{\label{fig-ranges-coverage}% Plot of ranges with accumulated coverage.} \end{center} \end{figure} \subsection{Finding Neighboring Ranges} The \Rfunction{nearest} function finds the nearest neighbor ranges (overlapping is zero distance). The \Rfunction{precede} and \Rfunction{follow} functions find the non-overlapping nearest neighbors on a specific side. \subsection{Transforming Ranges} Utilities are available for transforming an \Rclass{IRanges} object in a variety of ways. Some transformations, like \Rfunction{reduce} introduced above, can be dramatic, while others are simple per-range adjustments of the starts, ends or widths. \subsubsection{Adjusting starts, ends and widths} Perhaps the simplest transformation is to adjust the start values by a scalar offset, as performed by the \Rfunction{shift} function. Below, we shift all ranges forward 10 positions. % <>= shift(ir, 10) @ There are several other ways to transform ranges. These include \Rfunction{narrow}, \Rfunction{resize}, \Rfunction{flank}, \Rfunction{reflect}, \Rfunction{restrict}, and \Rfunction{threebands}. For example \Rfunction{narrow} supports the adjustment of start, end and width values, which should be relative to each range. These adjustments are vectorized over the ranges. As its name suggests, the ranges can only be narrowed. % <>= narrow(ir, start=1:5, width=2) @ The \Rfunction{restrict} function ensures every range falls within a set of bounds. Ranges are contracted as necessary, and the ranges that fall completely outside of but not adjacent to the bounds are dropped, by default. % <>= restrict(ir, start=2, end=3) @ The \Rfunction{threebands} function extends \Rfunction{narrow} so that the remaining left and right regions adjacent to the narrowed region are also returned in separate \Rclass{IRanges} objects. % <>= threebands(ir, start=1:5, width=2) @ The arithmetic operators \Rfunction{+}, \Rfunction{-} and \Rfunction{*} change both the start and the end/width by symmetrically expanding or contracting each range. Adding or subtracting a numeric (integer) vector to an \Rclass{IRanges} causes each range to be expanded or contracted on each side by the corresponding value in the numeric vector. <>= ir + seq_len(length(ir)) @ % The \Rfunction{*} operator symmetrically magnifies an \Rclass{IRanges} object by a factor, where positive contracts (zooms in) and negative expands (zooms out). % <>= ir * -2 # double the width @ WARNING: The semantic of these arithmetic operators might be revisited at some point. Please restrict their use to the context of interactive visualization (where they arguably provide some convenience) but avoid to use them programmatically. \subsubsection{Making ranges disjoint} A more complex type of operation is making a set of ranges disjoint, \textit{i.e.} non-overlapping. For example, \Rfunction{threebands} returns a disjoint set of three ranges for each input range. The \Rfunction{disjoin} function makes an \Rclass{IRanges} object disjoint by fragmenting it into the widest ranges where the set of overlapping ranges is the same. <>= disjoin(ir) plotRanges(disjoin(ir)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-disjoin} \caption{\label{fig-ranges-disjoin}% Plot of disjoined ranges.} \end{center} \end{figure} A variant of \Rfunction{disjoin} is \Rfunction{disjointBins}, which divides the ranges into bins, such that the ranges in each bin are disjoint. The return value is an integer vector of the bins. <>= disjointBins(ir) @ \subsubsection{Other transformations} Other transformations include \Rfunction{reflect} and \Rfunction{flank}. The former ``flips'' each range within a set of common reference bounds. <>= reflect(ir, IRanges(start(ir), width=width(ir)*2)) @ % The \Rfunction{flank} returns ranges of a specified width that flank, to the left (default) or right, each input range. One use case of this is forming promoter regions for a set of genes. <>= flank(ir, width=seq_len(length(ir))) @ % \subsection{Set Operations} Sometimes, it is useful to consider an \Rclass{IRanges} object as a set of integers, although there is always an implicit ordering. This is formalized by \Rclass{NormalIRanges}, above, and we now present versions of the traditional mathematical set operations \textit{complement}, \textit{union}, \textit{intersect}, and \textit{difference} for \Rclass{IRanges} objects. There are two variants for each operation. The first treats each \Rclass{IRanges} object as a set and returns a \textit{normal} value, while the other has a ``parallel'' semantic like \Rfunction{pmin}/\Rfunction{pmax} and performs the operation for each range pairing separately. The \textit{complement} operation is implemented by the \Rfunction{gaps} and \Rfunction{pgap} functions. By default, \Rfunction{gaps} will return the ranges that fall between the ranges in the (normalized) input. It is possible to specify a set of bounds, so that flanking ranges are included. <>= gaps(ir, start=1, end=50) plotRanges(gaps(ir, start=1, end=50), c(1,50)) @ \begin{figure}[tb] \begin{center} \includegraphics[width=0.5\textwidth]{IRangesOverview-ranges-gaps} \caption{\label{fig-ranges-gap}% Plot of gaps from ranges.} \end{center} \end{figure} \Rfunction{pgap} considers each parallel pairing between two \Rclass{IRanges} objects and finds the range, if any, between them. Note that the function name is singular, suggesting that only one range is returned per range in the input. <>= @ The remaining operations, \textit{union}, \textit{intersect} and \textit{difference} are implemented by the \Rfunction{[p]union}, \Rfunction{[p]intersect} and \Rfunction{[p]setdiff} functions, respectively. These are relatively self-explanatory. <>= @ <>= @ <>= @ <>= @ <>= @ <>= @ % \subsection{Mapping Ranges Between Vectors} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Vector Views} The \Biocpkg{IRanges} package provides the virtual \Rclass{Views} class, which stores a vector-like object, referred to as the ``subject'', together with an \Rclass{IRanges} object defining ranges on the subject. Each range is said to represent a \textit{view} onto the subject. Here, we will demonstrate the \Rclass{RleViews} class, where the subject is of class \Rclass{Rle}. Other \Rclass{Views} implementations exist, such as \Rclass{XStringViews} in the \Biocpkg{Biostrings} package. \subsection{Creating Views} There are two basic constructors for creating views: the \Rfunction{Views} function based on indicators and the \Rfunction{slice} based on numeric boundaries. <>= xViews <- Views(xRle, xRle >= 1) xViews <- slice(xRle, 1) xRleList <- RleList(xRle, 2L * rev(xRle)) xViewsList <- slice(xRleList, 1) @ Note that \Rclass{RleList} objects will be covered later in more details in the ``Lists of Atomic Vectors'' section of this document. \subsection{Aggregating Views} While \Rfunction{sapply} can be used to loop over each window, the native functions \Rfunction{viewMaxs}, \Rfunction{viewMins}, \Rfunction{viewSums}, and \Rfunction{viewMeans} provide fast looping to calculate their respective statistical summaries. <>= head(viewSums(xViews)) viewSums(xViewsList) head(viewMaxs(xViews)) viewMaxs(xViewsList) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Lists of Atomic Vectors} In addition to the range-based objects described in the previous sections, the \Biocpkg{IRanges} package provides containers for storing lists of atomic vectors such as \Rclass{integer} or \Rclass{Rle} objects. The \Rclass{IntegerList} and \Rclass{RleList} classes represent lists of \Rclass{integer} vectors and \Rclass{Rle} objects respectively. They are subclasses of the \Rclass{AtomicList} virtual class which is itself a subclass of the \Rclass{List} virtual class defined in the \Biocpkg{S4Vectors} package. <>= showClass("RleList") @ As the class definition above shows, the \Rclass{RleList} class is virtual with subclasses \Rclass{SimpleRleList} and \Rclass{CompressedRleList}. A \Rclass{SimpleRleList} class uses an ordinary \R{} list to store the underlying elements and the \Rclass{CompressedRleList} class stores the elements in an unlisted form and keeps track of where the element breaks are. The former ``simple list" class is useful when the \Rclass{Rle} elements are long and the latter ``compressed list" class is useful when the list is long and/or sparse (i.e. a number of the list elements have length 0). In fact, all of the atomic vector types (logical, integer, numeric, complex, character, raw, and factor) have similar list classes that derive from the \Rclass{List} virtual class. For example, there is an \Rclass{IntegerList} virtual class with subclasses \Rclass{SimpleIntegerList} and \Rclass{CompressedIntegerList}. Each of the list classes for atomic sequences, be they stored as vectors or \Rclass{Rle} objects, have a constructor function with a name of the appropriate list virtual class, such as \Rclass{IntegerList}, and an optional argument \Rcode{compress} that takes an argument to specify whether or not to create the simple list object type or the compressed list object type. The default is to create the compressed list object type. <>= args(IntegerList) cIntList1 <- IntegerList(x=xVector, y=yVector) cIntList1 sIntList2 <- IntegerList(x=xVector, y=yVector, compress=FALSE) sIntList2 ## sparse integer list xExploded <- lapply(xVector[1:5000], function(x) seq_len(x)) cIntList2 <- IntegerList(xExploded) sIntList2 <- IntegerList(xExploded, compress=FALSE) object.size(cIntList2) object.size(sIntList2) @ The \Rfunction{length} function returns the number of elements in a \Rclass{Vector}-derived object and, for a \Rclass{List}-derived object like ``simple list" or ``compressed list", the \Rfunction{lengths} function returns an integer vector containing the lengths of each of the elements: <>= length(cIntList2) Rle(lengths(cIntList2)) @ Just as with ordinary \R{} \Rclass{list} objects, \Rclass{List}-derived object support the \Rfunction{[[} for element extraction, \Rfunction{c} for concatenating, and \Rfunction{lapply}/\Rfunction{sapply} for looping. When looping over sparse lists, the ``compressed list" classes can be much faster during computations since only the non-empty elements are looped over during the \Rfunction{lapply}/\Rfunction{sapply} computation and all the empty elements are assigned the appropriate value based on their status. <>= system.time(sapply(xExploded, mean)) system.time(sapply(sIntList2, mean)) system.time(sapply(cIntList2, mean)) identical(sapply(xExploded, mean), sapply(sIntList2, mean)) identical(sapply(xExploded, mean), sapply(cIntList2, mean)) @ Unlike ordinary \R{} \Rclass{list} objects, \Rclass{AtomicList} objects support the \Rfunction{Ops} (e.g. \Rfunction{+}, \Rfunction{==}, \Rfunction{\&}), \Rfunction{Math} (e.g. \Rfunction{log}, \Rfunction{sqrt}), \Rfunction{Math2} (e.g. \Rfunction{round}, \Rfunction{signif}), \Rfunction{Summary} (e.g. \Rfunction{min}, \Rfunction{max}, \Rfunction{sum}), and \Rfunction{Complex} (e.g. \Rfunction{Re}, \Rfunction{Im}) group generics. <>= xRleList > 0 yRleList <- RleList(yRle, 2L * rev(yRle)) xRleList + yRleList sum(xRleList > 0 | yRleList > 0) @ Since these atomic lists inherit from \Rclass{List}, they can also use the looping function \Rfunction{endoapply} to perform endomorphisms. <>= safe.max <- function(x) { if(length(x)) max(x) else integer(0) } endoapply(sIntList2, safe.max) endoapply(cIntList2, safe.max) endoapply(sIntList2, safe.max)[[1]] @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Session Information} Here is the output of \Rcode{sessionInfo()} on the system on which this document was compiled: <>= sessionInfo() @ \end{document}