Skip to contents

Bin a GRanges and apply a summary method (e.g: 'mean', 'median', 'sum', 'max, 'min' ...) to a chosen numerical variable of ranges in the same bin.

Usage

BinGRanges(
  gRange = NULL,
  chromSizes = NULL,
  binSize = NULL,
  method = "mean",
  metadataColName = NULL,
  na.rm = TRUE,
  cores = 1,
  reduceRanges = TRUE,
  verbose = FALSE
)

Arguments

gRange

: A GRanges to bin.

chromSizes

<data.frame>: A data.frame where first colum corresponds to the chromosomes names, and the second column corresponds to the chromosomes lengths in base pairs.

binSize

: Width of the bins.

method

: Name of a summary method as 'mean', 'median', 'sum', 'max, 'min'. (Default 'mean')

metadataColName

: A character vector that specify the metadata columns of GRanges on which to apply the summary method.

na.rm

: A logical value indicating whether NA values should be stripped before the computation proceeds. (Default TRUE)

cores

: The number of cores. (Default 1)

reduceRanges

: Whether duplicated Bins must be reduced with the summary method (method arg). (Default TRUE)

verbose

: If TRUE, show the progression in console. (Default FALSE)

Value

A binned GRanges.

Details

BinGRanges

Examples

GRange.gnr <- GenomicRanges::GRanges(
    seqnames = S4Vectors::Rle(c("chr1", "chr2"), c(3, 1)),
    ranges = IRanges::IRanges(
        start = c(1, 201, 251, 1),
        end = c(200, 250, 330, 100),
        names = letters[seq_len(4)]
    ),
    strand = S4Vectors::Rle(BiocGenerics::strand(c("*")), 4),
    score = c(50, NA, 100, 30)
)
GRange.gnr
#> GRanges object with 4 ranges and 1 metadata column:
#>     seqnames    ranges strand |     score
#>        <Rle> <IRanges>  <Rle> | <numeric>
#>   a     chr1     1-200      * |        50
#>   b     chr1   201-250      * |        NA
#>   c     chr1   251-330      * |       100
#>   d     chr2     1-100      * |        30
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths
BinGRanges(
    gRange = GRange.gnr,
    chromSizes = data.frame(c("chr1", "chr2"), c(350, 100)),
    binSize = 100,
    method = "mean",
    metadataColName = "score",
    na.rm = TRUE
)
#> GRanges object with 5 ranges and 2 metadata columns:
#>       seqnames    ranges strand |     score         bin
#>          <Rle> <IRanges>  <Rle> | <numeric> <character>
#>   [1]     chr1     1-100      * |        50      chr1:1
#>   [2]     chr1   101-200      * |        50      chr1:2
#>   [3]     chr1   201-300      * |       100      chr1:3
#>   [4]     chr1   301-350      * |       100      chr1:4
#>   [5]     chr2     1-100      * |        30      chr2:1
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome