fastRanges: A Practical Introduction to Genomic Interval Analysis

0.1 Overview
0.2 A Mental Model
0.3 Which Function Should You Use?
0.4 The Example Data
0.5 Core Overlap Operations
- 0.5.1 Return all hit pairs
- 0.5.2 Count matches per query
- 0.5.3 Ask only whether a match exists
0.6 Important Arguments
- 0.6.1 type
- 0.6.2 max_gap
- 0.6.3 min_overlap
- 0.6.4 ignore_strand
- 0.6.5 threads and deterministic
0.7 Bioconductor Compatibility
- 0.7.1 What is currently supported?
- 0.7.2 What is not currently supported?
- 0.7.3 Empty ranges
- 0.7.4 Direct vs indexed usage
- 0.7.5 deterministic
0.8 Return Types You Will See Most Often
0.9 Reusing the Same Subject with an Index
0.10 Overlap Joins
0.11 Nearest and Directional Queries
0.12 Grouped Overlap Summaries
- 0.12.1 Count overlaps by group
- 0.12.2 Aggregate a subject score over overlaps
0.13 Self-Overlaps, Clusters, and Sliding Windows
0.14 Range Algebra
0.15 Coverage and Binned Coverage
0.16 Iterating over Large Query Sets
0.17 Saving and Loading an Index
0.18 A Small Benchmark Example
0.19 Benchmark Resources
0.20 Package Example Files
0.21 Session Info

0.1 Overview

fastRanges provides high-throughput interval operations for Bioconductor workflows built on IRanges and GRanges.

For genomic data, the central data type is GRanges. Each row is one genomic interval with:

a sequence name such as "chr1"
a genomic range (start, end, or width)
a strand ("+", "-", or "*")

Typical analyses ask questions such as:

Which peaks overlap genes or promoters?
How many query ranges overlap each subject annotation?
Which subject is nearest to each query?
How can the same subject annotation be reused across many batches of query intervals?

fastRanges is designed to answer those questions while keeping its grammar close to the Bioconductor overlap APIs in IRanges and GenomicRanges.

The package is best thought of as a high-throughput interval engine for supported Bioconductor range classes rather than as a promise of full drop-in replacement for every findOverlaps() input type.

library(fastRanges)
library(GenomicRanges)
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#> 
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#> 
#>     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#>     setequal, union
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#>     as.data.frame, basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#>     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#>     rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#>     unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#> 
#>     findMatches
#> The following objects are masked from 'package:base':
#> 
#>     I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: Seqinfo
library(S4Vectors)

data("fast_ranges_example", package = "fastRanges")
query <- fast_ranges_example$query
subject <- fast_ranges_example$subject

0.2 A Mental Model

The package uses the same two-object language as Bioconductor:

query: the intervals you are asking about
subject: the intervals you compare the query against

If a query peak overlaps three subject genes, then:

fast_find_overlaps() returns three hit pairs
fast_count_overlaps() returns 3 for that query row
fast_overlaps_any() returns TRUE for that query row

That distinction is the main decision point when choosing a function.

0.3 Which Function Should You Use?

Goal	Function	Main output
Get every matching pair	`fast_find_overlaps()`	`Hits`
Count matches per query	`fast_count_overlaps()`	integer vector
Ask only yes/no	`fast_overlaps_any()`	logical vector
Build a tabular overlap join	`fast_overlap_join()`	`data.frame`
Keep only matching query rows	`fast_semi_overlap_join()`	`data.frame`
Keep only non-matching query rows	`fast_anti_overlap_join()`	`data.frame`
Reuse the same subject many times	`fast_build_index()`	`fast_ranges_index`
Find nearest subject	`fast_nearest()`	`DataFrame`
Summarize overlap counts by subject metadata	`fast_count_overlaps_by_group()`	matrix
Summarize subject scores over overlaps	`fast_overlap_aggregate()`	numeric vector
Merge or transform intervals themselves	`fast_reduce()`, `fast_disjoin()`, `fast_gaps()`	ranges
Summarize coverage in bins	`fast_tile_coverage()`	`data.frame`

0.4 The Example Data

The package ships with a small reproducible example:

names(fast_ranges_example)
#> [1] "query"   "subject"
query
#> GRanges object with 6 ranges and 2 metadata columns:
#>       seqnames    ranges strand |    query_id     score
#>          <Rle> <IRanges>  <Rle> | <character> <integer>
#>   [1]     chr1   100-159      + |          q1         8
#>   [2]     chr1   180-269      + |          q2        10
#>   [3]     chr1   350-419      - |          q3         6
#>   [4]     chr2     40-79      * |          q4         3
#>   [5]     chr2   300-379      + |          q5         9
#>   [6]     chr3     10-34      - |          q6         5
#>   -------
#>   seqinfo: 3 sequences from an unspecified genome; no seqlengths
subject
#> GRanges object with 7 ranges and 2 metadata columns:
#>       seqnames    ranges strand |     gene_id     biotype
#>          <Rle> <IRanges>  <Rle> | <character> <character>
#>   [1]     chr1    90-169      + |          gA    promoter
#>   [2]     chr1   210-309      - |          gB    enhancer
#>   [3]     chr1   320-369      - |          gC    enhancer
#>   [4]     chr2     20-89      + |          gD    promoter
#>   [5]     chr2   330-419      * |          gE   gene_body
#>   [6]     chr3      1-40      - |          gF    promoter
#>   [7]     chr3    80-114      + |          gG    enhancer
#>   -------
#>   seqinfo: 3 sequences from an unspecified genome; no seqlengths

In a real analysis, query might be ChIP-seq peaks and subject might be genes, exons, enhancers, promoters, or other genomic annotations.

This example is intentionally small so every function in the vignette can run quickly. The structure is:

query: 6 genomic intervals with metadata columns query_id and score
subject: 7 genomic intervals with metadata columns gene_id and biotype

If you want file-based examples instead of in-memory objects, the same records are also installed as BED files:

system.file("extdata", "query_peaks.bed", package = "fastRanges")
#> [1] "/tmp/RtmpOKui21/Rinst25ddf22361d5e/fastRanges/extdata/query_peaks.bed"
system.file("extdata", "subject_genes.bed", package = "fastRanges")
#> [1] "/tmp/RtmpOKui21/Rinst25ddf22361d5e/fastRanges/extdata/subject_genes.bed"

That split is deliberate:

use fast_ranges_example for package examples, tutorials, and tests
use the BED files when demonstrating import pipelines or command-line inputs

0.5 Core Overlap Operations

0.5.1 Return all hit pairs

hits <- fast_find_overlaps(query, subject, threads = 2)
hits
#> Hits object with 5 hits and 0 metadata columns:
#>            from        to
#>       <integer> <integer>
#>   [1]         1         1
#>   [2]         3         3
#>   [3]         4         4
#>   [4]         5         5
#>   [5]         6         6
#>   -------
#>   nLnode: 6 / nRnode: 7

This result is a Hits object. The important parts are:

queryHits(hits): row indices from query
subjectHits(hits): matching row indices from subject

cbind(
  query_id = S4Vectors::queryHits(hits),
  subject_id = S4Vectors::subjectHits(hits)
)
#>      query_id subject_id
#> [1,]        1          1
#> [2,]        3          3
#> [3,]        4          4
#> [4,]        5          5
#> [5,]        6          6

0.5.2 Count matches per query

counts <- fast_count_overlaps(query, subject, threads = 2)
counts
#> [1] 1 0 1 1 1 1

There is one value per query row:

0 means no subject matched
2 means that query matched two subject ranges

0.5.3 Ask only whether a match exists

any_hits <- fast_overlaps_any(query, subject, threads = 2)
any_hits
#> [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

This is the lightest-weight summary when you need only a logical answer.

0.6 Important Arguments

Most overlap-style functions use the same argument grammar:

query
subject
max_gap
min_overlap
type
ignore_strand
threads
deterministic

0.6.1 `type`

type controls what “match” means.

type = "any"
query  :    |------|
subject: |------|
=> any qualifying overlap is a hit

type = "within"
subject: |--------------|
query  :    |------|
=> query must lie inside subject

type = "start"
query  :    |------|
subject:    |------------|
=> start coordinates must match or be within tolerance

type = "end"
query  :       |------|
subject: |------------|
=> end coordinates must match or be within tolerance

type = "equal"
query  :    |------|
subject:    |------|
=> start and end coordinates must match, or be within tolerance

0.6.2 `max_gap`

max_gap controls how far apart two ranges may be and still count as a hit.

-1 means true overlap is required
0 allows Bioconductor’s zero-gap tolerance behavior
positive values allow nearby, non-overlapping ranges to count in the modes that support that interpretation

0.6.3 `min_overlap`

min_overlap is the minimum overlap width, in bases.

0 is the least strict setting
larger values require wider shared overlap

0.6.4 `ignore_strand`

For GRanges:

FALSE uses strand-aware Bioconductor semantics
TRUE ignores strand and compares only genomic coordinates

For IRanges, strand does not exist, so this argument has no effect.

0.6.5 `threads` and `deterministic`

threads controls parallel execution
deterministic = TRUE keeps stable ordering of hit pairs
deterministic = FALSE can be slightly faster for large jobs when order is unimportant

0.7 Bioconductor Compatibility

fastRanges is intended to stay close to Bioconductor overlap semantics.

The easiest way to see that is to compare outputs directly:

ref <- GenomicRanges::findOverlaps(query, subject, ignore.strand = FALSE)

identical(
  cbind(S4Vectors::queryHits(hits), S4Vectors::subjectHits(hits)),
  cbind(S4Vectors::queryHits(ref), S4Vectors::subjectHits(ref))
)
#> [1] TRUE

0.7.1 What is currently supported?

IRanges
GRanges
overlap modes such as "any", "start", "end", "within", and "equal"
strand-aware or strand-ignored comparison for GRanges
select = "all", "first", "last", and "arbitrary"

0.7.2 What is not currently supported?

GRangesList
circular genomic sequences

For these unsupported cases, fastRanges now raises an explicit error rather than returning silently different answers.

0.7.3 Empty ranges

Width-0 / empty-range cases are handled using Bioconductor-compatible reference behavior. This is a correctness choice: empty ranges are uncommon in production overlap workloads, so exact semantics are more important than forcing these cases through the multithreaded C++ path.

0.7.4 Direct vs indexed usage

There are two intended usage modes:

Direct mode:
- fast_find_overlaps(query, subject, ...)
- best for one-off calls or exploratory work
Indexed mode:
- idx <- fast_build_index(subject)
- fast_find_overlaps(query, idx, ...)
- best when the same subject is queried repeatedly

This distinction matters for performance much more than for semantics.

0.7.5 `deterministic`

deterministic = TRUE keeps output ordering stable across thread counts and is useful for tests, reports, and exact reproducibility.

deterministic = FALSE avoids that extra ordering step and is usually the right choice when you care about maximum throughput on large multithreaded jobs.

That compatibility is important because fastRanges is designed to fit into existing IRanges / GRanges workflows rather than replace those classes.

0.8 Return Types You Will See Most Often

ret_classes <- c(
  hits = class(fast_find_overlaps(query, subject, threads = 1))[1],
  count = class(fast_count_overlaps(query, subject, threads = 1))[1],
  any = class(fast_overlaps_any(query, subject, threads = 1))[1],
  nearest = class(fast_nearest(query, subject))[1],
  join = class(fast_overlap_join(query, subject, threads = 1))[1],
  coverage = class(fast_coverage(query))[1]
)

ret_classes
#>            hits           count             any         nearest            join 
#>          "Hits"       "integer"       "logical"        "DFrame"    "data.frame" 
#>        coverage 
#> "SimpleRleList"

Common return shapes:

Hits: pairwise overlap relationships
integer vector: one summary value per query
logical vector: one yes/no value per query
DataFrame: nearest-neighbor tables
data.frame: join or coverage summary tables
GRanges / IRanges: transformed intervals
RleList or Rle: base-resolution coverage

0.9 Reusing the Same Subject with an Index

If you will query the same subject repeatedly, build an index once and reuse it.

subject_index <- fast_build_index(subject)
subject_index
#> <fast_ranges_index>
#>   subject class: GRanges 
#>   subject ranges: 7 
#>   partitions: 3

Now use that index in later overlap calls:

hits_from_index <- fast_find_overlaps(query, subject_index, threads = 2)

identical(
  cbind(S4Vectors::queryHits(hits_from_index), S4Vectors::subjectHits(hits_from_index)),
  cbind(S4Vectors::queryHits(hits), S4Vectors::subjectHits(hits))
)
#> [1] TRUE

When should you build an index?

yes: many query batches against the same subject
yes: benchmarking repeated overlap workloads
no: one small one-off overlap call
no: when you need subject metadata directly in a join output

0.10 Overlap Joins

The join family turns overlap hits into beginner-friendly tables.

joined_inner <- fast_overlap_join(query, subject, join = "inner", threads = 2)
head(joined_inner)
#>   query_id subject_id query_seqnames query_start query_end query_width
#> 1        1          1           chr1         100       159          60
#> 3        3          3           chr1         350       419          70
#> 4        4          4           chr2          40        79          40
#> 5        5          5           chr2         300       379          80
#> 6        6          6           chr3          10        34          25
#>   query_strand query_query_id query_score subject_seqnames subject_start
#> 1            +             q1           8             chr1            90
#> 3            -             q3           6             chr1           320
#> 4            *             q4           3             chr2            20
#> 5            +             q5           9             chr2           330
#> 6            -             q6           5             chr3             1
#>   subject_end subject_width subject_strand subject_gene_id subject_biotype
#> 1         169            80              +              gA        promoter
#> 3         369            50              -              gC        enhancer
#> 4          89            70              +              gD        promoter
#> 5         419            90              *              gE       gene_body
#> 6          40            40              -              gF        promoter

Important columns:

query_id: row index from query
subject_id: matching row index from subject
query_*: columns copied from query
subject_*: columns copied from subject

Left join keeps all query rows:

joined_left <- fast_left_overlap_join(query, subject, threads = 2)
head(joined_left)
#>   query_id subject_id query_seqnames query_start query_end query_width
#> 1        1          1           chr1         100       159          60
#> 3        3          3           chr1         350       419          70
#> 4        4          4           chr2          40        79          40
#> 5        5          5           chr2         300       379          80
#> 6        6          6           chr3          10        34          25
#> 2        2         NA           chr1         180       269          90
#>   query_strand query_query_id query_score subject_seqnames subject_start
#> 1            +             q1           8             chr1            90
#> 3            -             q3           6             chr1           320
#> 4            *             q4           3             chr2            20
#> 5            +             q5           9             chr2           330
#> 6            -             q6           5             chr3             1
#> 2            +             q2          10             <NA>            NA
#>   subject_end subject_width subject_strand subject_gene_id subject_biotype
#> 1         169            80              +              gA        promoter
#> 3         369            50              -              gC        enhancer
#> 4          89            70              +              gD        promoter
#> 5         419            90              *              gE       gene_body
#> 6          40            40              -              gF        promoter
#> 2          NA            NA           <NA>            <NA>            <NA>

Semi and anti joins keep only query rows:

semi_tbl <- fast_semi_overlap_join(query, subject, threads = 2)
anti_tbl <- fast_anti_overlap_join(query, subject, threads = 2)

head(semi_tbl)
#>   query_id overlap_count query_seqnames query_start query_end query_width
#> 1        1             1           chr1         100       159          60
#> 3        3             1           chr1         350       419          70
#> 4        4             1           chr2          40        79          40
#> 5        5             1           chr2         300       379          80
#> 6        6             1           chr3          10        34          25
#>   query_strand query_query_id query_score
#> 1            +             q1           8
#> 3            -             q3           6
#> 4            *             q4           3
#> 5            +             q5           9
#> 6            -             q6           5
head(anti_tbl)
#>   query_id overlap_count query_seqnames query_start query_end query_width
#> 2        2             0           chr1         180       269          90
#>   query_strand query_query_id query_score
#> 2            +             q2          10

0.11 Nearest and Directional Queries

These functions are useful when direct overlap is not the only biological question.

nearest_tbl <- fast_nearest(query, subject)
head(as.data.frame(nearest_tbl))
#>   query_id subject_id distance
#> 1        1          1        0
#> 2        2          1       10
#> 3        3          3        0
#> 4        4          4        0
#> 5        5          5        0
#> 6        6          6        0

Interpretation:

subject_id is the nearest subject for each matched query
distance = 0 means overlap
positive distance means the nearest subject is separated by a gap

Directional helpers return subject row indices:

fast_precede(query, subject)
#> [1] NA NA  2  5 NA NA
fast_follow(query, subject)
#> [1] NA  1 NA NA  4 NA

0.12 Grouped Overlap Summaries

Many analyses need more than raw hits. They need counts or scores summarized by subject annotation.

set.seed(1)
S4Vectors::mcols(subject)$group <- sample(
  c("promoter", "enhancer"),
  length(subject),
  replace = TRUE
)
S4Vectors::mcols(subject)$score <- seq_len(length(subject))

0.12.1 Count overlaps by group

by_group <- fast_count_overlaps_by_group(
  query,
  subject,
  group_col = "group",
  threads = 2
)

by_group
#>      enhancer promoter
#> [1,]        0        1
#> [2,]        0        0
#> [3,]        0        1
#> [4,]        0        1
#> [5,]        1        0
#> [6,]        0        1

Rows are query ranges. Columns are subject groups.

0.12.2 Aggregate a subject score over overlaps

sum_score <- fast_overlap_aggregate(
  query,
  subject,
  value_col = "score",
  fun = "sum",
  threads = 2
)

sum_score
#> [1]  1 NA  3  4  5  6

This pattern is useful when subject intervals carry signal, annotation scores, or other numeric metadata.

0.13 Self-Overlaps, Clusters, and Sliding Windows

Self-overlaps compare a range set to itself:

self_hits <- fast_self_overlaps(query, threads = 2)
self_hits
#> Hits object with 0 hits and 0 metadata columns:
#>         from        to
#>    <integer> <integer>
#>   -------
#>   nLnode: 6 / nRnode: 6

Clusters identify connected components in the self-overlap graph:

clusters <- fast_cluster_overlaps(query, return = "data.frame")
head(clusters)
#>   range_id cluster_id cluster_size
#> 1        1          1            1
#> 2        2          2            1
#> 3        3          3            1
#> 4        4          4            1
#> 5        5          5            1
#> 6        6          6            1

Sliding-window summaries convert interval data into a regular table:

window_counts <- fast_window_count_overlaps(
  query,
  subject,
  window_width = 1000L,
  step_width = 500L,
  threads = 2
)
#> Warning in .merge_two_Seqinfo_objects(x, y): The 2 combined objects have no sequence levels in common. (Use
#>   suppressWarnings() to suppress this warning.)
#> Warning in .merge_two_Seqinfo_objects(x, y): The 2 combined objects have no sequence levels in common. (Use
#>   suppressWarnings() to suppress this warning.)

head(window_counts)
#>   window_id seqnames start  end overlap_count
#> 1         1     chr1   100 1099             3
#> 2         2     chr2    40 1039             2
#> 3         3     chr3    10 1009             2

0.14 Range Algebra

These functions work on the ranges themselves rather than overlap hit pairs.

query_reduced <- fast_reduce(query)
query_disjoint <- fast_disjoin(query)
query_gaps <- fast_gaps(query)

c(
  original = length(query),
  reduced = length(query_reduced),
  disjoint = length(query_disjoint),
  gaps = length(query_gaps)
)
#> original  reduced disjoint     gaps 
#>        6        6        6        6

You can also combine two range sets directly:

union_ranges <- fast_range_union(query, subject)
intersect_ranges <- fast_range_intersect(query, subject)
setdiff_ranges <- fast_range_setdiff(query, subject)

c(
  union = length(union_ranges),
  intersect = length(intersect_ranges),
  setdiff = length(setdiff_ranges)
)
#>     union intersect   setdiff 
#>        10         3         4

0.15 Coverage and Binned Coverage

Coverage answers: “how many ranges cover each base position?”

query_cov <- fast_coverage(query)
query_cov
#> RleList of length 3
#> $chr1
#> integer-Rle of length 419 with 6 runs
#>   Lengths: 99 60 20 90 80 70
#>   Values :  0  1  0  1  0  1
#> 
#> $chr2
#> integer-Rle of length 379 with 4 runs
#>   Lengths:  39  40 220  80
#>   Values :   0   1   0   1
#> 
#> $chr3
#> integer-Rle of length 34 with 2 runs
#>   Lengths:  9 25
#>   Values :  0  1

For plotting and downstream summaries, tile-based coverage is often easier to work with:

query_tiles <- fast_tile_coverage(
  query,
  tile_width = 1000L,
  step_width = 1000L
)

head(query_tiles)
#>   seqnames start end coverage_sum
#> 1     chr1     1 419          220
#> 2     chr2     1 379          120
#> 3     chr3     1  34           25

0.16 Iterating over Large Query Sets

Use the iterator API when you do not want to materialize all hits at once.

iter <- fast_find_overlaps_iter(query, subject_index, chunk_size = 2L, threads = 2)

fast_iter_has_next(iter)
#> [1] TRUE
chunk1 <- fast_iter_next(iter)
chunk1
#> Hits object with 1 hit and 0 metadata columns:
#>            from        to
#>       <integer> <integer>
#>   [1]         1         1
#>   -------
#>   nLnode: 6 / nRnode: 7
fast_iter_has_next(iter)
#> [1] TRUE

To start over:

fast_iter_reset(iter)
all_iter_hits <- fast_iter_collect(iter)
length(all_iter_hits)
#> [1] 5

fast_iter_collect() gathers the remaining chunks from the current iterator position. If you already consumed some chunks and want everything again, reset the iterator first.

0.17 Saving and Loading an Index

tmp_index <- tempfile(fileext = ".rds")

fast_save_index(subject_index, tmp_index)
loaded_index <- fast_load_index(tmp_index)
fast_index_stats(loaded_index)
#> DataFrame with 1 row and 6 columns
#>   subject_n partition_n   block_n seqlevel_n mean_partition_size index_size_mb
#>   <integer>   <integer> <integer>  <integer>           <numeric>     <numeric>
#> 1         7           3         3          3             2.33333     0.0043869

unlink(tmp_index)

This is useful when index construction is expensive and the same subject set will be queried again in a later session.

0.18 A Small Benchmark Example

The chunk below is only a small demonstration. Publication-scale benchmarking should use the Quarto workflow in inst/benchmarks.

base_time <- system.time({
  h_base <- GenomicRanges::findOverlaps(query, subject, ignore.strand = FALSE)
})

fast_time <- system.time({
  h_fast <- fast_find_overlaps(query, subject_index, ignore_strand = FALSE, threads = 2)
})

rbind(
  genomic_ranges = base_time[1:3],
  fast_ranges = fast_time[1:3]
)
#>                user.self sys.self elapsed
#> genomic_ranges     0.015        0   0.015
#> fast_ranges        0.004        0   0.003

c(base_hits = length(h_base), fast_hits = length(h_fast))
#> base_hits fast_hits 
#>         5         5

0.19 Benchmark Resources

The vignette example above is intentionally small. For large benchmark studies, repeated-query experiments, and figure-rich interpretation, use the benchmark resources maintained in the repository:

This separation is deliberate:

the vignette stays fast and reproducible on ordinary machines
the benchmark runner can be executed on larger servers
the interpretation report can be rendered later from saved result tables

0.20 Package Example Files

The package also ships example BED files and benchmarking helpers:

c(
  list.files(system.file("extdata", package = "fastRanges")),
  list.files(system.file("benchmarks", package = "fastRanges"))
)
#> [1] "query_peaks.bed"    "subject_genes.bed"  "01"                
#> [4] "README.md"          "validate_outputs.R"

0.21 Session Info

sessionInfo()
#> R version 4.6.0 alpha (2026-04-05 r89794)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] GenomicRanges_1.63.2 Seqinfo_1.1.0        IRanges_2.45.0      
#> [4] S4Vectors_0.49.2     BiocGenerics_0.57.1  generics_0.1.4      
#> [7] fastRanges_0.99.2    BiocStyle_2.39.0    
#> 
#> loaded via a namespace (and not attached):
#>  [1] httr_1.4.8          cli_3.6.6           knitr_1.51         
#>  [4] rlang_1.2.0         xfun_0.57           otel_0.2.0         
#>  [7] UCSC.utils_1.7.1    jsonlite_2.0.0      htmltools_0.5.9    
#> [10] sass_0.4.10         rmarkdown_2.31      evaluate_1.0.5     
#> [13] jquerylib_0.1.4     fastmap_1.2.0       GenomeInfoDb_1.47.2
#> [16] yaml_2.3.12         lifecycle_1.0.5     bookdown_0.46      
#> [19] BiocManager_1.30.27 compiler_4.6.0      Rcpp_1.1.1-1       
#> [22] XVector_0.51.0      digest_0.6.39       R6_2.6.1           
#> [25] bslib_0.10.0        tools_4.6.0         cachem_1.1.0

fastRanges: A Practical Introduction to Genomic Interval Analysis

17 April 2026

Contents

0.1 Overview

0.2 A Mental Model

0.3 Which Function Should You Use?

0.4 The Example Data

0.5 Core Overlap Operations

0.5.1 Return all hit pairs

0.5.2 Count matches per query

0.5.3 Ask only whether a match exists

0.6 Important Arguments

0.6.1 `type`

0.6.2 `max_gap`

0.6.3 `min_overlap`

0.6.4 `ignore_strand`

0.6.5 `threads` and `deterministic`

0.7 Bioconductor Compatibility

0.7.1 What is currently supported?

0.7.2 What is not currently supported?

0.7.3 Empty ranges

0.7.4 Direct vs indexed usage

0.7.5 `deterministic`

0.8 Return Types You Will See Most Often

0.9 Reusing the Same Subject with an Index

0.10 Overlap Joins

0.11 Nearest and Directional Queries

0.12 Grouped Overlap Summaries

0.12.1 Count overlaps by group

0.12.2 Aggregate a subject score over overlaps

0.13 Self-Overlaps, Clusters, and Sliding Windows

0.14 Range Algebra

0.15 Coverage and Binned Coverage

0.16 Iterating over Large Query Sets

0.17 Saving and Loading an Index

0.18 A Small Benchmark Example

0.19 Benchmark Resources

0.20 Package Example Files

0.21 Session Info

fastRanges: A Practical Introduction to Genomic Interval Analysis

17 April 2026

Contents

0.1 Overview

0.2 A Mental Model

0.3 Which Function Should You Use?

0.4 The Example Data

0.5 Core Overlap Operations

0.5.1 Return all hit pairs

0.5.2 Count matches per query

0.5.3 Ask only whether a match exists

0.6 Important Arguments

0.6.1 type

0.6.2 max_gap

0.6.3 min_overlap

0.6.4 ignore_strand

0.6.5 threads and deterministic

0.7 Bioconductor Compatibility

0.7.1 What is currently supported?

0.7.2 What is not currently supported?

0.7.3 Empty ranges

0.7.4 Direct vs indexed usage

0.7.5 deterministic

0.8 Return Types You Will See Most Often

0.9 Reusing the Same Subject with an Index

0.10 Overlap Joins

0.11 Nearest and Directional Queries

0.12 Grouped Overlap Summaries

0.12.1 Count overlaps by group

0.12.2 Aggregate a subject score over overlaps

0.13 Self-Overlaps, Clusters, and Sliding Windows

0.14 Range Algebra

0.15 Coverage and Binned Coverage

0.16 Iterating over Large Query Sets

0.17 Saving and Loading an Index

0.18 A Small Benchmark Example

0.19 Benchmark Resources

0.20 Package Example Files

0.21 Session Info

0.6.1 `type`

0.6.2 `max_gap`

0.6.3 `min_overlap`

0.6.4 `ignore_strand`

0.6.5 `threads` and `deterministic`

0.7.5 `deterministic`