Transforms and scales chromatin interaction data to prepare it for visualization. Applies user-defined scaling functions (e.g., log transformation) to interaction scores and handles missing values.
Arguments
- data
Input data in one of these formats:
ChromatinContacts object with imported interactions
GInteractions object with score metadata
data.frame or tibble with columns: seqnames1, start1, end1, seqnames2, start2, end2, plus score column
- scale_column
Character. Name of column containing values to scale. Common options:
"balanced": ICE-normalized counts (recommended)"count": raw contact countsAny other numeric metadata column
- scale_method
Function to apply for transformation. Common options:
log10: log10 transformation (default for most Hi-C data)log2: log2 transformationlog1p: log(x + 1) transformation (handles zeros)identityorfunction(x) x: no transformationCustom function: any function that takes numeric vector and returns numeric vector
- remove_na
Logical. Whether to remove rows with NA or infinite values after scaling (default: FALSE). Set TRUE to remove missing data that could cause visualization issues.
Value
Tibble (data frame) with standardized columns:
seqnames1,start1,end1: First anchor coordinatesseqnames2,start2,end2: Second anchor coordinatesscore: Transformed and scaled values
Details
Processing steps
Convert input to tibble format
Apply
scale_methodfunction toscale_columnCreate new
scorecolumn with transformed valuesOptionally remove NA/infinite values
Squish extreme outliers to prevent visualization artifacts
Examples
if (FALSE) { # \dontrun{
# Load Hi-C data
cc <- ChromatinContacts("file.cool") |> import()
# Standard log10 scaling of balanced data
scaled_data <- scaleData(cc, "balanced", log10)
# Raw counts without transformation
scaled_raw <- scaleData(cc, "count", function(x) x)
# Log2 scaling with NA removal
scaled_clean <- scaleData(cc, "balanced", log2, remove_na = TRUE)
# Use with plotting
library(ggplot2)
ggplot() +
geom_hic(data = scaleData(cc, "balanced", log10),
aes(seqnames1 = seqnames1, start1 = start1, end1 = end1,
seqnames2 = seqnames2, start2 = start2, end2 = end2,
fill = score))
} # }
