Skip to contents

geom_manhattan creates a Manhattan plot for visualizing GWAS or QTL data. The function automatically detects whether to use regional mode (single chromosome) or genome-wide mode (multiple chromosomes) based on the data.

In regional mode (single chromosome): Uses scale_x_genome_region() for x-axis formatting consistent with other track functions like ez_coverage and ez_gene. This mode is suitable for LocusZoom-style plots and can be stacked with other tracks.

In genome-wide mode (multiple chromosomes): Uses cumulative base pair positions with chromosome labels on the x-axis and alternating colors per chromosome.

Usage

geom_manhattan(
  mapping = NULL,
  data = NULL,
  region = NULL,
  mode = c("auto", "regional", "genome_wide"),
  stat = "identity",
  position = "identity",
  na.rm = FALSE,
  show.legend = NA,
  inherit.aes = TRUE,
  chr = NULL,
  bp = NULL,
  p = NULL,
  snp = NULL,
  logp = TRUE,
  size = 0.5,
  color = "grey50",
  lead_snp = NULL,
  r2 = NULL,
  colors = NULL,
  highlight_snps = NULL,
  highlight_color = "purple",
  highlight_shape = 18,
  threshold_p = NULL,
  threshold_color = "red",
  threshold_linetype = 2,
  color_by = "auto",
  x_axis_label = NULL,
  y_axis_label = NULL,
  ...
)

Arguments

mapping

Set of aesthetic mappings created by aes(). If not specified, the default mappings are used.

data

A data.frame containing the data to be plotted. Must include columns for chromosome, base pair position, and p-value. Optional column for SNP identifier.

region

Optional genomic region string (e.g., "chr1:1000000-2000000") to force regional mode and set x-axis limits. When provided, data is NOT filtered (use ez_manhattan for filtering).

mode

Plot mode: "auto" (default, detect from data), "regional", or "genome_wide".

stat

The statistical transformation to apply to the data (default: "identity").

position

Position adjustment, either as a string naming a position adjustment function, or the result of a call to a position adjustment function.

na.rm

If FALSE (default), missing values are removed with a warning. If TRUE, missing values are silently removed.

show.legend

Logical. Should this layer be displayed in the legend? NA for automatic, TRUE always, FALSE never.

inherit.aes

If FALSE, overrides the default aesthetics; if TRUE, inherits them.

chr

Name of the chromosome column in data. Supports both GWAS-style ("CHR") and GRanges-style ("seqnames") conventions. Default: auto-detect.

bp

Name of the base pair position column in data. Supports both GWAS-style ("BP") and GRanges-style ("start") conventions. Default: auto-detect.

p

Name of the p-value column in data. Supports "P", "pvalue", "p.value", etc. Default: auto-detect.

snp

Name of the SNP identifier column in data (default: "SNP" or "snp").

logp

Logical. If TRUE (default), -log10() transformation is applied to p-values.

size

Point size (default: 0.5).

color

Default point color for regional mode when color_by is not "r2" (default: "grey50").

lead_snp

Vector of SNP IDs to highlight (also accepts lead.snp for backward compatibility).

r2

Vector of R-squared values for linkage disequilibrium (LD) coloring. Should be in same order as data rows.

colors

Vector of colors for coloring points. Usage depends on color_by:

  • For discrete columns: colors are recycled/mapped to factor levels

  • For continuous columns: colors define a gradient (default: viridis-like palette)

  • Default c("grey", "skyblue") is suitable for alternating chromosome colors

highlight_snps

Data frame of SNPs to highlight, with columns matching chr, bp, and p.

highlight_color

Color for highlighted SNPs (default: "purple").

highlight_shape

Shape for highlighted SNPs (default: 18).

threshold_p

A numeric value for the p-value threshold to draw a horizontal line (e.g., 5e-8).

threshold_color

Color for the threshold line (default: "red").

threshold_linetype

Linetype for the threshold line (default: 2).

color_by

How points should be colored. Can be:

  • A column name in data (e.g., "CHR", "gene", "maf"): Colors by that column's values. Discrete columns use colors as a manual palette; continuous columns use a gradient.

  • "r2": Special mode using LD-based gradient coloring (requires r2 parameter)

  • "none": Single color specified by color parameter

  • "auto" (default): Uses "r2" if r2 is provided, otherwise "none"

x_axis_label

X-axis label (default: NULL, auto-generated).

y_axis_label

Label for the y-axis (default: expression for -log10(P)).

...

Additional arguments passed to ggplot2::geom_point().

Value

A list of ggplot2 layers and scales.