Skip to contents

This article collects notes on Bioconductor packages, made available here to faciliate their use and extensions.

pkgs <- c("AnnotationDbi", "AnnotationFilter", "ComplexHeatmap", "DESeq2", "EnsDb.Hsapiens.v86",
          "FlowSorted.DLPFC.450k", "GeneNet", "GenomicFeatures", "IlluminaHumanMethylation450kmanifest",
          "OUTRIDER","RColorBrewer", "RMariaDB", "Rgraphviz", "S4Vectors", "SummarizedExperiment",
          "TxDb.Hsapiens.UCSC.hg38.knownGene", "bladderbatch", "clusterProfiler",
          "corpcor", "doParallel", "ensembldb", "fdrtool", "graph", "graphite", "heatmaply",
          "minfi", "org.Hs.eg.db", "plyr", "quantro", "recount3", "sva")
es_pkgs <- c("Biobase", "arrayQualityMetrics", "dplyr", "knitr", "mclust", "pQTLtools", "rgl", "scatterplot3d")
se_pkgs <- c("BiocGenerics", "GenomicRanges", "IRanges", "MsCoreUtils", "SummarizedExperiment", "impute")
sp_pkgs <- c("Biostrings", "CAMERA", "MSnbase", "MSstats", "Spectra", "mzR", "protViz", "rawrr")
pkgs <- c(pkgs, es_pkgs,se_pkgs,sp_pkgs)
for (p in pkgs) if (length(grep(paste("^package:", p, "$", sep=""), search())) == 0) {
    if (!requireNamespace(p)) warning(paste0("This vignette needs package `", p, "'; please install"))
}
invisible(suppressMessages(lapply(pkgs, require, character.only = TRUE)))

1 liftover

See inst/turbomanin the source, https://github.com/jinghuazhao/pQTLtools/tree/master/inst/turboman, or turboman/ directory in the installed package.

2 ExpressionSet

We start with Bioconductor/Biobase’s ExpressionSet example and finish with a real application.

dataDirectory <- system.file("extdata", package="Biobase")
exprsFile <- file.path(dataDirectory, "exprsData.txt")
exprs <- as.matrix(read.table(exprsFile, header=TRUE, sep="\t", row.names=1, as.is=TRUE))
pDataFile <- file.path(dataDirectory, "pData.txt")
pData <- read.table(pDataFile, row.names=1, header=TRUE, sep="\t")
all(rownames(pData)==colnames(exprs))
metadata <- data.frame(labelDescription=c("Patient gender",
                                          "Case/control status",
                                          "Tumor progress on XYZ scale"),
                       row.names=c("gender", "type", "score"))
phenoData <- Biobase::AnnotatedDataFrame(data=pData, varMetadata=metadata)
experimentData <- Biobase::MIAME(name="Pierre Fermat",
                                 lab="Francis Galton Lab",
                                 contact="pfermat@lab.not.exist",
                                 title="Smoking-Cancer Experiment",
                                 abstract="An example ExpressionSet",
                                 url="www.lab.not.exist",
                                 other=list(notes="Created from text files"))
exampleSet <- pQTLtools::make_ExpressionSet(exprs,phenoData,experimentData=experimentData,
                                            annotation="hgu95av2")
data(sample.ExpressionSet, package="Biobase")
identical(exampleSet,sample.ExpressionSet)

2.1 data.frame

The great benefit is to use the object directly as a data.frame.

lm.result <- Biobase::esApply(exampleSet,1,function(x) lm(score~gender+x))
beta.x <- unlist(lapply(lapply(lm.result,coef),"[",3))
beta.x[1]
#> AFFX-MurIL2_at.x 
#>    -0.0001907472
lm(score~gender+AFFX.MurIL2_at,data=exampleSet)
#> 
#> Call:
#> lm(formula = score ~ gender + AFFX.MurIL2_at, data = exampleSet)
#> 
#> Coefficients:
#>    (Intercept)      genderMale  AFFX.MurIL2_at  
#>      0.5531725       0.0098932      -0.0001907

2.2 Composite plots

We wish to examine the distribution of each feature via histogram, scatter and boxplot. One could resort to esApply() for its simplicity as before

invisible(Biobase::esApply(exampleSet[1:2],1,function(x)
                           {par(mfrow=c(3,1));boxplot(x);hist(x);plot(x)}
))

but it is nicer to add feature name in the title.

par(mfrow=c(1,3))
f <- Biobase::featureNames(exampleSet[1:2])
invisible(sapply(f,function(x) {
                     d <- t(Biobase::exprs(exampleSet[x]))
                     fn <- Biobase::featureNames(exampleSet[x])
                     hist(d,main="",xlab=fn); plot(d, ylab=fn); boxplot(d,ylab=fn)
                   }
          )
)
Histogram, scatter & boxplots

Figure 2.1: Histogram, scatter & boxplots

Histogram, scatter & boxplots

Figure 2.2: Histogram, scatter & boxplots

where the expression set is indexed using feature name(s).

2.3 Outlier detections

This illustrates one mechanism,

list_outliers <- function(es, method="upperquartile")
                 arrayQualityMetrics::outliers(exprs(es),method=method)
for (method in c("KS","sum","upperquartile"))
{
  ZWK_outliers <- list_outliers(protein_ZWK,method=method)
  print(ZWK_outliers@statistic[ZWK_outliers@which])
}

2.4 Clustering

We employ model-based clustering absed on principal compoents to see potential groupings in the data,

  pca <- prcomp(na.omit(t(Biobase::exprs(exampleSet))), rank=10, scale=TRUE)
  pc1pc2pc3 <- with(pca,x)[,1:3]
  mc <- mclust::Mclust(pc1pc2pc3,G=3)
  with(mc, {
      cols <- c("blue","red", "purple")
      s3d <- scatterplot3d::scatterplot3d(with(pca,x[,c(2,1,3)]),
                                          color=cols[classification],
                                          pch=16,
                                          type="h",
                                          main="Plot of the PC1, PC2 and PC3")
      s3d.coords <- s3d$xyz.convert(with(pca,x[,c(2,1,3)]))
      text(s3d.coords$x, 
           s3d.coords$y,   
           cex = 1.2,
           col = cols[classification],
           labels = row.names(pc1pc2pc3),
           pos = 4)
      legend("right", legend=levels(as.factor(classification)), col=cols[classification], pch=16)
      rgl::open3d(width = 500, height = 500)
      rgl::plot3d(with(pca,x[,c(2,1,3)]),cex=1.2,col=cols[classification],size=5)
      rgl::text3d(with(pca,x[,c(2,1,3)]),cex=1.2,col=cols[classification],texts=row.names(pc1pc2pc3))
      htmlwidgets::saveWidget(rgl::rglwidget(), file = "mcpca3d.html")
  })
Three-group Clustering

Figure 2.3: Three-group Clustering

#> Warning in par3d(userMatrix = structure(c(1, 0, 0, 0, 0, 0.342020143325668, :
#> parameter "width" cannot be set
#> Warning in par3d(userMatrix = structure(c(1, 0, 0, 0, 0, 0.342020143325668, :
#> parameter "height" cannot be set

An interactive version is also available,

2.5 Data transformation

Suppose we wish use log2 for those greater than zero but set those negative values to be missing.

log2.na <- function(x) log2(ifelse(x>0, x, NA))
Biobase::exprs(exampleSet) <- log2.na(Biobase::exprs(exampleSet))

2.6 Limit of detection (LOD)

We generate a lod.max ~ U[0,1] variable to experiment

Biobase::fData(exampleSet)
#> data frame with 0 columns and 500 rows
Biobase::fData(exampleSet)$lod.max <- apply(Biobase::exprs(exampleSet),1,quantile,runif(nrow(exampleSet)))
lod <- pQTLtools::get.prop.below.LLOD(exampleSet)
x <- dplyr::arrange(Biobase::fData(lod),desc(pc.belowLOD.new))
knitr::kable(head(lod))
AFFX.MurIL2_at AFFX.MurIL10_at AFFX.MurIL4_at AFFX.MurFAS_at AFFX.BioB.5_at AFFX.BioB.M_at gender type score
A 192.7420 97.13700 45.81920 22.54450 96.7875 89.0730 Female Control 0.75
B 85.7533 126.19600 8.83135 3.60093 30.4380 25.8461 Male Case 0.40
C 176.7570 77.92160 33.06320 14.68830 46.1271 57.2033 Male Control 0.73
D 135.5750 93.37130 28.70720 12.33970 70.9319 69.9766 Male Case 0.42
E 64.4939 24.39860 5.94492 36.86630 56.1744 49.5822 Female Case 0.93
F 76.3569 85.50880 28.29250 11.25680 42.6756 26.1262 Male Control 0.22
G 160.5050 98.90860 30.96940 23.00340 86.5156 75.0083 Male Case 0.96
H 65.9631 81.69320 14.79230 16.21340 30.7927 42.3352 Male Case 0.79
I 56.9039 97.80150 14.23990 12.03750 19.7183 41.1207 Female Case 0.37
J 135.6080 90.48380 34.48740 4.54978 46.3520 91.5307 Male Control 0.63
K 63.4432 70.57330 20.35210 8.51782 39.1326 39.9136 Male Case 0.26
L 78.2126 94.54180 14.15540 27.28520 41.7698 49.8397 Female Control 0.36
M 83.0943 75.34550 20.62510 10.16160 80.2197 63.4794 Male Case 0.41
N 89.3372 68.58270 15.92310 20.24880 36.4903 24.7007 Male Case 0.80
O 91.0615 87.40500 20.15790 15.78490 36.4021 47.4641 Female Case 0.10
P 95.9377 84.45810 27.81390 14.32760 35.3054 47.3578 Female Control 0.41
Q 179.8450 87.68060 32.79110 15.94880 58.6239 58.1331 Female Case 0.16
R 152.4670 108.03200 33.52920 14.67530 114.0620 104.1220 Male Control 0.72
S 180.8340 134.26300 19.81720 -7.91911 93.4402 115.8310 Male Case 0.17
T 85.4146 91.40310 20.41900 12.88750 22.5168 58.1224 Female Case 0.74
U 157.9890 -8.68811 26.87200 11.91860 48.6462 73.4221 Male Control 0.35
V 146.8000 85.02120 31.14880 12.83240 90.2215 64.6066 Female Control 0.77
W 93.8829 79.29980 22.34200 11.13900 42.0053 40.3068 Male Control 0.27
X 103.8550 71.65520 19.01350 7.55564 57.5738 41.8209 Male Control 0.98
Y 64.4340 64.23690 12.16860 19.98490 44.8216 46.1087 Female Case 0.94
Z 175.6150 78.70680 17.37800 8.96849 61.7044 49.4122 Female Case 0.32
plot(x[,2], main="Random quantile cutoff", ylab="<lod%")
LOD based on a random cutoff

Figure 2.4: LOD based on a random cutoff

The quantity has been shown to have a big impact on protein abundance and therefore pQTL detection as is shown with a real example.

rm(list=ls())
dir <- "~/rds/post_qc_data/interval/phenotype/olink_proteomics/post-qc/"
eset <- readRDS(paste0(dir,"eset.inf1.flag.out.outlier.in.rds"))
x <- pQTLtools::get.prop.below.LLOD(eset)
annot <- Biobase::fData(x)
annot$MissDataProp <- as.numeric(gsub("\\%$", "", annot$MissDataProp))
plot(annot$MissDataProp, annot$pc.belowLOD.new, xlab="% <LLOD in Original",
     ylab="% <LLOD in post QC dataset", pch=19)
INF <- Sys.getenv("INF")
np <- read.table(paste(INF, "work", "INF1.merge.nosig", sep="/"), header=FALSE,
                 col.names = c("prot", "uniprot"))
kable(np, caption="Proteins with no pQTL")
annot$pQTL <- rep(NA, nrow(annot))
no.pQTL.ind <- which(annot$uniprot.id %in% np$uniprot)
annot$pQTL[no.pQTL.ind] <- "red"
annot$pQTL[-no.pQTL.ind] <- "blue"
annot <- annot[order(annot$pc.belowLOD.new, decreasing = TRUE),]
annot <- annot[-grep("^BDNF$", annot$ID),]
saveRDS(annot,file=file.path("~","pQTLtools","tests","annot.RDS"))
annot <- readRDS(file.path(find.package("pQTLtools"),"tests","annot.RDS")) %>%
         dplyr::left_join(pQTLdata::inf1[c("prot","target.short","alt_name")],by=c("ID"="prot")) %>%
         dplyr::mutate(prot=if_else(is.na(alt_name),target.short,alt_name),order=1:n()) %>%
         dplyr::arrange(desc(order))
xtick <- seq(1, nrow(annot))
attach(annot)
par(mar=c(10,5,1,1))
plot(100-pc.belowLOD.new,cex=2,pch=19,col=pQTL,xaxt="n",xlab="",ylab="",cex.axis=0.8)
text(66,16,"IL-17C",offset=0,pos=2,cex=1.5,font=2,srt=0)
arrows(67,16,71,16,lwd=2)
axis(1, at=xtick, labels=prot, lwd.tick=0.5, lwd=0, las=2, hadj=1, cex.axis=0.8)
mtext("% samples above LLOD",side=2,line=2.5,cex=1.2)
mtext("Ordered protein",side=1,line=6.5,cex=1.2,font=1)
legend(x=1,y=25,c("without pQTL","with pQTL"),box.lwd=0,cex=2,col=c("red","blue"),pch=19)
LOD in SCALLOP-INF/INTERVAL

Figure 2.5: LOD in SCALLOP-INF/INTERVAL

detach(annot)
knitr::kable(annot,caption="Summary statistics",row.names=FALSE)
Table 2.1: Summary statistics
ID dichot olink.id uniprot.id lod.max MissDataProp pc.belowLOD.new pQTL target.short alt_name prot order
CSF.1 FALSE 196_CSF-1 P09603 1.02 0.02 0.0000000 blue CSF-1 NA CSF-1 91
TNFB FALSE 195_TNFB P01374 1.10 0.04 0.0000000 blue TNFB NA TNFB 90
ADA FALSE 194_ADA P00813 1.80 0.04 0.0000000 blue ADA NA ADA 89
STAMPB FALSE 192_STAMPB O95630 1.40 0.06 0.0000000 red STAMPB NA STAMPB 88
CCL20 FALSE 190_CCL20 P78556 1.42 0.02 0.0000000 blue CCL20 NA CCL20 87
TWEAK FALSE 189_TWEAK O43508 1.78 0.02 0.0000000 blue TWEAK NA TWEAK 86
TNFRSF9 FALSE 187_TNFRSF9 Q07011 1.86 0.02 0.0000000 blue TNFRSF9 NA TNFRSF9 85
CX3CL1 FALSE 186_CX3CL1 P78423 1.76 0.02 0.0000000 blue CX3CL1 NA CX3CL1 84
CCL25 FALSE 185_CCL25 O15444 1.32 0.02 0.0000000 blue CCL25 NA CCL25 83
CASP.8 FALSE 184_CASP-8 Q14790 1.38 0.04 0.0000000 blue CASP-8 NA CASP-8 82
MCP.2 FALSE 183_MCP-2 P80075 2.10 0.02 0.0000000 blue MCP-2 CCL8 CCL8 81
FGF.19 FALSE 179_FGF-19 O95750 1.25 0.02 0.0000000 blue FGF-19 NA FGF-19 80
CD40 FALSE 176_CD40 P25942 1.91 0.04 0.0000000 blue CD40 NA CD40 79
DNER FALSE 174_DNER Q8NFT8  1.63 0.02 0.0000000 blue DNER NA DNER 78
4E.BP1 FALSE 170_4E-BP1 Q13541 1.04 0.10 0.0000000 blue 4EBP1 NA 4EBP1 77
CXCL10 FALSE 169_CXCL10 P02778 2.03 0.02 0.0000000 blue CXCL10 NA CXCL10 76
CXCL6 FALSE 168_CXCL6 P80162 1.80 0.02 0.0000000 blue CXCL6 CXCL6 CXCL6 75
Flt3L FALSE 167_Flt3L P49771 2.12 0.02 0.0000000 blue FIt3L NA FIt3L 74
CD5 FALSE 165_CD5 P06127 1.42 0.02 0.0000000 blue CD5 NA CD5 73
CCL23 FALSE 164_CCL23 P55773 2.10 0.02 0.0000000 blue CCL23 NA CCL23 72
MMP.10 FALSE 161_MMP-10 P09238 1.56 0.02 0.0000000 blue MMP-10 NA MMP-10 71
IL.12B FALSE 157_IL-12B P29460 0.85 0.02 0.0000000 blue IL-12B NA IL-12B 70
HGF FALSE 156_HGF P14210 1.77 0.02 0.0000000 blue HGF NA HGF 69
TRANCE FALSE 155_TRANCE O14788 1.51 0.04 0.0000000 blue TRANCE NA TRANCE 68
CXCL5 FALSE 154_CXCL5 P42830 1.92 0.02 0.0000000 blue CXCL5 NA CXCL5 67
PD.L1 FALSE 152_PD-L1 Q9NZQ7 2.14 0.04 0.0000000 blue PD-L1 NA PD-L1 66
IL.18R1 FALSE 151_IL-18R1 Q13478 1.21 0.02 0.0000000 blue IL-18R1 NA IL-18R1 65
IL.10RB FALSE 149_IL-10RB Q08334 1.72 0.02 0.0000000 blue IL10RB NA IL10RB 64
CCL19 FALSE 145_CCL19 Q99731 2.13 0.02 0.0000000 blue CCL19 NA CCL19 63
FGF.21 FALSE 144_FGF-21 Q9NSA1 1.14 0.02 0.0000000 blue FGF-21 NA FGF-21 62
LIF.R FALSE 143_LIF-R P42702 2.08 0.04 0.0000000 blue LIF-R NA LIF-R 61
FGF.23 FALSE 139_FGF-23 Q9GZV9 0.66 0.04 0.0000000 blue FGF-23 NA FGF-23 60
TNFSF14 FALSE 138_TNFSF14 O43557 1.81 0.02 0.0000000 blue TNFSF14 NA TNFSF14 59
CCL11 FALSE 137_CCL11 P51671 2.09 0.02 0.0000000 blue CCL11 CCL11 CCL11 58
MCP.4 FALSE 136_MCP-4 Q99616 0.56 0.04 0.0000000 blue MCP-4 CCL13 CCL13 57
TGF.alpha FALSE 135_TGF-alpha P01135 0.07 0.02 0.0000000 blue TGF-alpha NA TGF-alpha 56
IL.18 FALSE 133_IL-18 Q14116 2.79 0.02 0.0000000 blue IL-18 NA IL-18 55
SCF FALSE 132_SCF P21583 2.30 0.02 0.0000000 blue SCF NA SCF 54
CD6 FALSE 131_CD6 Q8WWJ7 1.78 0.04 0.0000000 blue CD6 NA CD6 53
CCL4 FALSE 130_CCL4 P13236 2.15 0.02 0.0000000 blue CCL4 NA CCL4 52
CXCL1 FALSE 128_CXCL1 P09341 3.45 0.02 0.0000000 blue CXCL1 NA CXCL1 51
OSM FALSE 126_OSM P13725 1.32 0.06 0.0000000 blue OSM NA OSM 50
CST5 FALSE 123_CST5 P28325 4.34 0.04 0.0000000 blue CST5 NA CST5 49
CXCL9 FALSE 122_CXCL9 Q07325 1.04 0.02 0.0000000 blue CXCL9 NA CXCL9 48
TRAIL FALSE 120_TRAIL P50591 1.22 0.02 0.0000000 blue TRAIL NA TRAIL 47
CXCL11 FALSE 117_CXCL11 O14625 1.40 0.02 0.0000000 blue CXCL11 NA CXCL11 46
MCP.1 FALSE 115_MCP-1 P13500 1.90 0.02 0.0000000 blue MCP-1 CCL2 CCL2 45
uPA FALSE 112_uPA P00749 1.75 0.02 0.0000000 blue uPA NA uPA 44
LAP.TGF.beta.1 FALSE 111_LAP TGF-beta-1 P01137 0.86 0.02 0.0000000 blue LAP TGF-beta-1 NA LAP TGF-beta-1 43
OPG FALSE 110_OPG O00300 1.81 0.02 0.0000000 blue OPG NA OPG 42
CD244 FALSE 108_CD244 Q9BZW8 1.09 0.02 0.0000000 blue CD244 NA CD244 41
CDCP1 FALSE 107_CDCP1 Q9H5V8 0.21 0.04 0.0000000 blue CDCP1 NA CDCP1 40
VEGF.A FALSE 102_VEGF-A P15692 3.15 0.02 0.0000000 blue VEGF_A NA VEGF_A 39
MIP.1.alpha FALSE 166_MIP-1 alpha P10147 1.92 0.06 0.0203998 blue MIP-1 alpha NA MIP-1 alpha 38
MMP.1 FALSE 142_MMP-1 P03956 1.89 0.06 0.0407997 blue MMP-1 NA MMP-1 37
SIRT2 FALSE 172_SIRT2 Q8IXJ6 1.52 1.00 0.0815993 blue SIRT2 NA SIRT2 36
IL.8 FALSE 101_IL-8 P10145 3.17 0.25 0.2447980 blue IL-8 NA IL-8 35
IL.7 FALSE 109_IL-7 P13232 1.39 1.43 0.7547940 blue IL-7 NA IL-7 34
SLAMF1 FALSE 134_SLAMF1 Q13291 1.55 2.13 1.5299878 blue SLAMF1 NA SLAMF1 33
EN.RAGE FALSE 175_EN-RAGE P80511 1.57 3.28 3.5291718 blue EN-RAGE NA EN-RAGE 32
Beta.NGF FALSE 153_Beta-NGF P01138 1.23 3.96 4.2227662 blue Beta-NGF NA Beta-NGF 31
CCL28 FALSE 173_CCL28 Q9NRJ3 0.92 5.49 5.2835577 red CCL28 NA CCL28 30
IL.10 FALSE 162_IL-10 P22301 1.50 7.71 7.0583435 blue IL-10 NA IL-10 29
AXIN1 FALSE 118_AXIN1 O15169 1.30 13.04 10.8731130 red AXIN1 NA AXIN1 28
NT.3 FALSE 188_NT-3 P20783 0.85 12.77 13.2394941 blue NT-3 NA NT-3 27
FGF.5 FALSE 141_FGF-5 Q8NF90 1.00 31.71 30.3957568 blue FGF-5 NA FGF-5 26
ST1A1 FALSE 191_ST1A1 P50225 1.34 32.68 31.2525500 blue ST1A1 NA ST1A1 25
GDNF FALSE 106_GDNF P39905 1.13 43.26 42.4520604 blue hGDNF NA hGDNF 24
IL.10RA FALSE 140_IL-10RA Q13651 1.14 53.30 53.9779682 red IL-10RA NA IL-10RA 23
IL.6 FALSE 113_IL-6 P05231 1.97 69.06 68.6250510 blue IL-6 NA IL-6 22
IL.5 FALSE 193_IL-5 P05113 1.55 78.21 77.8457772 red IL-5 NA IL-5 21
MCP.3 FALSE 105_MCP-3 P80098 1.31 79.30 79.0697674 blue MCP-3 CCL7 CCL7 20
IL.17C FALSE 114_IL-17C Q9P0M4 1.28 83.07 83.3741330 blue IL-17C NA IL-17C 19
IL.17A FALSE 116_IL-17A Q16552 0.97 84.73 84.7001224 red IL-17A NA IL-17A 18
IL.4 FALSE 180_IL-4 P05112 1.11 93.91 94.4104447 red IL-4 NA IL-4 17
LIF FALSE 181_LIF P15018 1.45 94.90 94.8184415 red LIF NA LIF 16
TNF FALSE 163_TNF P01375 1.19 95.22 95.1244390 red TNF NA TNF 15
IL.20RA FALSE 121_IL-20RA Q9UHF4 0.79 95.16 95.7772338 red IL-20RA NA IL-20RA 14
ARTN FALSE 160_ARTN Q5T4W7 0.73 95.84 95.9404325 red ARTN NA ARTN 13
IL.15RA FALSE 148_IL-15RA Q13261 1.18 96.10 96.1648307 blue IL-15RA NA IL-15RA 12
IL.13 FALSE 159_IL-13 P35225 1.71 96.51 96.5524276 red IL-13 NA IL-13 11
IL.20 FALSE 171_IL-20 Q9NYY1 1.30 96.72 96.7156263 red IL-20 NA IL-20 10
IL.24 FALSE 158_IL-24 Q13007 1.95 97.13 97.0828233 red IL-24 NA IL-24 9
IL.2RB FALSE 124_IL-2RB P14784 1.47 97.11 97.1032232 red IL-2RB NA IL-2RB 8
NRTN FALSE 182_NRTN Q99748 1.39 97.11 97.1236230 red NRTN NA NRTN 7
IL.33 FALSE 177_IL-33 O95760 1.17 98.54 98.5516116 red IL-33 NA IL-33 6
TSLP FALSE 129_TSLP Q969D9 1.80 99.49 99.4900041 red TSLP NA TSLP 5
IFN.gamma FALSE 178_IFN-gamma P01579 1.40 99.53 99.5308038 red IFN-gamma NA IFN-gamma 4
IL.22.RA1 FALSE 150_IL-22 RA1 Q8N6P7 1.79 99.61 99.6124031 red IL-22RA1 NA IL-22RA1 3
IL.2 FALSE 127_IL-2 P60568 0.93 99.86 99.8776010 red IL-2 NA IL-2 2
IL.1.alpha FALSE 125_IL-1 alpha P01583 4.76 99.88 99.8776010 blue IL-1 alpha NA IL-1 alpha 1

2.7 maEndtoEnd

Web: https://bioconductor.org/packages/release/workflows/html/maEndToEnd.html.

Examples can be found on PCA, heatmap, normalisation, linear models, and enrichment analysis from this Bioconductor package.

3 SummarizedExperiment

This is a more modern construct. Based on the documentation example, ranged summarized experiment (rse) and imputation are shown below.

set.seed(123)
nrows <- 20
ncols <- 4
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
missing_indices <- sample(length(counts), size = 5, replace = FALSE)
counts[missing_indices] <- NA
rowRanges <- GenomicRanges::GRanges(rep(c("chr1", "chr2"), c(1, 3) * nrows / 4),
                            IRanges::IRanges(floor(runif(nrows, 1e5, 1e6)), width=ncols),
                            strand=sample(c("+", "-"), nrows, TRUE),
                            feature_id=sprintf("ID%03d", 1:nrows))
colData <- S4Vectors::DataFrame(Treatment=rep(c("ChIP", "Input"), ncols/2),
                                row.names=LETTERS[1:ncols])
rse <- SummarizedExperiment::SummarizedExperiment(assays=S4Vectors::SimpleList(counts=counts),
                                                  rowRanges=rowRanges, colData=colData)
SummarizedExperiment::assay(rse)
#>               A         B         C           D
#>  [1,] 2876.4876 8895.5036 1428.8574 6651.486831
#>  [2,] 7883.2630 6928.3413 4146.0488  949.311769
#>  [3,] 4090.3602 6405.4276 4137.8295 3840.312408
#>  [4,] 8830.2910 9942.7035 3689.0857 2744.562062
#>  [5,] 9404.7324 6557.4023 1525.2950 8146.585749
#>  [6,]        NA 7085.5962 1388.9218 4485.714898
#>  [7,] 5281.5268 5441.1162 2331.1080 8100.833466
#>  [8,] 8924.2980 5941.8261 4660.1585 8124.082706
#>  [9,] 5514.7987 2892.3082 2660.4604 7943.628869
#> [10,] 4566.6907 1471.9894        NA 4398.877044
#> [11,] 9568.3766        NA  459.2658 7544.997111
#> [12,] 4533.8882 9023.0882 4422.5585          NA
#> [13,] 6776.0288 6907.3621 7989.4495 7102.113831
#> [14,] 5726.7614 7954.8787 1219.8707    7.247108
#> [15,] 1030.1439  247.1122 5609.9189 4753.690424
#> [16,] 8998.3499 4778.4819 2066.1074 2201.968733
#> [17,] 2461.6313 7584.8369 1276.1890 3798.785561
#> [18,]  421.5533 2164.8630 7533.3253 6128.097262
#> [19,] 3279.8793        NA 8950.5585 3518.627295
#> [20,] 9545.0820 2317.0262 3745.2533 1112.243108
imputed <- MsCoreUtils::impute_knn(as.matrix(SummarizedExperiment::assay(rse)),2)
imputed_counts <- MsCoreUtils::impute_RF(as.matrix(SummarizedExperiment::assay(rse)),2)
imputed-imputed_counts
#>              A         B        C        D
#>  [1,]    0.000    0.0000    0.000    0.000
#>  [2,]    0.000    0.0000    0.000    0.000
#>  [3,]    0.000    0.0000    0.000    0.000
#>  [4,]    0.000    0.0000    0.000    0.000
#>  [5,]    0.000    0.0000    0.000    0.000
#>  [6,] 2559.339    0.0000    0.000    0.000
#>  [7,]    0.000    0.0000    0.000    0.000
#>  [8,]    0.000    0.0000    0.000    0.000
#>  [9,]    0.000    0.0000    0.000    0.000
#> [10,]    0.000    0.0000 1791.909    0.000
#> [11,]    0.000 -709.2786    0.000    0.000
#> [12,]    0.000    0.0000    0.000 2467.614
#> [13,]    0.000    0.0000    0.000    0.000
#> [14,]    0.000    0.0000    0.000    0.000
#> [15,]    0.000    0.0000    0.000    0.000
#> [16,]    0.000    0.0000    0.000    0.000
#> [17,]    0.000    0.0000    0.000    0.000
#> [18,]    0.000    0.0000    0.000    0.000
#> [19,]    0.000 1298.3456    0.000    0.000
#> [20,]    0.000    0.0000    0.000    0.000
SummarizedExperiment::assays(rse) <- S4Vectors::SimpleList(counts=imputed_counts)
SummarizedExperiment::assay(rse)
#>               A         B         C           D
#>  [1,] 2876.4876 8895.5036 1428.8574 6651.486831
#>  [2,] 7883.2630 6928.3413 4146.0488  949.311769
#>  [3,] 4090.3602 6405.4276 4137.8295 3840.312408
#>  [4,] 8830.2910 9942.7035 3689.0857 2744.562062
#>  [5,] 9404.7324 6557.4023 1525.2950 8146.585749
#>  [6,] 4439.2947 7085.5962 1388.9218 4485.714898
#>  [7,] 5281.5268 5441.1162 2331.1080 8100.833466
#>  [8,] 8924.2980 5941.8261 4660.1585 8124.082706
#>  [9,] 5514.7987 2892.3082 2660.4604 7943.628869
#> [10,] 4566.6907 1471.9894 4812.3872 4398.877044
#> [11,] 9568.3766 5545.7565  459.2658 7544.997111
#> [12,] 4533.8882 9023.0882 4422.5585 4611.837102
#> [13,] 6776.0288 6907.3621 7989.4495 7102.113831
#> [14,] 5726.7614 7954.8787 1219.8707    7.247108
#> [15,] 1030.1439  247.1122 5609.9189 4753.690424
#> [16,] 8998.3499 4778.4819 2066.1074 2201.968733
#> [17,] 2461.6313 7584.8369 1276.1890 3798.785561
#> [18,]  421.5533 2164.8630 7533.3253 6128.097262
#> [19,] 3279.8793 4566.2289 8950.5585 3518.627295
#> [20,] 9545.0820 2317.0262 3745.2533 1112.243108
SummarizedExperiment::assays(rse) <- S4Vectors::endoapply(SummarizedExperiment::assays(rse), asinh)
SummarizedExperiment::assay(rse)
#>              A        B        C        D
#>  [1,] 8.657472 9.786448 7.957778 9.495743
#>  [2,] 9.665644 9.536523 9.023058 7.548885
#>  [3,] 9.009536 9.458048 9.021074 8.946456
#>  [4,] 9.779090 9.897741 8.906281 8.610524
#>  [5,] 9.842115 9.481497 8.023090 9.698501
#>  [6,] 9.091398 9.558966 7.929430 9.101800
#>  [7,] 9.265118 9.294887 8.447246 9.692869
#>  [8,] 9.789680 9.382919 9.139952 9.695735
#>  [9,] 9.308338 8.662957 8.579402 9.673273
#> [10,] 9.119691 7.987517 9.172096 9.082252
#> [11,] 9.859366 9.313936 6.822778 9.621787
#> [12,] 9.112482 9.800689 9.087621 9.129529
#> [13,] 9.514294 9.533490 9.679024 9.561295
#> [14,] 9.346053 9.674688 7.799647 2.678476
#> [15,] 7.630601 6.202994 9.325439 9.159824
#> [16,] 9.797944 9.165025 8.326569 8.390254
#> [17,] 8.501727 9.627054 7.844781 8.935584
#> [18,] 6.737095 8.373260 9.620239 9.413787
#> [19,] 8.788709 9.119590 9.792618 8.858973
#> [20,] 9.856929 8.441187 8.921392 7.707281

SummarizedExperiment::rowRanges(rse)
#> GRanges object with 20 ranges and 1 metadata column:
#>        seqnames        ranges strand |  feature_id
#>           <Rle>     <IRanges>  <Rle> | <character>
#>    [1]     chr1 409164-409167      - |       ID001
#>    [2]     chr1 691082-691085      + |       ID002
#>    [3]     chr1 388335-388338      + |       ID003
#>    [4]     chr1 268922-268925      - |       ID004
#>    [5]     chr1 804064-804067      - |       ID005
#>    ...      ...           ...    ... .         ...
#>   [16]     chr2 647861-647864      + |       ID016
#>   [17]     chr2 469620-469623      + |       ID017
#>   [18]     chr2 232385-232388      + |       ID018
#>   [19]     chr2 941769-941772      - |       ID019
#>   [20]     chr2 371106-371109      + |       ID020
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths
SummarizedExperiment::rowData(rse)
#> DataFrame with 20 rows and 1 column
#>      feature_id
#>     <character>
#> 1         ID001
#> 2         ID002
#> 3         ID003
#> 4         ID004
#> 5         ID005
#> ...         ...
#> 16        ID016
#> 17        ID017
#> 18        ID018
#> 19        ID019
#> 20        ID020
SummarizedExperiment::colData(rse)
#> DataFrame with 4 rows and 1 column
#>     Treatment
#>   <character>
#> A        ChIP
#> B       Input
#> C        ChIP
#> D       Input

4 Normalisation

4.1 ComBat

This is the documentation example, based on Bioconductor 3.14.

data(bladderdata, package="bladderbatch")
edat <- bladderEset[1:50]

pheno <- Biobase::pData(edat)
batch <- pheno$batch
table(batch)
#> batch
#>  1  2  3  4  5 
#> 11 18  4  5 19
quantro::matboxplot(edat,batch,cex.axis=0.6,notch=TRUE,pch=19,ylab="Expression")
ComBat example

Figure 4.1: ComBat example

quantro::matdensity(edat,batch,xlab=" ",ylab="density")
legend("topleft",legend=1:5,col=1:5,lty=1)
ComBat example

Figure 4.2: ComBat example


# 1. parametric adjustment
combat_edata1 <- sva::ComBat(dat=edat, batch=batch, par.prior=TRUE, prior.plots=TRUE)
#> Found5batches
#> Adjusting for0covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
ComBat example

Figure 4.3: ComBat example

#> Finding parametric adjustments
#> Adjusting the Data

# 2. non-parametric adjustment, mean-only version
combat_edata2 <- sva::ComBat(dat=edat, batch=batch, par.prior=FALSE, mean.only=TRUE)
#> Using the 'mean only' version of ComBat
#> Found5batches
#> Adjusting for0covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding nonparametric adjustments
#> Adjusting the Data

# 3. reference-batch version, with covariates
mod <- model.matrix(~as.factor(cancer), data=pheno)
combat_edata3 <- sva::ComBat(dat=edat, batch=batch, mod=mod, par.prior=TRUE, ref.batch=3, prior.plots=TRUE)
#> Using batch =3as a reference batch (this batch won't change)
#> Found5batches
#> Adjusting for2covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
ComBat example

Figure 4.4: ComBat example

#> Finding parametric adjustments
#> Adjusting the Data

4.2 quantro

This is also adapted from the package vignette but with FlowSorted.DLPFC.450k in place of FlowSorted.

data(FlowSorted.DLPFC.450k,package="FlowSorted.DLPFC.450k")
p <- getBeta(FlowSorted.DLPFC.450k,offset=100)
pd <- Biobase::pData(FlowSorted.DLPFC.450k)
quantro::matboxplot(p, groupFactor = pd$CellType, xaxt = "n", main = "Beta Values", pch=19)
quantro example

Figure 4.5: quantro example

quantro::matdensity(p, groupFactor = pd$CellType, xlab = " ", ylab = "density",
                    main = "Beta Values", brewer.n = 8, brewer.name = "Dark2")
legend('top', c("NeuN_neg", "NeuN_pos"), col = c(1, 2), lty = 1, lwd = 3)
quantro example

Figure 4.6: quantro example

qtest <- quantro::quantro(object = p, groupFactor = pd$CellType)
#> [quantro] Average medians of the distributions are 
#>                         not equal across groups.
#> [quantro] Calculating the quantro test statistic.
#> [quantro] No permutation testing performed. 
#>                          Use B > 0 for permutation testing.
if (FALSE)
{
  doParallel::registerDoParallel(cores=10)
  qtestPerm <- quantro::quantro(p, groupFactor = pd$CellType, B = 1000)
  quantro::quantroPlot(qtestPerm)
}

5 Outlier detection in RNA-Seq

The following is adapted from package OUTRIDER, noting its plotQQ() has issues

ctsFile <- system.file('extdata', 'KremerNBaderSmall.tsv', package='OUTRIDER')
ctsTable <- read.table(ctsFile, check.names=FALSE)
ods <- OUTRIDER::OutriderDataSet(countData=ctsTable)
ods <- OUTRIDER::filterExpression(ods, minCounts=TRUE, filterGenes=TRUE)
#> 229 genes did not pass the filter due to zero counts. This is 22.9% of the genes.
ods <- OUTRIDER::OUTRIDER(ods)
#> Tue Mar 25 10:41:03 2025: SizeFactor estimation ...
#> Tue Mar 25 10:41:03 2025: Controlling for confounders ...
#> Using estimated q with: 23
#> Tue Mar 25 10:41:03 2025: Using the autoencoder implementation for controlling.
#> [1] "Tue Mar 25 10:41:07 2025: Initial PCA loss: 4.73997327486604"
#> [1] "Tue Mar 25 10:41:11 2025: Iteration: 1 loss: 4.19416269506454"
#> [1] "Tue Mar 25 10:41:14 2025: Iteration: 2 loss: 4.17550752619036"
#> [1] "Tue Mar 25 10:41:16 2025: Iteration: 3 loss: 4.16639365666912"
#> [1] "Tue Mar 25 10:41:18 2025: Iteration: 4 loss: 4.16142359470334"
#> [1] "Tue Mar 25 10:41:20 2025: Iteration: 5 loss: 4.15785341106832"
#> [1] "Tue Mar 25 10:41:22 2025: Iteration: 6 loss: 4.15533343090454"
#> [1] "Tue Mar 25 10:41:23 2025: Iteration: 7 loss: 4.15339892434562"
#> [1] "Tue Mar 25 10:41:25 2025: Iteration: 8 loss: 4.15175378925737"
#> [1] "Tue Mar 25 10:41:26 2025: Iteration: 9 loss: 4.15069201289976"
#> [1] "Tue Mar 25 10:41:27 2025: Iteration: 10 loss: 4.1501222741797"
#> [1] "Tue Mar 25 10:41:29 2025: Iteration: 11 loss: 4.14904801948777"
#> [1] "Tue Mar 25 10:41:30 2025: Iteration: 12 loss: 4.14805452270911"
#> [1] "Tue Mar 25 10:41:31 2025: Iteration: 13 loss: 4.14796461892655"
#> [1] "Tue Mar 25 10:41:32 2025: Iteration: 14 loss: 4.14722109314569"
#> [1] "Tue Mar 25 10:41:33 2025: Iteration: 15 loss: 4.14696284053289"
#> Time difference of 25.25854 secs
#> [1] "Tue Mar 25 10:41:33 2025: 15 Final nb-AE loss: 4.14696284053289"
#> Tue Mar 25 10:41:33 2025: Used the autoencoder implementation for controlling.
#> Tue Mar 25 10:41:33 2025: P-value calculation ...
#> Tue Mar 25 10:41:37 2025: Zscore calculation ...
res <- OUTRIDER::results(ods)
knitr::kable(res,caption="A check list of outliers")
Table 5.1: A check list of outliers
geneID sampleID pValue padjust zScore l2fc rawcounts meanRawcounts normcounts meanCorrected theta aberrant AberrantBySample AberrantByGene padj_rank
ATAD3C MUC1360 0.0e+00 0.0000001 5.28 1.88 948 82.29 248.14 67.33 16.48 TRUE 1 1 1
NBPF15 MUC1351 0.0e+00 0.0000035 5.80 0.77 7591 4224.88 7028.35 4121.14 112.78 TRUE 2 1 1
MSTO1 MUC1367 0.0e+00 0.0000177 -6.26 -0.81 761 1327.87 727.74 1276.02 153.26 TRUE 1 1 1
HDAC1 MUC1350 0.0e+00 0.0001133 -5.88 -0.78 2215 3805.56 2128.19 3649.04 135.60 TRUE 1 1 1
DCAF6 MUC1374 1.0e-07 0.0004182 -5.67 -0.61 2348 4869.53 3085.40 4724.25 196.31 TRUE 1 1 1
NBPF16 MUC1351 2.0e-07 0.0006791 4.83 0.67 4014 2459.90 3822.18 2402.36 107.85 TRUE 2 1 2
FAM102B MUC1363 1.2e-06 0.0067682 -5.37 -1.29 455 1138.75 440.62 1076.81 41.98 TRUE 1 1 1
LOC100288142 MUC1361 3.0e-06 0.0167927 4.25 0.85 637 356.12 622.40 345.76 57.44 TRUE 1 1 1
TARDBP MUC0486 7.3e-06 0.0407384 -4.56 -0.32 5911 5780.34 4449.25 5565.44 464.81 TRUE 1 1 1
ZMPSTE24 MUC1370 7.7e-06 0.0426990 4.28 0.38 6180 4026.77 5088.19 3905.38 273.05 TRUE 1 1 1
if ("geneID" %in% colnames(res) & FALSE)
  OUTRIDER::plotQQ(ods, res$geneID, global=TRUE)

6 Differential expression

ex <- DESeq2::makeExampleDESeqDataSet(m=4)
dds <- DESeq2::DESeq(ex)
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res <- DESeq2::results(dds, contrast=c("condition","B","A"))
rld <- DESeq2::rlogTransformation(ex, blind=TRUE)
dat <- DESeq2::plotPCA(rld, intgroup=c("condition"),returnData=TRUE)
#> using ntop=500 top features by variance
percentVar <- round(100 * attr(dat,"percentVar"))
ggplot2::ggplot(dat, ggplot2::aes(PC1, PC2, color=condition, shape=condition)) +
ggplot2::geom_point(size=3) +
ggplot2::xlab(paste0("PC1:",percentVar[1],"% variance")) +
ggplot2::ylab(paste0("PC2:",percentVar[2],"% variance"))
DESeq2 example

Figure 6.1: DESeq2 example

ex$condition <- relevel(ex$condition, ref="B")
dds2 <- DESeq2::DESeq(dds)
#> using pre-existing size factors
#> estimating dispersions
#> found already estimated dispersions, replacing these
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res <- DESeq2::results(dds2)
knitr::kable(head(as.data.frame(res)))
baseMean log2FoldChange lfcSE stat pvalue padj
gene1 8.884414 0.2449020 1.3370638 0.1831640 0.8546693 0.9896044
gene2 18.554877 -0.9545927 1.0945708 -0.8721161 0.3831451 0.9896044
gene3 10.398828 -0.0389598 1.2211301 -0.0319047 0.9745481 0.9987191
gene4 13.075617 1.1475896 1.1265255 1.0186983 0.3083462 0.9896044
gene5 10.629715 -0.3700499 1.2156482 -0.3044054 0.7608190 0.9896044
gene6 22.092276 -1.4125098 0.9883968 -1.4290918 0.1529778 0.9896044

See the package in action from a snakemake workflow1.

7 Gene co-expression and network analysis

A simple network is furnished with the GeneNet documentation example,

## A random network with 40 nodes 
# it contains 780=40*39/2 edges of which 5 percent (=39) are non-zero
true.pcor <- GeneNet::ggm.simulate.pcor(40)
  
# A data set with 40 observations
m.sim <- GeneNet::ggm.simulate.data(40, true.pcor)

# A simple estimate of partial correlations
estimated.pcor <- corpcor::cor2pcor( cor(m.sim) )

# A comparison of estimated and true values
sum((true.pcor-estimated.pcor)^2)
#> [1] 346.6915

# A slightly better estimate ...
estimated.pcor.2 <- GeneNet::ggm.estimate.pcor(m.sim)
#> Estimating optimal shrinkage intensity lambda (correlation matrix): 0.2942
sum((true.pcor-estimated.pcor.2)^2)
#> [1] 11.19183

## ecoli data 
data(ecoli, package="GeneNet")

# partial correlation matrix 
inferred.pcor <- GeneNet::ggm.estimate.pcor(ecoli)
#> Estimating optimal shrinkage intensity lambda (correlation matrix): 0.1804

# p-values, q-values and posterior probabilities for each potential edge 
test.results <- GeneNet::network.test.edges(inferred.pcor)
#> Estimate (local) false discovery rates (partial correlations):
#> Step 1... determine cutoff point
#> Step 2... estimate parameters of null distribution and eta0
#> Step 3... compute p-values and estimate empirical PDF/CDF
#> Step 4... compute q-values and local fdr
#> Step 5... prepare for plotting
GeneNet example

Figure 7.1: GeneNet example


# best 20 edges (strongest correlation)
test.results[1:20,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084

# network containing edges with prob > 0.9 (i.e. local fdr < 0.1)
net <- GeneNet::extract.network(test.results, cutoff.ggm=0.9)
#> 
#> Significant edges:  65 
#>     Corresponding to  1.26 %  of possible edges
net
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573

# significant based on FDR cutoff Q=0.05?
num.significant.1 <- sum(test.results$qval <= 0.05)
test.results[1:num.significant.1,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573
#> 66  0.07706275    27    58 4.876831e-04 3.606179e-02 0.8297811
#> 67 -0.07610709    16    83 5.730532e-04 4.085920e-02 0.8297811
#> 68  0.07550557    53    84 6.337143e-04 4.406472e-02 0.8297811

# significant based on "local fdr" cutoff (prob > 0.9)?
num.significant.2 <- sum(test.results$prob > 0.9)
test.results[test.results$prob > 0.9,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573

# parameters of the mixture distribution used to compute p-values etc.
c <- fdrtool::fdrtool(corpcor::sm2vec(inferred.pcor), statistic="correlation")
#> Step 1... determine cutoff point
#> Step 2... estimate parameters of null distribution and eta0
#> Step 3... compute p-values and estimate empirical PDF/CDF
#> Step 4... compute q-values and local fdr
#> Step 5... prepare for plotting
c$param
#>          cutoff N.cens      eta0     eta0.SE    kappa kappa.SE
#> [1,] 0.03553068   4352 0.9474623 0.005656465 2043.377 94.72267

## A random network with 20 nodes and 10 percent (=19) edges
true.pcor <- GeneNet::ggm.simulate.pcor(20, 0.1)

# convert to edge list
test.results <- GeneNet::ggm.list.edges(true.pcor)[1:19,]
nlab <- LETTERS[1:20]

# graphviz
# network.make.dot(filename="test.dot", test.results, nlab, main = "A graph")
# system("fdp -T svg -o test.svg test.dot")

# Rgraphviz
gr <- GeneNet::network.make.graph( test.results, nlab)
gr
#> A graphNEL graph with directed edges
#> Number of Nodes = 20 
#> Number of Edges = 38
num.nodes(gr)
#> [1] 20
edge.info(gr)
#> $weight
#>      A~H      D~F      D~O      E~R      F~L      F~G      G~R      G~O 
#> -0.76325 -0.47042 -0.53865 -0.52631 -0.05494  0.42383 -0.19385  0.32270 
#>      H~T      I~J      J~S      J~M      K~R      K~L      L~N      N~R 
#> -0.34139  0.54040  0.41751  0.62051  0.33471 -0.64564 -0.05633  0.55074 
#>      O~R      O~P      S~T 
#> -0.04365 -0.37113  0.57545 
#> 
#> $dir
#>    A~H    D~F    D~O    E~R    F~L    F~G    G~R    G~O    H~T    I~J    J~S 
#> "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" 
#>    J~M    K~R    K~L    L~N    N~R    O~R    O~P    S~T 
#> "none" "none" "none" "none" "none" "none" "none" "none"
gr2 <- GeneNet::network.make.graph( test.results, nlab, drop.singles=TRUE)
gr2
#> A graphNEL graph with directed edges
#> Number of Nodes = 17 
#> Number of Edges = 38
GeneNet::num.nodes(gr2)
#> [1] 17
GeneNet::edge.info(gr2)
#> $weight
#>      A~H      D~F      D~O      E~R      F~L      F~G      G~R      G~O 
#> -0.76325 -0.47042 -0.53865 -0.52631 -0.05494  0.42383 -0.19385  0.32270 
#>      H~T      I~J      J~S      J~M      K~R      K~L      L~N      N~R 
#> -0.34139  0.54040  0.41751  0.62051  0.33471 -0.64564 -0.05633  0.55074 
#>      O~R      O~P      S~T 
#> -0.04365 -0.37113  0.57545 
#> 
#> $dir
#>    A~H    D~F    D~O    E~R    F~L    F~G    G~R    G~O    H~T    I~J    J~S 
#> "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" 
#>    J~M    K~R    K~L    L~N    N~R    O~R    O~P    S~T 
#> "none" "none" "none" "none" "none" "none" "none" "none"

# plot network
plot(gr, "fdp")
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
#> Warning in arrows(head_from[1], head_from[2], head_to[1], head_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
#> Warning in arrows(head_from[1], head_from[2], head_to[1], head_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
GeneNet example

Figure 7.2: GeneNet example

plot(gr2, "fdp")
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
GeneNet example

Figure 7.3: GeneNet example

A side-by-side heatmaps

set.seed(123454321)
m <- matrix(runif(2500),50)
r <- cor(m)
g <- as.matrix(r>=0.7)+0
f1 <- ComplexHeatmap::Heatmap(r)
f2 <- ComplexHeatmap::Heatmap(g)
f <- f1+f2
ComplexHeatmap::draw(f)
Heatmaps

Figure 7.4: Heatmaps


df <- heatmaply::normalize(mtcars)
hm <- heatmaply::heatmaply(df,k_col=5,k_row=5,
                           colors = grDevices::colorRampPalette(RColorBrewer::brewer.pal(3, "RdBu"))(256))
htmlwidgets::saveWidget(hm,file="heatmaply.html")
htmltools::tags$iframe(src = "heatmaply.html", width = "100%", height = "550px")

so we have heatmaply.html and a module analysis with WGCNA,

pwr <- c(1:10, seq(from=12, to=30, by=2))
sft <- WGCNA::pickSoftThreshold(dat, powerVector=pwr, verbose=5)
ADJ <- abs(cor(dat, method="pearson", use="pairwise.complete.obs"))^6
dissADJ <- 1-ADJ
dissTOM <- WGCNA::TOMdist(ADJ)
TOM <- WGCNA::TOMsimilarityFromExpr(dat)
Tree <- hclust(as.dist(1-TOM), method="average")
for(j in pwr)
{
  pam_name <- paste0("pam",j)
  assign(pam_name, cluster::pam(as.dist(dissADJ),j))
  pamTOM_name <- paste0("pamTOM",j)
  assign(pamTOM_name,cluster::pam(as.dist(dissTOM),j))
  tc <- table(get(pam_name)$clustering,get(pamTOM_name)$clustering)
  print(tc)
  print(diag(tc))
}
colorStaticTOM <- as.character(WGCNA::cutreeStaticColor(Tree,cutHeight=.99,minSize=5))
colorDynamicTOM <- WGCNA::labels2colors(cutreeDynamic(Tree,method="tree",minClusterSize=5))
Colors <- data.frame(pamTOM6$clustering,colorStaticTOM,colorDynamicTOM)
WGCNA::plotDendroAndColors(Tree, Colors, dendroLabels=FALSE, hang=0.03, addGuide=TRUE, guideHang=0.05)
meg <- WGCNA::moduleEigengenes(dat, color=1:ncol(dat), softPower=6)

8 Metadata

This section is based on package recount3.

hs <- recount3::available_projects()
dim(subset(hs,file_source=="gtex"))
recount3::annotation_options("human")
blood_rse <- recount3::create_rse(subset(hs,project=="BLOOD"))
S4Vectors::metadata(blood_rse)
SummarizedExperiment::rowRanges(blood_rse)
colnames(SummarizedExperiment::colData(blood_rse))[1:20]
recount3::expand_sra_attributes(blood_rse)

9 Pathway and enrichment analysis

reactome <- graphite::pathways("hsapiens", "reactome")
kegg <- graphite::pathways("hsapiens","kegg")
pharmgkb <- graphite::pathways("hsapiens","pharmgkb")
nodes(kegg[[21]])
#>  [1] "ENTREZID:102724560" "ENTREZID:10993"     "ENTREZID:113675"   
#>  [4] "ENTREZID:132158"    "ENTREZID:1610"      "ENTREZID:1738"     
#>  [7] "ENTREZID:1757"      "ENTREZID:189"       "ENTREZID:211"      
#> [10] "ENTREZID:212"       "ENTREZID:23464"     "ENTREZID:2593"     
#> [13] "ENTREZID:26227"     "ENTREZID:2628"      "ENTREZID:27232"    
#> [16] "ENTREZID:2731"      "ENTREZID:275"       "ENTREZID:29958"    
#> [19] "ENTREZID:29968"     "ENTREZID:441531"    "ENTREZID:501"      
#> [22] "ENTREZID:51268"     "ENTREZID:5223"      "ENTREZID:5224"     
#> [25] "ENTREZID:55349"     "ENTREZID:5723"      "ENTREZID:635"      
#> [28] "ENTREZID:63826"     "ENTREZID:6470"      "ENTREZID:6472"     
#> [31] "ENTREZID:64902"     "ENTREZID:669"       "ENTREZID:875"      
#> [34] "ENTREZID:9380"      "ENTREZID:1491"
kegg_t2g <- ldply(lapply(kegg, nodes), data.frame)
names(kegg_t2g) <- c("gs_name", "gene_symbol")
VEGF <- subset(kegg_t2g,gs_name=="VEGF signaling pathway")[[2]]
eKEGG <- clusterProfiler::enricher(gene=VEGF, TERM2GENE = kegg_t2g,
                                   universe=,
                                   pAdjustMethod = "BH",
                                   pvalueCutoff = 0.1, qvalueCutoff = 0.05,
                                   minGSSize = 10, maxGSSize = 500)

10 Transcript databases

An overview of annotation is available2.

# columns(org.Hs.eg.db)
# keyref <- keys(org.Hs.eg.db, keytype="ENTREZID")
# symbol_uniprot <- select(org.Hs.eg.db,keys=keyref,columns = c("SYMBOL","UNIPROT"))
# subset(symbol_uniprot,SYMBOL=="MC4R")

x <- EnsDb.Hsapiens.v86
ensembldb::listColumns(x, "protein", skip.keys=TRUE)
#> [1] "tx_id"            "protein_id"       "protein_sequence"
ensembldb::listGenebiotypes(x)
#>  [1] "protein_coding"                     "unitary_pseudogene"                
#>  [3] "unprocessed_pseudogene"             "processed_pseudogene"              
#>  [5] "processed_transcript"               "transcribed_unprocessed_pseudogene"
#>  [7] "antisense"                          "transcribed_unitary_pseudogene"    
#>  [9] "polymorphic_pseudogene"             "lincRNA"                           
#> [11] "sense_intronic"                     "transcribed_processed_pseudogene"  
#> [13] "sense_overlapping"                  "IG_V_pseudogene"                   
#> [15] "pseudogene"                         "TR_V_gene"                         
#> [17] "3prime_overlapping_ncRNA"           "IG_V_gene"                         
#> [19] "bidirectional_promoter_lncRNA"      "snRNA"                             
#> [21] "miRNA"                              "misc_RNA"                          
#> [23] "snoRNA"                             "rRNA"                              
#> [25] "Mt_tRNA"                            "Mt_rRNA"                           
#> [27] "IG_C_gene"                          "IG_J_gene"                         
#> [29] "TR_J_gene"                          "TR_C_gene"                         
#> [31] "TR_V_pseudogene"                    "TR_J_pseudogene"                   
#> [33] "IG_D_gene"                          "ribozyme"                          
#> [35] "IG_C_pseudogene"                    "TR_D_gene"                         
#> [37] "TEC"                                "IG_J_pseudogene"                   
#> [39] "scRNA"                              "scaRNA"                            
#> [41] "vaultRNA"                           "sRNA"                              
#> [43] "macro_lncRNA"                       "non_coding"                        
#> [45] "IG_pseudogene"                      "LRG_gene"
ensembldb::listTxbiotypes(x)
#>  [1] "protein_coding"                     "processed_transcript"              
#>  [3] "nonsense_mediated_decay"            "retained_intron"                   
#>  [5] "unitary_pseudogene"                 "TEC"                               
#>  [7] "miRNA"                              "misc_RNA"                          
#>  [9] "non_stop_decay"                     "unprocessed_pseudogene"            
#> [11] "processed_pseudogene"               "transcribed_unprocessed_pseudogene"
#> [13] "lincRNA"                            "antisense"                         
#> [15] "transcribed_unitary_pseudogene"     "polymorphic_pseudogene"            
#> [17] "sense_intronic"                     "transcribed_processed_pseudogene"  
#> [19] "sense_overlapping"                  "IG_V_pseudogene"                   
#> [21] "pseudogene"                         "TR_V_gene"                         
#> [23] "3prime_overlapping_ncRNA"           "IG_V_gene"                         
#> [25] "bidirectional_promoter_lncRNA"      "snRNA"                             
#> [27] "snoRNA"                             "rRNA"                              
#> [29] "Mt_tRNA"                            "Mt_rRNA"                           
#> [31] "IG_C_gene"                          "IG_J_gene"                         
#> [33] "TR_J_gene"                          "TR_C_gene"                         
#> [35] "TR_V_pseudogene"                    "TR_J_pseudogene"                   
#> [37] "IG_D_gene"                          "ribozyme"                          
#> [39] "IG_C_pseudogene"                    "TR_D_gene"                         
#> [41] "IG_J_pseudogene"                    "scRNA"                             
#> [43] "scaRNA"                             "vaultRNA"                          
#> [45] "sRNA"                               "macro_lncRNA"                      
#> [47] "non_coding"                         "IG_pseudogene"                     
#> [49] "LRG_gene"
ensembldb::listTables(x)
#> $gene
#> [1] "gene_id"          "gene_name"        "gene_biotype"     "gene_seq_start"  
#> [5] "gene_seq_end"     "seq_name"         "seq_strand"       "seq_coord_system"
#> [9] "symbol"          
#> 
#> $tx
#> [1] "tx_id"            "tx_biotype"       "tx_seq_start"     "tx_seq_end"      
#> [5] "tx_cds_seq_start" "tx_cds_seq_end"   "gene_id"          "tx_name"         
#> 
#> $tx2exon
#> [1] "tx_id"    "exon_id"  "exon_idx"
#> 
#> $exon
#> [1] "exon_id"        "exon_seq_start" "exon_seq_end"  
#> 
#> $chromosome
#> [1] "seq_name"    "seq_length"  "is_circular"
#> 
#> $protein
#> [1] "tx_id"            "protein_id"       "protein_sequence"
#> 
#> $uniprot
#> [1] "protein_id"           "uniprot_id"           "uniprot_db"          
#> [4] "uniprot_mapping_type"
#> 
#> $protein_domain
#> [1] "protein_id"            "protein_domain_id"     "protein_domain_source"
#> [4] "interpro_accession"    "prot_dom_start"        "prot_dom_end"         
#> 
#> $entrezgene
#> [1] "gene_id"  "entrezid"
#> 
#> $metadata
#> [1] "name"  "value"
ensembldb::metadata(x)
#>                  name                               value
#> 1             Db type                               EnsDb
#> 2     Type of Gene ID                     Ensembl Gene ID
#> 3  Supporting package                           ensembldb
#> 4       Db created by ensembldb package from Bioconductor
#> 5      script_version                               0.3.0
#> 6       Creation time            Thu May 18 16:32:27 2017
#> 7     ensembl_version                                  86
#> 8        ensembl_host                           localhost
#> 9            Organism                        homo_sapiens
#> 10        taxonomy_id                                9606
#> 11       genome_build                              GRCh38
#> 12    DBSCHEMAVERSION                                 2.0
ensembldb::organism(x)
#> [1] "Homo sapiens"
ensembldb::returnFilterColumns(x)
#> [1] TRUE
ensembldb::seqinfo(x)
#> Seqinfo object with 357 sequences (1 circular) from GRCh38 genome:
#>   seqnames seqlengths isCircular genome
#>   X         156040895      FALSE GRCh38
#>   20         64444167      FALSE GRCh38
#>   1         248956422      FALSE GRCh38
#>   6         170805979      FALSE GRCh38
#>   3         198295559      FALSE GRCh38
#>   ...             ...        ...    ...
#>   LRG_239      114904      FALSE GRCh38
#>   LRG_311      115492      FALSE GRCh38
#>   LRG_721       33396      FALSE GRCh38
#>   LRG_741      231167      FALSE GRCh38
#>   LRG_93        22459      FALSE GRCh38
ensembldb::seqlevels(x)
#>   [1] "1"                                     
#>   [2] "10"                                    
#>   [3] "11"                                    
#>   [4] "12"                                    
#>   [5] "13"                                    
#>   [6] "14"                                    
#>   [7] "15"                                    
#>   [8] "16"                                    
#>   [9] "17"                                    
#>  [10] "18"                                    
#>  [11] "19"                                    
#>  [12] "2"                                     
#>  [13] "20"                                    
#>  [14] "21"                                    
#>  [15] "22"                                    
#>  [16] "3"                                     
#>  [17] "4"                                     
#>  [18] "5"                                     
#>  [19] "6"                                     
#>  [20] "7"                                     
#>  [21] "8"                                     
#>  [22] "9"                                     
#>  [23] "CHR_HG107_PATCH"                       
#>  [24] "CHR_HG126_PATCH"                       
#>  [25] "CHR_HG1311_PATCH"                      
#>  [26] "CHR_HG1342_HG2282_PATCH"               
#>  [27] "CHR_HG1362_PATCH"                      
#>  [28] "CHR_HG142_HG150_NOVEL_TEST"            
#>  [29] "CHR_HG151_NOVEL_TEST"                  
#>  [30] "CHR_HG1651_PATCH"                      
#>  [31] "CHR_HG1832_PATCH"                      
#>  [32] "CHR_HG2021_PATCH"                      
#>  [33] "CHR_HG2022_PATCH"                      
#>  [34] "CHR_HG2023_PATCH"                      
#>  [35] "CHR_HG2030_PATCH"                      
#>  [36] "CHR_HG2058_PATCH"                      
#>  [37] "CHR_HG2062_PATCH"                      
#>  [38] "CHR_HG2063_PATCH"                      
#>  [39] "CHR_HG2066_PATCH"                      
#>  [40] "CHR_HG2072_PATCH"                      
#>  [41] "CHR_HG2095_PATCH"                      
#>  [42] "CHR_HG2104_PATCH"                      
#>  [43] "CHR_HG2116_PATCH"                      
#>  [44] "CHR_HG2128_PATCH"                      
#>  [45] "CHR_HG2191_PATCH"                      
#>  [46] "CHR_HG2213_PATCH"                      
#>  [47] "CHR_HG2217_PATCH"                      
#>  [48] "CHR_HG2232_PATCH"                      
#>  [49] "CHR_HG2233_PATCH"                      
#>  [50] "CHR_HG2235_PATCH"                      
#>  [51] "CHR_HG2239_PATCH"                      
#>  [52] "CHR_HG2247_PATCH"                      
#>  [53] "CHR_HG2249_PATCH"                      
#>  [54] "CHR_HG2288_HG2289_PATCH"               
#>  [55] "CHR_HG2290_PATCH"                      
#>  [56] "CHR_HG2291_PATCH"                      
#>  [57] "CHR_HG2334_PATCH"                      
#>  [58] "CHR_HG26_PATCH"                        
#>  [59] "CHR_HG986_PATCH"                       
#>  [60] "CHR_HSCHR10_1_CTG1"                    
#>  [61] "CHR_HSCHR10_1_CTG2"                    
#>  [62] "CHR_HSCHR10_1_CTG3"                    
#>  [63] "CHR_HSCHR10_1_CTG4"                    
#>  [64] "CHR_HSCHR10_1_CTG6"                    
#>  [65] "CHR_HSCHR11_1_CTG1_2"                  
#>  [66] "CHR_HSCHR11_1_CTG5"                    
#>  [67] "CHR_HSCHR11_1_CTG6"                    
#>  [68] "CHR_HSCHR11_1_CTG7"                    
#>  [69] "CHR_HSCHR11_1_CTG8"                    
#>  [70] "CHR_HSCHR11_2_CTG1"                    
#>  [71] "CHR_HSCHR11_2_CTG1_1"                  
#>  [72] "CHR_HSCHR11_3_CTG1"                    
#>  [73] "CHR_HSCHR12_1_CTG1"                    
#>  [74] "CHR_HSCHR12_1_CTG2_1"                  
#>  [75] "CHR_HSCHR12_2_CTG1"                    
#>  [76] "CHR_HSCHR12_2_CTG2"                    
#>  [77] "CHR_HSCHR12_2_CTG2_1"                  
#>  [78] "CHR_HSCHR12_3_CTG2"                    
#>  [79] "CHR_HSCHR12_3_CTG2_1"                  
#>  [80] "CHR_HSCHR12_4_CTG2"                    
#>  [81] "CHR_HSCHR12_4_CTG2_1"                  
#>  [82] "CHR_HSCHR12_5_CTG2"                    
#>  [83] "CHR_HSCHR12_5_CTG2_1"                  
#>  [84] "CHR_HSCHR12_6_CTG2_1"                  
#>  [85] "CHR_HSCHR13_1_CTG1"                    
#>  [86] "CHR_HSCHR13_1_CTG3"                    
#>  [87] "CHR_HSCHR13_1_CTG5"                    
#>  [88] "CHR_HSCHR13_1_CTG8"                    
#>  [89] "CHR_HSCHR14_1_CTG1"                    
#>  [90] "CHR_HSCHR14_2_CTG1"                    
#>  [91] "CHR_HSCHR14_3_CTG1"                    
#>  [92] "CHR_HSCHR14_7_CTG1"                    
#>  [93] "CHR_HSCHR15_1_CTG1"                    
#>  [94] "CHR_HSCHR15_1_CTG3"                    
#>  [95] "CHR_HSCHR15_1_CTG8"                    
#>  [96] "CHR_HSCHR15_2_CTG3"                    
#>  [97] "CHR_HSCHR15_2_CTG8"                    
#>  [98] "CHR_HSCHR15_3_CTG3"                    
#>  [99] "CHR_HSCHR15_3_CTG8"                    
#> [100] "CHR_HSCHR15_4_CTG8"                    
#> [101] "CHR_HSCHR15_5_CTG8"                    
#> [102] "CHR_HSCHR15_6_CTG8"                    
#> [103] "CHR_HSCHR16_1_CTG1"                    
#> [104] "CHR_HSCHR16_1_CTG3_1"                  
#> [105] "CHR_HSCHR16_2_CTG3_1"                  
#> [106] "CHR_HSCHR16_3_CTG1"                    
#> [107] "CHR_HSCHR16_4_CTG1"                    
#> [108] "CHR_HSCHR16_4_CTG3_1"                  
#> [109] "CHR_HSCHR16_5_CTG1"                    
#> [110] "CHR_HSCHR16_CTG2"                      
#> [111] "CHR_HSCHR17_10_CTG4"                   
#> [112] "CHR_HSCHR17_1_CTG1"                    
#> [113] "CHR_HSCHR17_1_CTG2"                    
#> [114] "CHR_HSCHR17_1_CTG4"                    
#> [115] "CHR_HSCHR17_1_CTG5"                    
#> [116] "CHR_HSCHR17_1_CTG9"                    
#> [117] "CHR_HSCHR17_2_CTG1"                    
#> [118] "CHR_HSCHR17_2_CTG2"                    
#> [119] "CHR_HSCHR17_2_CTG4"                    
#> [120] "CHR_HSCHR17_2_CTG5"                    
#> [121] "CHR_HSCHR17_3_CTG2"                    
#> [122] "CHR_HSCHR17_3_CTG4"                    
#> [123] "CHR_HSCHR17_4_CTG4"                    
#> [124] "CHR_HSCHR17_5_CTG4"                    
#> [125] "CHR_HSCHR17_6_CTG4"                    
#> [126] "CHR_HSCHR17_7_CTG4"                    
#> [127] "CHR_HSCHR17_8_CTG4"                    
#> [128] "CHR_HSCHR17_9_CTG4"                    
#> [129] "CHR_HSCHR18_1_CTG1_1"                  
#> [130] "CHR_HSCHR18_1_CTG2_1"                  
#> [131] "CHR_HSCHR18_2_CTG1_1"                  
#> [132] "CHR_HSCHR18_2_CTG2"                    
#> [133] "CHR_HSCHR18_2_CTG2_1"                  
#> [134] "CHR_HSCHR18_3_CTG2_1"                  
#> [135] "CHR_HSCHR18_5_CTG1_1"                  
#> [136] "CHR_HSCHR18_ALT21_CTG2_1"              
#> [137] "CHR_HSCHR18_ALT2_CTG2_1"               
#> [138] "CHR_HSCHR19KIR_ABC08_A1_HAP_CTG3_1"    
#> [139] "CHR_HSCHR19KIR_ABC08_AB_HAP_C_P_CTG3_1"
#> [140] "CHR_HSCHR19KIR_ABC08_AB_HAP_T_P_CTG3_1"
#> [141] "CHR_HSCHR19KIR_FH05_A_HAP_CTG3_1"      
#> [142] "CHR_HSCHR19KIR_FH05_B_HAP_CTG3_1"      
#> [143] "CHR_HSCHR19KIR_FH06_A_HAP_CTG3_1"      
#> [144] "CHR_HSCHR19KIR_FH06_BA1_HAP_CTG3_1"    
#> [145] "CHR_HSCHR19KIR_FH08_A_HAP_CTG3_1"      
#> [146] "CHR_HSCHR19KIR_FH08_BAX_HAP_CTG3_1"    
#> [147] "CHR_HSCHR19KIR_FH13_A_HAP_CTG3_1"      
#> [148] "CHR_HSCHR19KIR_FH13_BA2_HAP_CTG3_1"    
#> [149] "CHR_HSCHR19KIR_FH15_A_HAP_CTG3_1"      
#> [150] "CHR_HSCHR19KIR_FH15_B_HAP_CTG3_1"      
#> [151] "CHR_HSCHR19KIR_G085_A_HAP_CTG3_1"      
#> [152] "CHR_HSCHR19KIR_G085_BA1_HAP_CTG3_1"    
#> [153] "CHR_HSCHR19KIR_G248_A_HAP_CTG3_1"      
#> [154] "CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1"    
#> [155] "CHR_HSCHR19KIR_GRC212_AB_HAP_CTG3_1"   
#> [156] "CHR_HSCHR19KIR_GRC212_BA1_HAP_CTG3_1"  
#> [157] "CHR_HSCHR19KIR_LUCE_A_HAP_CTG3_1"      
#> [158] "CHR_HSCHR19KIR_LUCE_BDEL_HAP_CTG3_1"   
#> [159] "CHR_HSCHR19KIR_RP5_B_HAP_CTG3_1"       
#> [160] "CHR_HSCHR19KIR_RSH_A_HAP_CTG3_1"       
#> [161] "CHR_HSCHR19KIR_RSH_BA2_HAP_CTG3_1"     
#> [162] "CHR_HSCHR19KIR_T7526_A_HAP_CTG3_1"     
#> [163] "CHR_HSCHR19KIR_T7526_BDEL_HAP_CTG3_1"  
#> [164] "CHR_HSCHR19LRC_COX1_CTG3_1"            
#> [165] "CHR_HSCHR19LRC_COX2_CTG3_1"            
#> [166] "CHR_HSCHR19LRC_LRC_I_CTG3_1"           
#> [167] "CHR_HSCHR19LRC_LRC_J_CTG3_1"           
#> [168] "CHR_HSCHR19LRC_LRC_S_CTG3_1"           
#> [169] "CHR_HSCHR19LRC_LRC_T_CTG3_1"           
#> [170] "CHR_HSCHR19LRC_PGF1_CTG3_1"            
#> [171] "CHR_HSCHR19LRC_PGF2_CTG3_1"            
#> [172] "CHR_HSCHR19_1_CTG2"                    
#> [173] "CHR_HSCHR19_1_CTG3_1"                  
#> [174] "CHR_HSCHR19_2_CTG2"                    
#> [175] "CHR_HSCHR19_2_CTG3_1"                  
#> [176] "CHR_HSCHR19_3_CTG2"                    
#> [177] "CHR_HSCHR19_3_CTG3_1"                  
#> [178] "CHR_HSCHR19_4_CTG2"                    
#> [179] "CHR_HSCHR19_4_CTG3_1"                  
#> [180] "CHR_HSCHR19_5_CTG2"                    
#> [181] "CHR_HSCHR1_1_CTG11"                    
#> [182] "CHR_HSCHR1_1_CTG3"                     
#> [183] "CHR_HSCHR1_1_CTG31"                    
#> [184] "CHR_HSCHR1_1_CTG32_1"                  
#> [185] "CHR_HSCHR1_2_CTG3"                     
#> [186] "CHR_HSCHR1_2_CTG31"                    
#> [187] "CHR_HSCHR1_2_CTG32_1"                  
#> [188] "CHR_HSCHR1_3_CTG3"                     
#> [189] "CHR_HSCHR1_3_CTG31"                    
#> [190] "CHR_HSCHR1_3_CTG32_1"                  
#> [191] "CHR_HSCHR1_4_CTG3"                     
#> [192] "CHR_HSCHR1_4_CTG31"                    
#> [193] "CHR_HSCHR1_5_CTG3"                     
#> [194] "CHR_HSCHR1_5_CTG32_1"                  
#> [195] "CHR_HSCHR1_ALT2_1_CTG32_1"             
#> [196] "CHR_HSCHR20_1_CTG1"                    
#> [197] "CHR_HSCHR20_1_CTG2"                    
#> [198] "CHR_HSCHR20_1_CTG3"                    
#> [199] "CHR_HSCHR20_1_CTG4"                    
#> [200] "CHR_HSCHR21_2_CTG1_1"                  
#> [201] "CHR_HSCHR21_3_CTG1_1"                  
#> [202] "CHR_HSCHR21_4_CTG1_1"                  
#> [203] "CHR_HSCHR21_5_CTG2"                    
#> [204] "CHR_HSCHR21_6_CTG1_1"                  
#> [205] "CHR_HSCHR21_8_CTG1_1"                  
#> [206] "CHR_HSCHR22_1_CTG1"                    
#> [207] "CHR_HSCHR22_1_CTG2"                    
#> [208] "CHR_HSCHR22_1_CTG3"                    
#> [209] "CHR_HSCHR22_1_CTG4"                    
#> [210] "CHR_HSCHR22_1_CTG5"                    
#> [211] "CHR_HSCHR22_1_CTG6"                    
#> [212] "CHR_HSCHR22_1_CTG7"                    
#> [213] "CHR_HSCHR22_2_CTG1"                    
#> [214] "CHR_HSCHR22_3_CTG1"                    
#> [215] "CHR_HSCHR22_4_CTG1"                    
#> [216] "CHR_HSCHR22_5_CTG1"                    
#> [217] "CHR_HSCHR22_6_CTG1"                    
#> [218] "CHR_HSCHR22_7_CTG1"                    
#> [219] "CHR_HSCHR22_8_CTG1"                    
#> [220] "CHR_HSCHR2_1_CTG1"                     
#> [221] "CHR_HSCHR2_1_CTG15"                    
#> [222] "CHR_HSCHR2_1_CTG5"                     
#> [223] "CHR_HSCHR2_1_CTG7"                     
#> [224] "CHR_HSCHR2_1_CTG7_2"                   
#> [225] "CHR_HSCHR2_2_CTG1"                     
#> [226] "CHR_HSCHR2_2_CTG15"                    
#> [227] "CHR_HSCHR2_2_CTG7"                     
#> [228] "CHR_HSCHR2_2_CTG7_2"                   
#> [229] "CHR_HSCHR2_3_CTG1"                     
#> [230] "CHR_HSCHR2_3_CTG15"                    
#> [231] "CHR_HSCHR2_3_CTG7_2"                   
#> [232] "CHR_HSCHR2_4_CTG1"                     
#> [233] "CHR_HSCHR2_6_CTG7_2"                   
#> [234] "CHR_HSCHR3_1_CTG1"                     
#> [235] "CHR_HSCHR3_1_CTG2_1"                   
#> [236] "CHR_HSCHR3_1_CTG3"                     
#> [237] "CHR_HSCHR3_2_CTG2_1"                   
#> [238] "CHR_HSCHR3_2_CTG3"                     
#> [239] "CHR_HSCHR3_3_CTG1"                     
#> [240] "CHR_HSCHR3_3_CTG3"                     
#> [241] "CHR_HSCHR3_4_CTG2_1"                   
#> [242] "CHR_HSCHR3_4_CTG3"                     
#> [243] "CHR_HSCHR3_5_CTG2_1"                   
#> [244] "CHR_HSCHR3_5_CTG3"                     
#> [245] "CHR_HSCHR3_6_CTG3"                     
#> [246] "CHR_HSCHR3_7_CTG3"                     
#> [247] "CHR_HSCHR3_8_CTG3"                     
#> [248] "CHR_HSCHR3_9_CTG3"                     
#> [249] "CHR_HSCHR4_11_CTG12"                   
#> [250] "CHR_HSCHR4_1_CTG12"                    
#> [251] "CHR_HSCHR4_1_CTG4"                     
#> [252] "CHR_HSCHR4_1_CTG6"                     
#> [253] "CHR_HSCHR4_1_CTG9"                     
#> [254] "CHR_HSCHR4_2_CTG12"                    
#> [255] "CHR_HSCHR4_2_CTG4"                     
#> [256] "CHR_HSCHR4_3_CTG12"                    
#> [257] "CHR_HSCHR4_4_CTG12"                    
#> [258] "CHR_HSCHR4_5_CTG12"                    
#> [259] "CHR_HSCHR4_6_CTG12"                    
#> [260] "CHR_HSCHR4_7_CTG12"                    
#> [261] "CHR_HSCHR4_8_CTG12"                    
#> [262] "CHR_HSCHR4_9_CTG12"                    
#> [263] "CHR_HSCHR5_1_CTG1"                     
#> [264] "CHR_HSCHR5_1_CTG1_1"                   
#> [265] "CHR_HSCHR5_1_CTG5"                     
#> [266] "CHR_HSCHR5_2_CTG1"                     
#> [267] "CHR_HSCHR5_2_CTG1_1"                   
#> [268] "CHR_HSCHR5_2_CTG5"                     
#> [269] "CHR_HSCHR5_3_CTG1"                     
#> [270] "CHR_HSCHR5_3_CTG5"                     
#> [271] "CHR_HSCHR5_4_CTG1"                     
#> [272] "CHR_HSCHR5_4_CTG1_1"                   
#> [273] "CHR_HSCHR5_5_CTG1"                     
#> [274] "CHR_HSCHR5_6_CTG1"                     
#> [275] "CHR_HSCHR5_7_CTG1"                     
#> [276] "CHR_HSCHR6_1_CTG10"                    
#> [277] "CHR_HSCHR6_1_CTG2"                     
#> [278] "CHR_HSCHR6_1_CTG3"                     
#> [279] "CHR_HSCHR6_1_CTG4"                     
#> [280] "CHR_HSCHR6_1_CTG5"                     
#> [281] "CHR_HSCHR6_1_CTG6"                     
#> [282] "CHR_HSCHR6_1_CTG7"                     
#> [283] "CHR_HSCHR6_1_CTG8"                     
#> [284] "CHR_HSCHR6_1_CTG9"                     
#> [285] "CHR_HSCHR6_8_CTG1"                     
#> [286] "CHR_HSCHR6_MHC_APD_CTG1"               
#> [287] "CHR_HSCHR6_MHC_COX_CTG1"               
#> [288] "CHR_HSCHR6_MHC_DBB_CTG1"               
#> [289] "CHR_HSCHR6_MHC_MANN_CTG1"              
#> [290] "CHR_HSCHR6_MHC_MCF_CTG1"               
#> [291] "CHR_HSCHR6_MHC_QBL_CTG1"               
#> [292] "CHR_HSCHR6_MHC_SSTO_CTG1"              
#> [293] "CHR_HSCHR7_1_CTG1"                     
#> [294] "CHR_HSCHR7_1_CTG4_4"                   
#> [295] "CHR_HSCHR7_1_CTG6"                     
#> [296] "CHR_HSCHR7_1_CTG7"                     
#> [297] "CHR_HSCHR7_2_CTG1"                     
#> [298] "CHR_HSCHR7_2_CTG4_4"                   
#> [299] "CHR_HSCHR7_2_CTG6"                     
#> [300] "CHR_HSCHR7_2_CTG7"                     
#> [301] "CHR_HSCHR7_3_CTG6"                     
#> [302] "CHR_HSCHR8_1_CTG1"                     
#> [303] "CHR_HSCHR8_1_CTG6"                     
#> [304] "CHR_HSCHR8_1_CTG7"                     
#> [305] "CHR_HSCHR8_2_CTG1"                     
#> [306] "CHR_HSCHR8_2_CTG7"                     
#> [307] "CHR_HSCHR8_3_CTG1"                     
#> [308] "CHR_HSCHR8_3_CTG7"                     
#> [309] "CHR_HSCHR8_4_CTG1"                     
#> [310] "CHR_HSCHR8_4_CTG7"                     
#> [311] "CHR_HSCHR8_5_CTG1"                     
#> [312] "CHR_HSCHR8_5_CTG7"                     
#> [313] "CHR_HSCHR8_6_CTG1"                     
#> [314] "CHR_HSCHR8_7_CTG1"                     
#> [315] "CHR_HSCHR8_8_CTG1"                     
#> [316] "CHR_HSCHR8_9_CTG1"                     
#> [317] "CHR_HSCHR9_1_CTG1"                     
#> [318] "CHR_HSCHR9_1_CTG2"                     
#> [319] "CHR_HSCHR9_1_CTG3"                     
#> [320] "CHR_HSCHR9_1_CTG4"                     
#> [321] "CHR_HSCHR9_1_CTG5"                     
#> [322] "CHR_HSCHR9_1_CTG6"                     
#> [323] "CHR_HSCHRX_1_CTG3"                     
#> [324] "CHR_HSCHRX_2_CTG12"                    
#> [325] "CHR_HSCHRX_2_CTG3"                     
#> [326] "GL000009.2"                            
#> [327] "GL000194.1"                            
#> [328] "GL000195.1"                            
#> [329] "GL000205.2"                            
#> [330] "GL000213.1"                            
#> [331] "GL000216.2"                            
#> [332] "GL000218.1"                            
#> [333] "GL000219.1"                            
#> [334] "GL000220.1"                            
#> [335] "GL000225.1"                            
#> [336] "KI270442.1"                            
#> [337] "KI270711.1"                            
#> [338] "KI270713.1"                            
#> [339] "KI270721.1"                            
#> [340] "KI270726.1"                            
#> [341] "KI270727.1"                            
#> [342] "KI270728.1"                            
#> [343] "KI270731.1"                            
#> [344] "KI270733.1"                            
#> [345] "KI270734.1"                            
#> [346] "KI270744.1"                            
#> [347] "KI270750.1"                            
#> [348] "LRG_183"                               
#> [349] "LRG_187"                               
#> [350] "LRG_239"                               
#> [351] "LRG_311"                               
#> [352] "LRG_721"                               
#> [353] "LRG_741"                               
#> [354] "LRG_93"                                
#> [355] "MT"                                    
#> [356] "X"                                     
#> [357] "Y"
ensembldb::updateEnsDb(x)
#> EnsDb for Ensembl:
#> |Backend: SQLite
#> |Db type: EnsDb
#> |Type of Gene ID: Ensembl Gene ID
#> |Supporting package: ensembldb
#> |Db created by: ensembldb package from Bioconductor
#> |script_version: 0.3.0
#> |Creation time: Thu May 18 16:32:27 2017
#> |ensembl_version: 86
#> |ensembl_host: localhost
#> |Organism: homo_sapiens
#> |taxonomy_id: 9606
#> |genome_build: GRCh38
#> |DBSCHEMAVERSION: 2.0
#> | No. of genes: 63970.
#> | No. of transcripts: 216741.
#> |Protein data available.

ensembldb::genes(x, columns=c("gene_name"),
             filter=list(SeqNameFilter("X"), AnnotationFilter::GeneBiotypeFilter("protein_coding")))
#> GRanges object with 841 ranges and 3 metadata columns:
#>                   seqnames              ranges strand |   gene_name
#>                      <Rle>           <IRanges>  <Rle> | <character>
#>   ENSG00000182378        X       276322-303356      + |      PLCXD1
#>   ENSG00000178605        X       304529-318819      - |      GTPBP6
#>   ENSG00000167393        X       333963-386955      - |     PPP2R3B
#>   ENSG00000185960        X       624344-659411      + |        SHOX
#>   ENSG00000205755        X     1187549-1212750      - |       CRLF2
#>               ...      ...                 ...    ... .         ...
#>   ENSG00000277745        X 155459415-155460005      - |      H2AFB3
#>   ENSG00000185973        X 155490115-155669944      - |       TMLHE
#>   ENSG00000168939        X 155767812-155782459      + |       SPRY3
#>   ENSG00000124333        X 155881293-155943769      + |       VAMP7
#>   ENSG00000124334        X 155997581-156010817      + |        IL9R
#>                         gene_id  gene_biotype
#>                     <character>   <character>
#>   ENSG00000182378 ENSG000001... protein_co...
#>   ENSG00000178605 ENSG000001... protein_co...
#>   ENSG00000167393 ENSG000001... protein_co...
#>   ENSG00000185960 ENSG000001... protein_co...
#>   ENSG00000205755 ENSG000002... protein_co...
#>               ...           ...           ...
#>   ENSG00000277745 ENSG000002... protein_co...
#>   ENSG00000185973 ENSG000001... protein_co...
#>   ENSG00000168939 ENSG000001... protein_co...
#>   ENSG00000124333 ENSG000001... protein_co...
#>   ENSG00000124334 ENSG000001... protein_co...
#>   -------
#>   seqinfo: 1 sequence from GRCh38 genome
ensembldb ::transcripts(x, columns=ensembldb::listColumns(x, "tx"),
                        filter = AnnotationFilter::AnnotationFilterList(), order.type = "asc", return.type = "GRanges")
#> GRanges object with 216741 ranges and 6 metadata columns:
#>                   seqnames            ranges strand |         tx_id
#>                      <Rle>         <IRanges>  <Rle> |   <character>
#>   ENST00000456328        1       11869-14409      + | ENST000004...
#>   ENST00000450305        1       12010-13670      + | ENST000004...
#>   ENST00000488147        1       14404-29570      - | ENST000004...
#>   ENST00000619216        1       17369-17436      - | ENST000006...
#>   ENST00000473358        1       29554-31097      + | ENST000004...
#>               ...      ...               ...    ... .           ...
#>   ENST00000420810        Y 26549425-26549743      + | ENST000004...
#>   ENST00000456738        Y 26586642-26591601      - | ENST000004...
#>   ENST00000435945        Y 26594851-26634652      - | ENST000004...
#>   ENST00000435741        Y 26626520-26627159      - | ENST000004...
#>   ENST00000431853        Y 56855244-56855488      + | ENST000004...
#>                      tx_biotype tx_cds_seq_start tx_cds_seq_end       gene_id
#>                     <character>        <integer>      <integer>   <character>
#>   ENST00000456328 processed_...             <NA>           <NA> ENSG000002...
#>   ENST00000450305 transcribe...             <NA>           <NA> ENSG000002...
#>   ENST00000488147 unprocesse...             <NA>           <NA> ENSG000002...
#>   ENST00000619216         miRNA             <NA>           <NA> ENSG000002...
#>   ENST00000473358       lincRNA             <NA>           <NA> ENSG000002...
#>               ...           ...              ...            ...           ...
#>   ENST00000420810 processed_...             <NA>           <NA> ENSG000002...
#>   ENST00000456738 unprocesse...             <NA>           <NA> ENSG000002...
#>   ENST00000435945 unprocesse...             <NA>           <NA> ENSG000002...
#>   ENST00000435741 processed_...             <NA>           <NA> ENSG000002...
#>   ENST00000431853 processed_...             <NA>           <NA> ENSG000002...
#>                         tx_name
#>                     <character>
#>   ENST00000456328 ENST000004...
#>   ENST00000450305 ENST000004...
#>   ENST00000488147 ENST000004...
#>   ENST00000619216 ENST000006...
#>   ENST00000473358 ENST000004...
#>               ...           ...
#>   ENST00000420810 ENST000004...
#>   ENST00000456738 ENST000004...
#>   ENST00000435945 ENST000004...
#>   ENST00000435741 ENST000004...
#>   ENST00000431853 ENST000004...
#>   -------
#>   seqinfo: 357 sequences (1 circular) from GRCh38 genome

#txdbEnsemblGRCh38 <- GenomicFeatures::makeTxDbFromEnsembl(organism="Homo sapiens", release=98)
#txdb <- as.list(txdbEnsemblGRCh38)
#lapply(txdb,head)

11 Spectrum data analysis

This section collects notes on peptide/protein analysis, especially with respect to spectrum data.

11.1 Peptide sequence

Here is an example for PROC_HUMAN, which is handled by the Biostrings package,

fasta_file_path <- 'https://rest.uniprot.org/uniprotkb/P04070.fasta'
fasta_sequences <- Biostrings::readAAStringSet(fasta_file_path, format = "fasta")
AA_sequence <- fasta_sequences[[1]]
cat("Sequence:", toString(AA_sequence), "\n")
iso_442688365 <- 'TDGEGALSEPSATVTIEELAAPPPPVLMHHGESSQVLHPGNK'
match_position <- regexpr(iso_442688365, AA_sequence)
match_position
mp <- matchPattern(iso_442688365,AA_sequence)
mp
load(system.file("tests","PROC.rda",package="pQTLtools"))
pQTLtools::peptideAssociationPlot(protein,cistrans)
#> Joining with `by = join_by(Modified.Peptide.Sequence)`
peptide association plot

Figure 11.1: peptide association plot

11.2 Spectrum data analysis

11.2.1 Setup

The .raw files can be handled by rawrr package nevertheless it requires necessary files,

library(rawrr)
if (isFALSE(rawrr::.checkDllInMonoPath())){
   rawrr::installRawFileReaderDLLs()
}
if (isFALSE(file.exists(rawrr:::.rawrrAssembly()))){
   rawrr::installRawrrExe()
}

11.2.2 List of.raw files

Based on a real project, the following is an example of listing/generating multiple .raw from .zip files

# ZWK .raw data
spectra_ZWK <- "~/Caprion/pre_qc_data/spectral_library_ZWK"
raw_files <- list.files(spectra_ZWK, pattern = "\\.raw$", full.names = TRUE)
## collectively
suppressMessages(library(MsBackendRawFileReader))
ZWK <- Spectra::backendInitialize(MsBackendRawFileReader::MsBackendRawFileReader(),
       files = raw_files)
class(ZWK)
methods(class=class(ZWK))
Spectra(ZWK)
spectraData(ZWK)
ZWK
ZWKvars <- ZWK |> Spectra::spectraVariables()
ZWKdata <- ZWK |> Spectra::spectraData()
dim(ZWKdata)
# rows with >=1 non-NA value in the columns with prefix "precursor"
precursor <- apply(ZWKdata[grep("precursor",ZWKvars)], 1, function(x) any(!is.na(x)))
ZWKdata_filtered <- ZWKdata[precursor, ]
save(ZWK,file="~/Caprion/analysis/work/ZWK.rda")

# ZYQ/UDP
library(utils)
spectra <- "~/Caprion/pre_qc_data/spectra"
zip_files <- dir(spectra, recursive = TRUE, full.names=TRUE)
work_dir <- "~/Caprion/analysis/work"
for (zip_file in zip_files) unzip(zip_file, exdir=work_dir)
ZYQ_UDP <- Spectra::backendInitialize(MsBackendRawFileReader::MsBackendRawFileReader(),
           files = dir(work_dir,patt="raw",full.names=TRUE))
class(ZYQ_UDP)
ZYQ_UDP
ZYQ_UDP |> Spectra::spectraVariables()
save(ZYQ_UDP,file="~/Caprion/analysis/work/ZYQ_UDP.rda")

11.2.3 Usage

Various facilities are shown below.

# various files
d <- "/rds/project/rds-zuZwCZMsS0w/Caprion_proteomics/analysis/crux"
f <- file.path(d,"szwk901104i19801xms1.mzML")
x <- file.path(d,"szwk901104i19801xms1.mzXML")
g <- file.path(d,"szwk901104i19801xms1.mgf")
r <- file.path(d,"szwk901104i19801xms1.rda")
z <- file.path(d,"szwk901104i19801xms1.mzML.gz")

# mzML
mz <- mzR::openMSfile(f)
header_info <- mzR::header(mz)
table(header_info$msLevel)
peak_data <- mzR::peaks(mz)
spec <- mzR::spectra(mz)
class(spec)
length(spec)
lapply(spec,head,3)
methods(class="mzRpwiz")
mzR::close(mz)

mz <- mzR::openMSfile(z, backend = "pwiz")
mz
nChrom(mz)
head(tic(mz))
head(chromatogram(mz, 1L)) ## same as tic(x)
str(chromatogram(mz))
head(peaks(mz, scan=4))

# MSnbase
mzXML <- MSnbase::readMSData(x)
mgf <- MSnbase::readMgfData(g)
save(mzXML,mgf,file=r)

MSnbase::extractSpectraData(mzXML)
MSnbase::hasSpectra(z)
MSnbase::hasChromatograms(z)
MSnbase::plot2d(mzXML,z="peaks.count")
MSnbase::plotDensity(mzXML,z="precursor.mz")

MSnbase::extractSpectraData(mgf)
methods(class="MSpectra")
MSnbase::mz(mgf)
MSnbase::intensity(mgf)
MSnbase::rtime(mgf)
MSnbase::precursorMz(mgf)
MSnbase::precursorCharge(mgf)
MSnbase::precScanNum(mgf)
MSnbase::precursorIntensity(mgf)
MSnbase::acquisitionNum(mgf)
MSnbase::scanIndex(mgf)
MSnbase::peaksCount(mgf)
MSnbase::msLevel(mgf)
MSnbase::tic(mgf)
MSnbase::ionCount(mgf)
MSnbase::collisionEnergy(mgf)
MSnbase::fromFile(mgf)
MSnbase::polarity(mgf)
MSnbase::smoothed(mgf)
MSnbase::centroided(mgf)
MSnbase::isCentroided(mgf)
MSnbase::writeMgfData(mgf, con = "spectra.mgf", COM = NULL, TITLE = NULL)
MSnbase::removePeaks(mgf, t, msLevel., ...)
MSnbase::filterMsLevel(mgf, msLevel=2)
MSnbase::as.ExpressionSet(mgf)

# This turned to be really slow!
sp_list <- lapply(seq_along(mgf), function(i) {
  intensity_i <- MSnbase::intensity(mgf)[[i]]
  mz_i <- MSnbase::mz(mgf)[[i]]
  centroided_i <- MSnbase::centroided(mgf)[[i]]
  return(new("Spectrum1", intensity = intensity_i, mz = mz_i, centroided = centroided_i))
})
sp1 <- do.call(rbind, sp_list)
# only the first one is more manageable
sp1 <- new("Spectrum1",intensity=MSnbase::intensity(mgf)[[1]],mz=MSnbase::mz(mgf)[[1]],centroided=MSnbase::centroided(mgf)[[1]])
sp2 <- MSnbase::pickPeaks(sp1)
MSnbase::intensity(sp2)
plot(MSnbase::mz(sp1),MSnbase::intensity(sp1),type="h")
## Without m/z refinement
points(MSnbase::mz(sp2), MSnbase::intensity(sp2), col = "darkgrey")
## Using k = 1, closest signals
sp3 <- MSnbase::pickPeaks(sp1, refineMz = "kNeighbors", k = 1)
points(MSnbase::mz(sp3), MSnbase::intensity(sp3), col = "green", type = "h")
## Using descendPeak requiring at least 50% or the centroid's intensity
sp4 <- MSnbase::pickPeaks(sp1, refineMz = "descendPeak", signalPercentage = 50)
points(MSnbase::mz(sp4), MSnbase::intensity(sp4), col = "red", type = "h")

# CAMERA
xs   <- CAMERA::xcmsSet(f, method="centWave", ppm=30, peakwidth=c(5,10))
an   <- CAMERA::xsAnnotate(xs)
an   <- CAMERA::groupFWHM(an)
#For one group
peaklist <- CAMERA::getpspectra(an, 1)
#For two groups
peaklist <- CAMERA::getpspectra(an, c(1,2))

# Spectra
suppressMessages(library(Spectra))
sp <- Spectra::Spectra(z)
head(sp)
table(sp$msLevel)
d <- Spectra::computeMzDeltas(sp[1:1000])
Spectra::plotMzDelta(d)

# protViz
protViz::fragmentIon("TFVLNFIK")
esd <- MSnbase::extractSpectraData(mgf)
op <- par(mfrow=c(2,1))
ms <- function(i) with(esd[i,],list(title=TITLE,rtinseconds=RTINSECONDS,pepmass=PEPMASS,charge=CHARGE,
                                    mZ=MSnbase::mz(mgf[[i]]),intensity=MSnbase::intensity(mgf[[i]])))
protViz::peakplot("TAFDEAIAELDTLNEESYK", ms(1))
protViz::peakplot("TAFDEAIAELDTLSEESYK", ms(2))
par(op)
load("~/Caprion/pilot/ZWK.rda")
peptides <- subset(mapping_ZWK,Protein=="PROC_HUMAN")[["Modified.Peptide.Sequence"]] |> unique()
pim <- protViz::parentIonMass(peptides)
fi <- protViz::fragmentIon(peptides)
df <- as.data.frame(fi)
op <- par(mfrow=c(3,1))
for (i in 1:length(peptides)){
    plot(0, 0,
    xlab='m/Z',
    ylab='',
    xlim=range(c(fi[[i]]$b,fi[[i]]$y)),
    ylim=c(0,1),
    type='n',
    axes=FALSE,
    sub=paste(peptides[i], "/", pim[i], "Da"));
    box()
    axis(1, fi[[i]]$b, round(fi[[i]]$b,1), las=2)
    axis(1, fi[[i]]$y, round(fi[[i]]$y,1), las=2)

    pepSeq<-strsplit(peptides[i], "")
    axis(3,fi[[i]]$b, paste("b", row.names(fi[[i]]),sep=''),las=2)
    axis(3,fi[[i]]$y, paste("y", row.names(fi[[i]]),sep=''),las=2)

    text(fi[[i]]$b, rep(0.3, nchar(peptides[i])),
    pepSeq[[1]],pos=3,cex=4, lwd=4, col="#aaaaaaaa")

    abline(v=fi[[i]]$b, col='red')
    abline(v=fi[[i]]$y, col='blue',lwd=2)
}
par(op)

# MSstats
head(SRMRawData)
QuantData <- MSstats::dataProcess(SRMRawData, use_log_file = FALSE)
quant <- MSstats::dataProcess(SRMRawData,
                              normalization = "equalizeMedians",
                              summaryMethod = "TMP",
                              censoredInt = "NA",
                              MBimpute = TRUE,
                              maxQuantileforCensored = 0.999,
                              logTrans=2,
                              use_log_file=FALSE,
                              numberOfCores=5)
names(quant)

MSstats3 takes output from other data-processing pipelines.

12 Bionconductor forum

Web: https://support.bioconductor.org/

13 Bioconductor/CRAN packages

Package Description
Bioconductor
AnnotationDbi AnnotationDb objects and their progeny, methods etc.
Biobase Base functions for Bioconductor
Biostrings Efficient manipulation of biological strings
CAMERA Collection of annotation related methods
clusterProfiler Functional profiles for genes and gene clusters
ComplexHeatmap Make complex heatmaps
DESSeq2 Differential gene expression analysis based on the negative binomial distribution
edgeR Empirical analysis of digital gene expression
EnsDb.Hsapiens.v86 Exposes an annotation databases generated from Ensembl
ensembldb Retrieve annotation data from an Ensembl based package
FlowSorted.DLPFC.450k Illumina HumanMethylation data on sorted frontal cortex cell populations
graphite GRAPH Interaction from pathway topological environment
IlluminaHumanMethylation450kmanifest Annotation for Illumina’s 450k methylation arrays
INSPEcT Quantification of the intronic and exonic gene features and the post-transcriptional regulation analysis
MSnbase Base Functions and Classes for Mass Spectrometry and Proteomics
MSstats Protein Significance Analysis in DDA, SRM and DIA Proteomics
mzR parser for netCDF, mzXML and mzML and mzIdentML files
org.Hs.eg.db Conversion of Entrez ID – gene symbols
OUTRIDER OUTlier in RNA-Seq fInDER
Pi Priority index, leveraging genetic evidence to prioritise drug targets at the gene and pathway level
quantro A test for when to use quantile normalisation
rawrr Direct Access to Orbitrap Data and Beyond
recount3 Interface to uniformly processed RNA-seq data
Rgraphiz Interfaces R with the AT&T graphviz library for plotting R graph objects from the graph package
Spectra Spectra Infrastructure for Mass Spectrometry
sva Surrogate Variable Analysis
TxDb.Hsapiens.UCSC.hg38.knownGene Annotation of the human genome
CRAN
doParallel Foreach Parallel Adaptor for the ‘parallel’ Package
GeneNet Modeling and Inferring Gene Networks
ggplot2 Data Visualisations Using the grammar of graphics
heatmaply Interactive Cluster Heat Maps Using plotly and ggplot2
pheatmap results visualisation
plyr Splitting, applying and combining data
protViz Foreach Parallel Adaptor for the ‘parallel’ Package
RColorBrewer ColorBrewer Palettes
WGCNA Weighted correlation network analysis

References

1.
Köster, J., Forster, J., Schmeier, S. & Salazar, V. snakemake-workflows/rna-seq-star-deseq2: Version 1.2.0. (2021) doi:10.5281/zenodo.5245549.
2.
Carlson, M. R. J., Pagès, H., Arora, S., Obenchain, V. & Morgan, M. Genomic annotation resources in R/Bioconductor. in Statistical genomics: Methods and protocols (eds. Mathé, E. & Davis, S.) 67–69 (Springer New York, New York, NY, 2016). doi:10.1007/978-1-4939-3578-9_4.
3.