Skip to contents

This article collects notes on Bioconductor packages, made available here to faciliate their use and extensions.

pkgs <- c("AnnotationDbi", "AnnotationFilter", "Biostrings", "ComplexHeatmap", "DESeq2", "EnsDb.Hsapiens.v86",
          "FlowSorted.DLPFC.450k", "GeneNet", "GenomicFeatures", "IlluminaHumanMethylation450kmanifest",
          "OUTRIDER","RColorBrewer", "RMariaDB", "Rgraphviz", "S4Vectors", "SummarizedExperiment",
          "TxDb.Hsapiens.UCSC.hg38.knownGene", "bladderbatch", "clusterProfiler",
          "corpcor", "doParallel", "ensembldb", "fdrtool", "graph", "graphite", "heatmaply",
          "minfi", "org.Hs.eg.db", "plyr", "quantro", "recount3", "sva")
for (p in pkgs) if (length(grep(paste("^package:", p, "$", sep=""), search())) == 0) {
    if (!requireNamespace(p)) warning(paste0("This vignette needs package `", p, "'; please install"))
}
invisible(suppressMessages(lapply(pkgs, require, character.only = TRUE)))

1 liftover

See inst/turbomanin the source, https://github.com/jinghuazhao/pQTLtools/tree/master/inst/turboman, or turboman/ directory in the installed package.

2 Normalisation

2.1 ComBat

This is the documentation example, based on Bioconductor 3.14.

data(bladderdata, package="bladderbatch")
edat <- bladderEset[1:50]

pheno <- Biobase::pData(edat)
batch <- pheno$batch
table(batch)
#> batch
#>  1  2  3  4  5 
#> 11 18  4  5 19
quantro::matboxplot(edat,batch,cex.axis=0.6,notch=TRUE,pch=19,ylab="Expression")
ComBat example

Figure 2.1: ComBat example

quantro::matdensity(edat,batch,xlab=" ",ylab="density")
legend("topleft",legend=1:5,col=1:5,lty=1)
ComBat example

Figure 2.2: ComBat example


# 1. parametric adjustment
combat_edata1 <- sva::ComBat(dat=edat, batch=batch, par.prior=TRUE, prior.plots=TRUE)
#> Found5batches
#> Adjusting for0covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding parametric adjustments
#> Adjusting the Data
ComBat example

Figure 2.3: ComBat example


# 2. non-parametric adjustment, mean-only version
combat_edata2 <- sva::ComBat(dat=edat, batch=batch, par.prior=FALSE, mean.only=TRUE)
#> Using the 'mean only' version of ComBat
#> Found5batches
#> Adjusting for0covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding nonparametric adjustments
#> Adjusting the Data

# 3. reference-batch version, with covariates
mod <- model.matrix(~as.factor(cancer), data=pheno)
combat_edata3 <- sva::ComBat(dat=edat, batch=batch, mod=mod, par.prior=TRUE, ref.batch=3, prior.plots=TRUE)
#> Using batch =3as a reference batch (this batch won't change)
#> Found5batches
#> Adjusting for2covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding parametric adjustments
#> Adjusting the Data
ComBat example

Figure 2.4: ComBat example

2.2 quantro

This is also adapted from the package vignette but with FlowSorted.DLPFC.450k in place of FlowSorted.

data(FlowSorted.DLPFC.450k,package="FlowSorted.DLPFC.450k")
p <- getBeta(FlowSorted.DLPFC.450k,offset=100)
pd <- Biobase::pData(FlowSorted.DLPFC.450k)
quantro::matboxplot(p, groupFactor = pd$CellType, xaxt = "n", main = "Beta Values", pch=19)
quantro example

Figure 2.5: quantro example

quantro::matdensity(p, groupFactor = pd$CellType, xlab = " ", ylab = "density",
                    main = "Beta Values", brewer.n = 8, brewer.name = "Dark2")
legend('top', c("NeuN_neg", "NeuN_pos"), col = c(1, 2), lty = 1, lwd = 3)
quantro example

Figure 2.6: quantro example

qtest <- quantro::quantro(object = p, groupFactor = pd$CellType)
#> [quantro] Average medians of the distributions are 
#>                         not equal across groups.
#> [quantro] Calculating the quantro test statistic.
#> [quantro] No permutation testing performed. 
#>                          Use B > 0 for permutation testing.
if (FALSE)
{
  doParallel::registerDoParallel(cores=10)
  qtestPerm <- quantro::quantro(p, groupFactor = pd$CellType, B = 1000)
  quantro::quantroPlot(qtestPerm)
}

3 Outlier detection in RNA-Seq

The following is adapted from package OUTRIDER,

ctsFile <- system.file('extdata', 'KremerNBaderSmall.tsv', package='OUTRIDER')
ctsTable <- read.table(ctsFile, check.names=FALSE)
ods <- OUTRIDER::OutriderDataSet(countData=ctsTable)
ods <- OUTRIDER::filterExpression(ods, minCounts=TRUE, filterGenes=TRUE)
#> 229 genes did not pass the filter due to zero counts. This is 22.9% of the genes.
ods <- OUTRIDER::OUTRIDER(ods)
#> Fri Jun  7 10:45:13 2024: SizeFactor estimation ...
#> Fri Jun  7 10:45:14 2024: Controlling for confounders ...
#> Using estimated q with: 23
#> Fri Jun  7 10:45:14 2024: Using the autoencoder implementation for controlling.
#> [1] "Fri Jun  7 10:45:18 2024: Initial PCA loss: 4.73997327486604"
#> [1] "Fri Jun  7 10:45:23 2024: Iteration: 1 loss: 4.19495955308718"
#> [1] "Fri Jun  7 10:45:27 2024: Iteration: 2 loss: 4.175558109168"
#> [1] "Fri Jun  7 10:45:29 2024: Iteration: 3 loss: 4.16627394855922"
#> [1] "Fri Jun  7 10:45:31 2024: Iteration: 4 loss: 4.16188545534749"
#> [1] "Fri Jun  7 10:45:34 2024: Iteration: 5 loss: 4.15785003075681"
#> [1] "Fri Jun  7 10:45:37 2024: Iteration: 6 loss: 4.15514657702473"
#> [1] "Fri Jun  7 10:45:38 2024: Iteration: 7 loss: 4.15361195345535"
#> [1] "Fri Jun  7 10:45:41 2024: Iteration: 8 loss: 4.1519827021534"
#> [1] "Fri Jun  7 10:45:42 2024: Iteration: 9 loss: 4.15111658632385"
#> [1] "Fri Jun  7 10:45:44 2024: Iteration: 10 loss: 4.14989393180947"
#> [1] "Fri Jun  7 10:45:45 2024: Iteration: 11 loss: 4.14980710469775"
#> [1] "Fri Jun  7 10:45:47 2024: Iteration: 12 loss: 4.14920776796145"
#> [1] "Fri Jun  7 10:45:48 2024: Iteration: 13 loss: 4.14899084050906"
#> [1] "Fri Jun  7 10:45:50 2024: Iteration: 14 loss: 4.14820997972724"
#> [1] "Fri Jun  7 10:45:51 2024: Iteration: 15 loss: 4.14814964413254"
#> Time difference of 31.53587 secs
#> [1] "Fri Jun  7 10:45:51 2024: 15 Final nb-AE loss: 4.14814964413254"
#> Fri Jun  7 10:45:51 2024: Used the autoencoder implementation for controlling.
#> Fri Jun  7 10:45:51 2024: P-value calculation ...
#> Fri Jun  7 10:45:55 2024: Zscore calculation ...
res <- OUTRIDER::results(ods)
knitr::kable(res,caption="A check list of outliers")
Table 3.1: A check list of outliers
geneID sampleID pValue padjust zScore l2fc rawcounts meanRawcounts normcounts meanCorrected theta aberrant AberrantBySample AberrantByGene padj_rank
ATAD3C MUC1360 0.0e+00 0.0000002 5.28 1.87 948 82.29 247.03 67.31 16.60 TRUE 1 1 1
NBPF15 MUC1351 0.0e+00 0.0000037 5.78 0.78 7591 4224.88 7054.78 4121.48 110.75 TRUE 2 1 1
MSTO1 MUC1367 0.0e+00 0.0000215 -6.23 -0.81 761 1327.87 728.28 1276.06 151.81 TRUE 1 1 1
HDAC1 MUC1350 0.0e+00 0.0000764 -5.95 -0.78 2215 3805.56 2122.51 3648.82 137.72 TRUE 1 1 1
DCAF6 MUC1374 1.0e-07 0.0003708 -5.69 -0.62 2348 4869.53 3082.74 4724.20 197.18 TRUE 1 1 1
NBPF16 MUC1351 2.0e-07 0.0006284 4.84 0.68 4014 2459.90 3836.96 2402.50 106.52 TRUE 2 1 2
FAM102B MUC1363 1.2e-06 0.0068396 -5.37 -1.28 455 1138.75 443.55 1076.64 42.53 TRUE 1 1 1
LOC100288142 MUC1361 4.0e-06 0.0222251 4.20 0.84 637 356.12 618.68 345.77 57.25 TRUE 1 1 1
SERINC2 MUC1343 7.5e-06 0.0416750 -6.10 -4.91 42 1718.44 45.84 1405.65 4.52 TRUE 1 1 1
TARDBP MUC0486 7.6e-06 0.0423348 -4.55 -0.32 5911 5780.34 4452.35 5565.42 466.01 TRUE 1 1 1
OUTRIDER::plotQQ(ods, res["geneID"],global=TRUE)
Q-Q plot for Kremer data

Figure 3.1: Q-Q plot for Kremer data

4 Differential expression

ex <- DESeq2::makeExampleDESeqDataSet(m=4)
dds <- DESeq2::DESeq(ex)
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res <- DESeq2::results(dds, contrast=c("condition","B","A"))
rld <- DESeq2::rlogTransformation(ex, blind=TRUE)
dat <- DESeq2::plotPCA(rld, intgroup=c("condition"),returnData=TRUE)
#> using ntop=500 top features by variance
percentVar <- round(100 * attr(dat,"percentVar"))
ggplot2::ggplot(dat, ggplot2::aes(PC1, PC2, color=condition, shape=condition)) +
ggplot2::geom_point(size=3) +
ggplot2::xlab(paste0("PC1:",percentVar[1],"% variance")) +
ggplot2::ylab(paste0("PC2:",percentVar[2],"% variance"))
DESeq2 example

Figure 4.1: DESeq2 example

ex$condition <- relevel(ex$condition, ref="B")
dds2 <- DESeq2::DESeq(dds)
#> using pre-existing size factors
#> estimating dispersions
#> found already estimated dispersions, replacing these
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res <- DESeq2::results(dds2)
knitr::kable(head(as.data.frame(res)))
baseMean log2FoldChange lfcSE stat pvalue padj
gene1 277.46471 0.0164732 0.4477271 0.0367929 0.9706501 0.9998477
gene2 10.51017 0.9062864 1.2748781 0.7108808 0.4771581 0.9998477
gene3 37.01996 0.4181164 0.9271126 0.4509877 0.6519984 0.9998477
gene4 78.76033 -0.9719377 0.6074553 -1.6000153 0.1095952 0.9998477
gene5 14.43047 -0.5318057 1.4614592 -0.3638868 0.7159426 0.9998477
gene6 41.86626 1.2435933 0.7644820 1.6267136 0.1037979 0.9998477

See the package in action from a snakemake workflow1.

5 Gene co-expression and network analysis

A simple network is furnished with the GeneNet documentation example,

## A random network with 40 nodes 
# it contains 780=40*39/2 edges of which 5 percent (=39) are non-zero
true.pcor <- GeneNet::ggm.simulate.pcor(40)
  
# A data set with 40 observations
m.sim <- GeneNet::ggm.simulate.data(40, true.pcor)

# A simple estimate of partial correlations
estimated.pcor <- corpcor::cor2pcor( cor(m.sim) )

# A comparison of estimated and true values
sum((true.pcor-estimated.pcor)^2)
#> [1] 593.2851

# A slightly better estimate ...
estimated.pcor.2 <- GeneNet::ggm.estimate.pcor(m.sim)
#> Estimating optimal shrinkage intensity lambda (correlation matrix): 0.3689
sum((true.pcor-estimated.pcor.2)^2)
#> [1] 10.85778

## ecoli data 
data(ecoli, package="GeneNet")

# partial correlation matrix 
inferred.pcor <- GeneNet::ggm.estimate.pcor(ecoli)
#> Estimating optimal shrinkage intensity lambda (correlation matrix): 0.1804

# p-values, q-values and posterior probabilities for each potential edge 
test.results <- GeneNet::network.test.edges(inferred.pcor)
#> Estimate (local) false discovery rates (partial correlations):
#> Step 1... determine cutoff point
#> Step 2... estimate parameters of null distribution and eta0
#> Step 3... compute p-values and estimate empirical PDF/CDF
#> Step 4... compute q-values and local fdr
#> Step 5... prepare for plotting

# best 20 edges (strongest correlation)
test.results[1:20,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084

# network containing edges with prob > 0.9 (i.e. local fdr < 0.1)
net <- GeneNet::extract.network(test.results, cutoff.ggm=0.9)
#> 
#> Significant edges:  65 
#>     Corresponding to  1.26 %  of possible edges
net
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573

# significant based on FDR cutoff Q=0.05?
num.significant.1 <- sum(test.results$qval <= 0.05)
test.results[1:num.significant.1,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573
#> 66  0.07706275    27    58 4.876831e-04 3.606179e-02 0.8297811
#> 67 -0.07610709    16    83 5.730532e-04 4.085920e-02 0.8297811
#> 68  0.07550557    53    84 6.337143e-04 4.406472e-02 0.8297811

# significant based on "local fdr" cutoff (prob > 0.9)?
num.significant.2 <- sum(test.results$prob > 0.9)
test.results[test.results$prob > 0.9,]
#>           pcor node1 node2         pval         qval      prob
#> 1   0.23185664    51    53 2.220446e-16 3.612205e-13 1.0000000
#> 2   0.22405545    52    53 2.220446e-16 3.612205e-13 1.0000000
#> 3   0.21507824    51    52 2.220446e-16 3.612205e-13 1.0000000
#> 4   0.17328863     7    93 3.108624e-15 3.792816e-12 0.9999945
#> 5  -0.13418892    29    86 1.120812e-09 1.093997e-06 0.9999516
#> 6   0.12594697    21    72 1.103836e-08 8.978563e-06 0.9998400
#> 7   0.11956105    28    86 5.890924e-08 3.853590e-05 0.9998400
#> 8  -0.11723897    26    80 1.060526e-07 5.816172e-05 0.9998400
#> 9  -0.11711625    72    89 1.093655e-07 5.930499e-05 0.9972804
#> 10  0.10658013    20    21 1.366610e-06 5.925275e-04 0.9972804
#> 11  0.10589778    21    73 1.596859e-06 6.678429e-04 0.9972804
#> 12  0.10478689    20    91 2.053403e-06 8.024425e-04 0.9972804
#> 13  0.10420836     7    52 2.338382e-06 8.778605e-04 0.9944557
#> 14  0.10236077    87    95 3.525186e-06 1.224964e-03 0.9944557
#> 15  0.10113550    27    95 4.610444e-06 1.500047e-03 0.9920084
#> 16  0.09928954    21    51 6.868357e-06 2.046549e-03 0.9920084
#> 17  0.09791914    21    88 9.192373e-06 2.520616e-03 0.9920084
#> 18  0.09719685    18    95 1.070232e-05 2.790102e-03 0.9920084
#> 19  0.09621791    28    90 1.313007e-05 3.171817e-03 0.9920084
#> 20  0.09619099    12    80 1.320374e-05 3.182526e-03 0.9920084
#> 21  0.09576091    89    95 1.443542e-05 3.354777e-03 0.9891317
#> 22  0.09473210     7    51 1.784126e-05 3.864825e-03 0.9891317
#> 23 -0.09386896    53    58 2.127622e-05 4.313590e-03 0.9891317
#> 24 -0.09366615    29    83 2.217013e-05 4.421099e-03 0.9891317
#> 25 -0.09341148    21    89 2.334321e-05 4.556947e-03 0.9810727
#> 26 -0.09156391    49    93 3.380043e-05 5.955972e-03 0.9810727
#> 27 -0.09150710    80    90 3.418363e-05 6.002083e-03 0.9810727
#> 28  0.09101505     7    53 3.767966e-05 6.408102e-03 0.9810727
#> 29  0.09050688    21    84 4.164471e-05 6.838782e-03 0.9810727
#> 30  0.08965490    72    73 4.919365e-05 7.581866e-03 0.9810727
#> 31 -0.08934025    29    99 5.229604e-05 7.861416e-03 0.9810727
#> 32 -0.08906819     9    95 5.512708e-05 8.104759e-03 0.9810727
#> 33  0.08888345     2    49 5.713144e-05 8.270673e-03 0.9810727
#> 34  0.08850681    86    90 6.143363e-05 8.610161e-03 0.9810727
#> 35  0.08805868    17    53 6.695170e-05 9.015175e-03 0.9810727
#> 36  0.08790809    28    48 6.890884e-05 9.151291e-03 0.9810727
#> 37  0.08783471    33    58 6.988211e-05 9.217597e-03 0.9682377
#> 38 -0.08705796     7    49 8.101244e-05 1.021362e-02 0.9682377
#> 39  0.08645033    20    46 9.086547e-05 1.102466e-02 0.9682377
#> 40  0.08609950    48    86 9.705862e-05 1.150392e-02 0.9682377
#> 41  0.08598769    21    52 9.911458e-05 1.165816e-02 0.9682377
#> 42  0.08555275    32    95 1.075099e-04 1.226435e-02 0.9682377
#> 43  0.08548231    17    51 1.089311e-04 1.236337e-02 0.9424721
#> 44  0.08470370    80    83 1.258659e-04 1.382356e-02 0.9424721
#> 45  0.08442510    80    82 1.325062e-04 1.437068e-02 0.9174573
#> 46  0.08271606    80    93 1.810275e-04 1.845632e-02 0.9174573
#> 47  0.08235175    46    91 1.933329e-04 1.941579e-02 0.9174573
#> 48  0.08217787    25    95 1.994788e-04 1.988432e-02 0.9174573
#> 49 -0.08170331    29    87 2.171999e-04 2.119715e-02 0.9174573
#> 50  0.08123632    19    29 2.360716e-04 2.253606e-02 0.9174573
#> 51  0.08101702    51    84 2.454547e-04 2.318024e-02 0.9174573
#> 52  0.08030748    16    93 2.782643e-04 2.532796e-02 0.9174573
#> 53  0.08006503    28    52 2.903870e-04 2.608271e-02 0.9174573
#> 54 -0.07941656    41    80 3.252833e-04 2.814824e-02 0.9174573
#> 55  0.07941410    54    89 3.254229e-04 2.815620e-02 0.9174573
#> 56 -0.07934653    28    80 3.292784e-04 2.837511e-02 0.9174573
#> 57  0.07916783    29    92 3.396802e-04 2.895702e-02 0.9174573
#> 58 -0.07866905    17    86 3.703635e-04 3.060293e-02 0.9174573
#> 59  0.07827749    16    29 3.962446e-04 3.191462e-02 0.9174573
#> 60 -0.07808262    73    89 4.097452e-04 3.257290e-02 0.9174573
#> 61  0.07766261    52    67 4.403165e-04 3.400207e-02 0.9174573
#> 62  0.07762917    25    87 4.428396e-04 3.411637e-02 0.9174573
#> 63 -0.07739378     9    93 4.609872e-04 3.492295e-02 0.9174573
#> 64  0.07738885    31    80 4.613747e-04 3.493988e-02 0.9174573
#> 65 -0.07718681    80    94 4.775136e-04 3.563444e-02 0.9174573

# parameters of the mixture distribution used to compute p-values etc.
c <- fdrtool::fdrtool(corpcor::sm2vec(inferred.pcor), statistic="correlation")
#> Step 1... determine cutoff point
#> Step 2... estimate parameters of null distribution and eta0
#> Step 3... compute p-values and estimate empirical PDF/CDF
#> Step 4... compute q-values and local fdr
#> Step 5... prepare for plotting
GeneNet example

Figure 5.1: GeneNet example

c$param
#>          cutoff N.cens      eta0     eta0.SE    kappa kappa.SE
#> [1,] 0.03553068   4352 0.9474623 0.005656465 2043.377 94.72267

## A random network with 20 nodes and 10 percent (=19) edges
true.pcor <- GeneNet::ggm.simulate.pcor(20, 0.1)

# convert to edge list
test.results <- GeneNet::ggm.list.edges(true.pcor)[1:19,]
nlab <- LETTERS[1:20]

# graphviz
# network.make.dot(filename="test.dot", test.results, nlab, main = "A graph")
# system("fdp -T svg -o test.svg test.dot")

# Rgraphviz
gr <- GeneNet::network.make.graph( test.results, nlab)
gr
#> A graphNEL graph with directed edges
#> Number of Nodes = 20 
#> Number of Edges = 38
num.nodes(gr)
#> [1] 20
edge.info(gr)
#> $weight
#>      A~T      A~Q      A~J      A~L      B~N      B~J      C~R      D~N 
#> -0.06270  0.19879  0.40924  0.62571  0.19421  0.52032 -0.69851  0.38015 
#>      D~H      F~Q      F~M      G~K      H~I      J~K      K~M      M~T 
#>  0.61005 -0.23242  0.52489  0.48304  0.67365  0.25697  0.44416  0.18103 
#>      M~S      N~Q      R~T 
#>  0.31499 -0.43478 -0.59535 
#> 
#> $dir
#>    A~T    A~Q    A~J    A~L    B~N    B~J    C~R    D~N    D~H    F~Q    F~M 
#> "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" 
#>    G~K    H~I    J~K    K~M    M~T    M~S    N~Q    R~T 
#> "none" "none" "none" "none" "none" "none" "none" "none"
gr2 <- GeneNet::network.make.graph( test.results, nlab, drop.singles=TRUE)
gr2
#> A graphNEL graph with directed edges
#> Number of Nodes = 17 
#> Number of Edges = 38
GeneNet::num.nodes(gr2)
#> [1] 17
GeneNet::edge.info(gr2)
#> $weight
#>      A~T      A~Q      A~J      A~L      B~N      B~J      C~R      D~N 
#> -0.06270  0.19879  0.40924  0.62571  0.19421  0.52032 -0.69851  0.38015 
#>      D~H      F~Q      F~M      G~K      H~I      J~K      K~M      M~T 
#>  0.61005 -0.23242  0.52489  0.48304  0.67365  0.25697  0.44416  0.18103 
#>      M~S      N~Q      R~T 
#>  0.31499 -0.43478 -0.59535 
#> 
#> $dir
#>    A~T    A~Q    A~J    A~L    B~N    B~J    C~R    D~N    D~H    F~Q    F~M 
#> "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" "none" 
#>    G~K    H~I    J~K    K~M    M~T    M~S    N~Q    R~T 
#> "none" "none" "none" "none" "none" "none" "none" "none"

# plot network
plot(gr, "fdp")
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
#> Warning in arrows(head_from[1], head_from[2], head_to[1], head_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
GeneNet example

Figure 5.2: GeneNet example

plot(gr2, "fdp")
#> Warning in arrows(tail_from[1], tail_from[2], tail_to[1], tail_to[2], col =
#> edgeColor, : zero-length arrow is of indeterminate angle and so skipped
GeneNet example

Figure 5.3: GeneNet example

A side-by-side heatmaps

set.seed(123454321)
m <- matrix(runif(2500),50)
r <- cor(m)
g <- as.matrix(r>=0.7)+0
f1 <- ComplexHeatmap::Heatmap(r)
f2 <- ComplexHeatmap::Heatmap(g)
f <- f1+f2
ComplexHeatmap::draw(f)
Heatmaps

Figure 5.4: Heatmaps


df <- heatmaply::normalize(mtcars)
hm <- heatmaply::heatmaply(df,k_col=5,k_row=5,
                           colors = grDevices::colorRampPalette(RColorBrewer::brewer.pal(3, "RdBu"))(256))
htmlwidgets::saveWidget(hm,file="heatmaply.html")
htmltools::tags$iframe(src = "heatmaply.html", width = "100%", height = "550px")

so we have heatmaply.html and a module analysis with WGCNA,

pwr <- c(1:10, seq(from=12, to=30, by=2))
sft <- WGCNA::pickSoftThreshold(dat, powerVector=pwr, verbose=5)
ADJ <- abs(cor(dat, method="pearson", use="pairwise.complete.obs"))^6
dissADJ <- 1-ADJ
dissTOM <- WGCNA::TOMdist(ADJ)
TOM <- WGCNA::TOMsimilarityFromExpr(dat)
Tree <- hclust(as.dist(1-TOM), method="average")
for(j in pwr)
{
  pam_name <- paste0("pam",j)
  assign(pam_name, cluster::pam(as.dist(dissADJ),j))
  pamTOM_name <- paste0("pamTOM",j)
  assign(pamTOM_name,cluster::pam(as.dist(dissTOM),j))
  tc <- table(get(pam_name)$clustering,get(pamTOM_name)$clustering)
  print(tc)
  print(diag(tc))
}
colorStaticTOM <- as.character(WGCNA::cutreeStaticColor(Tree,cutHeight=.99,minSize=5))
colorDynamicTOM <- WGCNA::labels2colors(cutreeDynamic(Tree,method="tree",minClusterSize=5))
Colors <- data.frame(pamTOM6$clustering,colorStaticTOM,colorDynamicTOM)
WGCNA::plotDendroAndColors(Tree, Colors, dendroLabels=FALSE, hang=0.03, addGuide=TRUE, guideHang=0.05)
meg <- WGCNA::moduleEigengenes(dat, color=1:ncol(dat), softPower=6)

6 Meta-data

This section is based on package recount3.

hs <- recount3::available_projects()
dim(subset(hs,file_source=="gtex"))
recount3::annotation_options("human")
blood_rse <- recount3::create_rse(subset(hs,project=="BLOOD"))
S4Vectors::metadata(blood_rse)
SummarizedExperiment::rowRanges(blood_rse)
colnames(SummarizedExperiment::colData(blood_rse))[1:20]
recount3::expand_sra_attributes(blood_rse)

7 Pathway and enrichment analysis

reactome <- graphite::pathways("hsapiens", "reactome")
kegg <- graphite::pathways("hsapiens","kegg")
pharmgkb <- graphite::pathways("hsapiens","pharmgkb")
nodes(kegg[[21]])
#>  [1] "ENTREZID:102724560" "ENTREZID:10993"     "ENTREZID:113675"   
#>  [4] "ENTREZID:132158"    "ENTREZID:1610"      "ENTREZID:1738"     
#>  [7] "ENTREZID:1757"      "ENTREZID:189"       "ENTREZID:211"      
#> [10] "ENTREZID:212"       "ENTREZID:23464"     "ENTREZID:2593"     
#> [13] "ENTREZID:26227"     "ENTREZID:2628"      "ENTREZID:27232"    
#> [16] "ENTREZID:2731"      "ENTREZID:275"       "ENTREZID:29958"    
#> [19] "ENTREZID:29968"     "ENTREZID:441531"    "ENTREZID:501"      
#> [22] "ENTREZID:51268"     "ENTREZID:5223"      "ENTREZID:5224"     
#> [25] "ENTREZID:55349"     "ENTREZID:5723"      "ENTREZID:635"      
#> [28] "ENTREZID:63826"     "ENTREZID:6470"      "ENTREZID:6472"     
#> [31] "ENTREZID:64902"     "ENTREZID:669"       "ENTREZID:875"      
#> [34] "ENTREZID:9380"      "ENTREZID:1491"
kegg_t2g <- ldply(lapply(kegg, nodes), data.frame)
names(kegg_t2g) <- c("gs_name", "gene_symbol")
VEGF <- subset(kegg_t2g,gs_name=="VEGF signaling pathway")[[2]]
eKEGG <- clusterProfiler::enricher(gene=VEGF, TERM2GENE = kegg_t2g,
                                   universe=,
                                   pAdjustMethod = "BH",
                                   pvalueCutoff = 0.1, qvalueCutoff = 0.05,
                                   minGSSize = 10, maxGSSize = 500)

8 Peptide sequence

Here is an example for PROC_HUMAN, which is handled by the Biostrings package,

fasta_file_path <- 'https://rest.uniprot.org/uniprotkb/P04070.fasta'
fasta_sequences <- Biostrings::readAAStringSet(fasta_file_path, format = "fasta")
AA_sequence <- fasta_sequences[[1]]
cat("Sequence:", toString(AA_sequence), "\n")
#> Sequence: MWQLTSLLLFVATWGISGTPAPLDSVFSSSERAHQVLRIRKRANSFLEELRHSSLERECIEEICDFEEAKEIFQNVDDTLAFWSKHVDGDQCLVLPLEHPCASLCCGHGTCIDGIGSFSCDCRSGWEGRFCQREVSFLNCSLDNGGCTHYCLEEVGWRRCSCAPGYKLGDDLLQCHPAVKFPCGRPWKRMEKKRSHLKRDTEDQEDQVDPRLIDGKMTRRGDSPWQVVLLDSKKKLACGAVLIHPSWVLTAAHCMDESKKLLVRLGEYDLRRWEKWELDLDIKEVFVHPNYSKSTTDNDIALLHLAQPATLSQTIVPICLPDSGLAERELNQAGQETLVTGWGYHSSREKEAKRNRTFVLNFIKIPVVPHNECSEVMSNMVSENMLCAGILGDRQDACEGDSGGPMVASFHGTWFLVGLVSWGEGCGLLHNYGVYTKVSRYLDWIHGHIRDKEAPQKSWAP
iso_442688365 <- 'TDGEGALSEPSATVTIEELAAPPPPVLMHHGESSQVLHPGNK'
match_position <- regexpr(iso_442688365, AA_sequence)
match_position
#> [1] -1
#> attr(,"match.length")
#> [1] -1
#> attr(,"index.type")
#> [1] "chars"
#> attr(,"useBytes")
#> [1] TRUE
mp <- matchPattern(iso_442688365,AA_sequence)
mp
#> Views on a 461-letter AAString subject
#> subject: MWQLTSLLLFVATWGISGTPAPLDSVFSSSERAH...LLHNYGVYTKVSRYLDWIHGHIRDKEAPQKSWAP
#> views: NONE
protein <- "PROC"
cistrans <- read.csv(paste0("~/pQTLtools/tests","/",protein,".cis.vs.trans"))
load(paste0("~/pQTLtools/tests/",protein,".rda"))
pQTLtools::peptideAssociationPlot(protein,cistrans)
#> Joining with `by = join_by(Modified.Peptide.Sequence)`
peptide association plot

Figure 8.1: peptide association plot

9 Transcript databases

An overview of annotation is available2.

options(width=200)

# columns(org.Hs.eg.db)
# keyref <- keys(org.Hs.eg.db, keytype="ENTREZID")
# symbol_uniprot <- select(org.Hs.eg.db,keys=keyref,columns = c("SYMBOL","UNIPROT"))
# subset(symbol_uniprot,SYMBOL=="MC4R")

x <- EnsDb.Hsapiens.v86
ensembldb::listColumns(x, "protein", skip.keys=TRUE)
#> [1] "tx_id"            "protein_id"       "protein_sequence"
ensembldb::listGenebiotypes(x)
#>  [1] "protein_coding"                     "unitary_pseudogene"                 "unprocessed_pseudogene"             "processed_pseudogene"               "processed_transcript"              
#>  [6] "transcribed_unprocessed_pseudogene" "antisense"                          "transcribed_unitary_pseudogene"     "polymorphic_pseudogene"             "lincRNA"                           
#> [11] "sense_intronic"                     "transcribed_processed_pseudogene"   "sense_overlapping"                  "IG_V_pseudogene"                    "pseudogene"                        
#> [16] "TR_V_gene"                          "3prime_overlapping_ncRNA"           "IG_V_gene"                          "bidirectional_promoter_lncRNA"      "snRNA"                             
#> [21] "miRNA"                              "misc_RNA"                           "snoRNA"                             "rRNA"                               "Mt_tRNA"                           
#> [26] "Mt_rRNA"                            "IG_C_gene"                          "IG_J_gene"                          "TR_J_gene"                          "TR_C_gene"                         
#> [31] "TR_V_pseudogene"                    "TR_J_pseudogene"                    "IG_D_gene"                          "ribozyme"                           "IG_C_pseudogene"                   
#> [36] "TR_D_gene"                          "TEC"                                "IG_J_pseudogene"                    "scRNA"                              "scaRNA"                            
#> [41] "vaultRNA"                           "sRNA"                               "macro_lncRNA"                       "non_coding"                         "IG_pseudogene"                     
#> [46] "LRG_gene"
ensembldb::listTxbiotypes(x)
#>  [1] "protein_coding"                     "processed_transcript"               "nonsense_mediated_decay"            "retained_intron"                    "unitary_pseudogene"                
#>  [6] "TEC"                                "miRNA"                              "misc_RNA"                           "non_stop_decay"                     "unprocessed_pseudogene"            
#> [11] "processed_pseudogene"               "transcribed_unprocessed_pseudogene" "lincRNA"                            "antisense"                          "transcribed_unitary_pseudogene"    
#> [16] "polymorphic_pseudogene"             "sense_intronic"                     "transcribed_processed_pseudogene"   "sense_overlapping"                  "IG_V_pseudogene"                   
#> [21] "pseudogene"                         "TR_V_gene"                          "3prime_overlapping_ncRNA"           "IG_V_gene"                          "bidirectional_promoter_lncRNA"     
#> [26] "snRNA"                              "snoRNA"                             "rRNA"                               "Mt_tRNA"                            "Mt_rRNA"                           
#> [31] "IG_C_gene"                          "IG_J_gene"                          "TR_J_gene"                          "TR_C_gene"                          "TR_V_pseudogene"                   
#> [36] "TR_J_pseudogene"                    "IG_D_gene"                          "ribozyme"                           "IG_C_pseudogene"                    "TR_D_gene"                         
#> [41] "IG_J_pseudogene"                    "scRNA"                              "scaRNA"                             "vaultRNA"                           "sRNA"                              
#> [46] "macro_lncRNA"                       "non_coding"                         "IG_pseudogene"                      "LRG_gene"
ensembldb::listTables(x)
#> $gene
#> [1] "gene_id"          "gene_name"        "gene_biotype"     "gene_seq_start"   "gene_seq_end"     "seq_name"         "seq_strand"       "seq_coord_system" "symbol"          
#> 
#> $tx
#> [1] "tx_id"            "tx_biotype"       "tx_seq_start"     "tx_seq_end"       "tx_cds_seq_start" "tx_cds_seq_end"   "gene_id"          "tx_name"         
#> 
#> $tx2exon
#> [1] "tx_id"    "exon_id"  "exon_idx"
#> 
#> $exon
#> [1] "exon_id"        "exon_seq_start" "exon_seq_end"  
#> 
#> $chromosome
#> [1] "seq_name"    "seq_length"  "is_circular"
#> 
#> $protein
#> [1] "tx_id"            "protein_id"       "protein_sequence"
#> 
#> $uniprot
#> [1] "protein_id"           "uniprot_id"           "uniprot_db"           "uniprot_mapping_type"
#> 
#> $protein_domain
#> [1] "protein_id"            "protein_domain_id"     "protein_domain_source" "interpro_accession"    "prot_dom_start"        "prot_dom_end"         
#> 
#> $entrezgene
#> [1] "gene_id"  "entrezid"
#> 
#> $metadata
#> [1] "name"  "value"
ensembldb::metadata(x)
#>                  name                               value
#> 1             Db type                               EnsDb
#> 2     Type of Gene ID                     Ensembl Gene ID
#> 3  Supporting package                           ensembldb
#> 4       Db created by ensembldb package from Bioconductor
#> 5      script_version                               0.3.0
#> 6       Creation time            Thu May 18 16:32:27 2017
#> 7     ensembl_version                                  86
#> 8        ensembl_host                           localhost
#> 9            Organism                        homo_sapiens
#> 10        taxonomy_id                                9606
#> 11       genome_build                              GRCh38
#> 12    DBSCHEMAVERSION                                 2.0
ensembldb::organism(x)
#> [1] "Homo sapiens"
ensembldb::returnFilterColumns(x)
#> [1] TRUE
ensembldb::seqinfo(x)
#> Seqinfo object with 357 sequences (1 circular) from GRCh38 genome:
#>   seqnames seqlengths isCircular genome
#>   X         156040895      FALSE GRCh38
#>   20         64444167      FALSE GRCh38
#>   1         248956422      FALSE GRCh38
#>   6         170805979      FALSE GRCh38
#>   3         198295559      FALSE GRCh38
#>   ...             ...        ...    ...
#>   LRG_239      114904      FALSE GRCh38
#>   LRG_311      115492      FALSE GRCh38
#>   LRG_721       33396      FALSE GRCh38
#>   LRG_741      231167      FALSE GRCh38
#>   LRG_93        22459      FALSE GRCh38
ensembldb::seqlevels(x)
#>   [1] "1"                                      "10"                                     "11"                                     "12"                                    
#>   [5] "13"                                     "14"                                     "15"                                     "16"                                    
#>   [9] "17"                                     "18"                                     "19"                                     "2"                                     
#>  [13] "20"                                     "21"                                     "22"                                     "3"                                     
#>  [17] "4"                                      "5"                                      "6"                                      "7"                                     
#>  [21] "8"                                      "9"                                      "CHR_HG107_PATCH"                        "CHR_HG126_PATCH"                       
#>  [25] "CHR_HG1311_PATCH"                       "CHR_HG1342_HG2282_PATCH"                "CHR_HG1362_PATCH"                       "CHR_HG142_HG150_NOVEL_TEST"            
#>  [29] "CHR_HG151_NOVEL_TEST"                   "CHR_HG1651_PATCH"                       "CHR_HG1832_PATCH"                       "CHR_HG2021_PATCH"                      
#>  [33] "CHR_HG2022_PATCH"                       "CHR_HG2023_PATCH"                       "CHR_HG2030_PATCH"                       "CHR_HG2058_PATCH"                      
#>  [37] "CHR_HG2062_PATCH"                       "CHR_HG2063_PATCH"                       "CHR_HG2066_PATCH"                       "CHR_HG2072_PATCH"                      
#>  [41] "CHR_HG2095_PATCH"                       "CHR_HG2104_PATCH"                       "CHR_HG2116_PATCH"                       "CHR_HG2128_PATCH"                      
#>  [45] "CHR_HG2191_PATCH"                       "CHR_HG2213_PATCH"                       "CHR_HG2217_PATCH"                       "CHR_HG2232_PATCH"                      
#>  [49] "CHR_HG2233_PATCH"                       "CHR_HG2235_PATCH"                       "CHR_HG2239_PATCH"                       "CHR_HG2247_PATCH"                      
#>  [53] "CHR_HG2249_PATCH"                       "CHR_HG2288_HG2289_PATCH"                "CHR_HG2290_PATCH"                       "CHR_HG2291_PATCH"                      
#>  [57] "CHR_HG2334_PATCH"                       "CHR_HG26_PATCH"                         "CHR_HG986_PATCH"                        "CHR_HSCHR10_1_CTG1"                    
#>  [61] "CHR_HSCHR10_1_CTG2"                     "CHR_HSCHR10_1_CTG3"                     "CHR_HSCHR10_1_CTG4"                     "CHR_HSCHR10_1_CTG6"                    
#>  [65] "CHR_HSCHR11_1_CTG1_2"                   "CHR_HSCHR11_1_CTG5"                     "CHR_HSCHR11_1_CTG6"                     "CHR_HSCHR11_1_CTG7"                    
#>  [69] "CHR_HSCHR11_1_CTG8"                     "CHR_HSCHR11_2_CTG1"                     "CHR_HSCHR11_2_CTG1_1"                   "CHR_HSCHR11_3_CTG1"                    
#>  [73] "CHR_HSCHR12_1_CTG1"                     "CHR_HSCHR12_1_CTG2_1"                   "CHR_HSCHR12_2_CTG1"                     "CHR_HSCHR12_2_CTG2"                    
#>  [77] "CHR_HSCHR12_2_CTG2_1"                   "CHR_HSCHR12_3_CTG2"                     "CHR_HSCHR12_3_CTG2_1"                   "CHR_HSCHR12_4_CTG2"                    
#>  [81] "CHR_HSCHR12_4_CTG2_1"                   "CHR_HSCHR12_5_CTG2"                     "CHR_HSCHR12_5_CTG2_1"                   "CHR_HSCHR12_6_CTG2_1"                  
#>  [85] "CHR_HSCHR13_1_CTG1"                     "CHR_HSCHR13_1_CTG3"                     "CHR_HSCHR13_1_CTG5"                     "CHR_HSCHR13_1_CTG8"                    
#>  [89] "CHR_HSCHR14_1_CTG1"                     "CHR_HSCHR14_2_CTG1"                     "CHR_HSCHR14_3_CTG1"                     "CHR_HSCHR14_7_CTG1"                    
#>  [93] "CHR_HSCHR15_1_CTG1"                     "CHR_HSCHR15_1_CTG3"                     "CHR_HSCHR15_1_CTG8"                     "CHR_HSCHR15_2_CTG3"                    
#>  [97] "CHR_HSCHR15_2_CTG8"                     "CHR_HSCHR15_3_CTG3"                     "CHR_HSCHR15_3_CTG8"                     "CHR_HSCHR15_4_CTG8"                    
#> [101] "CHR_HSCHR15_5_CTG8"                     "CHR_HSCHR15_6_CTG8"                     "CHR_HSCHR16_1_CTG1"                     "CHR_HSCHR16_1_CTG3_1"                  
#> [105] "CHR_HSCHR16_2_CTG3_1"                   "CHR_HSCHR16_3_CTG1"                     "CHR_HSCHR16_4_CTG1"                     "CHR_HSCHR16_4_CTG3_1"                  
#> [109] "CHR_HSCHR16_5_CTG1"                     "CHR_HSCHR16_CTG2"                       "CHR_HSCHR17_10_CTG4"                    "CHR_HSCHR17_1_CTG1"                    
#> [113] "CHR_HSCHR17_1_CTG2"                     "CHR_HSCHR17_1_CTG4"                     "CHR_HSCHR17_1_CTG5"                     "CHR_HSCHR17_1_CTG9"                    
#> [117] "CHR_HSCHR17_2_CTG1"                     "CHR_HSCHR17_2_CTG2"                     "CHR_HSCHR17_2_CTG4"                     "CHR_HSCHR17_2_CTG5"                    
#> [121] "CHR_HSCHR17_3_CTG2"                     "CHR_HSCHR17_3_CTG4"                     "CHR_HSCHR17_4_CTG4"                     "CHR_HSCHR17_5_CTG4"                    
#> [125] "CHR_HSCHR17_6_CTG4"                     "CHR_HSCHR17_7_CTG4"                     "CHR_HSCHR17_8_CTG4"                     "CHR_HSCHR17_9_CTG4"                    
#> [129] "CHR_HSCHR18_1_CTG1_1"                   "CHR_HSCHR18_1_CTG2_1"                   "CHR_HSCHR18_2_CTG1_1"                   "CHR_HSCHR18_2_CTG2"                    
#> [133] "CHR_HSCHR18_2_CTG2_1"                   "CHR_HSCHR18_3_CTG2_1"                   "CHR_HSCHR18_5_CTG1_1"                   "CHR_HSCHR18_ALT21_CTG2_1"              
#> [137] "CHR_HSCHR18_ALT2_CTG2_1"                "CHR_HSCHR19KIR_ABC08_A1_HAP_CTG3_1"     "CHR_HSCHR19KIR_ABC08_AB_HAP_C_P_CTG3_1" "CHR_HSCHR19KIR_ABC08_AB_HAP_T_P_CTG3_1"
#> [141] "CHR_HSCHR19KIR_FH05_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH05_B_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH06_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH06_BA1_HAP_CTG3_1"    
#> [145] "CHR_HSCHR19KIR_FH08_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH08_BAX_HAP_CTG3_1"     "CHR_HSCHR19KIR_FH13_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH13_BA2_HAP_CTG3_1"    
#> [149] "CHR_HSCHR19KIR_FH15_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_FH15_B_HAP_CTG3_1"       "CHR_HSCHR19KIR_G085_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_G085_BA1_HAP_CTG3_1"    
#> [153] "CHR_HSCHR19KIR_G248_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1"     "CHR_HSCHR19KIR_GRC212_AB_HAP_CTG3_1"    "CHR_HSCHR19KIR_GRC212_BA1_HAP_CTG3_1"  
#> [157] "CHR_HSCHR19KIR_LUCE_A_HAP_CTG3_1"       "CHR_HSCHR19KIR_LUCE_BDEL_HAP_CTG3_1"    "CHR_HSCHR19KIR_RP5_B_HAP_CTG3_1"        "CHR_HSCHR19KIR_RSH_A_HAP_CTG3_1"       
#> [161] "CHR_HSCHR19KIR_RSH_BA2_HAP_CTG3_1"      "CHR_HSCHR19KIR_T7526_A_HAP_CTG3_1"      "CHR_HSCHR19KIR_T7526_BDEL_HAP_CTG3_1"   "CHR_HSCHR19LRC_COX1_CTG3_1"            
#> [165] "CHR_HSCHR19LRC_COX2_CTG3_1"             "CHR_HSCHR19LRC_LRC_I_CTG3_1"            "CHR_HSCHR19LRC_LRC_J_CTG3_1"            "CHR_HSCHR19LRC_LRC_S_CTG3_1"           
#> [169] "CHR_HSCHR19LRC_LRC_T_CTG3_1"            "CHR_HSCHR19LRC_PGF1_CTG3_1"             "CHR_HSCHR19LRC_PGF2_CTG3_1"             "CHR_HSCHR19_1_CTG2"                    
#> [173] "CHR_HSCHR19_1_CTG3_1"                   "CHR_HSCHR19_2_CTG2"                     "CHR_HSCHR19_2_CTG3_1"                   "CHR_HSCHR19_3_CTG2"                    
#> [177] "CHR_HSCHR19_3_CTG3_1"                   "CHR_HSCHR19_4_CTG2"                     "CHR_HSCHR19_4_CTG3_1"                   "CHR_HSCHR19_5_CTG2"                    
#> [181] "CHR_HSCHR1_1_CTG11"                     "CHR_HSCHR1_1_CTG3"                      "CHR_HSCHR1_1_CTG31"                     "CHR_HSCHR1_1_CTG32_1"                  
#> [185] "CHR_HSCHR1_2_CTG3"                      "CHR_HSCHR1_2_CTG31"                     "CHR_HSCHR1_2_CTG32_1"                   "CHR_HSCHR1_3_CTG3"                     
#> [189] "CHR_HSCHR1_3_CTG31"                     "CHR_HSCHR1_3_CTG32_1"                   "CHR_HSCHR1_4_CTG3"                      "CHR_HSCHR1_4_CTG31"                    
#> [193] "CHR_HSCHR1_5_CTG3"                      "CHR_HSCHR1_5_CTG32_1"                   "CHR_HSCHR1_ALT2_1_CTG32_1"              "CHR_HSCHR20_1_CTG1"                    
#> [197] "CHR_HSCHR20_1_CTG2"                     "CHR_HSCHR20_1_CTG3"                     "CHR_HSCHR20_1_CTG4"                     "CHR_HSCHR21_2_CTG1_1"                  
#> [201] "CHR_HSCHR21_3_CTG1_1"                   "CHR_HSCHR21_4_CTG1_1"                   "CHR_HSCHR21_5_CTG2"                     "CHR_HSCHR21_6_CTG1_1"                  
#> [205] "CHR_HSCHR21_8_CTG1_1"                   "CHR_HSCHR22_1_CTG1"                     "CHR_HSCHR22_1_CTG2"                     "CHR_HSCHR22_1_CTG3"                    
#> [209] "CHR_HSCHR22_1_CTG4"                     "CHR_HSCHR22_1_CTG5"                     "CHR_HSCHR22_1_CTG6"                     "CHR_HSCHR22_1_CTG7"                    
#> [213] "CHR_HSCHR22_2_CTG1"                     "CHR_HSCHR22_3_CTG1"                     "CHR_HSCHR22_4_CTG1"                     "CHR_HSCHR22_5_CTG1"                    
#> [217] "CHR_HSCHR22_6_CTG1"                     "CHR_HSCHR22_7_CTG1"                     "CHR_HSCHR22_8_CTG1"                     "CHR_HSCHR2_1_CTG1"                     
#> [221] "CHR_HSCHR2_1_CTG15"                     "CHR_HSCHR2_1_CTG5"                      "CHR_HSCHR2_1_CTG7"                      "CHR_HSCHR2_1_CTG7_2"                   
#> [225] "CHR_HSCHR2_2_CTG1"                      "CHR_HSCHR2_2_CTG15"                     "CHR_HSCHR2_2_CTG7"                      "CHR_HSCHR2_2_CTG7_2"                   
#> [229] "CHR_HSCHR2_3_CTG1"                      "CHR_HSCHR2_3_CTG15"                     "CHR_HSCHR2_3_CTG7_2"                    "CHR_HSCHR2_4_CTG1"                     
#> [233] "CHR_HSCHR2_6_CTG7_2"                    "CHR_HSCHR3_1_CTG1"                      "CHR_HSCHR3_1_CTG2_1"                    "CHR_HSCHR3_1_CTG3"                     
#> [237] "CHR_HSCHR3_2_CTG2_1"                    "CHR_HSCHR3_2_CTG3"                      "CHR_HSCHR3_3_CTG1"                      "CHR_HSCHR3_3_CTG3"                     
#> [241] "CHR_HSCHR3_4_CTG2_1"                    "CHR_HSCHR3_4_CTG3"                      "CHR_HSCHR3_5_CTG2_1"                    "CHR_HSCHR3_5_CTG3"                     
#> [245] "CHR_HSCHR3_6_CTG3"                      "CHR_HSCHR3_7_CTG3"                      "CHR_HSCHR3_8_CTG3"                      "CHR_HSCHR3_9_CTG3"                     
#> [249] "CHR_HSCHR4_11_CTG12"                    "CHR_HSCHR4_1_CTG12"                     "CHR_HSCHR4_1_CTG4"                      "CHR_HSCHR4_1_CTG6"                     
#> [253] "CHR_HSCHR4_1_CTG9"                      "CHR_HSCHR4_2_CTG12"                     "CHR_HSCHR4_2_CTG4"                      "CHR_HSCHR4_3_CTG12"                    
#> [257] "CHR_HSCHR4_4_CTG12"                     "CHR_HSCHR4_5_CTG12"                     "CHR_HSCHR4_6_CTG12"                     "CHR_HSCHR4_7_CTG12"                    
#> [261] "CHR_HSCHR4_8_CTG12"                     "CHR_HSCHR4_9_CTG12"                     "CHR_HSCHR5_1_CTG1"                      "CHR_HSCHR5_1_CTG1_1"                   
#> [265] "CHR_HSCHR5_1_CTG5"                      "CHR_HSCHR5_2_CTG1"                      "CHR_HSCHR5_2_CTG1_1"                    "CHR_HSCHR5_2_CTG5"                     
#> [269] "CHR_HSCHR5_3_CTG1"                      "CHR_HSCHR5_3_CTG5"                      "CHR_HSCHR5_4_CTG1"                      "CHR_HSCHR5_4_CTG1_1"                   
#> [273] "CHR_HSCHR5_5_CTG1"                      "CHR_HSCHR5_6_CTG1"                      "CHR_HSCHR5_7_CTG1"                      "CHR_HSCHR6_1_CTG10"                    
#> [277] "CHR_HSCHR6_1_CTG2"                      "CHR_HSCHR6_1_CTG3"                      "CHR_HSCHR6_1_CTG4"                      "CHR_HSCHR6_1_CTG5"                     
#> [281] "CHR_HSCHR6_1_CTG6"                      "CHR_HSCHR6_1_CTG7"                      "CHR_HSCHR6_1_CTG8"                      "CHR_HSCHR6_1_CTG9"                     
#> [285] "CHR_HSCHR6_8_CTG1"                      "CHR_HSCHR6_MHC_APD_CTG1"                "CHR_HSCHR6_MHC_COX_CTG1"                "CHR_HSCHR6_MHC_DBB_CTG1"               
#> [289] "CHR_HSCHR6_MHC_MANN_CTG1"               "CHR_HSCHR6_MHC_MCF_CTG1"                "CHR_HSCHR6_MHC_QBL_CTG1"                "CHR_HSCHR6_MHC_SSTO_CTG1"              
#> [293] "CHR_HSCHR7_1_CTG1"                      "CHR_HSCHR7_1_CTG4_4"                    "CHR_HSCHR7_1_CTG6"                      "CHR_HSCHR7_1_CTG7"                     
#> [297] "CHR_HSCHR7_2_CTG1"                      "CHR_HSCHR7_2_CTG4_4"                    "CHR_HSCHR7_2_CTG6"                      "CHR_HSCHR7_2_CTG7"                     
#> [301] "CHR_HSCHR7_3_CTG6"                      "CHR_HSCHR8_1_CTG1"                      "CHR_HSCHR8_1_CTG6"                      "CHR_HSCHR8_1_CTG7"                     
#> [305] "CHR_HSCHR8_2_CTG1"                      "CHR_HSCHR8_2_CTG7"                      "CHR_HSCHR8_3_CTG1"                      "CHR_HSCHR8_3_CTG7"                     
#> [309] "CHR_HSCHR8_4_CTG1"                      "CHR_HSCHR8_4_CTG7"                      "CHR_HSCHR8_5_CTG1"                      "CHR_HSCHR8_5_CTG7"                     
#> [313] "CHR_HSCHR8_6_CTG1"                      "CHR_HSCHR8_7_CTG1"                      "CHR_HSCHR8_8_CTG1"                      "CHR_HSCHR8_9_CTG1"                     
#> [317] "CHR_HSCHR9_1_CTG1"                      "CHR_HSCHR9_1_CTG2"                      "CHR_HSCHR9_1_CTG3"                      "CHR_HSCHR9_1_CTG4"                     
#> [321] "CHR_HSCHR9_1_CTG5"                      "CHR_HSCHR9_1_CTG6"                      "CHR_HSCHRX_1_CTG3"                      "CHR_HSCHRX_2_CTG12"                    
#> [325] "CHR_HSCHRX_2_CTG3"                      "GL000009.2"                             "GL000194.1"                             "GL000195.1"                            
#> [329] "GL000205.2"                             "GL000213.1"                             "GL000216.2"                             "GL000218.1"                            
#> [333] "GL000219.1"                             "GL000220.1"                             "GL000225.1"                             "KI270442.1"                            
#> [337] "KI270711.1"                             "KI270713.1"                             "KI270721.1"                             "KI270726.1"                            
#> [341] "KI270727.1"                             "KI270728.1"                             "KI270731.1"                             "KI270733.1"                            
#> [345] "KI270734.1"                             "KI270744.1"                             "KI270750.1"                             "LRG_183"                               
#> [349] "LRG_187"                                "LRG_239"                                "LRG_311"                                "LRG_721"                               
#> [353] "LRG_741"                                "LRG_93"                                 "MT"                                     "X"                                     
#> [357] "Y"
ensembldb::updateEnsDb(x)
#> EnsDb for Ensembl:
#> |Backend: SQLite
#> |Db type: EnsDb
#> |Type of Gene ID: Ensembl Gene ID
#> |Supporting package: ensembldb
#> |Db created by: ensembldb package from Bioconductor
#> |script_version: 0.3.0
#> |Creation time: Thu May 18 16:32:27 2017
#> |ensembl_version: 86
#> |ensembl_host: localhost
#> |Organism: homo_sapiens
#> |taxonomy_id: 9606
#> |genome_build: GRCh38
#> |DBSCHEMAVERSION: 2.0
#> | No. of genes: 63970.
#> | No. of transcripts: 216741.
#> |Protein data available.

ensembldb::genes(x, columns=c("gene_name"),
             filter=list(SeqNameFilter("X"), AnnotationFilter::GeneBiotypeFilter("protein_coding")))
#> GRanges object with 841 ranges and 3 metadata columns:
#>                   seqnames              ranges strand |   gene_name         gene_id   gene_biotype
#>                      <Rle>           <IRanges>  <Rle> | <character>     <character>    <character>
#>   ENSG00000182378        X       276322-303356      + |      PLCXD1 ENSG00000182378 protein_coding
#>   ENSG00000178605        X       304529-318819      - |      GTPBP6 ENSG00000178605 protein_coding
#>   ENSG00000167393        X       333963-386955      - |     PPP2R3B ENSG00000167393 protein_coding
#>   ENSG00000185960        X       624344-659411      + |        SHOX ENSG00000185960 protein_coding
#>   ENSG00000205755        X     1187549-1212750      - |       CRLF2 ENSG00000205755 protein_coding
#>               ...      ...                 ...    ... .         ...             ...            ...
#>   ENSG00000277745        X 155459415-155460005      - |      H2AFB3 ENSG00000277745 protein_coding
#>   ENSG00000185973        X 155490115-155669944      - |       TMLHE ENSG00000185973 protein_coding
#>   ENSG00000168939        X 155767812-155782459      + |       SPRY3 ENSG00000168939 protein_coding
#>   ENSG00000124333        X 155881293-155943769      + |       VAMP7 ENSG00000124333 protein_coding
#>   ENSG00000124334        X 155997581-156010817      + |        IL9R ENSG00000124334 protein_coding
#>   -------
#>   seqinfo: 1 sequence from GRCh38 genome
ensembldb ::transcripts(x, columns=ensembldb::listColumns(x, "tx"),
                        filter = AnnotationFilter::AnnotationFilterList(), order.type = "asc", return.type = "GRanges")
#> GRanges object with 216741 ranges and 6 metadata columns:
#>                   seqnames            ranges strand |           tx_id             tx_biotype tx_cds_seq_start tx_cds_seq_end         gene_id         tx_name
#>                      <Rle>         <IRanges>  <Rle> |     <character>            <character>        <integer>      <integer>     <character>     <character>
#>   ENST00000456328        1       11869-14409      + | ENST00000456328   processed_transcript             <NA>           <NA> ENSG00000223972 ENST00000456328
#>   ENST00000450305        1       12010-13670      + | ENST00000450305 transcribed_unproces..             <NA>           <NA> ENSG00000223972 ENST00000450305
#>   ENST00000488147        1       14404-29570      - | ENST00000488147 unprocessed_pseudogene             <NA>           <NA> ENSG00000227232 ENST00000488147
#>   ENST00000619216        1       17369-17436      - | ENST00000619216                  miRNA             <NA>           <NA> ENSG00000278267 ENST00000619216
#>   ENST00000473358        1       29554-31097      + | ENST00000473358                lincRNA             <NA>           <NA> ENSG00000243485 ENST00000473358
#>               ...      ...               ...    ... .             ...                    ...              ...            ...             ...             ...
#>   ENST00000420810        Y 26549425-26549743      + | ENST00000420810   processed_pseudogene             <NA>           <NA> ENSG00000224240 ENST00000420810
#>   ENST00000456738        Y 26586642-26591601      - | ENST00000456738 unprocessed_pseudogene             <NA>           <NA> ENSG00000227629 ENST00000456738
#>   ENST00000435945        Y 26594851-26634652      - | ENST00000435945 unprocessed_pseudogene             <NA>           <NA> ENSG00000237917 ENST00000435945
#>   ENST00000435741        Y 26626520-26627159      - | ENST00000435741   processed_pseudogene             <NA>           <NA> ENSG00000231514 ENST00000435741
#>   ENST00000431853        Y 56855244-56855488      + | ENST00000431853   processed_pseudogene             <NA>           <NA> ENSG00000235857 ENST00000431853
#>   -------
#>   seqinfo: 357 sequences (1 circular) from GRCh38 genome

txdbEnsemblGRCh38 <- GenomicFeatures::makeTxDbFromEnsembl(organism="Homo sapiens", release=98)
#> Warning in call_fun_in_txdbmaker("makeTxDbFromEnsembl", ...): makeTxDbFromEnsembl() has moved to the txdbmaker package. Please call txdbmaker::makeTxDbFromEnsembl() to get rid of this warning.
#> Fetch transcripts and genes from Ensembl ... OK
#>   (fetched 250194 transcripts from 67946 genes)
#> Fetch exons and CDS from Ensembl ... OK
#> Fetch chromosome names and lengths from Ensembl ...OK
#> Gather the metadata ... OK
#> Make the TxDb object ... OK
txdb <- as.list(txdbEnsemblGRCh38)
lapply(txdb,head)
#> $transcripts
#>   tx_id         tx_name        tx_type        tx_chrom tx_strand tx_start  tx_end
#> 1     1 ENST00000636745         lncRNA CHR_HG107_PATCH         +  1049876 1055745
#> 2     2 ENST00000636387         lncRNA CHR_HG107_PATCH         +  1052607 1055745
#> 3     3 ENST00000643422 protein_coding CHR_HG107_PATCH         +  1075018 1112365
#> 4     4 ENST00000645631 protein_coding CHR_HG107_PATCH         +  1075018 1112365
#> 5     5 ENST00000636567 protein_coding CHR_HG107_PATCH         +  1159911 1203106
#> 6     6 ENST00000636545 protein_coding CHR_HG107_PATCH         -  1012823 1036718
#> 
#> $splicings
#>   tx_id exon_rank exon_id       exon_name      exon_chrom exon_strand exon_start exon_end cds_id        cds_name cds_start cds_end
#> 1     1         1       1 ENSE00003797146 CHR_HG107_PATCH           +    1049876  1049958     NA            <NA>        NA      NA
#> 2     1         2       2 ENSE00003795151 CHR_HG107_PATCH           +    1051619  1051839     NA            <NA>        NA      NA
#> 3     1         3       4 ENSE00003793692 CHR_HG107_PATCH           +    1054235  1054388     NA            <NA>        NA      NA
#> 4     1         4       5 ENSE00003797325 CHR_HG107_PATCH           +    1055110  1055745     NA            <NA>        NA      NA
#> 5     2         1       3 ENSE00003798310 CHR_HG107_PATCH           +    1052607  1055745     NA            <NA>        NA      NA
#> 6     3         1       6 ENSE00003815958 CHR_HG107_PATCH           +    1075018  1075093      1 ENSP00000494473   1075018 1075093
#> 
#> $genes
#>   tx_id         gene_id
#> 1     1 ENSG00000283640
#> 2     2 ENSG00000283640
#> 3     3 ENSG00000284971
#> 4     4 ENSG00000284971
#> 5     5 ENSG00000283158
#> 6     6 ENSG00000283350
#> 
#> $chrominfo
#>              chrom    length is_circular
#> 1  CHR_HG107_PATCH 135088590          NA
#> 2  CHR_HG109_PATCH  58617934          NA
#> 3  CHR_HG126_PATCH 198295908          NA
#> 4 CHR_HG1277_PATCH 133754853          NA
#> 5 CHR_HG1296_PATCH 190208697          NA
#> 6 CHR_HG1298_PATCH 190196285          NA

txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene

# liverExprs <- quantifyExpressionsFromBWs(txdb = txdb,BWfiles=,experimentalDesign=)

10 Bionconductor forum

Web: https://support.bioconductor.org/

11 Bioconductor/CRAN packages

Package Description
Bioconductor
AnnotationDbi AnnotationDb objects and their progeny, methods etc.
Biobase Base functions for Bioconductor
Biostrings Efficient manipulation of biological strings
clusterProfiler Functional profiles for genes and gene clusters
ComplexHeatmap Make complex heatmaps
DESSeq2 Differential gene expression analysis based on the negative binomial distribution
edgeR Empirical analysis of digital gene expression
EnsDb.Hsapiens.v86 Exposes an annotation databases generated from Ensembl
ensembldb Retrieve annotation data from an Ensembl based package
FlowSorted.DLPFC.450k Illumina HumanMethylation data on sorted frontal cortex cell populations
graphite GRAPH Interaction from pathway topological environment
IlluminaHumanMethylation450kmanifest Annotation for Illumina’s 450k methylation arrays
INSPEcT Quantification of the intronic and exonic gene features and the post-transcriptional regulation analysis
org.Hs.eg.db Conversion of Entrez ID – gene symbols
OUTRIDER OUTlier in RNA-Seq fInDER
Pi Priority index, leveraging genetic evidence to prioritise drug targets at the gene and pathway level
quantro A test for when to use quantile normalisation
recount3 Interface to uniformly processed RNA-seq data
Rgraphiz Interfaces R with the AT&T graphviz library for plotting R graph objects from the graph package
sva Surrogate Variable Analysis
TxDb.Hsapiens.UCSC.hg38.knownGene Annotation of the human genome
CRAN
doParallel Foreach Parallel Adaptor for the ‘parallel’ Package
GeneNet Modeling and Inferring Gene Networks
ggplot2 Data Visualisations Using the grammar of graphics
heatmaply Interactive Cluster Heat Maps Using plotly and ggplot2
pheatmap results visualisation
plyr Splitting, applying and combining data
RColorBrewer ColorBrewer Palettes
WGCNA Weighted correlation network analysis

References

1.
Köster, J., Forster, J., Schmeier, S. & Salazar, V. snakemake-workflows/rna-seq-star-deseq2: Version 1.2.0. (2021) doi:10.5281/zenodo.5245549.
2.
Carlson, M. R. J., Pagès, H., Arora, S., Obenchain, V. & Morgan, M. Genomic annotation resources in R/Bioconductor. in Statistical genomics: Methods and protocols (eds. Mathé, E. & Davis, S.) 67–69 (Springer New York, New York, NY, 2016). doi:10.1007/978-1-4939-3578-9_4.