Linux systems

While MacOS is already Unix-based, here are information on how to install Ubuntu on Win10. It is native to the Windows system but would need some extra work, e.g., installation of Ximg or VcXsrv, to get the native Ubuntu desktop. A current drawback with this Windows extension is its failure to exchange files between the two system, it is thus appealing to create and use virtual machines.

Linux distributions such as Fedora and Ubuntu offer their ways to set up and Oracle’s VirtualBox allows for canonical installations of Linux system installed as an application, or once the software is set up a pre-setup virtual disk attached to it. Later on when we remove files or create snapshot we can follow these steps to shrink it.

Now a variety of things can be done, for instance we can install Google-Chrome and sublime-text.


wget -q -O - | sudo apt-key add -
echo 'deb [arch=amd64] stable main' | sudo tee /etc/apt/sources.list.d/google-chrome.list
sudo apt-get update 
sudo apt-get install google-chrome-stable

now you can start google-chrome.


wget -qO - | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb apt/stable/" | sudo tee /etc/apt/sources.list.d/sublime-text.list
sudo apt-get update
sudo apt-get install sublime-text

remarkable with MathJax

sudo dpkg -i remarkable_1.87_all.deb
sudo apt-get install -f
git clone MathJax

so that MathJax can be invoked from your Markdown document as follows,

<script type="text/javascript" src="/home/physalia/MathJax//MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="/home/physalia/MathJax/MathJax.js"></script>

An relatively easier option is Mango, also with the ability to generate HTML and PDF but does not require MathJax.


It is preferable to have within the Linux system R and RStudio. For instance in the case of Fedora, you can issue

dnf list R R-devel
dnf list rstudio

to check their availability and

sudo dnf install R R-devel
sudo rpm -i rstudio-1.0.153-x86_64.rpm

to install. For Ubuntu, simply replace dnf with apt, e.g.,

sudo apt install r-base r-base-dev r-mathlib
sudo dpkg -i rstudio-xenial-1.0.153-amd64.deb

It is more involved to download the source and compile, since it would involve installation of various componets (LaTeX, readline, TCL/Tk, etc.) as well.

Pre-course software

In the order of relevance, if possible, they are:

Specific software for genetic data: fcgene, QCTOOL, PLINK, GCTA, BOLT-LMM, (locuszoom) MAGMA, depict, PASCAL finemap, TWAS, FUSION.

R packages by day

  1. Rcmdr, devtools

  2. dplyr, tidyr, tidyverse, ggplot2, metafor, NCBI2R, genetics, gap, HardyWeinberg, haplo.stats, kinship2, pedigreemm

  3. snpStats, GenABEL

  4. lme4, MCMCglmm, openxlsx, SKAT

  5. R2BGLiMS, MendelianRandomization

Genetic analysis software

Software URL Notes
BEAGLE Genotype imputatoin
bedtools Data manipulation
BOLT-LMM Mixed modeling
DEPICT Broad, (GitHub) Pathway analysis
finemap fine-mapping on sumstats
GCTA Mixed model
GenABEL GWA analysis on chip data
GRAIL (VIZ-GRAIL) Obsolete pathway analysis software
GTOOL Data manipulation
IMPUTE2 Genotype imputation
LDSC Heritability from sumstats
LocusZoom Regional association plot
MaCH Genotype imputation
MAGENTA Pathway analysis
MAGMA Pathway analysis
METAL Metal-analysis
PASCAL, (source) Pathway analysis
PLINK2 Efficient data analysis on directly genotyped data
ProbABEL GWA analysis for linear, logistic and survival models
QCTOOL QC summary statistics
QUICKTEST Gene-environment interaction analysis
rvtests (GitHub) Rare-variant analysis
samtools Data manipulation
SHAPIT Haplotype phasing
SNPTEST Analysis of imputed genotype data
UNPHASED Haplotype analysisi
vcftools Data manipulation
VEGAS2 online version, command-line version Annotation
VEP Annotation

This is an extension of the Baylor list. The old Columbia/Rockefeller list may still contain useful links.

Institutional repositories are increasingly available, e.g. Eskin group, Broad Institute.

Generic software

JAGS is Just Another Gibbs Sampler. It is a program for the statistical analysis of Bayesian hierarchical models by Markov Chain Monte Carlo.

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

SAS (pronounced "sass") once stood for "statistical analysis system." It began at North Carolina State University as a project to analyze agricultural research. Demand for such software capabilities began to grow, and SAS was founded in 1976 to help customers in all sorts of industries – from pharmaceutical companies and banks to academic and governmental entities. SAS – both the software and the company – thrived throughout the next few decades. Development of the software attained new heights in the industry because it could run across all platforms, using the multivendor architecture for which it is known today. While the scope of the company has spread across the globe, the encouraging and innovative corporate culture has remained the same.

Stan (Thousands of users rely on Stan) for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business.

Stata is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics.