Artificial intelligence

Pyro

https://pyro.ai/examples/index.html

pytorch

The home page is https://pytorch.github.io, and the repository itself https://github.com/pytorch/ with https://github.com/pytorch/examples.

tensorflow

The tensorflow repository is here, https://github.com/tensorflow/tensorflow, and it is relatively easy to install via pip,

pip install tensorflow
python <<END
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
END

Follow https://github.com/aymericdamien/TensorFlow-Examples for readily adaptible examples.

Also

https://github.com/apress/pro-deep-learning-w-tensorflow

tensorQTL

AI-derived implementation.

https://github.com/broadinstitute/tensorqtl, https://github.com/broadinstitute/SignatureAnalyzer-GPU

module load python/3.6
git clone git@github.com:broadinstitute/tensorqtl.git
cd tensorqtl
# set up virtual environment and install
virtualenv venv
source venv/bin/activate
pip install -r install/requirements.txt .
cd example
wget https://personal.broadinstitute.org/francois/geuvadis/GEUVADIS.445_samples.GRCh38.20170504.maf01.filtered.nodup.bed
wget https://personal.broadinstitute.org/francois/geuvadis/GEUVADIS.445_samples.GRCh38.20170504.maf01.filtered.nodup.bim
wget https://personal.broadinstitute.org/francois/geuvadis/GEUVADIS.445_samples.GRCh38.20170504.maf01.filtered.nodup.fam   
wget https://personal.broadinstitute.org/francois/geuvadis/GEUVADIS.445_samples.covariates.txt
wget https://personal.broadinstitute.org/francois/geuvadis/GEUVADIS.445_samples.expression.bed.gz
# Jupyter notebook
sed -i 's/filtered/filtered.nodup/g' tensorqtl_examples.ipynb
# csd3
hostname
jupyter notebook --ip=127.0.0.1 --no-browser --port 8081
# local host
ssh -4 -L 8081:127.0.0.1:8081 -fN hostname.hpc.cam.ac.uk
firefox <generated URL from jupyter notebook command above> &

Note that a Parquet file is generated we use SparkR,

module load spark/2.4.0-bin-hadoop2.7

followed by

library(SparkR)
sparkR.session()
df <- read.parquet("GEUVADIS.445_samples.cis_qtl_pairs.chr18.parquet")
head(df)

to get

> dim(df)
[1] 2927819       9
> head(df)
       phenotype_id          variant_id tss_distance        maf ma_samples
1 ENSG00000263006.6 chr18_10644_C_G_b38       -98421 0.01685393         15
2 ENSG00000263006.6 chr18_10847_C_A_b38       -98218 0.01910112         17
3 ENSG00000263006.6 chr18_11275_G_A_b38       -97790 0.02471910         22
4 ENSG00000263006.6 chr18_11358_G_A_b38       -97707 0.02471910         22
5 ENSG00000263006.6 chr18_11445_G_A_b38       -97620 0.02359551         21
6 ENSG00000263006.6 chr18_13859_G_C_b38       -95206 0.02471910         22
  ma_count pval_nominal       slope  slope_se
1       15    0.5808729 -0.11776078 0.2131254
2       17    0.1428839 -0.29872555 0.2035047
3       22    0.7452308  0.05461900 0.1679810
4       22    0.7452308  0.05461900 0.1679810
5       21    0.6032759  0.08937798 0.1718505
6       22    0.7452308  0.05461900 0.1679810

An alternative is to tweak the R package arrow.

The command-line counterpart is as follows,

export plink_prefix_path=GEUVADIS.445_samples.GRCh38.20170504.maf01.filtered.nodup
export expression_bed=GEUVADIS.445_samples.expression.bed.gz
export covariates_file=GEUVADIS.445_samples.covariates.txt
export prefix=GEUVADIS.445_samples

python3 -m tensorqtl ${plink_prefix_path} ${expression_bed} ${prefix} \
    --covariates ${covariates_file} \
    --mode cis

python3 -m tensorqtl ${plink_prefix_path} ${expression_bed} ${prefix} \
    --covariates ${covariates_file} \
    --mode trans

Again one can read the Parquet format output.

Taylor-Weiner et al (2019). Scaling computational genomics to millions of individuals with GPUs. Genome Biol 20:228, https://doi.org/10.1186/s13059-019-1836-7