NOTES
Installation
singularity pull -F library://hmgu-itg/default/burden_testing:latest
singularity inspect burden_testing_latest.sif
singularity exec burden_testing_latest.sif help
singularity exec burden_testing_latest.sif single_cohort_munge_variantlist -h
singularity exec burden_testing_latest.sif prepare-regions -h
singularity exec burden_testing_latest.sif make-group-file -h
singularity exec --containall burden_testing_latest.sif step2 --help
for the latest or :1.5.4
, see wiki instructions on installation, data and SMMAT.
Debugging
Additional information can be obtained with singularity exec --help
. For debugging, e.g., on VCF2GDS, one can invoke the singularity shell via
singularity shell --bind ${PWD} --containall --shell /usr/bin/bash burden_testing_latest.sif
and the container has its own HOME directory within which ~
is recognised.
- The embedded call to
sbatch
implicates the--wait
option as the succeeding steps would require its full results. - The array jobs could be altered to accommodate software and model (BOLT-LMM, GCTA
--fastGWA/fastGWAS-lr
for mixed model/linear regression). - The group filess for rare variant analysis have been provided by the central analysis team. Note that (a). They are whole-genome so are preferably split by chromosome. (b). No groupings are contained for chromosomes X and Y, so they could be dropped from the analysis.
File upload
This is a heavy burden for the system, so the files are compressed first with sftp.sb
^addons.
The usual sftp wrapped by lftp utility is considerably faster.
export HOST=
export USER=
export PASS=
export RVA=/rds/project/jmmh2/rds-jmmh2-projects/olink_proteomics/scallop/SCALLOP-Seq/rva
lftp -c "open sftp://${USER}:${PASS}@${HOST}:/genetic_data/for_Grace; mirror -c -P=20 -R -v ${RVA}"
File download
We simply drop "-R" from above to have,
export HOST=
export USER=
export PASS=
export RVA=/rds/project/jmmh2/rds-jmmh2-projects/olink_proteomics/scallop/SCALLOP-Seq/rva
lftp -c "open sftp://${USER}:${PASS}@${HOST}:/genetic_data/for_Grace; mirror -c -P=20 -v ${RVA}"
URLs
This is used to add some incomplete files.
#!/usr/bin/bash
cat addons/*txt | \
parallel -C' ' 'echo {} | sed "s/INTERVAL-//;s/-/\t/g" | awk -vf={} "{print \$1\"-\"\$2\"/\" f}"' | \
parallel -C' ' -j10 'cp -r {} addons'
cp -r wgs-cvd2_IgG.Fc.receptor.II.b__P31994 addons
where we have
==> addons/INTERVAL.cvd2.corruptedfiles.txt <==
INTERVAL-wgs-cvd2_MMP.12__P39900-reg_Only-chr21.var.28.gz
INTERVAL-wgs-cvd2_MMP.12__P39900-reg_Only-chr21.var.47.gz
==> addons/INTERVAL.cvd3.corruptedfiles.txt <==
INTERVAL-wgs-cvd3_PGLYRP1__O75594-reg_Only-chr10.score.50.gz
INTERVAL-wgs-cvd3_PGLYRP1__O75594-reg_Only-chr10.score.51.gz
==> addons/missing.txt <==
INTERVAL-wgs-cvd2_DCN__P07585-exon_reg-chr4.gz
INTERVAL-wgs-cvd2_CXCL1__P09341-exon_reg-chr5.gz