An overview of pQTLdata
pQTLdata.Rmd
This package intends to gather information, meta-data and relevant scripts in proteogenomic analysis.
1 Collections
As used in several years of proteomic analysis and for future extensions, the collections are in two locations:
-
data/
. R datasets. -
inst/
.EndNote/
,Olink/
,scripts/
,pQTLdata.sh
,docs.sh
which spread into the package’s root directory after installation.
While library(help=pQTLdata)
displays the general information, ? pQTLdata
can give a list of data objects in the package.
2 Protein panels
2.1 Olink/Inflammation
The Olink qPCR inflammation panel (inf1
)1 used in the SCALLOP consortium seeds the collection.
gene | prot | uniprot | target | target.short | chr | start | end | ensembl_gene_id | alt_name | ensGene | chromosome | start38 | end38 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ADA | ADA | P00813 | Adenosine Deaminase (ADA) | ADA | 20 | 43248163 | 43280874 | ENSG00000196839 | NA | NA | NA | NA | NA |
ARTN | ARTN | Q5T4W7 | Artemin (ARTN) | ARTN | 1 | 44398992 | 44402913 | ENSG00000117407 | NA | NA | NA | NA | NA |
AXIN1 | AXIN1 | O15169 | Axin-1 (AXIN1) | AXIN1 | 16 | 337440 | 402673 | ENSG00000103126 | NA | NA | NA | NA | NA |
BDNF | BDNF | P23560 | Brain-derived neutrophic factor (BDNF) | BDNF | 11 | 27676440 | 27743605 | ENSG00000176697 | NA | NA | NA | NA | NA |
CASP8 | CASP.8 | Q14790 | Caspase 8 (CASP-8) | CASP-8 | 2 | 202098166 | 202152434 | ENSG00000064012 | NA | NA | NA | NA | NA |
CCL11 | CCL11 | P51671 | Eotaxin-1 (CCL11) | CCL11 | 17 | 32612687 | 32615353 | ENSG00000172156 | CCL11 | NA | NA | NA | NA |
CCL13 | MCP.4 | Q99616 | Monocyte chemotactic protein 4 (MCP-4) | MCP-4 | 17 | 32683471 | 32685629 | ENSG00000181374 | CCL13 | NA | NA | NA | NA |
CCL19 | CCL19 | Q99731 | C-C motif chemokine 19 (CCL19) | CCL19 | 9 | 34689564 | 34691274 | ENSG00000172724 | NA | NA | NA | NA | NA |
CCL2 | MCP.1 | P13500 | Monocyte chemotactic protein 1 (MCP-1) | MCP-1 | 17 | 32582304 | 32584222 | ENSG00000108691 | CCL2 | NA | NA | NA | NA |
CCL20 | CCL20 | P78556 | C-C motif chemokine 20 (CCL20) | CCL20 | 2 | 228678558 | 228682272 | ENSG00000115009 | NA | NA | NA | NA | NA |
CCL23 | CCL23 | P55773 | C-C motif chemokine 23 (CCL23) | CCL23 | 17 | 34340096 | 34345005 | ENSG00000167236 | NA | ENSG00000274736 | 17 | 36013056 | 36017972 |
CCL25 | CCL25 | O15444 | C-C motif chemokine 25 (CCL25) | CCL25 | 19 | 8117651 | 8127534 | ENSG00000131142 | NA | ENSG00000131142 | 19 | 8052318 | 8062660 |
CCL28 | CCL28 | Q9NRJ3 | C-C motif chemokine 28 (CCL28) | CCL28 | 5 | 43376747 | 43412493 | ENSG00000151882 | NA | NA | NA | NA | NA |
CCL3 | MIP.1.alpha | P10147 | Macrophage inflammatory protein 1-alpha (MIP-1 alpha) | MIP-1 alpha | 17 | 34415602 | 34417515 | ENSG00000006075 | NA | ENSG00000277632 | 17 | 36088256 | 36090169 |
CCL4 | CCL4 | P13236 | C-C motif chemokine 4 (CCL4) | CCL4 | 17 | 34430983 | 34433014 | ENSG00000129277 | NA | ENSG00000275302 | 17 | 36103827 | 36105621 |
CCL7 | MCP.3 | P80098 | Monocyte chemotactic protein 3 (MCP-3) | MCP-3 | 17 | 32597240 | 32599261 | ENSG00000108688 | CCL7 | NA | NA | NA | NA |
CCL8 | MCP.2 | P80075 | Monocyte chemotactic protein 2 (MCP-2) | MCP-2 | 17 | 32646055 | 32648421 | ENSG00000108700 | CCL8 | NA | NA | NA | NA |
CD244 | CD244 | Q9BZW8 | Natural killer cell receptor 2B4 (CD244) | CD244 | 1 | 160799950 | 160832692 | ENSG00000122223 | NA | NA | NA | NA | NA |
CD274 | PD.L1 | Q9NZQ7 | Programmed cell death 1 ligand 1 (PD-L1) | PD-L1 | 9 | 5450503 | 5470566 | ENSG00000120217 | NA | NA | NA | NA | NA |
CD40 | CD40 | P25942 | CD40L receptor (CD40) | CD40 | 20 | 44746911 | 44758502 | ENSG00000101017 | NA | NA | NA | NA | NA |
CD5 | CD5 | P06127 | T-cell surface glycoprotein CD5 (CD5) | CD5 | 11 | 60869867 | 60895324 | ENSG00000110448 | NA | NA | NA | NA | NA |
CD6 | CD6 | P30203 | T-cell surface glycoprotein CD6 isoform (CD6) | CD6 | 11 | 60739115 | 60787849 | ENSG00000138675 | NA | ENSG00000013725 | 11 | 60971680 | 61020377 |
CDCP1 | CDCP1 | Q9H5V8 | CUB domain-containing protein 1 (CDCP1) | CDCP1 | 3 | 45123770 | 45187914 | ENSG00000163814 | NA | NA | NA | NA | NA |
CSF1 | CSF.1 | P09603 | Macrophage colony-stimulating factor 1 (CSF-1) | CSF-1 | 1 | 110452864 | 110473614 | ENSG00000184371 | NA | NA | NA | NA | NA |
CST5 | CST5 | P28325 | Cystatin D (CST5) | CST5 | 20 | 23856572 | 23860387 | ENSG00000170367 | NA | NA | NA | NA | NA |
CX3CL1 | CX3CL1 | P78423 | Fractalkine (CX3CL1) | CX3CL1 | 16 | 57406370 | 57418960 | ENSG00000006210 | NA | NA | NA | NA | NA |
CXCL1 | CXCL1 | P09341 | C-X-C motif chemokine 1 (CXCL1) | CXCL1 | 4 | 74735110 | 74736959 | ENSG00000163739 | NA | NA | NA | NA | NA |
CXCL10 | CXCL10 | P02778 | C-X-C motif chemokine 10 (CXCL10) | CXCL10 | 4 | 76942273 | 76944650 | ENSG00000169245 | NA | NA | NA | NA | NA |
CXCL11 | CXCL11 | O14625 | C-X-C motif chemokine 11 (CXCL11) | CXCL11 | 4 | 76954835 | 76962568 | ENSG00000169248 | NA | NA | NA | NA | NA |
CXCL5 | CXCL5 | P42830 | C-X-C motif chemokine 5 (CXCL5) | CXCL5 | 4 | 74861359 | 74864496 | ENSG00000163735 | NA | NA | NA | NA | NA |
CXCL6 | CXCL6 | P80162 | C-X-C motif chemokine 6 (CXCL6) | CXCL6 | 4 | 74702214 | 74714781 | ENSG00000124875 | CXCL6 | NA | NA | NA | NA |
CXCL9 | CXCL9 | Q07325 | C-X-C motif chemokine 9 (CXCL9) | CXCL9 | 4 | 76922428 | 76928641 | ENSG00000138755 | NA | NA | NA | NA | NA |
DNER | DNER | Q8NFT8 | Delta and Notch-like epidermal growth factor related receptor (DNER) | DNER | 2 | 230222345 | 230579274 | ENSG00000187957 | NA | NA | NA | NA | NA |
EIF4EBP1 | 4E.BP1 | Q13541 | Eukaryotic translation initiation factor 4E-binding protein 1 (4EBP1) | 4EBP1 | 8 | 37887859 | 37917883 | ENSG00000187840 | NA | NA | NA | NA | NA |
FGF19 | FGF.19 | O95750 | Fibroblast growth factor 19 (FGF-19) | FGF-19 | 11 | 69513000 | 69519410 | ENSG00000162344 | NA | NA | NA | NA | NA |
FGF21 | FGF.21 | Q9NSA1 | Fibroblast growth factor 21 (FGF-21) | FGF-21 | 19 | 49258816 | 49261587 | ENSG00000105550 | NA | NA | NA | NA | NA |
FGF23 | FGF.23 | Q9GZV9 | Fibroblast growth factor 23 (FGF-23) | FGF-23 | 12 | 4477393 | 4488894 | ENSG00000118972 | NA | NA | NA | NA | NA |
FGF5 | FGF.5 | P12034 | Fibroblast growth factor 5 (FGF-5) | FGF-5 | 4 | 81187753 | 81257834 | ENSG00000013725 | NA | ENSG00000138675 | 4 | 80266639 | 80336680 |
FLT3LG | Flt3L | P49771 | Fms-related tyrosine kinase 3 ligand (FIt3L) | FIt3L | 19 | 49977464 | 49989488 | ENSG00000090554 | NA | NA | NA | NA | NA |
GDNF | GDNF | P39905 | Glial cell line-derived neutrophic factor (hGDNF) | hGDNF | 5 | 37812779 | 37839788 | ENSG00000168621 | NA | NA | NA | NA | NA |
HGF | HGF | P14210 | Hepatocyte growth factor (HGF) | HGF | 7 | 81328322 | 81399754 | ENSG00000019991 | NA | NA | NA | NA | NA |
IFNG | IFN.gamma | P01579 | Interferon gamma (IFN-gamma) | IFN-gamma | 12 | 68548548 | 68553527 | ENSG00000111537 | NA | NA | NA | NA | NA |
IL10 | IL.10 | P22301 | Interleukin-10 (IL-10) | IL-10 | 1 | 206940947 | 206945839 | ENSG00000136634 | NA | NA | NA | NA | NA |
IL10RA | IL.10RA | Q13651 | Interleukin-10 receptor subunit alpha (IL-10RA) | IL-10RA | 11 | 117857063 | 117872196 | ENSG00000110324 | NA | NA | NA | NA | NA |
IL10RB | IL.10RB | Q08334 | Interleukin-10 receptor subunit beta (IL10RB) | IL10RB | 21 | 34638663 | 34669539 | ENSG00000243646 | NA | NA | NA | NA | NA |
IL12B | IL.12B | P29460 | Interleukin-12 subunit beta (IL-12B) | IL-12B | 5 | 158741791 | 158757895 | ENSG00000113302 | NA | NA | NA | NA | NA |
IL13 | IL.13 | P35225 | Interleukin-13 (IL-13) | IL-13 | 5 | 131991955 | 131996802 | ENSG00000169194 | NA | NA | NA | NA | NA |
IL15RA | IL.15RA | Q13261 | Interleukin-15 receptor subunit alpha (IL-15RA) | IL-15RA | 10 | 5990855 | 6020150 | ENSG00000134470 | NA | NA | NA | NA | NA |
IL17A | IL.17A | Q16552 | Interleukin-17A (IL-17A) | IL-17A | 6 | 52051185 | 52055436 | ENSG00000112115 | NA | NA | NA | NA | NA |
IL17C | IL.17C | Q9P0M4 | Interleukin-17C (IL-17C) | IL-17C | 16 | 88704999 | 88706881 | ENSG00000124391 | NA | NA | NA | NA | NA |
IL18 | IL.18 | Q14116 | Interleukin-18 (IL-18) | IL-18 | 11 | 112013974 | 112034840 | ENSG00000150782 | NA | NA | NA | NA | NA |
IL18R1 | IL.18R1 | Q13478 | Interleukin-18 receptor 1 (IL-18R1) | IL-18R1 | 2 | 102927989 | 103015218 | ENSG00000115604 | NA | NA | NA | NA | NA |
IL1A | IL.1.alpha | P01583 | Interleukin-1 alpha (IL-1 alpha) | IL-1 alpha | 2 | 113531492 | 113542167 | ENSG00000115008 | NA | NA | NA | NA | NA |
IL2 | IL.2 | P60568 | Interleukin-2 (IL-2) | IL-2 | 4 | 123372625 | 123377880 | ENSG00000109471 | NA | NA | NA | NA | NA |
IL20 | IL.20 | Q9NYY1 | Interleukin-20 (IL-20) | IL-20 | 1 | 207038699 | 207042568 | ENSG00000162891 | NA | NA | NA | NA | NA |
IL20RA | IL.20RA | Q9UHF4 | Interleukin-20 receptor subunit alpha (IL-20RA) | IL-20RA | 6 | 137321108 | 137366298 | ENSG00000016402 | NA | NA | NA | NA | NA |
IL22RA1 | IL.22.RA1 | Q8N6P7 | Interleukin-22 receptor subunit alpha-1 (IL-22RA1) | IL-22RA1 | 1 | 24446261 | 24469611 | ENSG00000142677 | NA | NA | NA | NA | NA |
IL24 | IL.24 | Q13007 | Interleukin-24 (IL-24) | IL-24 | 1 | 207070788 | 207077484 | ENSG00000162892 | NA | NA | NA | NA | NA |
IL2RB | IL.2RB | P14784 | Interleukin-2 receptor subunit beta (IL-2RB) | IL-2RB | 22 | 37521878 | 37571094 | ENSG00000100385 | NA | NA | NA | NA | NA |
IL33 | IL.33 | O95760 | Interleukin-33 (IL-33) | IL-33 | 9 | 6215805 | 6257983 | ENSG00000137033 | NA | NA | NA | NA | NA |
IL4 | IL.4 | P05112 | Interleukin-4 (IL-4) | IL-4 | 5 | 132009678 | 132018368 | ENSG00000113520 | NA | NA | NA | NA | NA |
IL5 | IL.5 | P05113 | Interleukin-5 (IL-5) | IL-5 | 5 | 131877136 | 131892530 | ENSG00000113525 | NA | NA | NA | NA | NA |
IL6 | IL.6 | P05231 | Interleukin-6 (IL-6) | IL-6 | 7 | 22765503 | 22771621 | ENSG00000136244 | NA | NA | NA | NA | NA |
IL7 | IL.7 | P13232 | Interleukin-7 (IL-7) | IL-7 | 8 | 79587978 | 79717758 | ENSG00000104432 | NA | NA | NA | NA | NA |
IL8 | IL.8 | P10145 | Interleukin-8 (IL-8) | IL-8 | 4 | 74606223 | 74609433 | ENSG00000169429 | NA | NA | NA | NA | NA |
KITLG | SCF | P21583 | Stem cell factor (SCF) | SCF | 12 | 88886570 | 88974628 | ENSG00000049130 | NA | NA | NA | NA | NA |
LIF | LIF | P15018 | Leukemia inhibitory factor (LIF) | LIF | 22 | 30636436 | 30642840 | ENSG00000128342 | NA | NA | NA | NA | NA |
LIFR | LIF.R | P42702 | Leukemia inhibitory factor receptor (LIF-R) | LIF-R | 5 | 38475065 | 38608456 | ENSG00000113594 | NA | NA | NA | NA | NA |
LTA | TNFB | P01374 | TNF-beta (TNFB) | TNFB | 6 | 31539831 | 31542101 | ENSG00000226979 | NA | NA | NA | NA | NA |
MMP1 | MMP.1 | P03956 | Matrix metalloproteinase-1 (MMP-1) | MMP-1 | 11 | 102660651 | 102668891 | ENSG00000196611 | NA | NA | NA | NA | NA |
MMP10 | MMP.10 | P09238 | Matrix metalloproteinase-10 (MMP-10) | MMP-10 | 11 | 102641234 | 102651359 | ENSG00000166670 | NA | NA | NA | NA | NA |
NGF | Beta.NGF | P01138 | Beta-nerve growth factor (Beta-NGF) | Beta-NGF | 1 | 115828539 | 115880857 | ENSG00000134259 | NA | NA | NA | NA | NA |
NRTN | NRTN | Q99748 | Neurturin (NRTN) | NRTN | 19 | 5823813 | 5828335 | ENSG00000171119 | NA | NA | NA | NA | NA |
NTF3 | NT.3 | P20783 | Neurotrophin-3 (NT-3) | NT-3 | 12 | 5541278 | 5630702 | ENSG00000185652 | NA | NA | NA | NA | NA |
OSM | OSM | P13725 | Oncostatin-M (OSM) | OSM | 22 | 30658818 | 30662829 | ENSG00000099985 | NA | NA | NA | NA | NA |
PLAU | uPA | P00749 | Urokinase-type plasminogen activator (uPA) | uPA | 10 | 75668935 | 75677255 | ENSG00000122861 | NA | NA | NA | NA | NA |
S100A12 | EN.RAGE | P80511 | Protein S100-A12 (EN-RAGE) | EN-RAGE | 1 | 153346184 | 153348125 | ENSG00000163221 | NA | NA | NA | NA | NA |
SIRT2 | SIRT2 | Q8IXJ6 | SIR2-like protein 2 (SIRT2) | SIRT2 | 19 | 39369197 | 39390502 | ENSG00000068903 | NA | NA | NA | NA | NA |
SLAMF1 | SLAMF1 | Q13291 | Signaling lymphocytic activation molecule (SLAMF1) | SLAMF1 | 1 | 160577890 | 160617085 | ENSG00000117090 | NA | NA | NA | NA | NA |
STAMBP | STAMPB | O95630 | STAM-binding protein (STAMPB) | STAMPB | 2 | 74056086 | 74100786 | ENSG00000124356 | NA | NA | NA | NA | NA |
SULT1A1 | ST1A1 | P50225 | Sulfotransferase 1A1 (ST1A1) | ST1A1 | 16 | 28616903 | 28634946 | ENSG00000196502 | NA | NA | NA | NA | NA |
TGFA | TGF.alpha | P01135 | Transforming growth factor alpha (TGF-alpha) | TGF-alpha | 2 | 70674412 | 70781325 | ENSG00000163235 | NA | NA | NA | NA | NA |
TGFB1 | LAP.TGF.beta.1 | P01137 | Latency-associated peptide transforming growth factor beta 1 (LAP TGF-beta-1) | LAP TGF-beta-1 | 19 | 41807492 | 41859816 | ENSG00000105329 | NA | NA | NA | NA | NA |
TNF | TNF | P01375 | Tumor necrosis factor (TNF) | TNF | 6 | 31543344 | 31546113 | ENSG00000232810 | NA | NA | NA | NA | NA |
TNFRSF11B | OPG | O00300 | Osteoprotegerin (OPG) | OPG | 8 | 119935796 | 119964439 | ENSG00000164761 | NA | NA | NA | NA | NA |
TNFRSF9 | TNFRSF9 | Q07011 | Tumor necrosis factor receptor superfamily member 9 (TNFRSF9) | TNFRSF9 | 1 | 7979907 | 8000926 | ENSG00000049249 | NA | NA | NA | NA | NA |
TNFSF10 | TRAIL | P50591 | TNF-related apoptosis ligand (TRAIL) | TRAIL | 3 | 172223298 | 172241297 | ENSG00000121858 | NA | NA | NA | NA | NA |
TNFSF11 | TRANCE | O14788 | TNF-related activation cytokine (TRANCE) | TRANCE | 13 | 43136872 | 43182149 | ENSG00000120659 | NA | NA | NA | NA | NA |
TNFSF12 | TWEAK | O43508 | Tumor necrosis factor (Ligand) superfamily member 12 (TWEAK) | TWEAK | 17 | 7452208 | 7464925 | ENSG00000239697 | NA | NA | NA | NA | NA |
TNFSF14 | TNFSF14 | O43557 | Tumor necrosis factor ligand superfamily member 14 (TNFSF14) | TNFSF14 | 19 | 6663148 | 6670599 | ENSG00000125735 | NA | ENSG00000125735 | 19 | 6661253 | 6670588 |
TSLP | TSLP | Q969D9 | Thymic stromal lymphopoietin (TSLP) | TSLP | 5 | 110405760 | 110413722 | ENSG00000145777 | NA | NA | NA | NA | NA |
VEGFA | VEGF.A | P15692 | Vascular endothelial growth factor A (VEGF_A) | VEGF_A | 6 | 43737921 | 43754224 | ENSG00000112715 | NA | NA | NA | NA | NA |
followed by inclusion of Olink_qPCR for all 12 panels, Olink NGS, Caprion and SWATH-MS panels. SomaScanV4.1
is directly from SomaLogic.
2.2 Caprion
As has been the norm, no snapshot upon data release was provided which consequently requires substantial effort and the notable ones are highlighted here.
Gene | Gene.orig | Protein | Accession | ensGenes | chr | start | end |
---|---|---|---|---|---|---|---|
AMY1 | AMY1A; AMY1B; AMY1C | AMY1_HUMAN | P04745 | ENSG00000174876, ENSG00000187733, ENSG00000237763 | 1 | 103655760 | 103758726 |
C20orf27 | C20orf27 | CT027_HUMAN | Q9GZN8 | ENSG00000101220 | 20 | 3753508 | 3767781 |
C4B | C4B; C4B_2 | CO4B_HUMAN | P0C0L5 | ENSG00000224389, ENSG00000236625, ENSG00000228267, ENSG00000233312, ENSG00000228454 | 6 | 3262537 | 32035418 |
C7orf26 | C7orf26 | CG026_HUMAN | Q96N11 | ENSG00000146576 | 7 | 6590021 | 6608726 |
CD97 | CD97 | CD97_HUMAN | P48960 | ENSG00000123146 | 19 | 14380501 | 14408725 |
FAM49B | FAM49B | FA49B_HUMAN | Q9NUQ9 | ENSG00000153310 | 8 | 129839593 | 130017504 |
GBA | GBA | GLCM_HUMAN | P04062 | ENSG00000177628 | 1 | 155234452 | 155244699 |
HBA | HBA1; HBA2 | HBA_HUMAN | P69905 | ENSG00000188536, ENSG00000206172 | 16 | 172876 | 177522 |
HIST | HIST1H4A; HIST1H4B; HIST1H4C; HIST1H4D; HIST1H4E; HIST1H4F; HIST1H4H; HIST1H4I; HIST1H4J; HIST1H4K; HIST1H4L; HIST2H4A; HIST2H4B; HIST4H4 | H4_HUMAN | P62805 | ENSG00000277157, ENSG00000275126, ENSG00000274618, ENSG00000197061, ENSG00000158406, ENSG00000197238, ENSG00000276966, ENSG00000197837, ENSG00000276180, ENSG00000278705, ENSG00000273542, ENSG00000278637, ENSG00000270276, ENSG00000270882 | 6, 12, 1 | 14767999 | 149861159 |
SLC9A3R1 | SLC9A3R1 | NHRF1_HUMAN | O14745 | ENSG00000109062 | 17 | 74748628 | 74769353 |
WARS | WARS | SYWC_HUMAN | P23381 | ENSG00000140105 | 14 | 100333790 | 100376805 |
YARS | YARS | SYYC_HUMAN | P54577 | ENSG00000134684 | 1 | 32775237 | 32818031 |
which again is useful for extracting data from GTEx v8.
3 Published data
This associates with the analysis of INTERVAL data as reported1, which includes the panel SomaScan160410
, supplementary tables 4, 6, 18.
5 Scripts
An analysis involving COVID-19 data is in Olink/
directory , while the scripts/
directory records data generation which potentially can be extended.
Specifically, pQTLdata.sh
handles package under Cambridge Service for Data Driven Discovery (CSD3) system, and docs.sh
operates with GitHub.
6 URLs
- Cell Carta, https://cellcarta.com/
- EndNote, https://endnote.com/
- ENSEMBL BioMart, https://www.ensembl.org/index.html
- GitHub, https://github.com/
- INTERVAL study, https://www.donorhealth-btru.nihr.ac.uk/studies/interval-study/
- NULISA, https://alamarbio.com/technology/nulisa-platform/
- NCI Proteomic Data Commons, https://pdc.cancer.gov/pdc/browse
- Olink, https://olink.com/ (https://olink.com/resources-support/publications-2/)
- SCALLOP consortium, http://www.scallop-consortium.com/
- Seer, https://seer.bio/
- SomaLogic, https://somalogic.com/
- SWATH-MS, https://www.creative-proteomics.com/ngpro/swath-ms.html
- Thermo Fisher Scientific - LSMS, https://github.com/thermofisherlsms
- UCSC, https://genome.ucsc.edu/
References
The EndNote/
directory includes references in1 and2 formatted in EndNote.