With our careful planning and implementation, we are able to furnish a timely analysis of our large project. Our primary interest here is not a detailed account of the hardware and timing for our project but the approach we implement. We have focused on single-point analysis, and the problem can be more complex when multipoint analysis is involved.
Our approach builds on our considerable experiences of the
computer systems over years. Given that the SAS system is widely available, our
work will be welcome. We also noted some caveats association with SAS. While the
system was designed to
The software we developed is largely applicable to those with moderate resources and both Linux and Windows systems. We have also started and will continue our experiments on other software systems including Stata (http://www.stata.com), R (http://www.r-project.org), standalone programs such as SNPGWA (http://www.phs.wfubmc.edu/web/public_bios/sec_gene/downloads.cfm), and commercial software such as BC/SNPmax (http://www.bcplatforms.com). In general, grid computing using clusters is now the state-of-the-art alternative to supercomputers, which is well-documented and will be part of our next experiment. SAS, Stata, R all have facilities to support multiple processors. They are also generic and not specific to genetic data. An additional feature common to these system is the support of open database connectivity which allows for synchronous access to databases. The comparison of these systems however will be more appropriate in a separate contribution.