A pairwise-gaussian-merging approach: Towards genome segmentation for copy number analysis

Chih Hao Chen, Hsing Chung Lee, Qingdong Ling, Hsiao Jung Chen, Sun Chong Wang, Li Ching Wu, H. C. Lee

Research output: Contribution to journalArticlepeer-review


Segmentation, filtering out of measurement errors and identification of breakpoints are integral parts of any analysis of microarray data for the detection of copy number variation (CNV). Existing algorithms designed for these tasks have had some successes in the past, but they tend to be O(N 2) in either computation time or memory requirement, or both, and the rapid advance of microarray resolution has practically rendered such algorithms useless. Here we propose an algorithm, SAD, that is much faster and much less thirsty for memory - O(N) in both computation time and memory requirement -- and offers higher accuracy. The two key ingredients of SAD are the fundamental assumption in statistics that measurement errors are normally distributed and the mathematical relation that the product of two Gaussians is another Gaussian (function). We have produced a computer program for analyzing CNV based on SAD. In addition to being fast and small it offers two important features: quantitative statistics for predictions and, with only two user-decided parameters, ease of use. Its speed shows little dependence on genomic profile. Running on an average modern computer, it completes CNV analyses for a 262 thousand-probe array in ~1 second and a 1.8 million-probe array in 9 seconds.

Original languageEnglish
Pages (from-to)58-66
Number of pages9
JournalWorld Academy of Science, Engineering and Technology
StatePublished - Mar 2011


  • Cancer
  • Chromosomal aberration
  • Copy number variation
  • Pathogenesis
  • Segmentation analysis


Dive into the research topics of 'A pairwise-gaussian-merging approach: Towards genome segmentation for copy number analysis'. Together they form a unique fingerprint.

Cite this