Abstract
Current DNA compression algorithms rely on finding repetitions within the DNA sequence so that similar subsequences can be encoded by referencing to each other. In this paper, we explore similarities between different chromosomes of the sequence "Saccharomyces cerevisiae". These similarities are characterized by the existence of similar subsequences among different chromosomes. The longer the similar subsequences are, the higher the cross-similarities are. Our study indicates that these cross-sequence similarities are often significant as compared to self-sequence similarities. This implies that it would be advantageous to compress two or more sequences together so that similar subsequences found between multiple sequences can be encoded together.
Original language | English |
---|---|
Title of host publication | Computational Models For Life Sciences (CMLS '07) - 2007 International Symposium |
Pages | 167-176 |
Number of pages | 10 |
Volume | 952 |
DOIs | |
Publication status | Published - 1 Dec 2007 |
Event | 2007 International Symposium on Computational Models for Life Sciences, CMLS '07 - Gold Coast, QLD, Australia Duration: 17 Dec 2007 → 19 Dec 2007 |
Conference
Conference | 2007 International Symposium on Computational Models for Life Sciences, CMLS '07 |
---|---|
Country/Territory | Australia |
City | Gold Coast, QLD |
Period | 17/12/07 → 19/12/07 |
Keywords
- DNA compression
- DNA sequence similarity
ASJC Scopus subject areas
- General Physics and Astronomy