“面向深度测序大数据量的计算模型与体系结构研究”立项报告
2016-05-30张佩珩卜东波熊劲谭光明
张佩珩 卜东波 熊劲 谭光明
摘 要:新一代测序的发展和推广应用使生物序列数据增长速度远远超过了摩尔定律对计算机处理能力增长的预期。该研究人员将深入分析各种基因组数据的特点,针对性地研究高效数据压缩和传输的方法,研究新型的数据存储系统构架;研究在压缩空间上进行数据处理的方法,将存储、压缩和处理、应用结合起来考虑,发展适应超大规模基因组数据的搜索方法;深入分析测序数据的特点和测序数据常见处理任务对计算资源的需求特点,探索新的软硬件模型和可能的新型体系结构,探索新的计算服务模型在测序数据存储、传输和处理上的应用,从计算技术上为迎接个体基因组时代的到来做好充分准备,同时推动我国相关信息技术和产业的创新发展。
关键词:深度测序 大数据 计算模型 体系结构 序列比对 序列拼接 序列压缩
Abstract:With the development of next-generation sequencing, the sequence data increase much faster than Moore's Law. In this project we will further analyze the characteristics of various genomic data, research data compression and transmission methods, study the new data storagestorage system architecturewe will research data processing method in the compression space, comprehensively considering storage and compression as well as processing together, develop methods to search over large-scale genomic datawe will analyze the characteristics of sequencing data sequencing and data processing tasks, explore new computing models and new hardware-software architecture. These work will help us to prepare for the arrival of individual genomes era, while promoting innovation and development of China's information technology and industry.
Key Words:Deep Sequencing;Big Data;Computing Model;Architechture;Sequence Alignment;Sequence Assembly;Sequence Compress
閱读全文链接(需实名注册):http://www.nstrs.cn/xiangxiBG.aspx?id=50827&flag=1