APP下载

Big Data:Where Dreams Take Flight

2013-06-06ChengzhongXu,ZhibinYu

ZTE Communications 2013年2期

▶Chengzhong Xu

Chengzhong Xu received his BSc degree and MSc degree in computer science and engineering from Nan⁃jing University in 1986 and 1989.He received his PhDdegree in com⁃puter engineering from the Universi⁃ty of Hong Kong in 1993.His re⁃search interests include computer architecture,distributed systems,virtualization,and cloud computing.Dr.Xu is a professor of electrical and computer engineer⁃ing at Wayne State University,Detroit,USA.He is also the director of the Cloud and Internet Computer Laboratory at Wayne State University.He is a senior member of the IEEEand member of the ACM.

▶Zhibin Yu

Zhibin Yu received his PhD de⁃gree in computer science from Hua⁃zhong University of Science and Technology(HUST)in 2008.He spent one year as a visiting scholar at the Laboratory of Computer Ar⁃chitecture,Department of Electrical and Computer Engineering,Univer⁃sity of Texas at Austin.He is cur⁃rently an associate professor at the Shenzhen Institutes of Advanced Technology,China.His research interests include mi⁃cro-architecture simulation,computer architecture,work⁃load characterization and generation,performance evalua⁃tion,multicore architecture,and virtualization technolo⁃gies.In 2005,he won first prize in the HUST Young Lec⁃turers Teaching Contest.In 2003,he won second prize in the HUST Teaching Quality Assessment.He is a member of the IEEEand ACM.

F rom academia to industry,big data has become a buzzword in informa⁃tion technology.The USFederal Government is paying much attention to the big-data revolution.In 2012,fourteen US government departments allocated funds to 87 big-data projects[1].Europe has the second larg⁃est amount of data[2],and most universitiesand research instituteshavealready es⁃tablished big-data research programs.In Asia,especially in China,central and lo⁃cal governmentshave been setting aside fundsfor their own big-data programs.The big-data related 973 Projects in China are good examples of this.Industry players have been following in the footsteps of big-data pioneers such as Google,Facebook,Twitter,and Baidu,and more and more companies are rushing into the big-data business.Companies have been analyzing the purchasing behavior of huge numbers of customers and have been devising more attractive plans and policies.Big data is already an important part of the$64 billion database and data analytics market[3].Indeed,big data will open up commercial opportunities comparable in scale to those created by enterprise software of the late 1980s,the internet of the 1990s,and thesocial mediaexplosion today.

However,what is big data?It hasbeen defined in many different ways.We prefer todefine bigdata as data sets that aretoo big for current information technologiesto capture,transmit,store,process,or visualize.Although this definition is simple,it encompasses computing complexity theory,computer architecture,operating sys⁃tem,programming model,database technologies,algorithms,and applications.Peo⁃ple from different fields have dramatically different understandings of big data,which iswhy thereissomuch excitement and conjecturesurroundingit.

In thisspecial issue,we present papersthat discussbig-datatechnology fromdif⁃ferent perspectives.These are not only high-level surveys but also reports on initial results from big-data projects.Communication infrastructure is one of the most im⁃portant aspects of big data.Yi Zhu and Zhengkun Mi from Nanjing University of Posts and Telecommunications discuss content-centric networking,which is seen as a promising approach to big-data distribution.They propose a networking archi⁃tecture for processing big data,and this architecture is fundamentally different from TCP/IP.Shengmei Luo et al.from the Cloud Computing&IT Institute of ZTE Cor⁃poration present a survey of big-data analytics.They analyze challenges related to storage,data-mining algorithms,and programming models for big data.They also predict opportunities in the big-data era.Although there are many potential busi⁃ness opportunities in big data,security is of the utmost importance for users and cannot be overlooked.Ruixuan Li et al.from Huazhong University of Science and Technology provide an overview of data security and privacy-preservation for cloud storage.They carefully investigate confidentiality,data integrity,and data availabili⁃ty.They also propose a feasible solution to current security problems.Shigang Chen et al.from the University of Florida delve more deeply into data integrity.They pro⁃pose a novel authenticated data structure called Cloud Merkle B+tree that supports dynamic operationssuch asinsertion,deletion and modification.CMBTlowersover⁃head from O(n)to O(log n).

Moving to big data applications,algorithms oriented towards a single machine are not necessarily efficient in big-data platforms because many machines need to run concurrently for the same task.Weisong Shiet al.from Wayne State University design a mech⁃anism called SPBD that reduces the response time of big-data systems.This mechanism is very feasible in practice.Zhen⁃dong Bei et al.report their experiences with big-data applica⁃tions that use MapReduce/Hadoop.They confirm that manual⁃ly tuning up to 190 Hadoop configuration parameters is ex⁃tremely time consuming,if at all possible.They then propose an automatic performance prediction scheme based on random forest to determine the best configuration parameter combina⁃tions.Their experimental results show that their scheme can predict the performance of Hadoop systemsvery accurately.

Challenges and opportunities exist together in the big-data era.We believe most of these challenges will be overcome and opportunities will be realized.Big data is a field where dreams will takeflight.