APP下载

Implementation and application services of national marine scientific data sharing service system

2023-04-15WANGYiFUYuTONGXinXUMogeng

Marine Science Bulletin 2023年2期

WANG Yi ,FU Yu ,TONG Xin ,XU Mogeng *

1.National Marine Data and Information Service,Tianjin 300171,China;

2.National Marine Scientific Data Center,Tianjin 300171,China

Abstract: Marine scientific data sharing is a key factor to promote marine science and enhance the data values.This paper reviews the development history of the National Marine Scientific Data Center (which is short for NMSDC), and summarizes NMSDC’s practices and achievements in the field of marine scientific data sharing in terms of data collection and integration, data preservation and management, key technology research and product development, system platform implementation, promotion and application of data sharing services,etc.This paper concludes by suggesting that in the era of big data, data-intensive scientific research has emerged as a new scientific research paradigm, and intersections and fusions,mining and analysis, and integrated applications of big data in marine science are expected to usher in new opportunities and challenges for marine scientific data sharing.

Keywords: marine scientific data, data organization and management, data opening sharing,big data applications

1 Introduction

Marine scientific data represent a new type of strategic resource for accelerating marine science discovery and supporting the construction of a strong ocean state, and constitute an important basis for understanding, managing and planning the ocean.In China,there are a large number of departments related to marine affairs,but decentralized storage of marine scientific data and the phenomenon of "information islands" are highly common.These issues create a major bottleneck that restricts scientific and technological progress and even hampers the construction of a maritime power.Scientific data can only yield economic and social benefits and unleash the dividends of data resources when being shared and circulated.In view of environmental problems such as global climate change and marine ecological damage, it is urgent to implement a marine scientific data sharing program under the new situations such as overall land-sea integration and marine economic development.It is a core issue accompanying the overall process of marine science development to promote the convergence and integration of data resources at the national level and to tap into the value hidden behind the data.The implementation team of the NMSDC has kept pace with the times, closely focused on the national strategic demands,and relied on the accumulation of efforts from the previous project of building the"Digital Ocean" information infrastructure framework in offshore areas of China.In 2017, it initiated the construction of the National Marine Scientific Data Sharing Service Platform and formally incorporated it into the matrix of National Science&Technology Infrastructure,so as to provide extensive data services and information guarantee for marine scientific research,major sea-related engineering constructions,services for sea-related enterprises,public science popularization, etc., thereby greatly promoting the establishment of the marine scientific data sharing ecology and realizing the leap-forward transformation of marine scientific data sharing in China "from nonexistence to existence" and then "from existence to excellence".

2 Development history of National Marine Scientific Data Center

The development of the NMSDC can be dated back to the construction of the"Digital Ocean" Project in China.In 2003, the State Council approved the implementation of"China's Comprehensive Offshore Investigation and Assessment Program" (abbreviated as "908" Program), which incorporated the "Construction of 'Digital Ocean' Information Framework for Offshore Areas of China" into the Project.In 2006,the Implementation Plan for the Construction of Offshore Digital Ocean Information Framework in China was approved by the State Oceanic Administration, marking the official launch of the construction of China's "Digital Ocean" Project[1].With efforts on all aspects launched concurrently, such as the construction of standard and specification system, key technology research and development, marine data warehouse construction and digital ocean node construction,the work of marine informatization in China has been initiated[2].The "Digital Ocean" prototype system and integrated management information system developed have effectively integrated the "908 Program" survey, marine observation and monitoring as well as integrated management operation data resources, and provided marine data retrieval, marine environment statistical analysis, marine integrated management information inquire as well as marine thematic application services for marine scientific research and national and coastal provincial and municipal administrators via a dedicated digital ocean network.This enables one-stop services such as data query and retrieval,downloading and online use,visual expression and node interactive publishing[3].It promotes the innovative application of information science and technology in marine research and engineering practices,and has become a major window for domestic marine data information sharing services for a certain period of time.

In 2003, the Ministry of Science and Technology launched a special project for the construction of the National Science and Technology Infrastructure Platform[4].The National Marine Data and Information Service, in conjunction with the Qingdao Science and Technology Bureau and the three Marine Information Centers of the former State Oceanic Administration, jointly constructed the National Marine Scientific Data Sharing Center.This center served as one of the initial pilot projects for scientific data sharing projects,initiating the exploration of key technologies such as distributed data organization and management system,shared service model with data security as a prerequisite,visual representation, etc.In 2016, the Marine Scientific Data Sharing Center passed the performance assessment of the Ministry of Science and Technology and the Ministry of Finance, and was formally included in the matrix of the National Science & Technology Infrastructure in 2017, marking the entry into a comprehensive construction phase of the National Marine Scientific Data Sharing Service Platform.The Platform integrates the data management system, expands and optimizes the talent team, re-perform the sharing service portal system R&D, and develops the co-construction model of "main center +sub-center + data nodes", and it relies on nine institutions, including the Institute of Oceanology of the Chinese Academy of Sciences and Dalian Ocean University, to jointly engage in marine scientific data sharing services.In June 2019, the Ministry of Science and Technology and the Ministry of Finance released the optimized and adjusted List of National Science & Technology Infrastructure, and the National Marine Data Center passed the accreditation as one of the first 20 national scientific data centers.In August of the same year,the Implementation Plan for the Construction and Operation of the National Marine Scientific Data Center(2020-2025) passed the expert review, opening a new chapter in marine scientific data sharing from then on.

NMSDC has been consistently adhering to the concept of "Resource Integration,Open Fusion,and In-depth Services",and has engaged in construction and operation with the main line of "Establishing Mechanism, Integrating Resources,Addressing Technology,Building Platform, and Promoting Applications", with the aim of creating a sustainable development ecosystem of data sharing integrating industry, academia, research, and application in the field of marine science, so as to realize stable, authoritative, and sustainable data sharing services to support scientific and technological innovation and economic development in the marine field.

3 Effectiveness and achievements of the marine scientific data sharing service system

Over the past more than four years since its establishment,NMSDC has witnessed a rapid development stage.Under the guidance of the Ministry of Science and Technology and the Ministry of Natural Resources as well as the support of the sponsoring and co-construction institutions,it has achieved excellent results in terms of mechanism system construction, data collection and preservation, data archive for science and technology programs, resource mining and key technology R&D, data sharing services, etc., and gradually developed a new pattern of networked ocean data services for the whole society.The implementation achievements have been included in the "Top 10 Ocean Science and Technology Progress in 2019" and awarded the Second Prize of "Ocean Engineering Science and Technology Award"in 2022.

3.1 Data collection and integration

Over the years,based on the integration of data resources acquired by the Ministry of Natural Resources via marine investigation, observation and monitoring, oceanographic scientific investigation, polar expeditions and international cooperation, NMSDC has continued to extend its multi-level data aggregation channels.Efforts have been made to fully leverage the disciplinary fields and regional advantages of the co-construction institutions.It has integrated the characteristic data of satellite oceanography, marine fishery science, estuarine coastal science, etc.from related marine research institutes,colleges and universities.Additionally, it has collected data resources transferred from national science and technology programs on a large scale, tracked and collected international public data information, and incorporated enterprise data such as CNOOC Offshore Oil and Gas Platform Observation and Ningbo Marine Fishing Vessel Observation into the national data resource system.A co-construction mode featuring "Main Center +Sub-Centers + Data Nodes" has been adopted to incorporate the advantages of each entity and develop a sustainable development model to realize the comprehensive convergence and integration of marine scientific data resources from sea-related departments, research institutes, colleges, universities and enterprises in a progressive manner,with a view to establishing a center of marine scientific data resources in China.

Currently, NMSDC has integrated marine environmental data from 10 disciplines,such as marine hydrology, marine meteorology and marine biology, integrated marine management information from sea areas and islands,marine economy and marine ecology,basic geographic data such as seabed topography and remote sensing of islands and coastal zones, as well as data resources from 9 national science and technology basic work projects and more than 80 key national research and development projects under the"13thFive-Year Plan".Currently, the total data volume is 2.1 PB, with an average annual increment of about 200 TB.

3.2 Data storage and management

The marine scientific data collected and integrated feature a wide disciplinary scope,comprehensive spatial and temporal coverage, and strong comprehensiveness.The update frequency spans"second-minute -hour"level, and vertically covers spatial scales ranging from"high altitude-sea surface-water body-seabed".The scope of the sea area primarily covers the sea areas under the jurisdiction of China,and gradually expands to the deep ocean, the north and south poles,as well as strategic maritime channels.In terms of data management, NMSDC has structured a marine data classification and management model consisting of marine environment, marine comprehensive management, marine basic geography and remote sensing at the domain level, and three levels of raw information, basic data and application products at the data maturity level, which supports extraction and combination of different granularities according to disciplines, elements,thematic features,etc.,with high reusability and strong scalability.

On this basis, a marine scientific data resources pool has been established, which integrates and loads 240 databases in Chinese,160 databases in English,over 200 million data lists, and about 2.6 million entries of literature data such as journal papers, project reports,patents,treaties,standards,policies and regulations,science vocabulary lists,and expert institutions in marine both in China and abroad, and develops a series of tools and software for quality control processing, loading and updating, and visual expression, and realizes dynamic updating and controlled release of all entity data and metadata.Furthermore, NMSDC has independently developed CSTR, a marine science and technology resource coding and identification system compatible with DOI(the unique identifier of digital objects), which provides a global, transparent and unique permanent identifier for data[5]and is of multiple values in future scientific data publication such as tracking,citation,integration and association.

3.3 Key technology research and product development

In terms of key technology R&D, the research based on the international ERDDAP standard and technical framework, has been carried on and integrated OPeNDAP, kml,NetCDF, csv, json, txt and other protocols and data format standards, and then independently developed a set of data management and sharing tools and software featuring unified data rules,flexible extraction and aggregation,controlled storage formats,support for a variety of online rapid visual expression, and application service interfaces,thereby enriching and extending the traditional singular model of "data sharing is all about downloading".The blockchain technology has been applied in the field of data archive of science and technology projects[6],and a blockchain storage system has been constructed to cover the overall process of data submission and collection, providing new ideas and methods to address the problems of data validation,version management,credible sharing and traceability verification in the practice of data archive.

In terms of data product development,7 major products including environmental data,data for analysis and forecast, basic geography and remote sensing, comprehensive management information, knowledge application, atlas & report, thematic system, etc.have been developed independently,to meet the needs of marine environment protection,marine resources development and utilization, marine area and island supervision, etc.,which has realized an all-field, all-discipline and all-factor coverage of the ocean-related data products.In terms of marine environmental protection, efforts have been made to make a breakthrough in multi-source remote sensing data fusion and reconstruction technology, and to develop marine environmental remote sensing products featuring long time series and multiple sources and elements in the marginal waters of the Arctic Ocean for nearly 40 years, thereby providing data support for research on the ecological environment and climate change in the Arctic Ocean[7].In terms of development and utilization of marine resources,efforts have been made to develop frequency detection and adaptive threshold intelligent identification algorithms based on Sentinel 1 radar remote sensing data, which has enabled the formation of global offshore wind power accurate spatial location distribution data products with a spatial resolution of 10 meters and an accuracy of 99%; the use of big data technology to mine and analyze global AIS data,integrate and access 4,005 ports,more than 41,000 berths and 25,000 ship files worldwide,thereby forming marine shipping big data analysis and prediction products[8].In terms of sea area and island control, efforts have been made to utilize more than 20 sets of small target radars in the offshore area to produce and shape products covering 30 nautical miles of offshore target trajectories by fusing with AIS data.

3.4 Data open and sharing

NMSDC has developed Marine Scientific Data Sharing Service Platform integrating data services, information services and knowledge services for multi-terminal applications at home and abroad, thereby creating the first domestic Marine Scientific Data Sharing Service System featuring unified standards, convenient services, openness and security.For Internet users, a distributed marine scientific data sharing service portal system(https://mds.nmdis.org.cn) has been established with over 400 datasets published to enable"searchable,visible,available and downloadable"services for various types of data.For international users, the English version of the data sharing service system(http://odinwestpac.org.cn) has been established, which integrates and loads a variety of marine data products from countries around the Western Pacific Ocean, countries and regions along the Maritime Silk Road, providing important supports for promoting marine data sharing services and technical standards out of China and building a marine community of shared future.The dedicated network version of the Marine Data Sharing Service System has been constructed and put into operation,which relies on over 400 unit nodes connected by the dedicated marine communication network to provide real-time distribution and online use of marine investigation, observation and monitoring data in a mode of"Data Mall+Virtual Terminals".

As of June 2023, over 19,000 real-name users from more than 800 organizations have registered, and the System has cumulatively provided data services more than 35 million times, with an average annual growth of 20% in data service volume.Furthermore,through sharing modes such as offline and peer-to-peer transmission,it has provided data services for the military and other users more than 100 times a year on average, with a data volume of about 55 TB.

3.5 Effectiveness of shared services and promotion of applications

After years of construction and operation, the implementation achievements of NMSDC have been widely applied in the fields of marine development decision-making planning, marine-related scientific research and education, major marine-related engineering construction,marine-related enterprise services,public science popularization,etc.,and have achieved excellent social and economic benefits.

In terms of supporting marine science and technology innovation,data services have been provided for over 50 projects, including the National Natural Science Foundation of China and the National Science and Technology Key Projects.In terms of marine economic development, the marine environment monitoring and forecasting data have been prepared to support the construction of the overall industry chain intelligent marine ranching service platform,which has provided three-dimensional grid monitoring and early warning services for more than 10,000 farmers in Qinzhou, Beihai, etc.Based on the marine satellite remote sensing data,the fishing forecast products for tuna in the Southern Indian Ocean have been developed, with a quasi-real-time fishing forecast accuracy of up to 76% or more.In terms of supporting government and enterprise decision-making,it has provided over 50 types of data services for the three-dimensional one-map of the Ministry of Natural Resources as well as the basic information platform of land space through data interfaces.In terms of supporting the construction of local Smart Ocean Project, a total of 93 data sets have been provided to the Zhejiang Provincial Intelligent Ocean Data Center by way of online data interface to explore value-added services of data commercialization operation.In terms of marine science, data products have been provided for six exhibition areas in three major sectors of the National Maritime Museum of China, namely "Marine Humanities, Nature and Ecology", to support the construction and operation of the exhibition halls; and undersea topography products have been provided forExploration in Encyclopediapublished by Xinlei Publishing House to visualize the topographic appearance.For three consecutive years, supports have been extended to over 100 events such as the"Sharing Cup"Student Innovation Project Competition organized by the Ministry of Science and Technology.

To promote the influence and visibility of NMSDC and deepen the understanding of marine data sharing efforts in various fields, the team members have visited marine colleges, universities and research institutes more than twenty times to demonstrate the project construction and operation achievements and provide marine scientific data sharing services for marine related researchers and the public.Efforts have been made to actively participate in marine science and technology exhibitions, to distribute project brochures to guests, to answer questions on data content, service methods and project cooperation,with more than a thousand registered users on site.NMSDC has participated in over 30 international and domestic conferences in the field of open sharing of marine data, demonstrating the achievements of research results and applications of key technologies in the field of marine informatization and data application services in China,and promoting international influence.Meanwhile, efforts have been made to launch publicity and promotion via various online and offline media platforms, so as to guide the public to care about the ocean, promote a comprehensive understanding and awareness of the ocean,and shape a marine cultural atmosphere.

4 Summary and expectation

The progress and development of marine science, as a comprehensive inter-discipline, cannot be achieved without data sharing, which is becoming increasingly important in the era of big data and artificial intelligence.In the marine sector, the practice-based entities, centered on NMSDC, have explored and established a marine data resource aggregation and management system in line with the marine management mechanism in China, made breakthroughs in key technologies such as data fusion and reconstruction, visual analysis and distributed storage, and provided valuable experience in sharing mechanisms, data resource construction, system development and data services for the marine scientific data sharing efforts in China.

Given the holistic nature of the ocean itself as well as its multi-layered coupling with various natural process interactions, in conjunction with socio-economic development and technological progress, the cross-pollination and influence of various disciplines has become increasingly intensive, making marine scientific research a data-intensive activity.In the future, besides data sharing, it is necessary to leverage ocean big data to discover,extract and mine the rules, knowledge and new scientific issues hidden behind the data,and to produce more refined data products by utilizing existing data processing to enhance the values of data resources.However, the diversity of data resources, the complexity of application scenarios, as well as the integration of various types of natural and social data are expected to pose new challenges to key sharing technologies such as the collection and integration of existing data,preservation management and fusion processing.