Environmental Data Initiative (EDI) - streamlining data curation to accelerate scientific inquiry - LTER

Global-scale environmental issues such as food security, the spread of disease, and the availability of clean water emphasize the importance of environmental data that can address specific problems while also providing predictions of future conditions. The increasing availability of large volumes of different kinds of data offers new opportunities to address these issues. This project will provide the environmental research community with efficient and reliable means for data management, storage, and sharing. The facilities developed will allow researchers, policy makers, managers, and other stakeholders to bring relevant data to bear on complex environmental questions. Modern approaches that encourage geographically distributed collaboration will be used to increase efficiency of data curation beyond those available for single projects. The project will provide the training and skills needed to overcome technical and social barriers to collaboration, thereby enhancing infrastructure to address ecological questions over broad spatial and temporal scales.

Research in environmental sciences is often conducted by individual investigators over small spatial and temporal scales under funding models that provide limited capacity for data curation or sharing. Data that are archived in a stable, accessible repository and that are accompanied by appropriate metadata benefit both data producers and consumers through improved discoverability and reliability. This project builds on expertise available in the Long Term Ecological Research (LTER) community to provide these benefits. The project will leverage the collective experience of the LTER community to improve data management across a broad community through communication and collaboration. Training activities will be developed that range from the basics of metadata creation to the adoption of standardized best practices for specific types of data. The development of templates for describing a data lifecycle will accelerate the availability of data for synthesis. The project will promote shared technology to develop more commonly usable and more efficient approaches to data curation workflows. Participants in training workshops will be trained in developing workflow technology, re-using existing workflows, and archiving and sharing their developments. Through these workshops, together with community-level centers of expertise, and individual-based skill exchanges, the project will increase the volume of data available along with data discoverability and reuse. The Provenance Aware Synthesis Architecture repository will ensure long-term availability of data and open data access through federations such as DataONE. This repository will be expanded in several ways to accommodate a broader community of data providers and users. Enhancements include a scalable user identity management system, improved data documentation procedures to simplify data submission for non-technical users, and expanded data-quality assurance tools to accommodate a broader range of community practices. These advances will accelerate scientific inquiry through data curation and publication as well as through data discovery and integration.

Environmental Data Initiative (EDI) – streamlining data curation to accelerate scientific inquiry

Top Stories