Skip to main content

Unified data management for distributed experiments: A model for collaborative grassroots scientific networks

Lind, Eric M.
The rapidly growing number of grassroots ecological research networks demonstrates that ecologists have embraced distributed data collection and experimentation as a new tool for addressing global questions. A clear advantage of these networks is the ability to gather data at larger spatial and temporal scales and at relatively lower cost than could be typically accomplished by a single research team. However, a challenge arising from this structure is the need to merge distributed datasets into a coherent whole. The Nutrient Network, a coordinated distributed experiment entering its tenth year of data collection, has records from over 90 sites worldwide to date. In this paper I present lessons learned about data management from this project, focusing on such issues as standardization, storage, updates, and distribution of data within the network. I provide a relational database schema and associated workflow that could be generalized to many distributed ecological experiments or networked data observatories, especially those with need for taxonomic reconciliation of species occurrences. The success of distributed data collection efforts, especially long-term networks, will be proportional to the ability to coordinate and effectively combine project datasets.