At a time when agricultural development, food security and sustainable resource management are becoming increasingly important, access to high-quality agricultural data has become a global concern. However, global agricultural data at the subnational (i.e., subnational administrative regions) level are often fragmented, difficult to access, or inconsistent in standards. To address this challenge, the HarvestStat Subnational Agricultural Production Data Consortium, a consortium of nine international scholars, including Kyle Frankel Davis, recently published an article in the journal Environmental Research Letters, proposing a new paradigm centred on openness, collaboration and standardisation. The paper proposes a new paradigm centred on openness, collaboration and standardisation to solve the problems of data fragmentation, duplication of inputs and uneven access that have long plagued food system research.
01 Origin of the project: Why are sub-national agricultural data so important?
Subnational agricultural production statistics underpin a wide range of analyses, assessments and data products related to food security, land use, climate and natural resource management. These statistics are critical for guiding development agendas, food security and sustainable development policies, humanitarian aid response, and public and private investments.
The comprehensiveness, level of detail and accuracy of agricultural statistics directly determine information blind spots - which are often found in areas where food security and agricultural development needs are most pressing. As a result, relying on outdated, rough or inaccurate estimates of interventions can lead to ineffective policies and actions, or even undesirable consequences.
02 Current challenges for agricultural data
Despite the undisputed importance of agricultural data at the subnational level, there are many challenges to the collection and sharing of agricultural data globally:
Agricultural statistics across countries face unique challenges in terms of data quality, openness and comprehensiveness. These information gaps are partly attributable to the significant cost of collecting and producing agricultural census and survey statistics in a country, leading to significant differences in data accuracy, reliability and willingness to share across countries.
More worryingly, differences in funding cycles, researcher priorities, institutional silos and incentive structures have led to duplication of data collection efforts within the food systems research community. This not only wastes valuable resources, but also limits the comparability of research findings, as the data itself may vary depending on the source institution and the specific cleaning method.
03 HarvestStat: pioneering a new model of data sharing
HarvestStat is not yet another single database, but an integrated system of open collaborative networks and standardised data workflows. Its key initiatives include:
HarvestStat Africa: a fully open-source, sub-national dataset covering 90 crops in 35 African countries, which has been made globally available through GitHub and Harvard Dataverse;
Open toolchain: integrating open source scripts such as LUCKINet and arealDB to enable spatial and structural standardisation of data from multiple sources;
Quality control mechanisms: all data follow the FAIR principles (Findable, Accessible, Interoperable, Reusable) with an emphasis on untied openness, local expert participation and interdisciplinary collaboration.
04 HarvestStat's core principles
- Unconditional openness: all possible data is shared freely and openly, with transparent documentation and provenance;
- Active Inclusion: Alliance members actively seek the participation and input of local experts in all countries to ensure accurate, sensitive and effective use of data, as well as the participation of diverse research and policy communities to ensure that the datasets produced meet the needs of potential data users;
- Collaboration: Alliance members prioritise the provision of scalable tools, platforms and solutions that are open to the community.
In addition, HarvestStat prioritises standardisation and validation, comprehensiveness, and geo-targeting to ensure that data are consistent, have broad coverage and are easily geo-analysed.
05 A new model for data commons
HarvestStat's vision is to build a global data commons where agricultural production data becomes a shared resource that is available to all and contributed to by all. This is not just a technical endeavour, but an attempt to reshape the social fabric of the earth system science community.
‘We call on institutions and individuals working in agriculture, environment, policy, and data science around the world to join this initiative.’ The author team ends the article with an open invitation, ‘Only through collective wisdom and sustained collaboration will we be able to crack the blind spots in global agricultural data and lay down solid data for a more just and sustainable future.’
Extended reading: the HarvestStat Africa dataset (https://github.com/HarvestStat/HarvestStat-Africa)
LUCKINet data processing tool (https://www.luckinet.org/)
News content is taken from the following websites and does not represent the position of GoOA Headlines:
https://www.sei.org/publications/harveststat-agricultural-data/