Digital Agriculture: A System Theoretic Viewpoint

By Naira Hovakimyan, Co-Founder and Chief Scientist of Intelinair

The digital transformation of the U.S. economy is beginning to drive a digital agricultural revolution. The potential consequences of this for arable agricultural systems and economies are comparable with past revolutions, such as mechanization and biotechnology. The main drivers are: i) data that can be collected at arbitrarily high resolution using equipment sensors, satellites, aircraft, drones and multiple other data sources; ii) sensors that can measure local weather events, moisture level and other important parameters with high accuracy; iii) availability of affordable computation to process algorithms overnight on terabytes of data; and iv) availability of high-speed wireless networking in rural areas.

However, there are challenges with the processing and interpretation of agricultural data, as observed by Joseph Byrum in a previous article [1]. The application of machine learning techniques to agricultural data is not straightforward. As stated in [1], the problem is that in any farm environment, no two cases are identical, which makes the development, testing, validation and successful rollout of such technologies a lot more challenging compared to other industries. The data collected with high resolution sensors changes rapidly through the season; it is non-stationary, unstructured, heterogeneous and highly sensitive to the zone, soil, weather, pests, among many other uncontrollable factors. Also, compared to other industries such as the consumer technology sector, access to useful data in agriculture is usually restricted by privacy concerns and corporate confidentiality, and in some cases the data has just not been collected. There is a critical need for development of a new scientific paradigm that would take a different approach towards analysis of various emergent issues in farming, and also for building useful tools to close the loop around the farmer, agronomist and the data scientist.

In this article, we attempt to take a systems theory approach to understand the various phenomena involved in predicting crop performance from the farmer’s perspective. The large-scale farmer is anxious to get more production from every acre. Based on years of experience, his/her empirical knowledge of a specific environment from first-hand observation, together with data from weather forecasts, financial markets, and seed, fertilizer and chemical producers, guides him/her when to plant, how to monitor growth, when to apply fertilizers, how to manage weeds, diseases and pests, and when to harvest. In this process, every crop and every field has distinct growth dynamics, which can be affected by various factors, including weather, soil, pests, etc. Digital agriculture companies have focused on developing sophisticated, predictive models of field crop growth on the assumption that open field crop growth and health can be managed in a controlled manner. However, results have not met expectations, probably due to the impact of these random events.

In the case of controlled environments (greenhouse horticulture, vertical farming etc.), which are grounded in the idea of controlling such variables, it becomes easier to apply different algorithms and to validate their performance in a straightforward way. Therefore, although these environments are currently a small part of the overall agricultural economy, they form an important testbed for modelling and predicting plant growth.

From the perspective of a systems theorist, there are two fundamental paradigms that seem to compete:


  1. A data-driven approach that would try to analyze the available data and predict the annual yield based on the collected data on daily basis by close monitoring of the output of various sensors; this paradigm is inspired by solutions of machine learning, generally born within computer science that discard the underlying processes and the interaction of their dynamics.
  2. A system theoretic approach that would account for scientifically and empirically derived crop growth models, including knowledge learned from previous years, and would thus allow dropping some of the variables from analysis in the current year, by simplifying the load for the analytics engine and facilitating a much speedier response time with improved accuracy.


In the case of the first paradigm, we deal with data-driven machine learning methods that—in addition to a vast amount of annotated data—need to have good robustness to noise and various artifacts/nuisance factors involved in this process. The challenges with this paradigm include data structure and sources, computational requirements for data processing (implying high costs), good architectures for making use of training data, the need for accurate metadata including manual annotations by agronomists, and optimization of various threshold parameters in the convergence process.

In the case of the second paradigm, one needs good and reliable models for the crop growth dynamics that can help to understand how the crop responds to various inputs. In this case, the methods and tools from systems theory can help to choose when to react and how to react only to particular events. General models of photosynthesis and crop growth are available from many years of research, but have not been integrated into processed sensor and image data and predictive methods for individual farmers. System theoretic principles can help to discard some of the sensor data and focus only on important inputs, thus possibly simplifying the application of machine learning in challenging scenarios with noise and other artifacts. By minimizing the need for data collection, the analytics engine can be used more efficiently, saving the CPU for delivering the actionable insights to the farmer in a timely manner. Being selective with the analyzable data can help to find the sweet spot between the two competing notions in information sciences: “data-rich, information-poor” and “data-poor, information-rich.”

So, the challenge is reduced to the cost comparison: Will data-driven artificial intelligence methods achieve a level where the data can be analyzed and learned on-the-go at a desirable speed at a cost where the outlay for farmers is repaid by improved yield? Or could a more fundamental approach achieve the much-desired cost effectiveness by accounting for the crop growth models and helping to react only to particular events, minimizing the load of the analytics engine and the response time? In the first case, it seems that most of the technology already exists and the limitation is data, especially metadata and annotations by agronomists, which has been the driving force behind most of the competing companies in the space of digital agriculture. The main costs are associated with the later steps of intensive daily data collection, storage, ag-fused analytics and optimization. In the second case, a substantial investment is needed for the development of the integrated models and methods that down the road could reduce the costs for the data collection, storage, analytics and optimization. In either case, interdisciplinary collaborations between academicians and industry—involving mathematicians, computer scientists, agronomists and system theorists—will be instrumental.

Over the next few years, the surge of activities grounded in these two paradigms will lead to new advances in mathematics and computer science as well as new technologies that will compete for adoption by agribusiness customers. Which paradigm will be the winner will depend upon many factors, among the most important being the farmer’s motivation to cooperate in development and deployment, and the ability to create value in farm gate and processor profitability.

Acknowledgments: The author would like to thank Professor Matthew Hudson of College of Agriculture of UIUC for his insightful comments and editorial remarks.


[1] JOSEPH BYRUM; The Challenges for Artificial Intelligence in Agriculture; AgFunder News, FEBRUARY 20, 2017.

You may also like