-
PDF
- Split View
-
Views
-
Cite
Cite
Bryan Runck, Adam Streed, Diane R Wang, Patrick M Ewing, Michael B Kantar, Barath Raghavan, State spaces for agriculture: A meta-systematic design automation framework, PNAS Nexus, Volume 2, Issue 4, April 2023, pgad084, https://doi.org/10.1093/pnasnexus/pgad084
- Share Icon Share
Abstract
Agriculture is a designed system with the largest areal footprint of any human activity. In some cases, the designs within agriculture emerged over thousands of years, such as the use of rows for the spatial organization of crops. In other cases, designs were deliberately chosen and implemented over decades, as during the Green Revolution. Currently, much work in the agricultural sciences focuses on evaluating designs that could improve agriculture's sustainability. However, approaches to agricultural system design are diverse and fragmented, relying on individual intuition and discipline-specific methods to meet stakeholders' often semi-incompatible goals. This ad-hoc approach presents the risk that agricultural science will overlook nonobvious designs with large societal benefits. Here, we introduce a state space framework, a common approach from computer science, to address the problem of proposing and evaluating agricultural designs computationally. This approach overcomes limitations of current agricultural system design methods by enabling a general set of computational abstractions to explore and select from a very large agricultural design space, which can then be empirically tested.
Over the past half-century, humanity has rapidly innovated to address its evolving needs. Agriculture is a notable example facing the acute and dynamic challenges of a changing climate, urbanization, evolving diets, and global biodiversity loss. We propose a framework that combines computational state space search with agriculturalist intuition such that any potential value proposition can be assessed for its potential to meet societal goals.
Agricultural scientists propose and evaluate agricultural designs under many forms of uncertainty. Climate change has introduced bioclimatic instability. Consumer preferences and markets shift rapidly. Changes in geopolitics affect what inputs are available. Under all of this change, agricultural scientists must propose and evaluate new designs that adapt agriculture to these highly uncertain environments (1–5).
Current methods for agricultural design rely heavily on field experiments or domain-specific models. While such approaches are useful to address certain questions, they are limited in the breadth of designs they can consider. In the case of field experiments, designs can take years to decades to evaluate; in the case of domain-specific models, they are often only applicable to a limited number of species under a limited number of spatial and temporal configurations (3, 6).
Existing approaches from computer science can be adapted to explore and evaluate a substantially larger number of potential designs, as has been achieved in other disciplines such as drug design, aerospace, and land use planning (7, 8). The need for a computer-automated design approach is particularly true for agricultural systems with extreme complexity and high levels of social, bioclimatic, and technological uncertainty (9). In such a high-dimensional system, the universe of possible agricultural system configurations is impossible to explore exhaustively to identify optimal outcomes, particularly with existing agricultural research techniques.
Here, we describe one approach—state spaces—for describing and searching this universe of possible agricultural system designs, proposing both familiar and yet-unimagined designs, and evaluating their outcomes. This approach, borrowed from the domains of computer science and complexity research, represents agricultural systems as states, inputs, and outputs; agents and forces that act upon these systems; and goals and objectives in a single commensurable way.1 The approach can flexibly be applied across the wide range of agriculturally relevant scales—from genes to ecosystems and days to decades—to address challenges found across most of the disciplines of agricultural science. We anticipate that this framework will aid in proposing designs for empirical research that are resilient under high levels of uncertainty.
In this paper, we describe the state space framework and its components, including state space representation, state transition functions, state transition accounting, and state evaluation, and how these building blocks can be used to propose and evaluate agricultural system designs. Then, we illustrate how the framework can be applied across cultivar development, cropping systems agronomy, and within-season precision agriculture management.
State spaces
A state of a system is a single possible configuration of that system (Box 1). The state space of an agricultural system is then the collection of all possible states of the system.2 A state may represent, for example, a field at a moment in time or a given generation of a population under selection for breeding. In these examples, the state space represents the agricultural system under study, e.g. a cropping system or a breeding program. Critically, the state space includes states beyond those that have been observed and is delimited only by what is biophysically possible.3 All that is required is that a process exists by which states of the system can be encoded, even if hypothetical or unknown. Depending on the application, this can be fine-grained (e.g. including details about individual plant root structures in soil), coarse-grained (e.g. including only a few environmental variables at the decadal scale), or somewhere in between. Importantly, the state space approach is agnostic to how states are represented, requiring only that they are.
Understanding state spaces. A) A discrete example of state transitions of a chess board, specifically, the opening sequence known as the Queen's Gambit. This sequence provides a known example of a series of states that provides an advantage to a specific player. This abstraction of states, state spaces, and state transitions that advantage a scenario can be used when exploring any system, such as an agroecological system. B) Identifying a route between two different cities. Tools such as Google Maps leverage the concept of state spaces by having accounting functions for the value (cost) of any given road segment, making use of edge weights and an agent/algorithm to identify a reasonable path between locations, where the agent/algorithm operates on an (internal) summary of the underlying ground truth from Google's mapping infrastructure (e.g. Street View cars) and public data. A reasonable path is typically intuitively defined by a person, but the goal of defining the transitions is to make the implicit assumptions explicit and enable generalizability of both representations and the agents/algorithms that act upon them. In this analogy, cities/locations are states, and the state space includes all cities/locations in the region; the transition functions are road segments from one city to another. It is sometimes the case that a desirable result of state transitions (e.g. arriving at Darwin) will go through undesirable intermediate states (e.g. The Outback).
Consider a specific example of cropping systems design. The goal for a specific study could be to specify the most productive crop rotation for a geography. States can be defined at two different, nested spatial scales. The first is the landscape (Fig. 1A). The second is the field (Fig. 1B). At the landscape scale, we show 5 states that are made up of the crop rotation states at the field scale; this is only a small subset of the possible landscape states. At the field scale, we can look historically and represent the state space as a Markov model, which describes both the possible crop choices and the likelihood of transition from one crop choice to another. The state space transitions in this representation, of each field in this landscape, are the likelihood of transitioning from corn to corn, corn to soybeans, soybeans to wheat, and so forth.

A) Simulated landscape states each year contains 9 landscapes which contain either wheat (w), soybean (s), or corn (c) these states change between years. B) Each field change between years can be represented as a count, a probability, and weighted edge graph. C) Using accounting hooks, the value of each transition can be accounted for helping to explain the probability of change. D) A design agent can explore the transition history and then identify potential states and move between states of different probabilities to create new configurations.
Transition functions
Given suitably represented states, it becomes possible to define state transitions (i.e. mathematical functions) (Fig. 1C). These are the permissible (e.g. physically possible) transitions from one state to another. To the extent that such transitions are not idiosyncratic but follow predictable patterns, they can be represented as functions that map from an input state to an output state.4 In the language of graph theory, if each state is a node in a graph, then edges represent potential transitions. For an agricultural system, we might have a transition function representing the effects of irrigation (taking a set of drier states to wetter ones), fertilization (taking a set of states with lower nutrients to higher), crop selection, and so on. Transition functions are crucial to understand which state transitions are considered possible and to represent biophysical processes computationally.5 Accounting hooks are computational functions (e.g. simple formulas or even arbitrary pieces of code) that can be adjunct to transition functions, enabling the accounting of the costs and benefits as the agricultural system is transformed through a series of state transitions.
Software system modularity
The framework enables the transformation of agricultural system design over an infinite state space into tractable and practical summaries that can be operated upon by well-understood agent-based and algorithmic techniques (Fig. 2). Many of these summarization and transition functions have long been implicit in the agricultural sciences, and through this framework we make them explicit: a summary function could simply encode and combine the human-level understanding of the states of an agricultural system (for example, the spatial arrangement of a farm and the sequence of its crops at a timepoint during a growing season) with a lower-level representation such as a space-time cube of the land and its constituent parts. Such an approach to encoding low-level state representations in summarized forms has been the basis for success in the applications of artificial intelligence to general game playing problems as well as the success of general-purpose language models (7). This is because the output of a summary function is simply another, more coarse-grained state space, upon which transition functions (between these higher-level states) can be defined (Fig. 2). Thus, summary functions enable aggregation toward coarser representations that are more computationally tractable.6

State summarization for agricultural state spaces. Here, a multistage transformation of an infinite complexity agricultural state space is transferred into practical summaries that can be operationalized using well understood agent techniques. Transition functions apply within a single level of representation, describing the pathways for transition from one state to another (and can have associated costs that are accounted for). The data can be of any level of complexity (e.g. mapping layers, images, yield, ecosystem services). Summarization enables agents to act upon solely the components that are relevant for their decision-making, which is described by the user. This flexible framework allows for modularity for use cases that are of interest to any scientist.
There are four benefits to this summarization approach. First, it makes explicit the summarization and categorization of agricultural systems that is and always has been taking place (13). Second, by defining summarization functions and their output state spaces, it produces reusable high-level representations that enable modularity of agents that act upon those representations. Third, higher-level summary representations of state spaces are often more understandable by scientists and practitioners, enabling faster improvement and verification of the results of the design system. Fourth, much of the low-level representation of a state space is likely irrelevant for some specific context and retaining that low-level state representation will result in computational intractability; summarization enables agents to act upon solely the components that are relevant for their decision-making, lowering computational complexity (Fig. 2). As we progress toward more simplified and coarse state spaces, the framework makes it possible for existing techniques in the computer science literature, whether Reinforcement Learning, Random Forest Decision Trees, or deterministic or heuristic techniques, to act upon the state representation to explore the transition between states given some high-level objective. Regardless of technique, these form the basis of state spaces required for design automation.
Designing agricultural systems with state spaces
The last component required for design automation using state spaces is the creation of a design agent (Fig. 1D). There are many prior definitions of design agent. Here, we define a design agent as any program that has the ability to execute a transition function in order to explore the designs within a state space. There are many ways to operationalize this definition of an agent, from one that executes transitions in a state space at random to one that does so deliberately using utility theory (14). Regardless of the specific approach to building a design agent, the agent will need to be built with significant consultation with local stakeholders to parameterize utility functions (4, 5–7).
For example, an agent may be used to design nitrogen application rates and balance the tradeoff between nitrogen losses and grain yield, where larger nitrogen fertilizer applications increase yield with diminishing returns and rapidly increase nitrogen losses that degrade water quality (18). Here, a state is the field given a specific amount of nitrogen applied, the cost of transition is the cost of increasing or reducing nitrogen application, and the benefit is some weighting between water quality and grain yield outcomes. A utility-based design agent could select new states (e.g. combinations of fertilizer rate, fertilizer form, resulting crop yield, and water quality), estimate the likelihood of reaching these states (e.g. subject to variable weather, application timing, etc.), and the costs of transition and resulting benefits (e.g. cost of new equipment, different form, etc.), ultimately predicting high-value nitrogen rates and outcomes across a landscape. However, this utility agent is only one potential agent. One could instead employ a random design agent to approximate the range of possible nitrogen losses and yields based on simulated nitrogen rate configurations and thus provide a benchmark for the current state of the agricultural system.
Critically, high-accuracy forecasts of agricultural outcomes are not necessary for successful application of the state space framework and design agent. There are two reasons for this. First, it is possible to create agents that generate appealing and useful designs without any notion of how the world works, as is the case across many large language models and game playing agents. Agents could be instead trained by observing humans engage in design or by designing against themselves. For example, generative AI techniques have successfully created new works of art (19) and literature (20). Designing agricultural systems is a natural extension to success on other agricultural challenges such as automatic weeding (21), sorting produce (22), and managing farms (23). The strength of generative AI techniques is they can efficiently generate plausible and desirable output in high dimensionality spaces, meaning such techniques can be effective for generating designs using our state space framework.
Second, in cases where accurate biophysical prediction matters, the emphasis for a successful agent is primarily with transitions among states. In this second, explainable approach to building a design agent, a major research goal will be developing “world models” to estimate the likelihoods of state outcomes given a transition function. For example, to estimate the likelihood of yield and water quality outcomes a design agent does not need exact point predictions of a state to evaluate potential new states and transition functions, only accurate likelihood estimates that a state can be reached given a transition function and a value if it were to be reached. Many methods will produce such likelihood estimates. Most rely on Monte Carlo simulation with varying input parameters for models. This model requirement punctuates the need for general purpose and scalable models of agricultural systems that are parameterized automatically for different geographies and design objectives. The ability of design agents to handle uncertainty is a major feature of this state space approach. If the outcomes of transitions between states are highly uncertain, so long as there is an accurate evaluation of the likelihood of reaching each state and a value of the target state, the state space and design agent will provide robust design recommendations.7
Example applications
We describe three potential application areas. For each, we describe how agricultural scientists currently approach the design problem within each area. Then, we outline how the problem area maps into the state space framework. Finally, we end each with a description of insights the state space framework provides.
Breeding
Description of the design problem
Plant breeders make selections on genetic variation for target traits. Consider new cultivar development for biotic or abiotic stress tolerance. First, a source of tolerance must be identified, which can require screening hundreds or thousands of accessions. Next, experimental populations are developed, which result in tens of thousands of progeny. Crosses with tolerant parents may cause a population to initially fall in performance, which has a somewhat implicit, well understood definition that we aim to make explicit, requiring several generations to recover good yield and quality characteristics (24, 25). Progress depends on the size of the population, selection intensity, and genetic variability for the target trait. Current strategies for breeding include multienvironment trials to sample target populations of environments (26), genomic prediction modeling (27), and speed breeding (28).
State space description of the problem
The transition function for the state space is one cycle of selection and each generation can be represented as a summarized state of individual plants. Drawing upon concepts from evolutionary biology, each generation can be evaluated by a design agent based on its location on a fitness landscape (28). The ability to transition between states, or traverse the fitness landscape, depends on the genetic features of the trait (trait architecture and heritability), the characteristics of the species (e.g. the mating system: clonal, outcrossing, selfing), and the current state of the population (where it is located on the landscape). As the design agent calls the transition function to move the population through the state space, the ability of the design agent to recover specific desired properties depends on the outputs associated with various state transitions.
Novel insights into the problem area from state spaces
Breeders, in many respects, have been using methods that are akin to state spaces to optimize selection of new material (29).8 However, explicitly framing crop breeding using a state space approach has the potential to overcome the longstanding challenge of interoperability with other subdisciplines in the agricultural sciences (e.g. cropping systems design), if they are also framed using state spaces. By taking the state space perspective, this interdisciplinary interfacing can readily make use of emerging discipline-specific approaches (e.g. genomics and physiological models for crop prediction (30, 31)) to operate as transition functions that map one state to another. A major challenge to application of the state space approach in the breeding context is that the design agents' utility function may require observing the process of breeding and selection for multiple traits simultaneously, making gains per cycle small, and could thus slow the generation of the plant material needed in a working breeding program (32). Using more species and broadening the state space search may overcome this challenge [e.g. (33)]. More generally, as a more comprehensive understanding emerges about the underlying processes of crop physiology and genetics, current and future genome to phenome models may be readily swapped in and out of this framework.
Cropping systems
Description of the design problem
A preliminary evaluation of a new cropping system design requires a minimum of three site years; a longer study is necessary to address emerging challenges including resilience to climate variation. An exceptionally large number of potential cropping system designs emerge given species, cultivars, and management choices available in a single environment.
State space description of the problem
A current long-term research program is adapting cropping systems to increasingly erratic weather in the western US corn belt, which is currently dominated by the corn-soybean rotation (34). This experiment can test five annual crop rotations of up to 4 years in length with a subset of seven annual crops using locally common management. However, a total of 721 rotations of up to 4 years are possible. While some of these may be more favorable than the five currently being studied, evaluating them all is infeasible for a long-term experiment that captures sufficient weather variation.
Novel insights into the problem area from state spaces
The state space framework can guide the proposal and evaluation of unstudied rotations. One might combine crop models with soil physical and nutrient models to infer crop and soil outcomes based on weather and known or inferred rotational effects in order to establish the state transition functions (35). Management practices, including planting date, fertilizer inputs, or the addition of cover crops or intercropped forage legumes might also be varied using data from nearby experiments (36). Transition functions would account for estimates of yield, inputs, and changes in soil properties from each transition across single and multiple years in a rotation. Based on these outcomes, the design agent may select favorable (high expected value) transitions for each possible subsequent state based on the current crop state, resulting in the identification of locally adapted n-year rotations without exhaustively modeling each of the 721 possible 1–4-year rotations. Agricultural scientists could then initiate empirical study of the best performing rotations.
Parallels to tropical multispecies mixtures
A similar approach might be taken to study perennial multi-species mixtures in the tropics, such as intercropped and integrated coconut-cacao-animal systems (37). However, it is unclear which plant and animal combinations would achieve the desired goals of stakeholders. Using world models, the design agent would evaluate the summarized outcomes at the end of establishment, such as biannually for coconut and breadfruit (which can be planted in orchards together), and every few months for chickens; management of both trees and animals might be selected for highly favorable outcomes based on possibilities at the current state. Multiple long-term simulations would allow selection of favorable starting tree configurations, species, or varieties based on management goals and decision horizons.
In both cases, integrating technology into cropping systems agronomy is necessary to aid the data collection efforts necessary to train models. The logistics of such intensive data collection and integration into models are highly non-trivial with standard procedures still emerging within digital agriculture, including the use of synthetic data (38, 39). These practical considerations could slow the ultimate adoption of the state space approach.
In-Season management
Description of the design problem
Many decisions alter the growth and development of crops during the season, including planting date, tillage, nutrient inputs, pesticide use, grazing, and harvest date. The combinations of species and management within a season lead to an exceedingly large state space.
State space description of the problem
For example, consider an intercropping agrovoltaic system where there is a cover crop under the solar panels and vegetable production in the rows between panels. To set up a viable production farm, there may be a need to test 3 cover crops, 2 animal species (e.g. for grazing cover crops), 10 vegetables with 5 cultivars of each vegetable and 2 planting dates, 4 harvest times and 3 different pest management scenarios, for a total of 7,200 possible states in a single location (40). This amount of empirical testing is not feasible to identify the optimum for even a single location.
Novel insights into the problem area from state spaces
There are only a few scaled agrivoltaic production systems, but there are comparable agroforestry systems. From the perspective of the design agent there may be little difference in the photosynthetic activity from a tree or solar panel canopy, so new systems can be assessed and outcomes inferred without having actually been empirically tested. This demonstrates how abstractions that allow for modular thinking can limit the number of combinations that need to be tested in a given context or to reimagine what combinations can be used. By identifying the management that enables favorable transitions among states (Fig. 1), design agent-selected choices can guide real world testing of solutions that are likely practical and will meet the needs of the researcher.
Conclusion
Agriculture has served as the foundation of human civilization across cultures, resulting in a rich array of system designs, many of which are not in use today. The state space framework outlined here enables the automated design of agricultural systems to explore that full breadth and beyond. Design agents search agricultural state spaces in order to identify systems that can meet the changing demands of an uncertain future. The practical implementation of an automated design system requires modularity in both the conceptualization of agricultural systems (e.g. individual-based models based on biophysical principles) and the software components used to define states. The practical next steps of implementing such a system will require rapid proposal and disposal of many submodules to work toward automated design. Thus, we likely need to move the field toward deliberate consideration of abstractions that compose cleanly and enable modularity, where we can iterate on the individual contributions within subfields without siloing that knowledge. In this way, agricultural scientists may need to think more like computer scientists, holding abstractions and representations more loosely.
The result may be that instead of relying on the intuition of individual scientists and long-held implicit abstractions to generate new designs, the design of agricultural systems may be made more resilient in the face of uncertainty through:[A]bstraction is a quintessential activity of computer science—the intellectual tool that allows computer scientists to express their understanding of a problem, manage complexity, and select the level of detail and degree of generality they need at the moment. Computer scientists create and discard abstractions as freely as engineers and architects create and discard design sketches (41).
formalizing the intuition of experts for what constitutes a resilient agricultural system to establish goals for automated design agents,
facilitating the borrowing and integration of modularized knowledge across disciplines by providing a common language of state spaces, aiding multidisciplinary research, and
accelerating innovation by generating computer-aided design systems that can infer novel agricultural configurations with a high likelihood of societal benefit, allowing us to make the most of scarce time, space, and money in empirically evaluating new agricultural system designs.
In this way, the State Spaces for Agriculture framework is about the formalization of a computational imagination that provides a flexible and general approach to conceptualizing digital agriculture research to motivate and support empirical research and development on the most promising of designs in an uncertain world.
Acknowledgments
We would like to thank the NIFA AG2PI Collaborative: Creating a Shared Vision across Crop and Livestock Communities award number—20217041235233 for the subaward to undertake this work. NSF Grant 2138292—EAGER: Computational Agroecology: A Systems Approach. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. The USDA is an equal opportunity provider and employer.
Funding
The authors have received funding from the United States Department of Agriculture, the National Science Foundation, and the Minnesota Discovery, Research, and InnoVation Economy (MnDRIVE) funding program.
Author contributions
B.R. conceptualization, drafting, writing/reviewing; A.S. conceptualization, drafting, writing/reviewing; D.R.W. conceptualization, drafting, writing/reviewing; P.M.E. conceptualization, drafting, writing/reviewing; M.B.K. conceptualization, drafting, writing/reviewing; B.R. conceptualization, drafting, writing/reviewing.
Data availability
All data is included in the manuscript and/or supporting information.
References
Footnotes
A state space can be thought of as being a graph, where states are nodes and edges are transitions between states; transition functions define a set of transitions that can occur. Summary functions perform aggregation of groups of nodes into single nodes in a coarser-grained version of the graph.
For state spaces to be computationally interesting, they must be computationally representable; any computational methods must have state representations to compute over. In most real-world systems, a difficulty exists: complete representation of all world states is typically impossible, and no particular scheme of representation is obviously correct or best.
Transition functions can encode management actions taken on the land, environmental modeling (e.g. plant, weather, or climate models) or, more generally, anything that can cause a change in the state of the system.
The transition functions also implicitly define a reachable set of states of the state space given some starting state; it is important not to overly constrain the transition functions because this will artificially prevent evaluation of and/or traversal through some particularly good or bad states in the state space.
For example, the percentage of fields in a landscape growing each crop may be such a coarser representation. State spaces need not be spatial in nature.
This interplay between a world-model based state space and a design agent can be represented in multiple different ways. One approach is to represent the state space and its transition functions being controlled by a simulator that executes the world model and transitions from one state to another according to expected biophysical reality. The agent would also be capable of acting upon the state space to search for desirable transition pathways through the state space given the world model's behavior. Alternatively, this could be seen as simply the interplay of two agents, the world model and the design agent, acting upon a single state space. The framework admits many possible agents operating within a single state space, enabling both modularity and generality.
Resource allocation (29) within genomic selection provides an explicit example of changing the transition function in the state space model, where the goal is to optimize state transitions to minimize breeding program cost.
Author notes
All authors share first author.
Competing Interest: The authors declare no competing interest.