-
PDF
- Split View
-
Views
-
Cite
Cite
Amélie Mathieu, Tiphaine Vidal, Alexandra Jullien, QiongLi Wu, Camille Chambon, Benoit Bayol, Paul-Henry Cournède, A new methodology based on sensitivity analysis to simplify the recalibration of functional–structural plant models in new conditions, Annals of Botany, Volume 122, Issue 3, 1 September 2018, Pages 397–408, https://doi.org/10.1093/aob/mcy080
- Share Icon Share
Abstract
Functional–structural plant models (FSPMs) describe explicitly the interactions between plants and their environment at organ to plant scale. However, the high level of description of the structure or model mechanisms makes this type of model very complex and hard to calibrate. A two-step methodology to facilitate the calibration process is proposed here.
First, a global sensitivity analysis method was applied to the calibration loss function. It provided first-order and total-order sensitivity indexes that allow parameters to be ranked by importance in order to select the most influential ones. Second, the Akaike information criterion (AIC) was used to quantify the model’s quality of fit after calibration with different combinations of selected parameters. The model with the lowest AIC gives the best combination of parameters to select. This methodology was validated by calibrating the model on an independent data set (same cultivar, another year) with the parameters selected in the second step. All the parameters were set to their nominal value; only the most influential ones were re-estimated.
Sensitivity analysis applied to the calibration loss function is a relevant method to underline the most significant parameters in the estimation process. For the studied winter oilseed rape model, 11 out of 26 estimated parameters were selected. Then, the model could be recalibrated for a different data set by re-estimating only three parameters selected with the model selection method.
Fitting only a small number of parameters dramatically increases the efficiency of recalibration, increases the robustness of the model and helps identify the principal sources of variation in varying environmental conditions. This innovative method still needs to be more widely validated but already gives interesting avenues to improve the calibration of FSPMs.
INTRODUCTION
A main issue in agronomy is the understanding of the plasticity of a plant, which can be achieved by describing the interactions between a plant genotype and its environment. Functional–structural plant models (FSPMs) make explicit architectural components of the plant and their interactions with their environment. They scale up processes modelled at the organ scale to the plant level. Thus, they make it possible to simulate plant behaviours and quantify architectural plasticity (Sievanen et al., 2014). An ideal objective pursued by such models is that parameters should be environmental invariants, i.e. they should be directly related to genotypes (Bertin et al., 2010). For this reason, parameter estimation is of particular importance in enabling comparison of genotypes: which parameters have a different value between two genotypes and could explain the observed differences in the phenotypes?
Accurate determination of model parameters values is critical to ensure that model predictions reproduce as faithfully as possible the observations of real phenomena. The difficulty of calibrating a model increases with the number of parameters. Indeed, a model with a large number of parameters faces an identifiability problem. If model identification is theoretically possible, it is much more difficult in practice and highly dependent on the quality of the data (Walter and Pronzato, 1996). Compensation between correlated parameters can give solutions acceptable from a mathematical point of view but aberrant from a biological one. The tendency to refine the scales at which physiological processes are described [see Barillot et al. (2016) or Pallas et al. (2016) for recent examples] makes this issue particularly critical. A difficulty induced by this complexification is the larger number of parameters to be estimated, although the biological significance of some of them helps in their determination. This induces a heavy calibration cost and increases uncertainty about parameter values. Thus, adapting the model to a new genotype and/or environment might become very tedious with such complex models.
We are thus facing a general and crucial problem in plant growth modelling, that of calibration. Generally speaking, in the calibration stage parameters can be separated into two groups: parameters that can be directly measured experimentally (even if this often leads to a significant experimental cost for data acquisition) and those that are not easily available to measurement and should be estimated from indirect observations on the plant. Parameter estimation can be done either manually or automatically. Manual calibration is subjective, usually takes time (in particular regarding non-measurable parameters) and its success depends on the experience of the modeller (Muleta and Nicklow, 2005). Automatic calibration consists in the use of a search algorithm to determine optimal parameters, by minimizing a given loss criterion; it usually relies on a statistical model, the loss criterion being a measure of the distance between the model and the observed data set. The criterion function is increasingly difficult to minimize as the number of parameters increases, which may necessitate the use of costly algorithms to determine the exact solution (Wallach et al., 2001).
The objective of this paper is to propose a methodology to facilitate calibration of complex models by reducing the number of parameters using a priori a sensitivity analysis method, which helps to simplify the loss criterion to minimize. Sensitivity analysis methods are frequently applied to model outputs and used to rank parameters according to their impact on these outputs. For example, several studies used sensitivity analyses to select key parameters of process-based models, which helps to identify key genetic parameters and adapt the model easily to a new variety (Quilot-Turion et al., 2012; Martre et al., 2015). Cerasuolo et al. (2016) applied global sensitivity analyses to identify the key parameters for yield formation, for different varieties and sites. They concluded that such a model, with a reduced number of key parameters, could be used to predict site-specific yields of different genotypes. Likewise, Makowski et al. (2006) used a global sensitivity analysis method to quantify the impact of genetic parameters on crop yield and grain quality. Similar work has been carried out on FSPMs. For example, Luquet et al. (2006) used model sensitivity analysis to show that organ-based demand is crucial when considering plant growth and development. Perez et al. (2018) performed sensitivity analyses to highlight the key architectural traits that govern light interception. Muleta and Nicklow (2005) and Ruget et al. (2002) identified the most important parameters for the key processes using sensitivity analysis methods and they concentrated their calibration efforts on these parameters to reduce model uncertainty. However, such work is still rare for FSPMs, mainly because their computational time is too long and sensitivity analysis requires running the model numerous times.
The novelty of the work described in the present article is that sensitivity analysis was used to simplify the loss criterion rather than to quantify the impact of model parameters on model outputs. Hence, the global sensitivity analysis was directly applied to the criterion function used in the parameterization process, which has not been done previously. Indeed, a classical method in parameter estimation based on sensitivity analysis is to select parameters with the highest total sensitivity indices for the outputs of interest, as suggested by Campolongo and Saltelli (1997) and Post et al. (2008). However, for FSPMs the output vector is of large dimension since all individual organs are taken into account. Such methods thus require a choice among model outputs. Conducting the sensitivity analysis on the criterion function makes it possible to quantify more directly the parameters that have the greatest impact in the parameter identification process, considering all the outputs in a unique and simple criterion. These parameters might be different from those to which the model outputs are very sensitive according to theweights of the different observations. Our hypothesis was that a parameter with a low sensitivity index could be fixed to a nominal value while a parameter with a high sensitivity index needs to be estimated for each new situation.
This paper presents a two-step methodology. First, sensitivity analysis was carried out on the generalized least-squares criterion computed for a given experimental condition. It allowed the identification of the most influential parameters among all the estimated ones. Second, a model selection method based on the stepwise Akaike information criterion (AIC) was used to further reduce the number of parameters. This methodology was applied to the winter oilseed rape model based on the GreenLab framework (Mathieu et al., 2016) and detailed in the Materials and methods section. This methodology was then evaluated by estimating the parameter values for a new condition (same variety, another year).
MATERIALS AND METHODS
Winter oilseed rape
Winter oilseed rape (WOSR) is an annual plant. In France, WOSR plants are sown in September and harvested in July. The crop cycle lasts ~270 d from germination to yield. The cycle can be divided into three phases. First, from October to February, plant aerial parts are mainly composed of leaves; this is the rosette stage (stage BBCH 11–19). Then, the stem elongates (BBCH 30–39) and branches appear (BBCH 21–29). Depending on the cultivar, the average number of primary branches on the main stem varies between five and 20 and secondary branches may develop. In this article, the cultivar measured has on average eight primary branches. Flowers appear at the top of each stem in April (BBCH 57–65). One stem can bear from 20 to 100 flowers. A flower either aborts or becomes a pod (BBCH 69–88), which contains several seeds that will be used to produce oil.
Our article is based on two experimental data sets constructed from experimental measurements of WOSR plants of variety Pollen grown in Thiverval-Grignon, France (48° 51′20″ N, 1°56′25″ E).
Data set used to compute the loss criterion.
This data set has already been used to calibrate the GreenLab model and is fully described in Jullien et al. (2011). The crop was sown on 4 September 2007 at a density of 50 seeds m−2. The final crop density at flowering was 20 plants m−2. Nitrogen fertilization was carried out twice: 40 kg ha−1 on 4 March 2008 and 100 kg ha−1 on 18 March 2008. Meteorological data came from the climatic station at Thiverval-Grignon.
The number of leaves on the main stem and branches was recorded weekly from emergence to yield on several plants, in order to determine the rhythm of organ appearance, also called the phyllochron. Destructive measurements were carried out at eight dates between 12 November 2007 and 19 June 2008 (days from emergence: 46, 62, 120, 155, 174, 197, 237, 266). At each date, leaves and internodes of the main stem were measured, oven-dried and weighed. For lateral axes, organs were gathered according to type and axis: internodes, leaves, and pods.
Data set used to test the method.
Seeds were sown in individual pots on 30 August 2012 and then planted into six containers of the same dimensions (120 × 120 × 60 cm) ~2 weeks later. The density was 42 plants m−2. Container monitoring ensured water and nitrogen supply according to plant demand. Nitrogen fertilization was carried out twice: 50 kg ha−1 was applied in October with a uniform distribution of 20 points per container and a second amount of 120 kg ha−1 was applied in March. The total amount of nitrogen provided was consistent with agricultural practices in this area. The number of leaves was recorded weekly. Destructive measurements were carried out at three dates (181, 223 and 255 d after emergence). At each date, leaves of the main stem were measured, oven-dried and weighed. Internodes and pods were pooled by types of organ, and their dry weights were measured.
Individual plant model
The FSPM used in this work was derived from the GreenLab model (Yan et al., 2004; Cournede et al., 2006), which has been shown to provide a relevant framework to estimate and describe source and sink dynamics during plant growth in order to better understand how biomass allocation affects yield formation. A key feature is the estimation of hidden model parameters by model inversion (Cournede et al., 2011), making it possible to unravel the unobserved production and allocation patterns. It has been successfully calibrated in several species (Letort et al., 2008; Ma et al., 2008), including WOSR (Jullien et al., 2011).
The model simulates the interactions between plant growth and architecture from seedling emergence to harvest, with a daily time step. The model is built from several modules, which are computed each day: (1) biomass production from organ photosynthetic areas; (2) biomass allocation according to a proportional model in which each organ has a given sink; and (3) morphological development, computing the organ dimensions used to deduce the plant’s photosynthetic surface (Cournede et al., 2006).
The meteorological data provide each day d are the temperature T(d) and photosynthetically active radiation [PAR(d)]. Temperatures above a given value, denoted Tbase, are cumulated each day in order to compute the number of new phytomers. A new phytomer appears on a given axis when the number of degree days from the appearance of the previous phytomer exceeds the phyllochron value estimated for this type of axis. In the model, phyllochron values and numbers of different kind of organs on each type of axis are considered as fixed parameters. Phyllochrons and Tbase were estimated independently of the GreenLab model by a statistical model in which the number of leaves is a linear function of thermal time. This is a simplification as more complex models could have been used [see Parent and Tardieu (2012) for examples]. Each day d, the amount Q(d) of net biomass produced by the plant is computed with an empirical equation [eqn (1)] based on the Monteith model (Monteith and Moss, 1977), where kext is the extinction coefficient, S(d) the sum of the plant leaf surface area and pod surface area, D the crop density and PAR(d) the photosynthetically active radiation at day d.
To take into account the effect of temperature on photosynthesis efficiency (Justes et al., 2000), the radiation use efficiency μ(d) is a function of the daily temperature T(d), with two parameters, MRUE and kRUE. When the daily temperature is below the base temperature of WOSR, the radiation use efficiency is zero. This base temperature is also a parameter of the model denoted by TRUE, but is set to 4.5 °C, often considered to be the base temperature for rapeseed (Gabrielle et al., 1998a). The slope kRUE is chosen so that the radiation use efficiency reaches its maximum MRUE when the daily temperature is close to 8 °C.
otherwise μ(d) = 0.
This biomass Q(d) is distributed among plant organs with a proportional model. Each organ has a sink value sO,k(d) according to its type O, its age d (in days) and its physiological age k (Yan et al., 2004). For the WOSR model, five physiological ages were considered: phytomers with short and long internodes on the main stem are of physiological ages 1 and 2, respectively, reproductive organs on the main stem are of physiological age 4, and vegetative and reproductive phytomers of branches are physiological ages 3 and 5, respectively (Jullien et al., 2011). In eqns (3) and (4), O denotes I for internodes, B for leaves and P for pods. Variation in sink with organ age is modelled with a β function (Yin et al., 2003) with four estimated parameters: two coefficients determine the form of the curves (aO, bO), one parameter gives the expansion duration (TO in degree days) and the last one, SO,k, is the sink parameter corresponding to the maximum value of the function [eqns (3) and (4)]. N is a normalization coefficient and AT(d) is the organ’s age in growing degree days since initiation (Bonhomme, 2000):
Estimating both parameters aO and bO together might be difficult due to the correlations between them. Hence, bO is often set to a constant value in GreenLab. Thus, parameter bO was set to 3 and only aO was estimated. However, some authors chose other constraints such as aO+ bO = 5 (Guo et al., 2006; Dong et al., 2008). The parameter TO is the duration between organ appearance and the end of expansion. In this article, the end of organ expansion was computed from several of our experimental data sets (Mathieu et al., 2011) as the date at which organ mass had not increased since the previous measurement date.
The sum of all organ sinks is the plant demand. During its growth period, an organ receives each day a fraction of biomass equal to its sink value divided by the plant demand. Hence, the model is a proportional one, which requires the choice of a reference value. The chosen reference is the maximum leaf sink value at rosette stage, SB,1, taken as equal to 1. In this article, roots receive a constant proportion of the daily biomass production, which can be handled by considering a constant shoot:root ratio.
The photosynthetic surface area used in eqn (1) to compute biomass production is deduced from organ biomass. Leaf surface area is computed as the leaf biomass divided by its leaf mass per area. Pods and leaves have a functioning time during which they remain photosynthetic. Beyond this duration, they do not produce any more biomass, which corresponds to the senescence phase.
Jullien et al. (2009) linked the changes in leaf mass area in the plant to ontogenic stages and demonstrated that variations in leaf mass per area were related to variations in plant demand. To integrate this statement in the model, leaf mass per unit area depends on leaf position l, with a relationship with two parameters ea and eb. It increases from leaf 1 to leaf 12 and then decreases from leaf 12 to leaf 30, if the leaves are number from bottom to top.
otherwise e(l) = ebl + 12(ea − eb) + 0.005.
Likewise, the leaf functioning time and the internode expansion time depend on the phytomer position. These developmental submodels were calibrated on several experimental data sets (Mathieu et al., 2011). The first leaves have a constant functioning time denoted by F0. The upper leaf of rank l has a functioning time F(l) increasing from rank Ft with rank number according to the following equation:
Pods are also considered as photosynthetic organs with a constant functioning time FP.
Lastly, the internode expansion time increases linearly from TI,1°Cd for the first internode to TI,2°Cd for the 14th internode (end of the rosette stage), and then remains constant for the upper ones (internodes of bolting stage). For the other types of organ, their expansion time does not depend on their position in the plant.
When an internode has finished its growth, its biomass can be remobilized to fill pods. The amount of biomass remobilized each day is proportional to the internode current mass, with a constant proportional coefficient denoted QR. Remobilization stops when the internode mass reaches a threshold value, its structural mass.
Generalized least squares criterion
The estimation method used to calibrate the model was generalized least squares (GLS), which corresponds to the maximum likelihood estimation for a multivariate Gaussian model of observation errors. It has been described in detail for the GreenLab model in Cournede et al. (2011). This GLS criterion reaches its minimum value with a set of parameters for which model outputs are as close as possible to the experimental data with respect to weights inversely proportional to the variance of the residual errors for each category of experimental data. Let Y be the vector of outputs of the model under the form of a vector; Yk is the kth component of this vector, and is compared with an observation Ykobs.
The GLS estimator of the model parameters θ = (θ1,…,θp) is the sum of weighted squares of residuals given by eqn (7). In this equation, ωj denotes the weight of the observations computed for each type j of the observed variable as the inverse of the variance of the residuals.
Yobs is the vector of data composed of the following variables, measured on the data set of 2007–08 at each of the eight measurement dates, I stands for the number of each observation of a given type, and j represents the following categories of experimental data:
(1) mass (in g) of (i) internode compartment, (ii) leaf compartment and (iii) pod compartment;
(2) individual organ mass (in g) for (iv) main stem leaves and (v) main stem internodes;
(3) mass (in g) of organs of the ramifications pooled by type of organ (leaf, internode, pod).
Sensitivity analysis strategy
Mathieu et al. (2016) showed that the model was highly non-linear at two periods of the growth cycle (BBCH 13–16 and BBCH 31–37). As the model cannot be considered to be linear, the Sobol method (Sobol, 2001), a global sensitivity method, was applied despite its high computational time (Cariboni et al., 2007). Sensitivity analyses were performed for the GLS criterion as defined in the previous paragraph. The analysis was performed on 26 parameters (Table 1). The bO parameters were fixed in the model as they are highly correlated to parameters aO. The sensitivity analyses process was decomposed into the following four steps (Hamby, 1994).
(1) The first step was to determine for each parameter a nominal value and a probability distribution that reflects the uncertainty of each parameter (Cariboni et al., 2007). Each parameter is indeed considered as a random variable with a probability density function and a range of variation computed based on literature values, results of previous model calibration on WOSR or any other annual species. These choices have a great impact on the results. Helton (1993) suggests that the input range of the parameters is more influential than the probability density function Extensive data sets are required to determine the probability density functions of the 26 parameters of the model. This information was unavailable in our case so we chose to implement a uniform distribution between the bounds of the intervals (Monod et al., 2006).
(2) A large number N of samples of random parameter values were randomly drawn from the defined distribution.
(3) Simulations of the model were performed for each of the N sets of sampled parameters.
(4) The sensitivity indexes were computed for each parameter. The higher the index, the more important the parameter.
Sobol method
The Sobol method is a variance-based method of sensitivity analysis. The sensitivity index of a parameter quantifies its contribution to the variance of the output.
Let Y denote the model and (Xi) the input factors, thus Y = f(X1,X2,…,Xn). The basic idea of the Sobol method is to decompose the function according to the number of interacting factors (Sobol, 1993).
If the input factors are independent, there is a unique decomposition of eqn (4) such that all the summands are orthogonal. The variance of the output variable can thus be decomposed into
where Vi, Vij,V1,2,…,k are the respective variances of the functions fi, fij,f1,2,…,k.
Equation (10) can be divided by V(Y), which gives:
where Si denotes the first-order sensitivity index of input factor Xi and Si,…,j are higher-order indices.
The first-order sensitivity index gives the main effect of the factor to the model output, i.e. without interactions with the other factors, whereas the total sensitivity index stands for the total effect to the model output including all interactions with other factors. Computation of the total sensitivity index allowed quantification of the global importance of each parameter and then ranking parameters. The linearity index, i.e. the sum of all the first-order indices, is an indicator of the linearity of the model. This sum is equal to 1 when there is no interaction between parameters, i.e. when the output of a model varies similarly according to an input, regardless of the values of the other inputs.
The specific estimators for Sobol indices proposed by Wu et al. (2012) and implemented in the PYGMALION modelling platform were used for this study (Cournède et al., 2013). Each index computation relied on 10 000 samples of parameters. Tests with different numbers of samples revealed that parameter ranking was stabilized at 8000 samples.
Model selection method
The second step of our method was to use a statistical selection model. Models could be compared using the AIC (Bodzogan, 1987). This indicator quantifies the relative quality of a statistical model. It depends on the number p of parameters and the distance between model and observation [eqn (12)], given by the maximum of the likelihood function L:
The AIC is a very popular criterion for model selection. It is computed for a given data set and measures the relative quality of statistical models, the model with the minimal AIC being considered as the most probable. It increases with the number of parameters but decreases with the difference between model simulations and observations. Thus it is a compromise between the number of parameters and the quality of the fit, thus promoting model robustness.
The model was first run with only one of the main parameters. At each step, parametric estimation was carried out for individual parameters one by one, and the AIC criterion was computed. Then, the model with the lowest AIC was selected, corresponding to a given set of selected parameters. The same operation was repeated, keeping the parameters already selected and adding each of the other parameters one by one. Again, the model with the lowest AIC was selected. The process stopped when the AIC started increasing with an additional parameter, meaning that the optimal number of parameters had been reached.
When comparing several models indexed by i according to their AICi, it is interesting to note that pi given in eqn (13) is the probability that model i minimizes the loss of information compared with the model with the lowest AIC, denoted AICmin.
Strategy of evaluation
To assess the value of the methodology described above, its calibration results were compared with two usual methods: full calibration, i.e. simultaneous estimation of all the parameters, and step calibration, i.e. estimation of parameters one by one. We tested the first one but did not try the second. Indeed, this method is time-consuming and its empirical nature makes it a little random, as the final set of estimated parameters is highly dependent on the order in which the parameters to be estimated are chosen due to local minima where the estimation algorithm may remain blocked.
Due to the limits of these two methods, the proposed methodology was evaluated by calibration of the model on a new data set used to test the method, i.e. measurements made on the same cultivar but in another year (2012–13) and under other environmental conditions. This will be its typical use case. Measured parameters were adapted to this condition. It concerned the phyllochron value and crop density. For the rosette stage, the phyllochron was set to 35 °Cd, and phyllochron at the bolting stage was set to 13 °Cd, whereas they were 28.53 and 11.75 °Cd, respectively, for Pollen grown in 2007–08.
RESULTS
Parameter nominal values
The first step of the sensitivity analysis method consists in determining the nominal values and ranges of model parameters. As the GreenLab model has been parameterized on several species, reading several publications helped to find the adequate range of variation for each parameter (Table 1).
Values in Table 1 reveal a global robustness of the values of aO and bO parameters, which control the dynamics of the sink demand across species (Table 1). These curves are characterized by their sharpness (correlated to the values of aO and bO) and by their symmetry, correlated to aO/(aO + bO), which is the mode of the function. Even if the analysis was biased by the fact that some parameters were forced to 1, the shapes of the curve give indications about the sink dynamics of each organ for the different species. Figure 1 shows the normalized variations of the sink functions for the different organs of different species. If we excluded Capsicum, for which aO and bO were forced to 1, for dicotyledonous species aO had an average of 3.10 ± 0.46 for leaves and 3.12 ± 0.43 for fruits. For maize, which is a monocotyledonous species, aO was lower (1.55 for leaves and 2.95 for fruits). Values obtained for bO were more variable (2.52 ± 1.74 for leaves and 2.85 ± 1.49 for fruits).
Parameter . | Parameter description . | Nominal value . | Range (minimum – maximum) . | References . | |||||
---|---|---|---|---|---|---|---|---|---|
Equations 1 and 2 (biomass production) | |||||||||
kext | Extinction coefficient | 0.7 | 0.5–1 | WOSR (Gabrielle et al., 1998; Dreccer et al., 2000) | |||||
MRUE | Maximal value of radiation use efficiency | 4.6 g MJ−1 | 2–6 | The range is based on values of radiation use efficiency found in the literature and the nominal based on previous calibration | |||||
kRUE | Related to biomass production | 1 | 0.5–2 | Justes et al., 2000; Vidal, 2013 | |||||
Equations 3 and 4 (biomass allocation) | |||||||||
WOSR (Jullien et al. 2011) | Arabidopsis (Christophe et al., 2008) | Tomato (Dong et al., 2008) | Capsicum (Ma et al., 2011) | Maize (Guo et al., 2006) | Beetroot (Lemaire et al., 2008) | ||||
aB | Shape of the leaf sink function | 1.9 | 1.5–4.5 | 2.43 | 3.1 | 3.3 | 1 | 1.55 | 3.56 |
bB | Shape of the leaf sink function | 3 | Constant | 1.2 | 5.6 | 1.7 | 1 | 3.45 | 2.22 |
aI | Shape of the internode sink function | 3 | 1–4 | 21.7 | 2.6 | 3.5 | 1 | 3.3 | 2.56 |
bI | Shape of the internode sink function | 3 | Constant | 13.1 | 3 | 1.5 | 1 | 1.7 | 1.57 |
afruit | Shape of the fruit sink function | 3 | 2–6 | 2.76 | 3.6 | 3 | 2.4–3.6 | 2.95 | 3.13 (root) |
bfruit | Shape of the fruit sink function | 3 | Constant | 4 | 2.7 | 2 | 3.8–6.6 | 2.05 | 1.15 (root) |
SB,2 | Maximum of the leaf sink, bolting stage | 1 | 0.7–1.3 | (Jullien et al. 2011) | |||||
SB,3 | Maximum of the leaf sink, branch | 0.5 | 0.35–0.75 | WOSR (Jullien et al., 2011), Arabidopsis (Christophe et al., 2008) | |||||
SI,1 | Maximum of the internode sink, rosette stage | 1 | 0.4–1.2 | WOSR (Jullien et al., 2011; Mathieu et al., 2016, and unpublished calibration work) | |||||
SI,2 | Maximum of the internode sink, bolting stage | 2 | 0.5–2.5 | ||||||
SI,3 | Maximum of the internode sink, branch | 2 | 0.5–2.5 | ||||||
SI,4 | Maximum of the internode sink, inflorescence | 3 | 2–4 | ||||||
SP,4 | Maximum of the pod sink, main stem | 1.75 | 1–3 | Mathieu et al., 2016, and unpublished calibration work | |||||
SP,5 | Maximum of the pod sink, branch | 4.70 | 3–10 | ||||||
TB | Leaf expansion time | 180 | 80–260 | Mathieu et al., 2011 | |||||
TI,1 | Internode expansion time, rosette | 100 | 80–260 | Little information available | |||||
TI,2 | Internode expansion time, bolting | 250 | 160–520 | Little information available | |||||
TP | Pod expansion time | 350 | 120–450 | Experimental data (Jullien et al., 2011) | |||||
Equation 5 (leaf thickness) | |||||||||
ea | Slope for leaf thickness, rosette stage | 0.00013 | 0.0001–0.0004 | Jullien et al., 2009; Vidal, 2013 | |||||
eb | Slope at bolting stage | −0.00051 | Constant | ||||||
Equation 6 (leaf functioning time) | |||||||||
F0 | Minimal functioning time | 250 | 200–300 | Mathieu et al., 2011; Vidal, 2013 | |||||
Ft | Change in functioning time | 15 | 10–20 | ||||||
Fs | Slope of functioning time | 17.5 | 12–22 | ||||||
Other parameters | |||||||||
FP | Pod functioning time | 600 | 500–1000 | Experimental data (Jullien et al., 2011) | |||||
Q0 | Initial biomass value | 0.008 | 0.0056–0.0104 | Based on experimental data | |||||
QR | Coefficient of remobilized biomass | 0.0021 | 0.05–0.5 | Based on model calibration |
Parameter . | Parameter description . | Nominal value . | Range (minimum – maximum) . | References . | |||||
---|---|---|---|---|---|---|---|---|---|
Equations 1 and 2 (biomass production) | |||||||||
kext | Extinction coefficient | 0.7 | 0.5–1 | WOSR (Gabrielle et al., 1998; Dreccer et al., 2000) | |||||
MRUE | Maximal value of radiation use efficiency | 4.6 g MJ−1 | 2–6 | The range is based on values of radiation use efficiency found in the literature and the nominal based on previous calibration | |||||
kRUE | Related to biomass production | 1 | 0.5–2 | Justes et al., 2000; Vidal, 2013 | |||||
Equations 3 and 4 (biomass allocation) | |||||||||
WOSR (Jullien et al. 2011) | Arabidopsis (Christophe et al., 2008) | Tomato (Dong et al., 2008) | Capsicum (Ma et al., 2011) | Maize (Guo et al., 2006) | Beetroot (Lemaire et al., 2008) | ||||
aB | Shape of the leaf sink function | 1.9 | 1.5–4.5 | 2.43 | 3.1 | 3.3 | 1 | 1.55 | 3.56 |
bB | Shape of the leaf sink function | 3 | Constant | 1.2 | 5.6 | 1.7 | 1 | 3.45 | 2.22 |
aI | Shape of the internode sink function | 3 | 1–4 | 21.7 | 2.6 | 3.5 | 1 | 3.3 | 2.56 |
bI | Shape of the internode sink function | 3 | Constant | 13.1 | 3 | 1.5 | 1 | 1.7 | 1.57 |
afruit | Shape of the fruit sink function | 3 | 2–6 | 2.76 | 3.6 | 3 | 2.4–3.6 | 2.95 | 3.13 (root) |
bfruit | Shape of the fruit sink function | 3 | Constant | 4 | 2.7 | 2 | 3.8–6.6 | 2.05 | 1.15 (root) |
SB,2 | Maximum of the leaf sink, bolting stage | 1 | 0.7–1.3 | (Jullien et al. 2011) | |||||
SB,3 | Maximum of the leaf sink, branch | 0.5 | 0.35–0.75 | WOSR (Jullien et al., 2011), Arabidopsis (Christophe et al., 2008) | |||||
SI,1 | Maximum of the internode sink, rosette stage | 1 | 0.4–1.2 | WOSR (Jullien et al., 2011; Mathieu et al., 2016, and unpublished calibration work) | |||||
SI,2 | Maximum of the internode sink, bolting stage | 2 | 0.5–2.5 | ||||||
SI,3 | Maximum of the internode sink, branch | 2 | 0.5–2.5 | ||||||
SI,4 | Maximum of the internode sink, inflorescence | 3 | 2–4 | ||||||
SP,4 | Maximum of the pod sink, main stem | 1.75 | 1–3 | Mathieu et al., 2016, and unpublished calibration work | |||||
SP,5 | Maximum of the pod sink, branch | 4.70 | 3–10 | ||||||
TB | Leaf expansion time | 180 | 80–260 | Mathieu et al., 2011 | |||||
TI,1 | Internode expansion time, rosette | 100 | 80–260 | Little information available | |||||
TI,2 | Internode expansion time, bolting | 250 | 160–520 | Little information available | |||||
TP | Pod expansion time | 350 | 120–450 | Experimental data (Jullien et al., 2011) | |||||
Equation 5 (leaf thickness) | |||||||||
ea | Slope for leaf thickness, rosette stage | 0.00013 | 0.0001–0.0004 | Jullien et al., 2009; Vidal, 2013 | |||||
eb | Slope at bolting stage | −0.00051 | Constant | ||||||
Equation 6 (leaf functioning time) | |||||||||
F0 | Minimal functioning time | 250 | 200–300 | Mathieu et al., 2011; Vidal, 2013 | |||||
Ft | Change in functioning time | 15 | 10–20 | ||||||
Fs | Slope of functioning time | 17.5 | 12–22 | ||||||
Other parameters | |||||||||
FP | Pod functioning time | 600 | 500–1000 | Experimental data (Jullien et al., 2011) | |||||
Q0 | Initial biomass value | 0.008 | 0.0056–0.0104 | Based on experimental data | |||||
QR | Coefficient of remobilized biomass | 0.0021 | 0.05–0.5 | Based on model calibration |
Parameter . | Parameter description . | Nominal value . | Range (minimum – maximum) . | References . | |||||
---|---|---|---|---|---|---|---|---|---|
Equations 1 and 2 (biomass production) | |||||||||
kext | Extinction coefficient | 0.7 | 0.5–1 | WOSR (Gabrielle et al., 1998; Dreccer et al., 2000) | |||||
MRUE | Maximal value of radiation use efficiency | 4.6 g MJ−1 | 2–6 | The range is based on values of radiation use efficiency found in the literature and the nominal based on previous calibration | |||||
kRUE | Related to biomass production | 1 | 0.5–2 | Justes et al., 2000; Vidal, 2013 | |||||
Equations 3 and 4 (biomass allocation) | |||||||||
WOSR (Jullien et al. 2011) | Arabidopsis (Christophe et al., 2008) | Tomato (Dong et al., 2008) | Capsicum (Ma et al., 2011) | Maize (Guo et al., 2006) | Beetroot (Lemaire et al., 2008) | ||||
aB | Shape of the leaf sink function | 1.9 | 1.5–4.5 | 2.43 | 3.1 | 3.3 | 1 | 1.55 | 3.56 |
bB | Shape of the leaf sink function | 3 | Constant | 1.2 | 5.6 | 1.7 | 1 | 3.45 | 2.22 |
aI | Shape of the internode sink function | 3 | 1–4 | 21.7 | 2.6 | 3.5 | 1 | 3.3 | 2.56 |
bI | Shape of the internode sink function | 3 | Constant | 13.1 | 3 | 1.5 | 1 | 1.7 | 1.57 |
afruit | Shape of the fruit sink function | 3 | 2–6 | 2.76 | 3.6 | 3 | 2.4–3.6 | 2.95 | 3.13 (root) |
bfruit | Shape of the fruit sink function | 3 | Constant | 4 | 2.7 | 2 | 3.8–6.6 | 2.05 | 1.15 (root) |
SB,2 | Maximum of the leaf sink, bolting stage | 1 | 0.7–1.3 | (Jullien et al. 2011) | |||||
SB,3 | Maximum of the leaf sink, branch | 0.5 | 0.35–0.75 | WOSR (Jullien et al., 2011), Arabidopsis (Christophe et al., 2008) | |||||
SI,1 | Maximum of the internode sink, rosette stage | 1 | 0.4–1.2 | WOSR (Jullien et al., 2011; Mathieu et al., 2016, and unpublished calibration work) | |||||
SI,2 | Maximum of the internode sink, bolting stage | 2 | 0.5–2.5 | ||||||
SI,3 | Maximum of the internode sink, branch | 2 | 0.5–2.5 | ||||||
SI,4 | Maximum of the internode sink, inflorescence | 3 | 2–4 | ||||||
SP,4 | Maximum of the pod sink, main stem | 1.75 | 1–3 | Mathieu et al., 2016, and unpublished calibration work | |||||
SP,5 | Maximum of the pod sink, branch | 4.70 | 3–10 | ||||||
TB | Leaf expansion time | 180 | 80–260 | Mathieu et al., 2011 | |||||
TI,1 | Internode expansion time, rosette | 100 | 80–260 | Little information available | |||||
TI,2 | Internode expansion time, bolting | 250 | 160–520 | Little information available | |||||
TP | Pod expansion time | 350 | 120–450 | Experimental data (Jullien et al., 2011) | |||||
Equation 5 (leaf thickness) | |||||||||
ea | Slope for leaf thickness, rosette stage | 0.00013 | 0.0001–0.0004 | Jullien et al., 2009; Vidal, 2013 | |||||
eb | Slope at bolting stage | −0.00051 | Constant | ||||||
Equation 6 (leaf functioning time) | |||||||||
F0 | Minimal functioning time | 250 | 200–300 | Mathieu et al., 2011; Vidal, 2013 | |||||
Ft | Change in functioning time | 15 | 10–20 | ||||||
Fs | Slope of functioning time | 17.5 | 12–22 | ||||||
Other parameters | |||||||||
FP | Pod functioning time | 600 | 500–1000 | Experimental data (Jullien et al., 2011) | |||||
Q0 | Initial biomass value | 0.008 | 0.0056–0.0104 | Based on experimental data | |||||
QR | Coefficient of remobilized biomass | 0.0021 | 0.05–0.5 | Based on model calibration |
Parameter . | Parameter description . | Nominal value . | Range (minimum – maximum) . | References . | |||||
---|---|---|---|---|---|---|---|---|---|
Equations 1 and 2 (biomass production) | |||||||||
kext | Extinction coefficient | 0.7 | 0.5–1 | WOSR (Gabrielle et al., 1998; Dreccer et al., 2000) | |||||
MRUE | Maximal value of radiation use efficiency | 4.6 g MJ−1 | 2–6 | The range is based on values of radiation use efficiency found in the literature and the nominal based on previous calibration | |||||
kRUE | Related to biomass production | 1 | 0.5–2 | Justes et al., 2000; Vidal, 2013 | |||||
Equations 3 and 4 (biomass allocation) | |||||||||
WOSR (Jullien et al. 2011) | Arabidopsis (Christophe et al., 2008) | Tomato (Dong et al., 2008) | Capsicum (Ma et al., 2011) | Maize (Guo et al., 2006) | Beetroot (Lemaire et al., 2008) | ||||
aB | Shape of the leaf sink function | 1.9 | 1.5–4.5 | 2.43 | 3.1 | 3.3 | 1 | 1.55 | 3.56 |
bB | Shape of the leaf sink function | 3 | Constant | 1.2 | 5.6 | 1.7 | 1 | 3.45 | 2.22 |
aI | Shape of the internode sink function | 3 | 1–4 | 21.7 | 2.6 | 3.5 | 1 | 3.3 | 2.56 |
bI | Shape of the internode sink function | 3 | Constant | 13.1 | 3 | 1.5 | 1 | 1.7 | 1.57 |
afruit | Shape of the fruit sink function | 3 | 2–6 | 2.76 | 3.6 | 3 | 2.4–3.6 | 2.95 | 3.13 (root) |
bfruit | Shape of the fruit sink function | 3 | Constant | 4 | 2.7 | 2 | 3.8–6.6 | 2.05 | 1.15 (root) |
SB,2 | Maximum of the leaf sink, bolting stage | 1 | 0.7–1.3 | (Jullien et al. 2011) | |||||
SB,3 | Maximum of the leaf sink, branch | 0.5 | 0.35–0.75 | WOSR (Jullien et al., 2011), Arabidopsis (Christophe et al., 2008) | |||||
SI,1 | Maximum of the internode sink, rosette stage | 1 | 0.4–1.2 | WOSR (Jullien et al., 2011; Mathieu et al., 2016, and unpublished calibration work) | |||||
SI,2 | Maximum of the internode sink, bolting stage | 2 | 0.5–2.5 | ||||||
SI,3 | Maximum of the internode sink, branch | 2 | 0.5–2.5 | ||||||
SI,4 | Maximum of the internode sink, inflorescence | 3 | 2–4 | ||||||
SP,4 | Maximum of the pod sink, main stem | 1.75 | 1–3 | Mathieu et al., 2016, and unpublished calibration work | |||||
SP,5 | Maximum of the pod sink, branch | 4.70 | 3–10 | ||||||
TB | Leaf expansion time | 180 | 80–260 | Mathieu et al., 2011 | |||||
TI,1 | Internode expansion time, rosette | 100 | 80–260 | Little information available | |||||
TI,2 | Internode expansion time, bolting | 250 | 160–520 | Little information available | |||||
TP | Pod expansion time | 350 | 120–450 | Experimental data (Jullien et al., 2011) | |||||
Equation 5 (leaf thickness) | |||||||||
ea | Slope for leaf thickness, rosette stage | 0.00013 | 0.0001–0.0004 | Jullien et al., 2009; Vidal, 2013 | |||||
eb | Slope at bolting stage | −0.00051 | Constant | ||||||
Equation 6 (leaf functioning time) | |||||||||
F0 | Minimal functioning time | 250 | 200–300 | Mathieu et al., 2011; Vidal, 2013 | |||||
Ft | Change in functioning time | 15 | 10–20 | ||||||
Fs | Slope of functioning time | 17.5 | 12–22 | ||||||
Other parameters | |||||||||
FP | Pod functioning time | 600 | 500–1000 | Experimental data (Jullien et al., 2011) | |||||
Q0 | Initial biomass value | 0.008 | 0.0056–0.0104 | Based on experimental data | |||||
QR | Coefficient of remobilized biomass | 0.0021 | 0.05–0.5 | Based on model calibration |

Normalized sink variation functions as calibrated by the GreenLab model for three organ types (leaves, fruit, internodes) for different species, with parameters given in Table 1. The shape of the curve depends on the two parameters a and b. The expansion duration is normalized by dividing by the duration of the expansion for each species and organ.
Parameter values of the internode sink functions were different for WOSR compared with the other dicotyledonous species (aO, 7.59 ± 4.79 with WOSR included and 2.89 ± 0.53 with WOSR excluded; bO, 2.89 ± 0.53 with WOSR included and 2.02 ± 0.85 with WOSR excluded). In WOSR, these parameters have been fitted on long internodes of the main stem and ramifications, for which elongation duration is very fast and concentrated in a short period (~1 month for the main stem and 2 weeks for all the ramifications), whereas the vegetative phase lasts ~6 months. Hence, the tight internode sink variation function observed in Fig. 1 could be due either to the abruptness of bolting and ramification for this species or to the difficulty of estimating the expansion duration TI.
Based on these observations, nominal values were set to 3 for bO and aO except for leaves, for which aO was set to 1.9 (Table 1). Nominal values of the other parameters are given in Table 1. Most of them were computed from our experimental observations on WOSL or previous calibration of our model (; Mathieu et al., 2011) – maximum values for the sink functions, organ expansion and functioning times. As the maximum values of organ sink depend on organ physiological ages, their values could not be compared with those of the other plants calibrated with the GreenLab model, most of which were monoculm. As far as kext, MRUE and kRUE are concerned, their nominal values were found in the literature independently of the GreenLab model (Gabrielle et al., 1998b; Justes et al., 2000)
Sensitivity analysis on GLS criterion
The linearity index quantifies the linearity of the GLS with respect to the inputs. With 26 input parameters for sensitivity analysis, the linearity index is 0.12. This low value suggests that there are strong interactions between the input parameters. The linearity index is also very sensitive to the number of parameters. The weakness of this value indicates that the GLS criterion cannot be linearized and that its minimization is thus difficult. The result of the parameter estimation could be very sensitive to initial parameter values.
The Sobol method provides a total sensitivity index and first-order sensitivity index (Table 2). Comparisons between these two values revealed a high level of interactions between parameters, which is coherent with a low linearity index. The number of parameters to select depends on the threshold below which indices are considered to be negligible. In our analysis, some indices were slightly negative, which resulted from the Monte Carlo numerical estimation error and gave an indication of the numerical accuracy of the indices. Indeed, as indices cannot be negative, these slightly negative values could be assimilated to a null value. Hence slightly positive values of the same order could also be assimilated to a null value. Therefore, total sensitivity indices were considered to be significantly positive when their positive value was greater than the absolute value of the most negative index (light grey boxes in Table 2).
First-order and total sensitivity indexes for the main model parameters. Parameters are ranked with respect to their total indexes. Parameters with significant total-order indexes are highlighted in light grey, and among them, those also having significant first-order indexes are highlighted in dark grey
Parameter . | First-order index . | Total order index . |
---|---|---|
MRUE | 0.0550 | 0.74091 |
aI | 0.00376 | 0.55338 |
aB | 0.03986 | 0.39250 |
kext | 0.01269 | 0.33163 |
kRUE | −0.00346 | 0.13254 |
SI,1 | 0.00033 | 0.09984 |
TB | −0.00158 | 0.07686 |
SI,2 | −0.00081 | 0.06493 |
F0 | 0.00127 | 0.05529 |
TI,1 | 0.00251 | 0.04797 |
ea | 0.00184 | 0.02570 |
Ft | 0.00166 | 0.01035 |
SB,2 | 0.00011 | 0.00816 |
TI,2 | 0.00136 | 0.00772 |
Q0 | 0.00007 | 0.00739 |
Fs | 0.00050 | 0.00684 |
SP,5 | 0.00000 | −0.00541 |
TP | 0.00000 | −0.00578 |
aP | 0.00000 | −0.00579 |
SI, 3 | 0.00000 | −0.00591 |
SP,4 | 0.00000 | −0.00612 |
SI,4 | 0.00000 | −0.00612 |
SB,3 | 0.00000 | −0.00630 |
FP | 0.00000 | −0.00631 |
SI,5 | 0.0000 | −0.00641 |
QR | 0.00018 | −0.00836 |
Parameter . | First-order index . | Total order index . |
---|---|---|
MRUE | 0.0550 | 0.74091 |
aI | 0.00376 | 0.55338 |
aB | 0.03986 | 0.39250 |
kext | 0.01269 | 0.33163 |
kRUE | −0.00346 | 0.13254 |
SI,1 | 0.00033 | 0.09984 |
TB | −0.00158 | 0.07686 |
SI,2 | −0.00081 | 0.06493 |
F0 | 0.00127 | 0.05529 |
TI,1 | 0.00251 | 0.04797 |
ea | 0.00184 | 0.02570 |
Ft | 0.00166 | 0.01035 |
SB,2 | 0.00011 | 0.00816 |
TI,2 | 0.00136 | 0.00772 |
Q0 | 0.00007 | 0.00739 |
Fs | 0.00050 | 0.00684 |
SP,5 | 0.00000 | −0.00541 |
TP | 0.00000 | −0.00578 |
aP | 0.00000 | −0.00579 |
SI, 3 | 0.00000 | −0.00591 |
SP,4 | 0.00000 | −0.00612 |
SI,4 | 0.00000 | −0.00612 |
SB,3 | 0.00000 | −0.00630 |
FP | 0.00000 | −0.00631 |
SI,5 | 0.0000 | −0.00641 |
QR | 0.00018 | −0.00836 |
First-order and total sensitivity indexes for the main model parameters. Parameters are ranked with respect to their total indexes. Parameters with significant total-order indexes are highlighted in light grey, and among them, those also having significant first-order indexes are highlighted in dark grey
Parameter . | First-order index . | Total order index . |
---|---|---|
MRUE | 0.0550 | 0.74091 |
aI | 0.00376 | 0.55338 |
aB | 0.03986 | 0.39250 |
kext | 0.01269 | 0.33163 |
kRUE | −0.00346 | 0.13254 |
SI,1 | 0.00033 | 0.09984 |
TB | −0.00158 | 0.07686 |
SI,2 | −0.00081 | 0.06493 |
F0 | 0.00127 | 0.05529 |
TI,1 | 0.00251 | 0.04797 |
ea | 0.00184 | 0.02570 |
Ft | 0.00166 | 0.01035 |
SB,2 | 0.00011 | 0.00816 |
TI,2 | 0.00136 | 0.00772 |
Q0 | 0.00007 | 0.00739 |
Fs | 0.00050 | 0.00684 |
SP,5 | 0.00000 | −0.00541 |
TP | 0.00000 | −0.00578 |
aP | 0.00000 | −0.00579 |
SI, 3 | 0.00000 | −0.00591 |
SP,4 | 0.00000 | −0.00612 |
SI,4 | 0.00000 | −0.00612 |
SB,3 | 0.00000 | −0.00630 |
FP | 0.00000 | −0.00631 |
SI,5 | 0.0000 | −0.00641 |
QR | 0.00018 | −0.00836 |
Parameter . | First-order index . | Total order index . |
---|---|---|
MRUE | 0.0550 | 0.74091 |
aI | 0.00376 | 0.55338 |
aB | 0.03986 | 0.39250 |
kext | 0.01269 | 0.33163 |
kRUE | −0.00346 | 0.13254 |
SI,1 | 0.00033 | 0.09984 |
TB | −0.00158 | 0.07686 |
SI,2 | −0.00081 | 0.06493 |
F0 | 0.00127 | 0.05529 |
TI,1 | 0.00251 | 0.04797 |
ea | 0.00184 | 0.02570 |
Ft | 0.00166 | 0.01035 |
SB,2 | 0.00011 | 0.00816 |
TI,2 | 0.00136 | 0.00772 |
Q0 | 0.00007 | 0.00739 |
Fs | 0.00050 | 0.00684 |
SP,5 | 0.00000 | −0.00541 |
TP | 0.00000 | −0.00578 |
aP | 0.00000 | −0.00579 |
SI, 3 | 0.00000 | −0.00591 |
SP,4 | 0.00000 | −0.00612 |
SI,4 | 0.00000 | −0.00612 |
SB,3 | 0.00000 | −0.00630 |
FP | 0.00000 | −0.00631 |
SI,5 | 0.0000 | −0.00641 |
QR | 0.00018 | −0.00836 |
Three parameters had a positive first-order index and a high total index (dark grey boxes in Table 2): MRUE, kext and aB. MRUE and kext are parameters of eqns (1) and (2), which compute the daily amount of biomass produced by the plant, and aB drives the shape of the sink function for leaves [eqn (3)]. Five other important parameters were related to biomass allocation [eqn (3)]: aI drives the shape of the curve of the sink function according to time for internodes, and TI,1 and TB are the expansion durations for the initial set of internodes and leaves, respectively. As the parameters aO drive the shape of the sink variation functions, the relative dynamics of biomass allocation within organs appeared to be more important in the model than the maximum values of the sink (parameters SO,k for an organ of type O and physiological age k). Internode sink value parameters (SI,1, the sink of internodes appearing at the rosette stage and SI,2 for those appearing at the bolting stage) showed important total order indices, but with low first indices.
It is noticeable that the variations of these parameters that drive the allocation between organs during the rosette phase also control indirectly the dynamics of the leaf surface and thus biomass production. All the parameters related to the bolting and reproductive phases (physiological ages 3–5) had a lower sensitivity index.
Sensitivity analysis applied to the GLS loss criterion allowed us to select 11 parameters out of 26. However, this is still a high number of parameters for model calibration. Hence, in the next section we refine this selection by sensitivity analysis using a second step relying on the AIC.
Model selection with the AIC
Different combinations of parameters of different sizes were chosen to be estimated, while the others were fixed to their nominal values. The AIC was then computed for each scenario. Note that in our situation with a large number of observations, there is little difference between the original AIC and its corrected version introduced for small sample sizes (Burnham and Anderson, 2004). The AIC was first computed for models with only one parameter, for each of the 11 parameters (the other parameters being fixed to their nominal values). The lowest AIC was reached with the parameter of sink value for internodes appearing at rosette stage SI,1. Secondly, parameter estimation was run with SI,1 and any of the other parameters. The AIC was lowest for SI,1 and the extinction coefficient kext, and so on (Table 3). Finally, the best AIC was obtained with three parameters: SI,1, kext and MRUE. This step made it possible to reduce dramatically the number of parameters from 11 to three.
Best combination of parameters with the lowest values of AIC for every number of selected parameters from 0 to 6. To increase the number of parameters, the previously selected parameters were kept, and all the others were tested one by one. The lowest AIC was reached with three parameters
Number of parameters . | AIC . | Selected parameters . | |||||
---|---|---|---|---|---|---|---|
0 | 420.43 | ||||||
1 | 406.4 | SI,1 | |||||
2 | 398.634 | SI,1 | kext | ||||
3 | 398.634 | SI,1 | kext | MRUE | |||
4 | 399.895 | SI,1 | kext | MRUE | ea | ||
5 | 400.733 | SI,1 | kext | MRUE | ea | aB | |
6 | 402.499 | SI,1 | kext | MRUE | ea | aB | aI |
Number of parameters . | AIC . | Selected parameters . | |||||
---|---|---|---|---|---|---|---|
0 | 420.43 | ||||||
1 | 406.4 | SI,1 | |||||
2 | 398.634 | SI,1 | kext | ||||
3 | 398.634 | SI,1 | kext | MRUE | |||
4 | 399.895 | SI,1 | kext | MRUE | ea | ||
5 | 400.733 | SI,1 | kext | MRUE | ea | aB | |
6 | 402.499 | SI,1 | kext | MRUE | ea | aB | aI |
Best combination of parameters with the lowest values of AIC for every number of selected parameters from 0 to 6. To increase the number of parameters, the previously selected parameters were kept, and all the others were tested one by one. The lowest AIC was reached with three parameters
Number of parameters . | AIC . | Selected parameters . | |||||
---|---|---|---|---|---|---|---|
0 | 420.43 | ||||||
1 | 406.4 | SI,1 | |||||
2 | 398.634 | SI,1 | kext | ||||
3 | 398.634 | SI,1 | kext | MRUE | |||
4 | 399.895 | SI,1 | kext | MRUE | ea | ||
5 | 400.733 | SI,1 | kext | MRUE | ea | aB | |
6 | 402.499 | SI,1 | kext | MRUE | ea | aB | aI |
Number of parameters . | AIC . | Selected parameters . | |||||
---|---|---|---|---|---|---|---|
0 | 420.43 | ||||||
1 | 406.4 | SI,1 | |||||
2 | 398.634 | SI,1 | kext | ||||
3 | 398.634 | SI,1 | kext | MRUE | |||
4 | 399.895 | SI,1 | kext | MRUE | ea | ||
5 | 400.733 | SI,1 | kext | MRUE | ea | aB | |
6 | 402.499 | SI,1 | kext | MRUE | ea | aB | aI |
Model recalibration in another condition
To see whether these three parameters were sufficient, the model was calibrated on data measured on the Pollen variety in 2012–13.
First, all the 26 initial parameters were estimated simultaneously. The GLS criterion algorithm did not work with the 26 parameters, as it could not invert a matrix. The fact that the GLS algorithm is less efficient with a larger number of parameters is explained by the existence of more local minima for larger numbers of parameters, causing some numerical difficulties for the estimation procedure.
The mode was calibrated with the three parameters selected in the Sensitivity analysis on GLS criterion section. The GLS criterion was significantly decreased from 207.1 to 67.1 with an AIC of 73.1 as a result of this recalibration step, with the three estimated parameters. The initial parameter values were MRUE = 4.6, kext = 0.7 and SI,1 = 1 and the newly estimated ones were MRUE = 3.1, kext = 0.67 and SI,1 = 0.45. These values are in the proposed range but MRUE and SI,1 are quite different from their nominal values. Moreover, it is important to note that the amount of available data was lower for 2012–13 (only three dates compared with eight in 2007–08 and no measurements during the rosette stage).
The calibration results were quite satisfying at the compartment scale (Fig. 2A) except for the final plant biomass, which was underestimated by the model. But it was not very good at the organ scale (Fig. 2B). Hence, we tested to estimate in addition the parameter ea, which drives leaf thickness, and is the fourth parameter that could be selected by this method of model selection. From eqn (13), the probability that the model with these four parameters and AIC equal to 399.895 (Table 3) minimizes the loss of information is 0.53 times the probability that this is the model with three parameters and AIC equal to 398.634. This probability seemed high enough to allow us to test the four-parameter model. Indeed, the parameter ea decreased to 10−6, which induced a decreased in the GLS criterion to 58.3 with an AIC of 60.3 and a better correlation between observed leaf mass and the simulated one. The root mean square error decreased from 657 to 184. Figure 3 presents the results of this last calibration.

Simulated values versus observed values for the 2012–13 data set. In (A) red symbols represent plant whole biomass, green symbols the mass of internodes, black symbols the mass of leaves and blue symbols the mass of pods. In (B) main symbols represent the individual leaf masses of the main stem. Circles represent day 181 after emergence, + represents day 223 after emergence and squares represent day 255. The continuous line is the first bisector.

Simulated and observed leaf mass according to phytomer rank for the 2012–13 data set, with a recalibration step including four parameters. Lines represent the model simulations at the three dates (continuous lines for day 181 after emergence, dashed lines for day 223 and dotted lines for day 255) and symbols represent the experimental data (circles for day 181 after emergence, + for day 223 and squares for day 255).
DISCUSSION
Model calibration is a critical phase in model definition (Janssen and Heuberger, 1995). This paper proposes a two-step methodology to reduce the number of parameters to estimate: a sensitivity analysis on the GLS criterion followed by a reduction in the number of parameters by model selection. Sensitivity analysis is usually carried out on model outputs (Makowski et al., 2006; Perez et al., 2018). The originality of this paper is the use of sensitivity analysis to guide parametric estimation and thus to apply it directly to the criterion used in parameter estimation. This makes it possible to identify parameters that can improve model consistency with data and that should be included within the calibration as a priority. The methodology allowed us to reduce the number of parameters from 26 to three. It increased the efficiency of model recalibration on another set of experimental data. It saved time in the calibration process, which is an advantage in calibrating the model in different situations (Razavi et al., 2010). It provided key information to help parameterization and to help in the systematic recalibration of complex models such as FSPMs.
Another advantage of decreasing the number of parameters was to reduce their uncertainty; indeed, ‘estimating too many parameters can decrease model predictive quality’ (Wallach et al., 2001). Furthermore, the question of model identifiability is crucial with complex models with a large number of parameters (Walter and Pronzato, 1996). In addition to numerical problems often related to the inversion of a large matrix, the criterion function is more likely to reach a local minimum in the minimization algorithm. Because of correlations and compensations between parameters, the criterion function may reach the same values with different sets of parameters, some of which may be aberrant. This is confirmed by the high non-linearity of the GLS criterion.
The sensitivity analysis method was based on several hypotheses. The first one is the definition of the range and nominal values for each parameter. This is a crucial point as it has a great impact on the results of the methods (Saltelli et al., 2006). In our study, the search for these values was carried out through an analysis of previous calibration work or experimental work. Regarding the sink variation functions [eqn (3)], the nominal values were determined by analysis of these parameter values across the species for which GreenLab had been calibrated. This result gave an interesting insight into the variability and robustness of parameters and thus into the dynamics between the relative demands of leaves, internodes and fruits. It was noticeable that values were comparable for the different species, with a difference for the aO parameter between dicotyledonous and the monocotyledonous species. This comparison opens interesting avenues in ecophysiological studies on the interactions between the dynamics of sink growth and biomass allocation. Regarding the calibration process, this interspecific comparison makes it possible to provide nominal values for other case studies of sensitivity analysis.
In sensitivity analysis, the question of how many parameters should be selected by the sensitivity analysis method is decided by comprehensively considering the index values. To our knowledge, there is no exact decision rule or threshold to determine whether a sensitivity index is sufficient to select a given parameter. Similarly, the initial choice of parameters is a matter for debate. As an example, the base temperature was set to a constant value in this study, whereas the genotypic variation of base temperature has been studied for several species, such as Miscanthus (Farrell et al., 2006) and wheat (Slafer and Rawson, 1995). In these studies, however, the base temperature was a statistical parameter estimated from a linear model between the sum of daily temperatures and the number of organs. Recent studies have considered non-linear models that reduced the genotypic variability of base temperature between crop cultivars (Parent and Tardieu, 2012).
An interesting perspective of this work concerns the use of plant models to study genotype × environment interactions, with the objective of characterizing each genotype by a unique set of parameter values (Tardieu, 2003; Hammer et al., 2006). A current bottleneck in this regard resides in the difficulty of estimating parameters precisely enough to be able to reach clear conclusions in terms of parameter variability. The proposed sensitivity analysis based on estimation could potentially play a key role in helping to reduce uncertainty, even though further work is necessary. Parameters with higher sensitivity indexes do not necessarily correspond to parameters with larger genotypic variance. What is true, however, is that it is more difficult to identify inter-genotypic variability for parameters with lower sensitivity indexes, not because it does not exist but for statistical reasons, as has been shown by Baey et al. (2018). On the contrary, it can be expected that parameters with a known inter-genotype variance are characterized by high sensitivity indices and should be among the parameters to estimate as a priority.
Our results were guided by the choice of the GLS criterion and hence the choice of the observed data with which model outputs were compared. In our case, the GLS criterion was the sum of weighted squares of differences between measured and simulated biomasses of different organs. Furthermore, by analysing the influential parameters, to the first order, the quality of the model was found to depend mostly on the carbon input. All the parameters with high sensitivity indices were related directly or indirectly to light interception and conversion into biomass. A previous study quantified the impact of model parameters on model outputs (Mathieu et al., 2016). The hierarchy of the parameters was slightly different. For example, if the model output is the whole-plant biomass, the parameter with the highest total index is aB, whereas this parameter has the third highest total index with the GLS criterion. Conversely, the parameter kRUE is the seventh parameter when considering the whole-plant biomass, whereas it was the fifth one in this study. When considering the internode biomass as model output, the internode sink SI,2 has a higher impact compared with the current study. Anyway, globally speaking, parameters selected in this paper were similar to those to which model outputs are sensitive, which strengthened the need to estimate them accurately and reinforced the importance of carrying out the sensitivity analyses on one criterion.
The evaluation of our methodology involves comparison with reference methods to calibrate the model. In the situation where one has to estimate a large number of parameters, there is no reference method other than an empirical trial-and-error approach, starting from reasonable values and trying to improve the fit little by little. Calibration is classically done by estimating the parameters one by one. In our situation, such a method is time-consuming and may lead to very different results according to the order in which the parameters are chosen for the successive calibrations. Even if by expertise or luck we manage to find the proper parameter order with which to implement this strategy, there is little chance that the final optimal set of parameters obtained will actually be a global optimum; this is a well-known problem in numerical optimization, which led to the theory of stochastic gradient descent, for example (Robbins and Monro, 1951).
The other solution is to make an attempt at the full calibration, that is to say all parameters simultaneously, which, similarly, rarely works unless a large quantity of data is available. In our situation, we were confronted with numerical problems related to practical non-identifiability issues. In some situations, the minimization algorithm converged to local minima, which is a common problem in the calibration of complex models. Heuristic methods can be used to avoid this problem of local minima. However, in practice, the algorithms often abut on the limits of the field of exploration, at least for certain parameters, which, in the case of the FSPM, can give non-biological values, such as a negative sink value.
Regarding the estimation of the three parameters selected by the proposed methodology, the results were satisfactory at the compartment scale but not at the individual organ scale.
To overcome this problem, the error model on which the GLS method relies could be improved, probably with a little underestimation of the variance terms related to the compartment model (and thus an overweighting of the likelihood to the detriment of the fit at the organ scale).
However, in this study, the model was deliberately simplified at the organ scale to be able, in the medium term, to simulate the growth of several plants and integrate competition functions between them. A possibility for a user who needs a more detailed description at the organ level can be to increase the model complexity by modelling more detailed physiological processes. Even then, keeping the model as it is, the proposed methodology could provide an initial point at which the user could carry on the estimation of the sink function parameters that play a role in biomass allocation among organs and thus in the determination of individual organ biomasses. Having an initial solution close enough to the observed data is an improvement in itself.
The model calibration on the test data set (2012–13) gave estimated parameter values in the proposed range but MRUE and SI,1 were quite different from their nominal values. MRUE decreased by a third in the model recalibration. This can be explained by a higher density in 2012–13 (42 versus 20 plants m−2 in 2007–08) or by differences in phenology due to differences in temperature. The year 2007–08 was colder than 2012–13, but the cumulated photosynthetically active radiation during the whole crop cycle was higher than in 2012–13. Hence, the number of leaves increased slowly, and each leaf had more time to develop without concurrence of the younger one, which can explain the higher leaf surface area (Mathieu et al., 2016). An equivalent total biomass is produced for a higher quantity of radiation intercepted during a longer period of time; it is thus consistent with a lower efficiency of radiation conversion and also with the smaller internode sink at the rosette stage for 2012–13. It is important to note that the amount of available data was lower for 2012–13 (only three dates compared with eight in 2007–08 and no measurements during the rosette stage), hence calibration might be less precise.
The model selection method concludes by selecting three parameters. However, the differences in the AIC between the models with three and four parameters were quite low (Table 3). The probability that the model with four parameters minimizes the GLS was only 0.53 times that of the model with three parameters. This is why the calibration with four parameters was tested, which increased the quality of the results at the organ scale. It is indeed interesting to note this probabilistic interpretation of the AIC criterion when recalibrating the model in another condition. Other statistical criteria could have given different results (Konishi and Kitagawa, 1996; Kadane and Lazar, 2012). However, in the context of model regression, the AIC criterion is generally preferred to other criteria since it is asymptotically optimal (Yang, 2005) for the least mean squared error. Moreover, its interpretation in terms of information theory is usually helpful in concrete applications (the optimal design of experiments, for example).
Conclusions
With the proposed parameterization methodology, the number of parameters used in the recalibration process of plant growth models was dramatically reduced, which increased its efficiency. Moreover, by estimating only a limited number of parameters, the estimation uncertainty is automatically reduced, thus inducing a gain in model robustness. We believe this methodology opens avenues for the key issue of the application of complex and highly parameterized FSPMs to real data. With increased recalibration efficiency in new situations, a straightforward consequence is an increased capacity of FSPMs to analyse and compare the plasticity of genotypes. Although it has been tested on only one case study, the method could be easily further tested with other complex models.
A perspective of the proposed strategy is the possibility of reducing experimental data collection in the recalibration process by reducing the number of parameters to estimate, which is of course of interest if our objective is to apply the model to a large number of genotype × environment situations.
ACKNOWLEDGEMENTS
We thank Marc Bidon, Alain Fortineau, Christelle Franchet and Julie Rodrigues, among others, for technical experimental support. The study was supported by National Natural Science Foundation of China (NSFC grant No. 31600290).