A method for using monthly average temperatures in phenology models for grapevine ( Vitis vinifera L.)

In recent years, there have been increasing efforts to link phenology models with seasonal climate predictions in so-called Decision Support Systems (DSS) to tailor crop management strategies. However, temporal discrepancies between phenology models with temperature data gathered on a daily basis and seasonal forecasting systems providing predictability on monthly scales have limited their use. In this work, we present a novel methodology to use monthly average temperature data in phenology models. Briefly stated, we modelled the timing of the appearance of specific grapevine phenological phases using monthly average temperatures. To do so, we computed the cumulative thermal time (S f ) and the number of effective days per month ( eff d ). The eff d is the number of days in a month on which temperatures would be above the minimum value for development ( T b ). The calculation of eff d is obtained from a normal probability distribution function derived from historical weather records. We tested the methodology on four experimental plots located in different European countries with contrasting weather conditions and for four different grapevine cultivars. The root mean square deviation (RMSD) ranged from 4 to 7 days for all the phenological phases considered, at all the different sites, and for all the cultivars. Furthermore, the bias of observed vs predicted comparisons was not significantly different when using either monthly mean or daily temperature values to model phenology. This new methodology, therefore, provides an easy and robust way to incorporate monthly temperature data into grapevine phenology models.


INTRODUCTION
"Phenological development is the most important attribute of crop adaptation" (Sadras et al., 2009). Knowing the phenology of a plant helps to understand the climatic limitations and yield potential of a crop (Jones and Davis, 2000;Ritchie and Nesmith, 1991). Moreover, from a crop management perspective, knowledge of plant phenology helps to plan field tasks like pesticide applications, pruning or irrigation based on plant water demands (Jones and Davis, 2000). It is not, therefore, surprising that historically speaking, there has been huge interest in modelling plant phenology Wang, 1960).
In addition, phenological modelling is a very useful tool for studying the response of plant development to climate variability. It is well known that plant development is almost proportional to temperature (Ritchie and Nesmith, 1991). In grapevines, several approaches have been proposed to model phenology based on the calculation of thermal time (Parker et al., 2013;Parker et al., 2011). Thermal time computes the time of the year when a phenological phase takes place based on the total accumulated temperature above a threshold. The thresholds refer to a temperature below which development ceases, the base temperature (T b ) (Parker et al., 2011;Ritchie and Nesmith, 1991).
The main advances in phenology modelling took place in the last part of the 20th century . Interest was stimulated by increasing concerns about global warming . Nowadays, the negative effects of global warming are more evident than ever. For instance, some of the premium wine regions in Europe have already passed their optimum temperature for growing (Jones et al., 2005), and this trend is expected to continue in the future. IPCC projections present a global scenario in which temperatures will probably rise by more than the projected 1.5 o C (RCP4.5, RCP6.0, and RCP8.5) threshold by the end of the present century. This will also be accompanied by an increase in the frequency of extreme weather events (IPCC, 2014;IPCC, 2018). These changes have created a high degree of uncertainty in crop management as yields and berry quality may be compromised (Jones and Davis, 2000;Jones et al., 2005). Faced with such uncertainty, the thermal time approach has been introduced, in combination with weather forecasting, to predict short-term phenology.
Over longer time scales, seasonal predictions have been highlighted as a potential planning tool for the agriculture sector (Hansen et al., 2011) and phenology in particular (Ceglar and Toreti, 2021). Seasonal predictions are forecasts of climate conditions for periods ranging from one month to one year into the future. These forecasts are possible due to the existence of slowly evolving systems (such as oceans, snow cover, arctic ice, etc.) that interact with the atmosphere and favour certain states as opposed to others (see Hansen et al. (2011) or Doblas-Reyes et al. (2013). Within this framework, the capacity to use seasonal forecasts to anticipate temperature behaviour several months into the future could potentially place phenological predictions at the forefront of efforts to mitigate the negative effects of climate change in agriculture. Examples of the use of seasonal forecasts include adjusting sowing dates or predicting the appearance of pests and diseases and can be found in the existing literature (Hansen and Indeje, 2004;Régnière and Bolstad, 1994;Sporleder et al., 2008). In the case of grapevines, Santos et al. (2020) used seasonal forecasts to estimate yields in the Douro Region of Portugal. However, to the best of our knowledge, and as also highlighted by Santos et al. (2020), there has been no published information about the use of seasonal forecasts to predict grapevine phenology. Probably, the lack of adoption of seasonal forecasts to model phenology has been due to both the intrinsic uncertainty of this information, highly dependent on region, variable and season (Weisheimer and Palmer, 2014), as well as the temporal mismatch between the predictability provided by seasonal forecasts and the data required by grapevine phenology models. Although seasonal forecast systems can provide daily data, predictability on seasonal scales is centred on monthly to multi-month scales, so the sign is displayed in these aggregations (Doblas-Reyes et al., 2013). However, seasonal forecasting systems suffer uncertainties and biases in their predictions; these need to be corrected before they can be used (Doblas-Reyes et al., 2006). The existence of this temporal gap between phenology models, which requires daily or sub-daily temperature data, and seasonal forecasts presents a hurdle to providing seasonal phenology forecasts that can be used as a decision-making tool for grapevine management.
The objective of the present paper is to compare the performance of a phenology model when using either daily average temperatures or to use a novel methodology to digest monthly average temperatures. The robustness of the proposed approach is tested using a multiple-year phenological dataset collected from four grapevine cultivars grown in four different areas of Europe.

Phenology forecasts
To test the performance of the methodology proposed, we used the Spring Warming (SW) model (Hunter and Lechowicz, 1992). The SW model has been widely used in grapevines (Costa et al., 2019;Parker et al., 2011;Zapata et al., 2017). The model calculates the accumulated thermal time (S f , ºC d) for a given period as the sum of the development rate ) computed from daily average temperatures ( ). The model requires three parameters: base temperature (T b ), thermal time or forcing units required to reach a phenology stage (F * ), and starting date to accumulate temperature (t 0 ).
where y is the date when the target phenological event occurs.

Estimation of phenology forecast from monthly mean temperatures
As previously indicated, the thermal time model requires daily values of mean air temperature to compute ), but the objective of our research is to model phenology using average monthly temperatures ( ). To address this, we propose the monthly downscaling approach (DMA). The DMA calculates the S f from the rate of development computed from and the number of effective days per month (eff d ). We define the eff d as the number of days in a particular month that had a daily temperature above T b . (3) where n is the number of months, m is the starting month and ) is the rate of development of the m th month using a monthly mean temperature. The eff d was calculated as follows: first, we used historical weather records to obtain the standard deviation of the monthly temperature distributions (σ Tm ). Then, we computed the cumulative distribution function (cdf) for each month, from , σ Tm and T b . The eff d was obtained by multiplying the number of days of that month (n days ) by 1 minus cdf.

Site description
The analysis was performed at four experimental plots located in three different countries. The experimental plots were: Vilariça (Portugal), Raïmat (Spain), La Orden (Spain), and Mirabella Eclano (Italy) ( Figure 1). The model was validated for the following four cultivars of Vitis vinifera L.: 'Touriga Nacional', 'Tempranillo', 'Aglianico', and 'Chardonnay'. At each experimental plot, the phenology was recorded every week. The phenology was tracked according to the scale proposed by Baggiolini (1952) for bud break, bloom, fruit set, and veraison. Weather data were retrieved from nearby weather stations; these either belonged to the winery itself or were public. At each site, the plants were managed according to standard local winery practices. Table 1 summarises the information obtained at the different sites.

Calibration of the SW model
We carried out a specific calibration of the SW model for each cultivar and site. For each phenological phase and year, we calculated S f from Equation 1 by computing ) from t 0 until the recorded date of the phenological phase in the corresponding year. The F * for the corresponding phenology phase was calculated by averaging all the S f . T b was from Moncur et al. (1989), who set a T b of 3.5 ºC for bud break and 7.1 ºC for the rest of the phenological phases. The t 0 was set for January 1st. The F * values obtained from the SW calibration process were the used in the DMA approach.

Statistical analysis of model performance
We explored the performance of the DMA by computing the root mean square deviation (RMSD). The RMSD (Equation 6) represents the mean deviation of the predicted values (P) to the observed ones (O) (Kobayashi and Salam, 2000).

Testing the model compatibility
We computed the phenology using both the SW model ( ) and the DMA ( ). In both models, we applied the F * values derived from the SW calibration (Section 4). Temperature values were obtained from the selected weather stations at each location. The were calculated by averaging for the corresponding month. A regression analysis was performed comparing phenology forecasts from the SW model and from the DMA. In the regression analysis, we tested for slope = 1 and intercept = 0. If the null hypothesis for the slope is rejected, the DMA will have no consistency with the SW model. If the null hypothesis for the intercept FIGURE 1. Location of the study sites.
is rejected, then the DMA is biased. Finally, if both null hypotheses are not rejected, the bias observed is due to the unexplained variance (Piñeiro et al., 2008). The significance level for the test was 0.05. Moreover, we evaluated DMA's performance using the Willmot index of agreement (d) and Theil's partial inequality coefficients (U bias , U slope , and U error ) (Piñeiro et al., 2008;Willmott, 1984). The d evaluates the model's performance based on a ranking that varies from 0 (no agreement) to 1 (perfect fit), using both predicted (P) and observed (O) values (Equation 7). For the purpose of the performance test, P values corresponded to the phenology forecasts from the DMA and O values from the SW model.
Theil's partial inequality coefficients separate the total prediction error into three different proportions to assess the performance of the regression. These proportions are: the bias of the prediction (U bias ); the slope of the fitted model against the identity line (U slope ), and the unexplained variance (U error ) (Paruelo et al., 1998).
where n is the number of values, obs and pre are the observed and predicted values, and are the averages of the observed and predicted values, and est are the values estimated from the fitted regression model.
The compatibility of SW and the DMA was also tested by applying a two-sample t-test to the residuals of each calibrated model (Kalpić et al., 2011). These residuals were obtained from comparisons between the observations and predictions (from SW or DMA) at the different sites. The analysis was also applied to the four cultivars. Considering that the number of phenological records was rather limited (see Table 1), testing was conducted on a cultivar-wise basis. Our objective was to demonstrate that the two approaches, SW and DMA, were compatible. With this aim, the null hypothesis (H 0 ) was that the mean difference of residuals between the two models would be 0 (or compatible with 0). Conversely, the alternative hypothesis (H a ) was that the DMA and SW would significantly differ from 0 and, consequently, would not be compatible.

Sensitivity analysis of the model
We performed an analysis to explore the effect of the number of years for the historical records on the calculation of eff d (Figure 6). The analysis focused on the sites with the largest data set, Raïmat, Vilariça and Mirabela Eclano. At each site, we changed the number of years to compute eff d, starting with three years and subsequently increasing the number of years by two. For each number of years, we forecast the period of phenological records (Table 1) 100 times. In each forecast, the combination of years for eff d calculation was changed. For instance, in Raïmat, when the number of years for eff d calculation was three, the phenology was predicted from [(1997, 2005, 2004); (1999,2010,2000);…; (2002,2011,2014)]. Figure 2 presents the frequent temperature distribution obtained from historical weather records (Table 1). They all show bi-modal behaviour, with a primary probability maximum at around 11 ºC and a secondary maximum at around 21 ºC. The amplitude of these maximums and the frequencies of the extreme values are site-dependent. La Orden was the least extreme of the four, with its highest probabilities corresponding to the central temperature range. Vilariça, on the other hand, had the warmest extremes and was the site where bi-modal behaviour was least prominent. Raïmat was the coldest of the four sites, with its highest probabilities being in the lower temperature range. Finally, Mirabella Eclano was similar to La Orden but slightly cooler. These very different characteristics made the sites interesting to explore the functionality of the proposed approach under different temperature accumulation regimes. La Orden had the lowest variability in , presenting an average difference between the maximum and minimum of 7.4 ºC. This difference contrasts with the 11.9 ºC in Vilariça or the 12.5 ºC in Mirabela Eclano. The mean is above T b before bud break at all the sites, with Raïmat having the closest average to T b before bud break.   Table 2 reports the results of the calibration procedure performed at the different sites. 'Chardonnay' presented the lowest F * values at bud break, whereas 'Aglianico' had the highest. We obtained different F * values in the two sites for 'Tempranillo'. La Orden had a higher F * than Raïmat in all the phenology phases except for veraison. When we summed the F * for all the phenology phases, the largest value was for Aglianico and the lowest for Chardonnay (Table 2).

Model performance at different locations
The DMA performed well for all the cultivars and locations (Figure 4). The RMSDs were between 4 and 6 days and depended on the cultivar and location (Figure 4). However, the model's performance changed from year to year and for FIGURE 2. The probability distribution for the average daily temperature at the four experimental sites.

FIGURE 3.
The daily evolution of average, minimum, and maximum mean air temperatures at the four study sites from the historical records is presented in Table 1. Horizontal lines indicate the base temperature (T b ) used before bud break (solid) and after bud break (dashed).
the different phenology events (Figure 4). For instance, in La Orden, the RMSDs ranged from 3.3 to 8.4 days. In 2017, in La Orden, we observed an unusual deviation of the prediction at fruitset and veraison (21 and 17 days difference from the observation). However, the same bias was observed when we used the SW model instead of the DMA. The statistical regression analysis confirms the null hypothesis (p-Intercept and p-Slope > 0.05). The DMA predicts phenology with the same variance as the SW model. Most of the observed variance can be attributed to unexplained errors (U error = 0.78). Furthermore, all the RMSD comparisons for the different phenological stages were very similar.

Model comparison and compatibility analysis
Modelling phenology with the DMA provided slightly better results than the SW model, except for bud break (Table 3). When using the DMA, the highest RMSD was obtained at bud break, whereas with the SW model, the highest RMSD was at veraison. The lowest RMSDs were observed at bloom in both models.   The significance level established for testing the model's compatibility was 0.05 (5 %) (Fisher and Yates, 1938). Since the test is two-sided, as the difference can be either positive or negative, the p val needed to reject the H 0 is 0.025. Thus, the next step was to compute the different parameters of the test, the standard error, degrees of freedom, mean residual value, the t-statistic, and the p val corresponding to this statistic. Table 4 summarises these outcomes separately for each cultivar. Although the mean residuals were higher for the DMA, we could not reject the H 0 for any of the models because the p val was always above the level of significance (0.025). Consequently, both models were compatible for all of the situations studied.
We conducted a sensitivity analysis to test the effect of varying the number of years of historical weather records on the models' performance. We decided to exclude La Orden from the analysis because the number of years was low.
However, we found no differences in RMSD when we compared the predictions with the observations (Figure 6). Changing the number of years of the historical records from 3 to more than 30 (Mirabella Eclano) produced a slight fluctuation in the RMSD but no apparent trend.

DISCUSSION
We present a novel approach to incorporate monthly mean air temperatures into phenology models, opening the way for more straightforward use of seasonal forecasts in the prediction of grapevine phenology and its implementation into DSSs. Several examples of the use of seasonal forecasts to schedule agronomic tasks can already be found in the literature. For instance, seasonal precipitation forecasts can be used to optimise the date at which maize is sown (Hansen and Indeje, 2004). The incorporation of temperature and precipitation   forecasts into pest and disease models can be used to tailor crop protection strategies (Régnière and Bolstad, 1994;Sporleder et al., 2008). In the case of grapevines, seasonal weather forecasts have proven to be useful for estimating wine production in the Douro region (Santos et al., 2020). However, the use of seasonal data to forecast grapevine phenology is still lacking (Santos et al., 2020). One of the complexities in the use of seasonal predictions to forecast phenology is that it relies on conserving the monthly signal when introducing daily data into phenological models. We have presented a downscaling approach that will conserve the signal as well as any hypothetical bias correction that may need to be applied to monthly data.
The two phenological models, based on and , achieved similar results. Indeed, the residual comparison of the two approaches using real observations showed that they could not be regarded as being incompatible at the 5 % confidence level (Table 4). This means that we can use both approaches to compute phenological stages and expect to obtain similar performances (similar RMSDs) for different grape varieties and across different phenological stages and sites (Table 3). The regression analysis confirms this result ( Figure 5). The statistical indexes for slope and intercept and Theil's test confirmed that the bias observed in Figure 5 was a consequence of unexplained variance and not related to poor performance of the DMA (Piñeiro et al., 2008). It is important to bear in mind that we used the same F * and T b in both approaches. The F * was derived from inverse modelling of the SW model using . According to the results in Figure 5 and   a useful alternative when missing data are large enough to prevent any interpolation and gaps are distributed in a way that will not affect the monthly mean value.
The use of a statistical approach to perform a site-specific calibration requires a larger dataset than the ones available at the study sites . However, our main goal was not to test the accuracy of the DMA but to compare it with a well-established model for grapevine phenology forecasting. To overcome data limitations, we decided to use a T b from the literature. The T b selected was proposed by Moncur et al. (1989). To the best of our knowledge, this is the only work that has used an experimental approach to directly obtain T b from the linear regression between temperature and phase duration for a wide range of grapevine varieties. The use of a proper T b through a site-specific calibration would certainly improve DMA performance.
Equation 3 presents the general calculation approach. However, in only two cases was below T b , Raïmat 2017 and 2019.We acknowledge that this is a limited number of cases, and there is a need to explore more situations in which is below T b . Especially at sites with cooler climates than the ones used in the present study. Perhaps, at cooler sites with sudden temperature changes, the model will tend to underestimate S f during the winter season. The magnitude of the bias in phenology forecasting associated with S f underestimation will depend on the T b selected (Parker et al., 2011). However, our study sites provided a good representation of temperate dry-summer growing regions according to the Köppen-Geiger climate classification (Peel et al., 2007). At other sites with similar climates, the model should perform as well as at the study sites.
The methodology proposed in the present paper is based on probability distribution functions for monthly temperatures obtained from meteorological records. Climate change can alter monthly temperature distributions, but we think that the rate of change will be slow enough not to compromise the DMA performance. The updated definition of 'present-day' climate provided by the World Meteorological Organization (WMO, 2017) is formally represented by the meteorological statistics of the period 1991-2020 (Hulme, 2020). In two of the four sites studied (Raïmat and La Orden), the historical weather records are within the 'present-day' definition (Table 2). Interestingly, the RMSDs remained in the same range in the other two sites (Mirabella Eclano and Vilariça). We found no significant trends in the monthly average temperature in the historical weather records presented in Table 2. Possibly, alteration in monthly temperature distribution occurs at a lower rate than the frequency of extremes. Figure 6 supports this hypothesis. We found no differences in model performance when varying the number of years used for eff d calculations. However, we recommend keeping the historical weather records for eff d calculations within the 'present-day' definition by the WMO to track climate change alterations of monthly mean temperatures.

CONCLUSIONS
We have presented a downscaling monthly approach (DMA) to replace by with further modification in F * . The DMA, tested at different study sites and varieties, was able to forecast various phenology phases with an adequate level of error. The DMA was also able to maintain the same bias as the reference spring warming model (SW). Therefore, one could use the DMA and expect the same prediction accuracy as the SW model. The next step will be to examine the utility of the present methodology when seasonal forecasts are used.