Assessing grapevine phenological models under Chinese climatic conditions

The objective of this work was to carry out a preliminary assessment of the performance of different models for the simulation of three main phenology stages (budburst, flowering and veraison) of grapevine in China. This work utilised observations from five representative wine regions (Changli, Laixi, Shangri-La, Xiaxian, and Yanqi) and four widely cultivated grape cultivars (Cabernet-Sauvignon, Cabernet franc, Merlot, and Chardonnay) in China. The corresponding daily temperature data were used to simulate the timing of grape phenology stages using the different phenological models. The simulation dates and the actual dates were compared and the performance of the models was assessed for the different cultivars and wine regions. The GDD 10 model exhibited the best performance for budburst simulation in soil-burying regions, irrespective of the cultivar and location. For flowering and veraison, the optimal model varied in performance between cultivars and locations, and non-linear models exhibited better performance than linear models. In general, the performance of these models was better for the latter two stages than for budburst. The models with relatively good performance were selected for further calibration using the limited Chinese observations. The impact of soil-burying management on budburst simulation was also discussed. These results highlight the strengths of some phenological models for use in China. This study also reiterates the strong need for the establishment of a grapevine phenology observation network in China to obtain more comprehensive data.


INTRODUCTION
Phenological models are important tools with a wide range of applications in grapevine cultivation.Models can be applied in the short-term planning of viticultural practices, with a focus on the timing of treatments for different grape growth and development stages, like pest management (Galvan et al., 2009) and irrigation (Basile et al., 2012).Phenological models have also become useful tools for projecting the impacts of future climate change on viticulture (Duchêne et al., 2010).Obvious phenological changes have been observed in different wine regions and varieties (Bock et al., 2011;Duchêne and Schneider, 2005; García de Cortázar-Atauri et al., 2017;Lisek, 2008;Urhausen et al., 2011;Webb et al., 2011).Rising temperature, combined with advancing phenological phases, will profoundly affect many different processes during plant growth, altering both overall quality and yield (van Leeuwen and Darriet, 2016;Webb et al., 2011).Given that grape phenology is a good indicator of a changing climate, phenological models can also be used for climate reconstruction to identify past climate change (Chuine et al., 2004;García de Cortázar-Atauri et al., 2010b;Yiou et al., 2012).
These general applications highlight the importance of phenological models to properly describe grapevine growth and development.Numerous mechanistic models, or process-based models, have been recently developed to study grapevine phenology with the assumption that temperature is the main regulator of phenological development (Duchêne et al., 2010;García de Cortázar-Atauri et al., 2009;Nendel, 2010;Parker et al., 2011).These models are driven by chilling (for autumn and winter conditions) and forcing units (for spring and summer conditions) using various time-steps (e.g., daily or hourly temperature).These data are accumulated from a starting date (usually a phenological stage), and when they reach a critical threshold, the phenological stage (often judged at 50 % level of appearance) occurs (Parker et al., 2011).
The grapevine developmental cycle is typically described in the three main phenological stages of budburst, flowering, and veraison (Duchêne et al., 2010)."Maturity" is usually not considered a phenological stage because of the difficulty in accurately defining the time of maturity (Duchêne et al., 2010).Although sugar content is considered to be a good indicator of maturity, it can be influenced by many other factors in addition to climate (Jackson and Lombard, 1993).
The dormancy period is classically described by the three main phases of paradormancy, endodormancy, and ecodormancy (Lang, 1987;Sarvas, 1974).In process-based models, the first phase (paradormancy) is not usually included, and they mostly start on a fixed date (e.g., 1st August in Garcia de Cortazar-Atauri et al., 2009).Endodormancy and ecodormancy periods are classically described based on chilling and forcing temperatures respectively.Previous reports have used one of two types of models to simulate budburst.The first does not include the endodormancy phase and only takes into account a forcing model to calculate budburst (Duchêne et al., 2010;Nendel, 2010).These models start from a predefined fixed date (which differs for each model), and they calculate budburst using a linear (e.g., growing degree-days) or nonlinear (e.g., sigmoid) function.The second type of model describes both endodormancy and ecodormancy phases, using chilling and forcing functions that can be combined using different hypotheses (Chuine et al., 2013).For example, the BRIN model (García de Cortázar-Atauri et al., 2009) sequentially takes endodormancy and ecodormancy into account, while Caffarra's model (Caffarra and Eccel, 2010) takes into accont the possibility of an interaction between phases.
For flowering and veraison simulation, only heat units are taken into account, but models vary according to the starting date of the calculation.Models can start with a fixed date, like the Grapevine Flowering Veraison model (GFV) (Parker et al., 2013), or they can start from the previous phenological date, like the Growing Degree-days model (García de Cortázar-Atauri et l., 2009), the curvilinear WE model (Wang and Engel, 1998), and the Sigmoidal model (Caffarra and Eccel, 2010).
In general, the applicability of these models needs to be adapted and evaluated under local conditions.In recent years, some process-based phenological models have been parameterised and validated based on a large phenological database that includes measurements from different vineyards worldwide (Cuccia et al., 2014;Duchêne et al., 2010;García de Cortázar-Atauri et al., 2009;García de Cortázar-Atauri et al., 2010a;Parker et al., 2011).This validation is required for further use of these models to accurately predict the timing of grapevine development under a changing climate.
An increasing number of wineries have sprung up all over China, presenting a new framework in the Chinese wine industry (Banks and Overton, 2010).Wine grapes have been widely cultivated in 179 counties in China, with a total area of 163,200 ha in 2016 (Wang et al., 2018).Despite this widespread cultivation, to our knowledge, there has been no report of the use of phenological models to study grapevine cultivation in China.Unlike most classical wine regions in the world, which are dominated by a Mediterranean climate or an oceanic climate, China has a totally different climate regime, defined as typical continental monsoon climate characterised by hot and rainy summers, and cold and dry winters (Li et al., 2011).Approximately 90 % of the vineyards in China have to be covered with soil to ensure plant survival in winter (Li and Wang, 2015).Therefore, it is important to assess the performance of phenological models under Chinese climatic conditions.
The main aim of this study was to assess the ability of a set of phenological process-based models to simulate the main grape phenological stages of four cultivars under Chinese climatic conditions.This work is the first study performed using observed phenological data collected from Chinese vineyards.

MATERIALS AND METHODS
The different data and models used herein are described in this section.

Observed phenological data
The observed phenological data of the three main stages of budburst, flowering, and veraison were obtained for four cultivars of Cabernet-Sauvignon, Cabernet franc, Merlot, and Chardonnay grown in five wine regions in China.Each phenological stage was basically judged at 50 % level of appearance.The locations of these five regions are shown in Figure 1.The basic information of the phenological data for each wine region are summarised in Table 1.For Changli, the data were from two wineries covering two different time periods.For Laixi, the data were from one winery; Cabernet-Sauvignon was in two different plots during the same time period, therefore the average values of the two plots were used.For Shangri-La, data were from one winery at an altitude of 2000 to 2200 m.Because of the severe winter, grapevines in 90 % of vineyards in China need to be covered with soil to varying degrees in order to successfully overwinter (Li and Wang, 2015).Shangri-La is the only non-soil-burying region included in this study.The five study locations can be catagorised into three different climatic types.The dates of the phenological stages are expressed as day of year (DOY).

Meteorological data
To simulate the timing of different phenological stages during the same period, daily temperature observations were used as input variables.
As no meteorological data have been collected in vineyards, meteorological data from nearby weather stations were used in the study.These data were obtained from the China Meteorological Administration, along with daily observation records of temperature (mean, maximum, FIGURE 1.The locations of the five wine regions in China. and minimum) and geographical information.The climate characteristics of each wine region, the location of weather stations, and the distance from the weather station to the corresponding vineyard are described in Table 1.

Phenological models
Several candidate models were selected, because they are simple enough to ensure parsimony of input parameters, which makes it possible to test these models with limited data.Additionally, the parameter values of these models for the selected cultivars are available from previous studies.Table 2 provides an overview of the candidate models for each phenological phase, as well as the parameter values for available cultivars and the related original references.As it was not possible to obtain the parameter values of every model for all four varieties from literature, we have only listed the available cultivars for each model.

Growing Degree Day model
The Growing Degree Day model (GDD) is based on the classical thermal time concept (Bonhomme, 2000).This model calculates the cumulative daily temperatures above a temperature threshold (base temperature), usually starting from a given date (Equation 1).In this model, a phenological stage occurs when the sum of forcing unit reaches a critical state F crit : where t 0 is the starting date in Day of Year (DOY) format, N is the date of a phenological stage (here budburst, flowering, or veraison) in DOY format, T(n) is the temperature for day n, and T b is the base temperature above which the thermal summation is calculated.
For the budburst calculation, the t 0 value was set as 1st January, T(n) was the daily mean temperature, F crit varied between cultivars, and T b was fixed at 5 °C and 10 °C for the GDD5 and GDD10 models respectively (García de Cortázar-Atauri et al., 2009).
For the flowering and veraison calculation, we used two models: the GFV (Grapevine Flowering Veraison) model and the GDD10 model.The GFV model (Parker et al., 2011) is a version of the GDD model to which parameters t 0 , T b , t 0 and have been fitted based on a large database of observations predominantly from vineyards in Western Europe (France, Italy, Switzerland and Germany).This model uses daily mean temperatures, and the parameters t 0 = 60 and T b = 0 °C are the same for all cultivars, while F crit differs within cultivars (Parker et al., 2013).In the GDD10 model, the previous stage is used as the starting date, and the mean daily temperature and T b = 10 °C (García de Cortázar-Atauri, 2006).
We also explored the linear model proposed by Duchêne et al. (2010) (GDD Duchêne ).This model is similar to the original GDD, but it calculates the accumulation using daily maximum temperatures with different T b values for the different phases.The parameter values were fixed according to those obtained by Duchêne et al. (2010), but were only available for Cabernet-Sauvignon in this study.

Caffarra's model
This model combines several sub-models which have been selected and simplified by Caffarra and Eccel (2010) to simulate the three main phenological stages for the Chardonnay cultivar.
For budburst, the calculation was based on a parallel two-phase model (Chuine, 2000).The accumulation of chilling temperatures (using a normal function) started from 1st September (Equation 2) and the Dormancy Break (DB) stage is calculated once the chilling threshold has been reached (C crit ).The accumulation of forcing units then starts (using a sigmoid function given as Equation 3) until the threshold (F crit ) is reached, which is calculated using the C crit value (Equation 4).
where t 0 is the starting date, N is the date of a certain phenological stage (dormancy break or budburst),T m is the mean daily temperature, a and c are parameters to describe the temperature response to calculate chilling units, co1 and co2 are parameters describing the relationship between the critical Chilling (F crit ) and the forcing units required (F crit ) to reach the budburst stage.
For flowering and veraison, the calculation only takes into account forcing units, using a sigmoid function (Equation 5), and daily mean temperatures.For these stages, the starting point (t 0 ) is the previously calculated stage.

BRIN model
The BRIN model takes into account the dormancy period.This model combines two original models (García de Cortázar-Atauri et al., 2009): Bidabe's Cold Action model (Bidabe, 1965a, b) to calculate the dormancy period, and the Richardson model (Richardson, 1974;Richardson et al., 1975) to calculate the post-dormancy period.
Cold action is based on the Q 10 concept, where an arithmetic progression of 10 °C in temperature causes an action with a geometric regression of ratio Q 10 .Dormancy break (DB) occurs when the amount of daily chilling units (C U ) reaches a critical state (C crit ) (Equation 6).
Where t 0 is the starting date, N is the date of dormancy break, T x(n) is the maximum temperature for day n, and T n(n) is the minimum temperature for day n.
The accumulation of C U starts on 1st August of the previous year, and parameter Q 10 was fixed at 2.17 for all cultivars (García de Cortázar-Atauri et al., 2009).
After calculation of the Dormancy Break, the model starts to calculate the growing degree hours using Richardson's model (Equation 7).Budburst Date (N BB ) is when the sum of growing degree hours reaches a critical amount (G c ) (Equation 7).
The hourly temperature of day n [T (h, n)] can be calculated by linear interpolation between the maximal and minimal temperatures of day n and day n+1, and by assuming a day length of 12 h (as shown in Figure 2) (Equation 8).
The linear response is limited by two temperatures: T low , which is the minimal threshold of the plant response and T high the plant response stays at the maximum value (Equation 9) T low and T high parameters are fixed at 5 °C and 25 °C, respectively (García de Cortázar-Atauri et al., 2009).

Wang and Engel's model
A non-linear model was proposed by Wang and Engel (1998) et al., 2010a).The curvilinear structure of this model can take into account negative effects of high temperatures on grapevine development (Equation 10): Where F(T) is the daily rate of thermal summation within the range from 0 to 1, t 0 is the starting date of the forcing, N is the date of a phenological stage when F crit is reached, and T min , T opt , and T max refer to minimum, optimum, and maximum temperature thresholds respectively.Temperature below or above the minimum and maximum thresholds are considered to have no effect on the plants.
Temperatures T min and T max were fixed at 0 °C and 40 °C respectively (Table 2), and T opt and Table 2 provides an overview of the parameter values used in the aforementioned models for different phases and cultivars, as well as the corresponding references.

Statistical tests
Three statistical criteria were selected to evaluate the performance of different phenological models: the root mean square error (RMSE), the efficiency of the model (EF), and the mean bias error (MBE) (Caffarra and Eccel, 2010;Parker et al., 2011).
The RMSE provides information about the mean error of the prediction of the model (Equation 11): where S i is the simulated date, O i is the observed date, and N is the number of observations.
The efficiency of the model (EF) provides an estimation of the variance of the observations explained by the model (Equation 12).If EF = 0 or less, the model does not explain any variation  (with a maximum value of 1 corresponding to the perfect model).
where S i is the simulated date, O i is the observed date, n is the number of observations, and Ó is the mean value of the observations.
The MBE is the average predicted error representing the systematic error of the model (under or above predictions).
Where S i is the simulative date, O i is the observed date, and N is the number of observations.A systematic overestimation of the model can be indicated by a positive value of MBE, and a negative value of MBE indicates a systematic underestimation of the actual observation.

Simulation and calibration
To calculate the timing of different phenological stages, all these models were run using the Phenological Modelling Platform (PMP) software, version 5.5 (Chuine et al., 2013).The performances of these models were compared between different cultivars and sites based on the above-mentioned statistical criteria.
The models that gave the best results were optimised using PMP 5.5 for their parameter F crit with the above-mentioned Chinese phenological data for each cultivar.PMP 5.5 uses the simulated annealing algorithm of Metropolis (Metropolis et al., 1953) to optimise the parameters of different functions.

General performance of phenological models in the simulation of budburst, flowering and veraison in China
Shangri-La is the only site in which vines do not require soil-burying, and only data for Cabernet-Sauvignon were accessible for this location.Thus, to better compare the performance of models between cultivars, the data from this region were excluded in the analysis presented in Table 3, Table 4, and Table 5.
Only the BRIN model was tested for Cabernet franc, giving the worst result for all the tested cultivars.Although the performances of these models differed for the different cultivars, there were obvious differences between models.Except for GDD 10 , all the models revealed very poor performance (Table 3), with high RMSE and negative efficiency for all the tested cultivars.Three statistical criteria gave consistent results, where the higher the RMSE, the lower the EF and the higher the absolute value of MBE were.The GDD 10 model is the only model that performed well for all available cultivars.Except for the Caffarra model, all models showed overestimation with MBE > 0.
Figure 3 compares all the available observed budburst dates and the corresponding simulated dates.Three statistical criteria and the Pearson's correlation coefficient (r) have also been added.
Except for the Caffarra model, all the models showed a good correlation between the simulated data and observed data.Both the observed and simulated data show that the budburst date is mostly related to the place where the grapevines are grown.For the five regions, the earliest budburst dates occurred in Shangri-La, followed by Xiaxian.The later budburst dates occurred in Yanqi, Changli, and Laixi.Changli and Laixi revealed a similar time range, which was longer than that of Yanqi.Most of the simulations of the BRIN and GDD 5 models gave results that were more than 10 days later than the observed dates, with high MBE, high RMSE and negative EF.The later the observed budburst, the bigger the difference between the observed and simulated dates for the BRIN model.Different phenological behaviour was observed in the analysis of the only non-soil-burying region, Shangri-La, with a simulated budburst date very similar to, or earlier than, the observed budburst date for the BRIN and GDD 5 models.For the Caffarra model, all the simulations were earlier than the observations, and showed the worst performance.
For the GDD Duchêne model, 45.5 % of the data points fell within the range of y = x ± 10.For the   GDD 10 model, 82.5 % of the points were located within y = x ± 10, showing the best performance, but one of the Shangri-La data points is obviously isolated from the others.
In general, the performance of the models was good when simulating flowering.Except for the GFV model, which did not simulate the timing of flowering well (RMSE > 10 and EF < 0) for all cultivars, all models performed relatively well with certain differences between cultivars.The GDD 10 model gave the lowest RMSE (4.4) for Cabernet franc and the highest RMSE (8.1) with a negative EF value for Merlot.The WE, GDD Duchêne , and Caffarra models were only available for one or two cultivars, but they all showed relatively good performance, with the WE model being the best for its two available cultivars (Cabernet-Sauvignon and Chardonnay).There are only two models (GFV and GDD 10 ) available for Merlot and Cabernet franc.Neither model was reasonable for Merlot, but Cabernet franc was well simulated by GDD 10 .
In figure 4 the observed and simulated flowering dates are compared.All models showed a good correlation between observations and simulations according to the correlation coefficient (r).
Flowering was earliest in Xiaxian, followed by Shangri-La.Changli and Laixi showed similar flowering times.The flowering dates were overestimated in all tested models.For the GFV model, 90.5 % of the simulated dates were more than ten days later than the observed dates, showing the worst performance.For the WE, GDD Duchêne and GDD 10 models, 87.2 %, 81.8 % and 81.4 % respectively of the simulated dates were less than ten days later than the observed dates.For the Caffarra model, almost all the simulations were less than ten days later than the observations.Except for GFV, all models showed a relatively good performance.For all available models, two Shangri-La data points were isolated from the other data.Although the GDD 10 and GDD Duchêne models performed well in the flowering simulation, they showed poor performance with high RMSE and negative EF for all available cultivars in the veraison simulation.The GFV model showed a better performance at this stage with positive EF values for all cultivars, except for Cabernet franc.

Veraison
The WE, GDD Duchêne , and Caffarra models could only be applied to one or two cultivars, but they all preformed relatively well.No one model was best for all cultivars.The GFV model performed best for Cabernet-Sauvignon, while the Caffarra model was best for Chardonnay.We only tested two models for Merlot, out of which the GFV model performed better.Three models (GFV, GDD 10 , WE with constant t 0 ) were tested for Cabernet franc, but only the WE model with the constant gave acceptable results.
Figure 5 compares the simulated and observed veraison dates.In contrast to other phenological stages, the veraison dates were obviously underestimated by several models.In particular, the GDD Duchêne model gave the highest RMSE, the highest negative MBE and a negative EF, showing 90.5 % of the simulated dates more than ten days earlier than the observed dates.The GDD 10 model also showed bad performance, with 72.2 % of the simulated veraison dates earlier than observations by more than ten days.For the GFV model and the WE model with constant , 75 % and 65.7 % of the data were located within y = x ± 10 of which 29.6 % and 39.1 % showed simulated dates earlier than observed dates.These two models showed an obvious trend in which the later the observed date, the higher the possibility of underestimation.More than half of dates predicted by the Caffarra and WE models were earlier than the observed dates; however, most were within 10 d, thus showing relatively good performance.For most applied models, there were still two Shangri-La data points showing a different regulation.
In order to explore the impact of local climatic conditions on phenology simulation, the performance of each model available for Cabernet-Sauvignon was compared for different wine regions.Models available for Cabernet-Sauvignon were used for this comparison, because only data for Cabernet-Sauvignon were available for all five regions.There is only a total of 19 observations in all of the regions, and only two are available    in Yanqi which makes the calculation of EF impossible sometimes, therefore we just use the RMSE to make a simple comparison.
In the simulation of budburst, although the models performed differently for the different wine regions, the GDD 10 model showed the best performance for all regions except Shangri-La, with RMSE lower than 10 d (Figure 6a).GDD 5 performed best for Shangri-La.Laixi was quite well simulated by all the models.
In the of flowering, the GDD Duchêne model performed relatively consistently for the five wine regions, and performed best in the Xiaxian region (Figure 6b).In contrast, the GFV model showed the worst performance for all the wine regions.The performance of the WE model differed for the different wine regions, showing its worst performance for Shangri-La and its best for Changli.The GDD 10 model gave the most similar results, but with higher RMSE values.All models performed least well for Yanqi and Shangri-La.
In the simulation of veraison, the performance of the different models differed greatly for a single location, especially for Shangri-La, with an RMSE value of 1.16 d for the GDD Duchêne model, and 27.1 d for the GFV model (Figure 6c).The opposite result was observed in Laixi, with an RMSE value of 26.7 d for the GDD Duchêne model, and 10.1 d for the GFV model.The GDD Duchêne , GFV, and WE models with constant t 0 only performed well for one or two regions, while the WE model showed relatively good performance for all regions with an average RMSE of 6.98 d.

Calibrating the models with Chinese-observed phenological data
Some models with good performance were selected in this part, and we tried to calibrate these models using limited Chinese phenological data to obtain new values for the parameter F crit .The performance of the models using previous F crit and new F crit are shown in Table 6.
In the simulation of budburst, the new F crit value for each cultivar was smaller than the previous value, but the ranking of the F crit within these varieties remained the same.The performance of the GDD 10 model was improved by using the new F crit value, and this was most obvious for Chardonnay.
Several models performed similarly in predicting the timing of flowering and veraison for Cabernet-Sauvignon and Chardonnay.In order to avoid arbitrary judgment, three models were selected and separately calibrated for each cultivar for flowering and veraison.
In the simulation of flowering, the new F crit values were also smaller than previous values for each model.The ranking of the new F crit for GDD 10 within these varieties also differed from that of the previous F crit .Almost all the models showed very good fitting results, especially for Chardonnay, with all models giving excellent accuracy with EF > 0.9.The WE model still showed a slight advantage in terms of accuracy for both Cabernet-Sauvignon and Chardonnay.Among all varieties, the analysis of Cabernet franc with the GDD 10 model gave the lowest EF.
In the simulation of veraison, although the fitting result of the models was not as good as in flowering, there was still an improvement for all cultivars, with the best analysis being for Chardonnay.
For both Cabernet-Sauvignon and Chardonnay, the curve models performed better than linear models, and the WE model gave the best results for the two cultivars.

DISCUSSION
This is the first assessment of grape phenological models in China, which utilised limited phenological data.This study evaluated the performance of a set of models to simulate phenological stages for grapes grown in China.These models were previously validated under European conditions.There was certain variation in the accuracy of these models, with both varietydependent and site-dependent differences.

The most promising phenological models
While the Caffarra model contains the most parameters and was predicted to better explain the budburst process, it was the worst model for growth data for grapes in China.However, it performed well for growth data in northern Italy, with an average EF value of 0.33 and MBE value of 5.1 when using an external dataset (Caffarra and Eccel, 2010).The GDD5, GDD Duchêne , and BRIN models also performed much worse for data from China than for data from France (Duchêne et al., 2010;García de Cortázar-Atauri et al., 2009).The GDD10 is the only model that performed well in most regions and for four cultivars.This indicates that it is the best model for simulating budburst in China when considering that the dormancy period does not increase the accuracy of the result under current climate conditions (Costa et al., 2019;García de Cortázar-Atauri et al., 2009), and the BRIN and Caffarra models gave the worst results in this study.The models that take into account low and winter temperatures (GDD5, BRIN, and Caffarra) did not simulate phenology correctly, but the models that take into account forcing temperatures (GDD10 and Duchene) performed better, albeit with regionspecific variability.Future work should continue to explore the impact of dormancy conditions when simulating budburst, particularly that related to climate change scenarios (Chuine et al., 2016).
There was an overall improvement of the performance of models for veraison and flowering, which is consistent with the study by Costa et al. (2019).van Leeuwen et al. (2008) also reported better accuracy for simulation of veraison and especially for flowering for different vintages and locations compared to budburst.The mechanisms of budburst may be more complex.
In accordance with previous studies (Caffarra and Eccel, 2010;Parker et al., 2011), the GFV and Caffarra models performed better for veraison than for flowering.Non-linear models performed better than linear models, especially the WE model.This might be because the high temperature between flowering and veraison can exceed the optimum temperature, especially in July, which is the warmest month in China (He and Zhou, 2012).The non-linear models can perform better by taking into account the negative impact of high temperature in warmer regions or under a changing climate situation (Cuccia et al., 2014).

Insufficient observations
We tried to obtain the largest possible dataset of phenological stages from different varieties and regions of China, but the amount of data was quite limited.Most wineries in China do not record observations of grapevine phenology.We did not have sufficient data for both calibration and validation, so we relied on parameters calibrated in previous studies.
Most wine regions in China are in semi-arid and arid areas, therefore irrigation is necessary.In western Europe, where the models were originally tested and calibrated, irrigation is not allowed or very strictly used in most vineyards.Different water conditions may change the crop phenology to some degree (Degueldre et al., 2011;Shellie et al., 2018).Additional sources of variation include soil texture (Trought et al., 2008), soil temperature (Kliewer, 1975), pruning dates (Dunn and Martin, 2000;Gatti et al., 2016), androotstock (Downton andCrompton, 1979;Menora et al., 2015).Difficulties in consensus on the timing of the phenology can also lead to measurement errors of several days (García de Cortázar-Atauri et al., 2009).For these reasons, and to reduce the impact of individual factors, it is necessary to obtain more data and establish a more comprehensive phenological database for grapes grown in China.

Soil-burying
Soil-burying is indispensable in most Chinese wine regions where the extreme low temperature is less than -15 °C (Li, 2008;Wang et al., 2018).This vineyard management practice must be conducted before soil freezing, and grapevines should be out of the soil for a few days before budburst to avoid any physical damage to the bud during the unearthing process and any damage by spring frost (Li, 2008).During this period, soil temperature, rather than air temperature, directly impacts grapevine development.
Here is an example to illustrate the difference between air temperature and soil temperature at several depths (Figure S1).Soil temperature data were not available for the regions studied here, therefore data from Fangshan, another winegrowing site with temperate monsoon climate, were used instead.Air temperature was lower than soil temperature from October to February, but from March to September, air temperature was usually higher than soil temperature at deep depth and lower than soil temperature at shallow depth.Thus, the relationship between air temperature and soil temperature changes with the season at different depths, which directly leads to the inaccuracy of models that only take into account air temperatures during the soil-burying period.
The actual accumulation of temperature by GDD 5 and GDD 10 models, which take into account the process of soil-burying, was also illustrated in Fangshan for Cabernet-Sauvignon (Figure S2).The observed budburst date is for grapevine covered with about 30 cm of soil.Both GDD 5 and GDD 10 showed overestimation.More importantly, the simulated budburst that was calculated using air temperature was earlier than that calculated using actual temperatures; furthermore, the deeper the soil depth, the later the simulated budburst.The temperature accumulation by GDD 5 starts earlier than GDD 10 .This difference generates more accumulated temperature during the soilburying period, which directly increases the uncertainty caused by the difference between soil temperature and air temperature.Therefore, soil-burying had less effect on the accuracy of GDD 10 .However, the accuracy of the results may be impacted by the depth of soil-burying and the time of soil-uncovering, which can vary between different regions.

Meteorological data
The climate data used for each vineyard were obtained from nearby meteorological stations.Verdugo-Vásquez et al. (2016) found that even the microclimate at the field scale can lead to a variation in phenology of a few days.Three of the wine regions (Yanqi, Laixi and Shangri-La) are quite far (> 20 km) from the meteorological station, therefore the collected meteorological data may not be a true representation of the local climate.This is particularly the case for Shangri-La, which is located in a mountainous region with complex geographical features and changeable regional climate conditions; however, data were obtained at the meteorological station 24.1 km away from the vineyard.In terms of budburst, flowering, and veraison, some models were inaccurate for Shangri-La, possibly due to the inaccuracy of the temperature data for this region.

Perspectives
According to a study by Guo et al. (2015), spring phenology is almost exclusively determined by forcing temperature due to the severe winter in northern China which meets the chilling requirement.But in non-soil-burying regions or in the context of future climate change, insufficient chilling may offset the advance, or even cause the delay, of spring events (Chuine et al., 2016;Guo et al., 2013;Webb et al., 2007).It would be informative to test a model fitted under warmer conditions.
Being at the beginning of the grapevine growth cycle, the accuracy of budburst directly determines the accuracy of the simulation of subsequent phases.Therefore, the site-specific estimation of model parameters would be required to increase the accuracy of these models for budburst (Nendel, 2010).More experiments should be conducted to physiologically and genetically understand the phenological process, especially for budburst.
According to Parker et al. (2013), the calibration of each variety provides a relatively accurate for flowering and veraison with a minimum number of 20 observations from three sites.Sufficient phenological data combined with associated weather data would allow us to evaluate the performance of current models, as well as to calibrate and validate new models for more cultivars under additional climate conditions, thus facilitating modeling in China.A more complete phenological observation network for grapevine is therefore required for China, similar to those in other countries, such as PEP725 (http://www.pep725.eu/),TEMPO (https:// tempo.pheno.fr/),and USANPN (https://www.usanpn.org/usa-national-phenology-network).In addition, growth-room experiments could be an alternative method for quickly calibrating models (Fila et al., 2012).

CONCLUSIONS
This study assessed the performance of different phenological models to simulate the different grape phenological phases in five grape-growing regions in China.
For budburst, most models exhibited poor performance.The GDD 10 was the only model to perform well, irrespective of the cultivar and location in the soil-burying zones.
For flowering and veraison, most models provide relatively good performance, which varied between cultivars and regions.In general, nonlinear models performed better than linear models, especially for veraison, but not all models can be applied to all varieties.
The models with relatively good results were optimised for their parameter using these limited Chinese observations.The impact of the difference between air temperature (calculated temperature) and soil temperature (actual temperature) during the soil-burying period on the inaccuracy of models in the budburst simulation was also discussed.As this study was only based on limited observed data, the establishment of a grapevine phenology observation network to obtain more data would facilitate region-specific modelling and allow model application for more varieties.Our results illustrate the potential for the use of models to simulate grape growth in China, which can facilitate the development of improved cultivation strategies.
F crit differ for each phenological phases and for each cultivar (García deCortázar-Atauri et al., 2010a).Two versions of this model were tested in this study: one starts at a fixed date (t 0 = March 15th) and can directly calculate veraison (García deCortázar-Atauri et al., 2010b), and the other allows flowering and veraison to be calculated starting from the previous phenological stage (García deCortázar-Atauri et al., 2010a).

GDD
Duchêne , and Caffarra)  in the simulation of veraison.

FIGURE 5 .
FIGURE 5. Observed versus simulated veraison date (DOY) of six different models (Caffarra, GDD10, GDD Duchêne , GFV, WE, and WE with constant t 0 ).The dashed lines represent y = x ± 10 and the full line represents y = x.MBE: mean bias error between simulations and observations2.performance of phenological models in different wine regions.

FIGURE 6 .
FIGURE 6.Comparison of the RMSE of different models in five different wine regions for Cabernet-Sauvignon: a. budburst, b. flowering and c. veraison.

TABLE 1 .
Summary of the data used in this study.

TABLE 2 .
The parameter values of the different models used in this paper.

TABLE 3 .
The performance of different models in the simulation of budburst for different cultivars.
The values in bold indicate the best result for each variety.
Table5illustrates the performance of six different models (GFV, GDD 10 , WE, WE with constant t 0 ,

TABLE 4 .
The performance of different models in the simulation of flowering for different cultivars.
The values in bold indicate the best result for each variety.

TABLE 5 .
The performance of different models in the simulation of veraison for different cultivars.
The values in bold indicate the best result for each variety.

TABLE 6 .
The comparison of F crit values and the performance of selected models before and after calibration in the simulation of three main phenological phases for different cultivars.OENOOne 2020, 3, 637-656 XueqiuWang et al.