Overcoming the challenge of bunch occlusion by leaves for vineyard yield estimation using image analysis

Accurate yield estimation is of utmost importance for the entire grape and wine production chain, yet it remains an extremely challenging process due to high spatial and temporal variability in vineyards. Recent research has focused on using image analysis for vineyard yield estimation, with one of the major obstacles being the high degree of occlusion of bunches by leaves. This work uses canopy features obtained from 2D images (canopy porosity and visible bunch area) as proxies for estimating the proportion of occluded bunches by leaves to enable automatic yield estimation on non-disturbed canopies. Data was collected from three grapevine varieties, and images were captured from 1 m segments at two phenological stages (veraison and full maturation) in non-defoliated and partially defoliated vines. Visible bunches (bunch exposure; BE) varied between 16 and 64 %. This percentage was estimated using a multiple regression model that includes canopy porosity and visible bunch area as predictors, yielding a R2 between 0.70 and 0.84 on a training set composed of 70 % of all data, showing an explanatory power 10 to 43 % higher than when using the predictors individually. A model based on the combined data set (all varieties and phenological stages) was selected for BE estimation, achieving a R2 = 0.80 on the validation set. This model did not show validation metrics differences when applied on data collected at veraison or full maturation, suggesting that BE can be accurately estimated at any stage. Bunch exposure was then used to estimate total bunch area (tBA), showing low errors (< 10 %) except for the variety Arinto, which presents specific morphological traits such as large leaves and bunches. Finally, yield estimation computed from estimated tBA presented a very low error (0.2 %) on the validation data set with pooled data. However, when performed on every single variety, the simplified approach of area-to-mass conversion was less accurate for the variety Syrah. The method demonstrated in this work is an important step towards a fully automated non-invasive yield estimation approach, as it offers a solution to estimate bunches that are not visible to imaging sensors.


INTRODUCTION
Accurate yield estimation is of utmost importance for many commercial crops. In viticulture, such information is helpful to address challenges of logistics at harvest, helping define workforce and machinery required, as well as cellar needs. Furthermore, early yield forecasts can help planning bunch thinning to manage wine stock, grape prices and marketing strategies (Dunn and Martin, 2004). However, vineyard yield estimations can be hard to perform as the amount of fruit can vary highly from season to season as well as spatially across a vineyard (Bramley et al., 2011;Victorino et al., 2017;Jasse et al., 2021). Vineyard yield depends on several factors such as soil and climate conditions, grapevine variety, biotic and abiotic stresses, vineyard management practices (Bramley and Hamilton, 2004;Clingeleffer et al., 2001;Taylor et al., 2018).
One of the most common yield estimation methods involves manually collecting, counting and weighing grape bunch samples at the berry lag phase (BBCH 79, Lorenz et al., 1995). The information is then extrapolated to the whole vineyard or plot, which can cause inaccurate estimates if sampling is not representative enough. Furthermore, it is a destructive and laborious technique (Dunn and Martin, 2004).
There are other methods to estimate vineyard yields, such as pollen-based predictive models (Besselat, 1987;Cunha et al., 2016), static devices that measure wire tension (Blom and Tarara, 2009), or climate data-based models (Fraga and Santos, 2017). However, most of these methods are limited either by the need for high maintenance fixed devices or the scanning of large areas or regions. Recently, machine vision has become a sought-after research topic, as images can be collected massively across a whole vineyard, and data can be georeferenced, helping to understand its spatial variability. Along with quick image collection devices, resulting images can be automatically processed by means of machine learning techniques (Jensen, 1996;Cristianini and Shawe-Taylor, 2000;Lecun et al., 2015), which lead to high throughput systems that can provide real-time data based on image analysis.
In field conditions, image analysis for bunch and berry recognition has been proved to be effective using colour and light thresholding, geometric segmentation and Bayesian approaches (Aquino et al., 2018b;Abdelghafour et al., 2019;Liu et al., 2020), classification algorithms such as Random Forests and Support Vector Machines (Nuske et al., 2014b;Pérez-Zavala et al., 2018) and artificial neural networks (Behroozi-Khazaei and Maleki, 2017;Rudolph et al., 2018;Milella et al., 2019), among others. Three-dimensional imaging has also been explored to compute bunch volume for bunch weight estimation (Font et al., 2015;Ivorra et al., 2015). However, besides being a more complex computer task, Hacking et al. (2019) has shown that such technology can yield worse results than 2D imaging, in field conditions, due to the randomness of vine structures and bunch occlusions. Other yield components from earlier phenological stages have also been segmented.
Shoots have been automatically identified using unsupervised classification algorithms such as K-means and agglomerative hierarchical clustering . Flowers and inflorescences have been classified using artificial neural networks (Rudolph et al., 2018) and geometric segmentation inside a Bayesian framework (Abdelghafour et al., 2019). Today, it is generally accepted that these yield components can be accurately identified, and the challenge of analysing vine images for yield component recognition has been overcome (Seng et al., 2018). However, in natural conditions, in general, yield components are not all visible to the scanning sensors, requiring manual intervention (e.g., defoliation) or, alternatively, modelling approaches to estimate non-visible fruits. Regarding the most used yield components for yield estimations with image analysis, there are two main trends that have been explored in previous research: one uses the estimated number of visible berries in an image (Nuske et al., 2014b;Aquino et al., 2018a;Milella et al., 2019;Liu et al., 2020), the other uses bunch projected area or bunch pixels (Liu et al., 2013;Lopes et al., 2017;Hacking et al., 2020;Victorino et al., 2020).
Bunch occlusion by leaves is common in grapevines, especially in dense canopies. It prevents bunches from being overly exposed to sun radiation which, if in excess, can cause losses in berry quality and sunburn damage (Gambetta et al., 2021). From an image analysis perspective, however, leaf coverage is a limitation as it causes a large percentage of bunches to be occluded from the imaging devices. As it has been pointed out by Nuske et al. (2014a), berries can be occluded by leaves, wooden material and neighbouring berries. The same authors name these occlusions as self-occlusions (berry by berry), cluster-occlusions (berry by bunches) and vine-occlusions (berry by leaves and shoots). In the present work, the authors opted to address yield estimation by exploring the vine bunch projected area; however, occlusions can occur regardless of the considered yield component. Vine-occlusions appear to be the main challenge (Victorino et al., 2019;Íñiguez et al., 2021), particularly leaf-occlusions, especially in dense canopies that present numerous leaf layers (Smart and Robinson, 1991).
To overcome the challenge of fruit bunch occlusion, previous works have performed yield estimation using image analysis on vines that were defoliated prior to image collection (e.g. Dunn and Martin, 2004;Pérez-Zavala et al., 2018;Milella et al., 2019), with the exception of works developed at very early phenological stages . Grapevine basal leaf removal is a canopy management practice mainly used in viticulture regions where a higher bunch sun exposure is desired to reduce canopy density and improve fruiting area ventilation and luminous microclimate to reduce fungi pressure and improve polyphenolic composition in red grapes . However, in warm climate viticulture, such practices are not as common, considering the increasing instability caused by climate change. The higher sun exposure caused by defoliation can lead to increased berry temperature, negatively impacting berry composition and, ultimately, to berry sunburn, which can lead to great yield losses (Martínez-Lüscher et al., 2020). Furthermore, leaf removal is an expensive and time-consuming management practice, and yield estimation methods cannot be dependent on it. Nuske et al. (2011a) estimated an occlusion ratio either by sampling a small number of vines at the time of image collection or using occlusion ratios estimated from previous harvests. However, this methodology is not fully automatic as it still depends on manual sampling. Riggio et al. (2018) used another approach consisting of an imaging system to collect images from under the canopy, in an upwards perspective. This is a promising methodology to overcome leaf occlusion, yet it was developed on a very particular training system (Geneva double curtain) where the location of the bunches in relation to the vegetation is much different (downward shoot positioning) than in the most common training systems such as the vertical shoot positioning. Recently, Kierdorf et al. (2021) developed an algorithm to estimate hidden berries based on artificially generated images of occluded fruits, reducing the need for natural images and the labour associated with their collection and labelling. The algorithm showed promising yet inconclusive results when applied to non-synthetic data (real vine images). Another technology was tested by Parr et al. (2021), where ultrasonic phased arrays were used to overcome the problem of occluded bunches, showing results that appear promising in controlled conditions. However, despite all these research efforts, the challenge of bunch occlusion in vineyard conditions remains unsolved.
Grapevine canopy porosity (POR) is defined as the proportion of gaps in the canopy where no plant material is present (Smart, 1987;Diago et al., 2019). This POR, when determined at the fruit zone, shows a positive correlation with fruit exposure (Smart, 1988), indicating that the more gaps the canopy has, the more fruit will be potentially visible. Canopy porosity can be influenced by several factors like shoot density, shoot leaf area water availability, biotic and other abiotic stresses, among others and has been shown to influence fruit exposure to sunlight and, hence, berry health and composition (Smart, 1987;Austin et al., 2011). In warm climates, a lower POR can decrease the risk of berry sunburn, as more leaves cover bunches from direct sunlight (Krasnow et al., 2010). Canopy porosity is commonly manually assessed using the Point Quadrat method (Smart and Robinson, 1991), but recently it has been proven that it can be estimated with accuracy in a more automatic and objective method via image analysis (De Bei et al., 2016;Diago et al., 2019).
Based on the above-stated assumption that a relationship between canopy porosity and exposed bunches exists, the present work aims at testing the use of canopy traits (porosity and visible bunch area) as proxies for estimating the proportion of occluded bunches by leaves and, hence, overcoming the main difficulty faced by proximal remote sensing devices used for estimating vineyard yield.

Plant material and growth conditions
The experiment was carried out in two fully grown experimental vineyards in the Lisbon winegrowing region, Portugal ( Figure 1). Data were collected at Instituto Superior de Agronomia (ISA) vineyard, located at Tapada da Ajuda (38°42'24.61" N 9°11'05.53" W). The climate is of Mediterranean type with an Atlantic influence. This vineyard includes two different plots with spur pruned vines trained on a vertical shoot positioning trellis system with two pairs of movable wires. The first plot includes two drip-irrigated white varieties ('Encruzado' and 'Arinto') grafted onto 1103 Paulsen rootstock, planted side by side in 2006 and spaced 1.0 m within and 2.5 m between north-south oriented rows. Water was supplied with a drip irrigation system, and irrigation was managed using a soil moisture sensor. Readily available water was maintained until veraison, at which point stress was applied by replacing only 50 % of crop evapotranspiration. The second plot is rainfed and consists of the 'Syrah' variety (grafted onto 140 Ruggery rootstock) planted in 1999 and spaced 1.2 m within and 2.5 m between north-south oriented rows. In both plots, the soil is a clay loam with approximately 1.6 % organic matter and a pH of 7.8 (Teixeira et al., 2018). Data were collected from both plots in 2018 (except 'Arinto') and 2019. On the two seasons, both plots were subject to similar standard cultural practices during the growing cycle, including shoot trimming at 1.2 m above the cordon, water shoot removal, shoot positioning and fertilization.
Data were collected during the 2018 ('Encruzado' and 'Syrah' varieties) and 2019 ('Encruzado', 'Syrah' and 'Arinto' varieties) seasons. A total of 140 one-meter canopy segments (corresponding to the row space, 1 vine) were imaged and assessed in both seasons (70 at veraison and 70 at harvest), distributed along the vineyard to encompass as much spatial variability as possible.
The 2018 season was characterised by a wet spring (~300 mm of precipitation from March to June) and a warm and dry summer (~15 mm of precipitation and average mean temperature of 22.1° C, from June to September), whereas the 2019 season presented a drier spring (~119 mm of precipitation from March to June) and a dry and warm summer (~17 mm of precipitation and average mean temperature of 20.6 °C, from June to September).

Image acquisition
Grapevine images were collected in a lateral perspective using a commercial camera (Nikon D5200) equipped with a Sigma 50 mm F2.8 macro (Sigma corp., Kanagawa, Japan), set to auto-mode. RGB images were saved at a resolution of 24 Mpx (6000 × 4000 pixels), 8 bits per channel and with uncontrolled illumination. The camera was mounted on a tripod approximately 2 m away from the row axis and 1 m above the ground (Figure 2). A blue background was placed behind the canopy for each imaged canopy segment.
Images were collected in 2018 and 2019 at two phenological stages: veraison (BBCH 81) and harvest (BBCH 89). These phenological stages corresponded approximately to 1 month and 1 week prior to harvest, respectively. For more details, see Victorino et al. (2020).
Images were firstly collected on undisturbed canopies, followed by partial defoliation at the fruit zone to simulate a broader range of porosity situations. Afterwards, vines were fully defoliated to uncover all bunches occluded by leaves ( Figure 2D-F). A total of 270 images were collected in analysed in this work. After image collection, all bunches were picked and weighed per canopy segment, and fruit mass was determined at each analysed phenological stage. To avoid direct light incidence and consequent poor-quality imaging, images were collected after midday, taking advantage of the north-south row orientation and the shade provided by the background positioned on the opposite side of the canopy combined with the fully developed canopy.

Image processing
A region of interest of 50 cm above the cordon was defined for each image for detailed analysis ( Figure 3A). On each image, pixels of leaves, bunches, canopy gaps and wooden organs could be found. Bunch segmentation was performed manually for each individual image to obtain the true bunch projected area in the image for each canopy segment ( Figure  3B and C). Segmentation was performed using the software ImageJ ® (v1.52e, National Institutes of Health, EUA). Canopy gap pixels were classified using a static colour threshold after converting the original image from RGB to HSV (Hue-Saturation-Value) colour space for improved invariance to illumination conditions. The algorithm was developed using OpenCV in PyCharm®, an open-source Python-focused IDE, for automation purposes.

Data analysis
Data were divided into two sets: the training set, which included 70 % of all non-defoliated and partially defoliated vine images and the validation set, which included the remaining 30 %. In both the training and validation sets, data were organized according to variety and phenological stage to further explore the analysis. Several variables were addressed as follows: visible bunch area in cm 2 /m (vBA) and in proportion to the whole image (vBA p ), described as the projected area of non-occluded bunches ( Figure 3C in yellow); total bunch area (tBA), in cm 2 /m, defined as the projected area of non-occluded bunches, assessed after full defoliation; bunch exposure (BE), in percentage, described as the ratio between vBA and tBA; canopy porosity (POR), in percentage, described as the percentage of canopy gaps at the fruiting zone (Eq. 1; Figure 3C in red); fruit mass (FM), in kg/m, described as the actual bunch mass assessed for the corresponding image.
Descriptive statistics were computed for both data sets together (Table 1). Total bunch area (tBA) corresponds to the visible bunch area (vBA) of the fully defoliated vine segment. Canopy porosity and visible bunch area were selected as the main variables to estimate bunch exposure (BE). Canopy porosity (POR) was estimated according to Equation 1, in which the number of total pixels corresponds to the entire image (including canopy gaps, grapes, leaves, shoots and old wood). As this work focuses on occlusions caused by leaves, other types of occlusions, such as bunch occlusions by other bunches or shoots, were not considered.
On the training set, for each variety and phenological stage, a linear model was fitted between POR, vBA p (predictor variables) and BE (response variable), only on data from images of non-defoliated and partially defoliated vines. Then, a multiple linear regression between the same predictors and response variables was fitted. Models were computed, and variables were transformed according to maximum likelihood analysis performed in SAS ® . For validation, to evaluate the final model goodness of fit to the observed data, the following deviance measures were used: bias (Eq. 2), mean absolute percentage error (MA%E, Eq. 3), root mean square error (RMSE, Eq. 4) and modelling efficiency (EF, Eq. 5). A regression analysis of actual vs. estimated data and the F-test for slope = 1 and for intercept = 0 were performed for model evaluation. where represents observed values, estimated values, average observed value and n the number of pairs. Estimated tBA was calculated using Equation 6. Fruit mass was estimated based on the relationship between observed tBA and observed fruit mass (from the training set) at vine level, an approach similar to the one mentioned in Lopes et al. (2016) and Victorino et al. (2020). Both variables (estimated tBA and estimated fruit mass) were evaluated by analysing the relative mean error (Eq. 7).

Estimation of occluded bunch area
In fully defoliated vines, tBA presented a Pearson correlation coefficient of 0.78 with fruit mass. In the proposed methodology, the authors estimate fruit mass by first estimating tBA through BE (Eq. 6). It is important to reference again that the BE does not consider bunch occlusion by other bunches, but only bunch occlusion by leaves. Figure 4 shows the relationship between POR and vBA p with BE on images taken on non-defoliated and partially defoliated vines of the training data set. Visual observation shows that there is a linear trend between POR and BE ( Figure 4A). A similar trend is also observed in Figure 4B between vBA p and BE but with variability that increases with the increase in the magnitude of BE. Regression models were computed using POR and vBA p as predictors of BE (Table 2). For both relationships, a maximum likelihood analysis suggested that a root transformation of the response variable (BE) would be more appropriate than the original scale in some data sets. The resulting models, after converting the rooted response variable back to the original scale, corroborate the visual appraisal of the scatter plot. Indeed, both regression models present a high and significant R 2 (p < 0.001) for every single variety and the combined data. For the POR predictor, the Syrah dataset presented the highest R 2 of all varieties, followed by Encruzado and Arinto. When comparing both phenological stages, a higher R 2 was observed at harvest. Regarding the vBA p predictor, the Syrah dataset also presented the highest R 2 between all varieties, followed by Encruzado and Arinto, which presented very similar values among them. As opposed to the POR relationship, a higher R 2 was observed at the veraison phenological stage than near harvest. In general, the highest R 2 was observed in the models with vBA p as a predictor, with models based on the combined data set having a R 2 of similar magnitude as the individual cases. Figure 4C presents a scatterplot where the two predictors (POR and vBA p ) are plotted with BE. Visual observation shows that the combined variables tend to explain BE better than each one, individually. Most Syrah dataset observations are plotted in the upper left side of the plot, indicating a possible variety effect. No phenological stage effect is apparent. As in the previous chapter, for all data sets (except Syrah), a maximum likelihood analysis suggested that a square root transformation of the response variable (BE) would be more appropriate than the original scale. The resulting models, after converting the square rooted response variable back to the original scale, are presented in Table 2, along with their respective R 2 . All models present a significantly high R 2 , corroborating the visual appraisal, with the highest value being observed for the Syrah dataset, followed by Arinto and Encruzado. Models from the two phenological stages presented very similar results. The model based on the pooled data from all data sets (Table 2, nº 18) presents a R 2 of similar magnitude to all other cases.
Furthermore, model 18 was computed with a larger number of images, thus possibly making it more robust to different stages or varieties. Even though there are slight differences between the variety Syrah and the remaining ones, the authors still opted for this model as it provides a generalized option that goes in accordance with the objectives of this work. This generalisation hypothesis is tested and discussed further ahead.

Model validation
To test the potential of the model based on pooled data (model 18, Table 2), a validation was performed on a data set which, such as the training set, was separated by variety, stage and with all cases combined ( Figure 5 and Table  3). Visual observation shows an overall good agreement between observed and actual values for all varieties and stages, especially at low values of BE, but also some variability in cases of medium BE magnitude (between 40 % and 70 % of BE). The statistical measures of validation corroborate the good fit (

TABLE 2. Fitted models from the fitted regressions between bunch exposure (BE) over canopy porosity (POR) and percent visible bunch area (vBAp).
All R 2 are statistically significant at p < 0.001. Gonçalo Victorino et al. Arinto model showed the highest absolute Bias, followed by Syrah, and an EF which was higher than that of Syrah but lower than that of Encruzado. Regarding the models fitted for each phenological stage, all statistic measures of validation presented similar magnitudes. Overall, the selected model showed good performance on all data sets, presenting an average overestimation only on the Arinto data set and an average underestimation on the Syrah data set of similar magnitude. When the complete validation data set was considered (combined), statistical indicators were not different from most single data sets, highlighting the fact that it presented a Bias very close to 0.

Fruit mass estimation
To perform final fruit mass estimations based on the estimated BE obtained from the above-described model, two other steps were required. First, BE was converted to estimated tBA using Equation 6. Table 4 shows the estimated summed values of tBA (sum of all considered 1 m vine segments) and respective percentage errors for the validation set, considering all dataset cases, in this case, separated by phenological stage, inside each variety. When comparing the errors, the Syrah harvest dataset presented the largest absolute error, followed by Arinto harvest and then Arinto veraison data sets. The variety Syrah was the only one with overall overestimations of tBA. The Encruzado veraison data set showed the lowest absolute tBA estimation error, followed by Syrah veraison and then Encruzado harvest. Regarding the comparison among the two phenological stages (with combined varieties), both cases presented low errors, with the veraison data set having a slightly lower absolute error than the harvest data set. Considering the combined data set for both variety and phenological stage, the summed estimated tBA presented a low error of estimation, below 5 %.
After estimating tBA, the next step towards fruit mass estimation lies in the conversion of tBA into weight. Figure  6 shows the relationship between actual tBA and fruit mass at the analysed phenological stage. Visual observation shows that there is a linear trend between the two variables; however, the disposition of the Syrah observations (in red) hints at a possible variety dependency, while there also seems to be a trend that separates observations at veraison from harvest. Because of this, using the training set, individual regression models between actual tBA (predictor variable) and actual fruit mass (response variable) were fitted for each data set and are presented in Table 4 (area-to-mass equation). For simplification reasons, the regression intercept was set to zero.
The estimated values for the summed fruit mass on the validation set and respective relative mean error, separated by variety, stage and all data combined, are also shown on Table 4. The Encruzado veraison data set presented, by far, the lowest absolute error, followed by Encruzado harvest and then Arinto veraison. The highest absolute errors were shown by the Syrah harvest data set, followed by Arinto harvest. Syrah is also the only variable data set with an overestimation of fruit mass. Regarding the comparison between the two phenological stages for combined varieties, the data set at veraison presented a very slight overestimation of fruit mass, while at harvest, there was a low underestimation. The combined data set presented an error that was very close to zero.  FIGURE 6. Scatterplot between actual vine fruit mass over actual total bunch area (tBA) for the three studied varieties at two phenological stages.

DISCUSSION
In the present work, canopy traits such as porosity and visible bunch area are used as proxies for estimating the proportion of occluded bunches in vine images. The linear trend presented by the point cloud between POR and BE ( Figure 4A) hints that bunch occlusion can be indirectly estimated from imagebased canopy traits. This relationship was first demonstrated by the works of Richard Smart based on the use of the Point Quadrat method to evaluate canopy porosity (Smart, 1987). In recent work, we have applied this idea with success for the varieties Encruzado and Syrah (Victorino et al., 2019), and the topic has been generally discussed in Victorino et al. (2020). However, even though the models presented in Table 2 (ID 1 to 6) present medium R 2 , the percentage of bunch exposure variability explained by POR was still too low to enable an accurate estimation of BE, as it was preliminarily explored in (Victorino et al., 2019). Furthermore, POR itself, when calculated using image analysis (Diago et al., 2019) or even LiDAR technology (Pfeiffer et al., 2018), has a chance to be misestimated if canopy gaps are occupied by bunches. This can be increasingly problematic in cases of low canopy porosity and high yield, where BE would likely be underestimated. So far, research on this subject sidestepped the challenge of bunch occlusion by leaves by fully defoliating the fruit zone and focusing exclusively on the automatic bunch segmentation issue (Dunn and Martin, 2004;Nuske et al., 2011b;Riggio et al., 2018;Pérez-Zavala et al., 2018;Milella et al., 2019). If bunch pixels in the image are also considered, some of these limitations can potentially be overcome. Aquino et al. (2018b) mention that the randomness of bunch exposure causes vBA to be considered an unreliable predictor of yield. In (Íñiguez et al., 2021), a similar trend was observed for the variety Tempranillo, in Spain, with vBA in non-defoliated vines having a low correlation with yield. However, this may only be true if the variable is considered individually. By itself, the relationship between vBA p and BE showed high R 2 values (Table 2, ID 7 to 12), yet there is an increase in variability at higher magnitudes ( Figure 4B). This variability increase was slightly less visible with the variety Syrah. However, we suspect that the cause for a better agreement in its regression model with BE is not caused by the variety difference but the fact that in our work, the variety Syrah is rainfed and presents less vegetative vigour. Indeed, it is plausible to suspect that the challenge of estimating occluded bunches would be easier to overcome in less vigorous conditions, as a smaller percentage of bunches are occluded (Keller et al., 2016) The Syrah model was the only case where a maximum likelihood analysis did not suggest a square root transformation to the response variable in a multi-variable scenario ( Table 2). As mentioned above, we suspect this behaviour might be due to water stress and higher bunch exposure and not a varietal difference. Apart from this case, no other individual variety or stage fitted model presented a higher R 2 than when all data were combined, suggesting the viability of model 18 (Table 2) to be used across different varieties and stages.
When validating model 18, the variety Arinto stood out as being the data set on which the model presented the highest overestimation of BE. In fact, almost all Arinto observations are below the 1:1 line, regardless of the phenological stage ( Figure 5). Such a trend can possibly be explained by the impact of this variety's large leaves and large bunches (OIV, 2009) on bunch occlusion. Indeed, larger leaf surfaces have a higher chance to cover bunches and decrease POR on non-disturbed canopies (e.g., Table 1). Furthermore, in a 2D image, large bunches are more prone to be occluded by other bunches (especially when there is a high density of bunches in the fruiting zone) and to have more berries occluded by other berries inside the same bunch (berry by berry occlusion) as shown in Table 1 (lower observed BE, before defoliation at all stages). However, even with these differences, the variety Arinto still showed a vBA p higher than other varieties at harvest before defoliation. The lowest bias and highest EF presented by the variety Encruzado data set seems to corroborate this hypothesis. Compared to the other varieties, Encruzado presents average sized bunches and leaves (OIV, 2009) and showed medium values of POR and vBA p (Table 1). This suggests that model 18 can be particularly effective in varieties with morphologic traits similar to the ones presented by Encruzado in our study. Furthermore, the data set of the variety Syrah was the only one on which the model underestimated BE, which is mostly due to data collected at harvest (Table 3 and Figure 5). This suggests that this variety's image features are different from the other cases, especially at harvest, which can have two potential reasons. The first reason is the higher observed POR, indicating a sparser canopy (Table 1). In fact, at harvest, the variety Syrah shows 5 to 7 % higher POR than the other two varieties, either before or after partial defoliation. Moreover, actual BE values of this variety, at harvest, were almost 20 % higher than the other varieties before defoliation. At the same time, before defoliation, vBA p was very similar to other varieties. After partial defoliation, Syrah presented the lowest vBA p while both POR and BE remained higher than the remaining varieties. From non-defoliation to partial defoliation, POR and BE increased at the same rate for Syrah as with other varieties, while vBA p increased at a lower rate. This means that, for the variety Syrah, although only a low amount of bunches became exposed with the partial defoliation, these bunches represented a higher portion of the total amount of bunches. This leads to the second reason related to the actual yield. The variety Syrah presented the lowest yield, more than two times lower than the remaining ones, and consequently, the amount of visible bunch pixels represent very different proportions of total bunches. However, it was still possible to achieve accurate BE estimation results for Syrah and Arinto, data hints that context information could be a helpful addition to the fruit mass estimation algorithm. In this case, other context information could be considered, such as data regarding plant water status during the growth cycle as it would possibly indicate different levels of canopy density; vigour data, which in many cases is also possible, today, to be obtained automatically (Gatti et al., 2016;del-Moral-Martínez et al., 2016;Sanz et al., 2018); historical yield data (Riggio et al., 2018;Araya-Alman et al., 2019) as it would lower or increase the expectations of bunch occlusion by leaves in vine images.
The fact that there were no significative differences in the model's statistical measures when applied on the veraison and harvest data sets hints that when attempting to estimate BE, it is possible to collect data, at least, between veraison and harvest, without sacrificing the model's accuracy, which goes in accordance with what was previously observed in Victorino et al. (2020). The timing of data collection for yield estimation using image analysis should only be defined by other strategic reasons, such as bunch thinning planning or harvest logistics.
Finally, the validation of model 18 on the combined data set yielded statistical measures that rival the best individual models, especially regarding the bias, which was very close to 0 %. The validated model can be seen as an estimator of the occlusion ratio, as it was previously contemplated by (Nuske et al., 2011a) and can possibly be considered an improvement to that work as it can estimate bunch occlusion by leaves in a fully automatic way. Furthermore, our methodology can be used on all the training systems with vertical plants and does not depend on a specialized imaging system (Riggio et al., 2018). The challenge of bunch occlusion was also very recently studied by Kierdorf et al. (2021), who have used a methodology based on generating synthetic images with different cases of bunch occlusion. This approach is a perfect example of how today's machine learning technologies can help face challenges such as the one discussed in our research. However, its application in images collected from non-defoliated vines in field conditions is not yet fully developed. In the present work, instead of machine learning, we focused on image-obtained canopy parameters that are commonly used by growers. In the future, the possibility of combining both strategies can be a very promising approach.
The possible variety dependency observed in Figure 6, where Syrah observations are mainly positioned in the lower side of the point cloud, can also be seen in Table 1, where the tBA (vBA in total defoliated vines) of the variety Syrah at harvest, for example, presents values that are approximately 20 to 50 % lower than the other varieties. At the same time, actual fruit mass differences are much higher, suggesting that the relationship between tBA and fruit mass varies from case to case. This is possibly caused by two reasons. The first reason is, again, related to differences in bunch morphological traits (OIV, 2009). An example of such a trait is bunch compactness, which has been mentioned in several works as a possible cause of error in bunch weight estimation using image analysis (Aquino et al., 2018b;Hacking et al., 2020). Although this trait changes with different varieties, it can also change with different management practices, flowering and fruit-set conditions and even water availability (Tello and Ibáñez, 2018). The second possible reason is related to the vines lower overall yield and the fact that there is a smaller chance for bunch occlusion by other bunches to occur, a challenge that is not addressed by our model, which by encompassing cases of other varieties, where this bunch occlusion can be higher, creates some misestimations within the Syrah variety. Bunch by bunch occlusion can attain significant magnitudes (Nuske et al., 2014b;Victorino et al., 2019) and still requires further research to be automatically obtained, as it remains hard to predict based only on visible image features. A possible factor that might impact bunch by bunch occlusion is how densely positioned bunches are in the canopy, in short, how narrow the fruit zone amplitude is. By knowing the fruit zone amplitude, which can potentially be estimated through image analysis, along with historical yield data, a promising strategy to estimate bunch by bunch occlusion could be computed, which should then be added to the results presented by the approach described in this work.
The multiplying factors to convert tBA into fruit mass (Table 4) are a simplified approach for the area-to-mass conversion. As they are based on the relationship between observed tBA and fruit mass, they encompass several aspects in the same equation, such as bunch occlusion by shoots and by other bunches. As it is also a variety-specific approach, its use should be done with caution when applied to other varieties. Indeed, further research is needed to obtain a robust method to be applied as a general model, possibly by considering other image features and complementary vineyard data. Furthermore, to perform yield estimations with our results, fruit mass estimated at veraison would still be required to be multiplied by a factor that considers berry growth from veraison until harvest. Such an approach, based on historical data, is also used by other methods that attempt to estimate yield at this stage (e.g., Liu et al., 2020;Clingeleffer et al., 2001). Although the main relationship explored in our work (POR + vBA p vs. BE) was independent of the phenological stage, the area-to-mass conversion is not, as it can be observed by the different slopes between phenological stages (Table 4), most likely caused by the increase in size that generally occurs from veraison until harvest (Coombe and Mccarthy, 2000). Nevertheless, for simplification reasons, and because this challenge goes beyond the scope of the present work, this simplified approach was the option used to obtain the final estimations of fruit mass, which is similar to the approaches presented in, e.g., Lopes et al. (2016) and Hacking et al. (2019). Even with a simplified area-to-mass conversion, fruit mass estimation results are very promising, with most data sets presenting an absolute error below the strictest estimation error threshold of 5 % mentioned in  Table 4). In fact, only the data sets of the varieties Syrah (harvest) and Arinto (both stages) showed absolute errors above 10 %. Most fruit mass estimation errors are similar to the tBA estimation errors, as expected since the conversion factor was linear. However, the variety Syrah (both stages) presented much larger fruit mass estimation errors than tBA errors, which reinforces the idea that there is a need to improve the area-to-mass conversion methodology using extra image features or context data. Regardless, the fruit mass estimation of the combined data set presented an error extremely close to zero, caused by the compensations between under and overestimations of the different varieties and phenological stages. This error is of similar magnitude and, in some cases outperforms, the ones reported by Liu et al. (2020). Furthermore, Liu et al. (2020) approach was dependent on individual bunch images taken in field conditions as well as shoot counts, which would require a higher number of steps and images than a vine level imaging strategy like the one proposed in our work.
The approach described in this work seems to be a viable method to estimate bunches hidden by leaves; however, it achieves acceptable accuracy in cases where a considerable portion of bunches are visible. High-density canopies with very low-to-none visible bunches remain a challenge that may only fully be surpassed with advanced sensing such as ultrasounds (Parr et al., 2021) or other devices that can see through leaves, but which are not yet readily available or practical to use in the field conditions.

CONCLUSION
In this study, canopy porosity and visible bunch area in the image were used as proxies for estimating bunch exposure to solve one of the most relevant challenges faced when assessing vineyard yield via image analysis: bunch occlusion by leaves. Our results show that the percentage of occluded bunch area can be estimated using a multiple regression model that includes the variables: canopy porosity and visible bunch area. The obtained pooled model was accurate on data from three varieties at two phenological stages (veraison and full maturation). Final fruit mass estimations presented low errors, similar to state-of-the-art approaches, with the advantage of less effort during image collection. The methodology presented in this work is an important step towards a fully automated non-invasive yield estimation approach, as it improves the estimation of bunches that are occluded by leaves and not visible to scanning devices. Further work is needed to test this methodology in other varieties and sites and to improve area-to-mass conversion models.