Automated yield prediction in vineyard using RGB images acquired by a UAV prototype platform
Abstract
In viticulture, accurate yield estimation is crucial for enhancing vineyard management and improving the quality of commercial grapes. However, traditional monitoring methods, which rely on visual assessments or destructive sampling, face significant limitations, including subjectivity and lengthy processing times. Additionally, these methods often fail to capture the spatial variability within vineyards, potentially leading to unrepresentative observations. This article describes the development of a fast and automated method based on aerial digital image analysis for yield quantification in vineyards with different slopes and scenarios. In this work, digital images of vines were acquired by unmanned aerial vehicles (UAVs) and coloured tags of a known size were placed in the vineyards to automate image processing. To properly extract quantitative data from the images, the geometric distortion due to deformation between the aerial image and the real world was first corrected using a control points tool. For grape detection, colour thresholds and image filtering were applied and tested in different scenarios. Then, the number of grape pixels was converted in yield (kg/vine) using a linear regression model calculated between the bunch surface derived from image analysis and the bunch weight observed by ground measurements on representative vines. The yield estimated from UAV images was validated against ground truth measurements and the R2 in two seasons was equal to 0.85 and 0.89, respectively. Future prospects are to improve image processing by exploiting the identification of objects present in the environment to avoid the installation of the tags in the field, extend the developed approach to other agricultural contexts, and create easy-to-use tools that do not require any specific skills for the agronomist.
Introduction
Yield estimation is important for improving the fruit ripening process and harvest management, and for scheduling wine production operations (Jackson & Lombard, 1993). Conventionally, the evaluation of yield is performed manually and is regularly assessed using destructive or visual means. For instance, the average number of clusters per vine and the average weight per cluster are determined by weighing the bunches. These data are then applied to the entire vineyard to determine the number of vines per hectare (OIV, 2007). This methodology presents some drawbacks, mainly ascribable to the relatively long times required for the analysis and to the variability of the results provided by different operators causing an inherent subjectivity of the method. In addition, the limited amount of analysed samples implies the risk of a poorly representative sampling (Pothen & Nuske, 2016). Furthermore, using the traditional methods it can be difficult to perform the measurements due to the vineyard’s location, often on a slope and ploughed soil, and to weather conditions, such as high temperature and solar radiation. In this context, the availability of proper systems to perform a fast analysis of vineyards might constitute a very useful tool for winegrowers. Digital image processing is well-suited to screen the exterior attributes of agri-food items, including but not limited to flaws, colour, size, form, and fruits, crops, vegetables, drinks, sauces, fish, and cattle (Quevedo et al., 2010; Khojastehnazh et al., 2010; Foca et al., 2011; Fernandez-Vazquez et al., 2011; Ulrici et al., 2012; Dutta et al., 2015; Orlandi et al., 2018a; Bruno et al., 2022). In viticulture, many research works have been reported in the literature, where morphological, textural, and colour features extracted from RGB images were used for grape berry recognition and grape bunch detection (Pérez-Zavala et al., 2018). In general, these methodologies rely on the manual gathering of pictures (Hacking et al., 2019; Íñiguez et al., 2021; Menozzi et al., 2023) or sensor-equipped frames from human-driven or robotic vehicles in motion (Nuske et al., 2014; Aquino et al., 2018; Victorino et al., 2020; Khokher et al., 2023). An accurate yield prediction is difficult with those methods, which rely on manual acquisition or mobile image acquisition via modified terrestrial vehicles. These limitations are exacerbated by factors such as potential slopes, ploughed or wet soil, and the presence of cover crops within the vineyard. All on-the-go proximal approaches for yield estimation not only require leaf removal but are often conducted at night time (using artificial lighting to improve grape segmentation performance), which involves difficult driving conditions and operator risk. Moreover, vineyard-wide mapping from ground platforms is time-consuming, generates an enormous dataset that is difficult to manage, and causes soil compaction. Attempting to pair data collection with traditional vineyard management operations to optimise time can risk image quality due to vibrations during field movement, especially on slopes. Likewise, even with a representative zone monitoring approach to reduce dataset size, a ground-based approach would still require time to reach those zones. Unmanned Aerial Vehicles (UAVs) are becoming more prominent in the domain of reliable agricultural monitoring due to their capacity to conduct costly and expeditious analyses, hence surpassing the constraints of ground measurements) (Gago et al., 2015; Pôças et al., 2015; Bellvert et al., 2016; Poblete-Echeverría et al., 2017; Romboli et al., 2017; Santesteban et al., 2017; Comba et al., 2018; Caruso et al., 2023; Di Gennaro et al., 2023). The potential of UAV platforms for yield estimation in vineyard is a cutting-edge and challenging topic in precision viticulture, which does not present much research in the literature. The first work was published by Di Gennaro et al. (2019), in which the development of an operational protocol of low-altitude flights by UAV is presented for the acquisition of RGB images with an inclined camera in areas representative of the spatial vegetative variability in red grape variety vineyards. The images were processed with an approach that envisaged a manual segmentation of the fruiting band, and subsequently, an unsupervised segmentation of the bunches, and the subsequent estimation of the weight per plant starting from 2D data of the projected surface of the bunches. A second work was published by Torres-Sánchez et al. (2021) suggesting a 3D yield estimation approach based on low-cost tools such as a UAV equipped with an RGB inexpensive sensor and free software for image analysis. In detail, the work described an unsupervised and automated workflow for the detection of grape clusters in red grapevine varieties using UAV photogrammetric point clouds using RGB information through a colour-filtering process for cluster detection.
As regards the analysis of aerial digital images for quantitative measurements, two important issues have to be considered: the geometric image correction due to lens distortion and the colour segmentation to detect the different objects in the images. Geometric errors in aerial images are inevitable and can be attributed to a variety of sources. The two most commonly observed distortions in aerial imagery are the barrel effect caused by a wide-angle lens and the perspective effect caused by inconsistencies in the aircraft’s attitude, among other geometric distortions. Lens distortion specifically induces the deflection of straight lines and the displacement of points from their intended positions (Richards & Jia, 2006). Due to the significant deformation caused by geometric distortion in both the uncalibrated raw aerial image and the corresponding real-world scene, it is not possible to utilise an unaltered aerial image directly. To extract precise quantitative information from the images, geometric correction is an essential pre-processing step. Geometric correction permits the elimination of lens distortion effects and the orthorectification of pictures through the use of ground control points whose locations are known (Liew et al., 2012). Considering object segmentation, as summarised in Hamuda et al. (2016) various approaches have been developed for plant detection, in particular to segment the different image pixels into plant and background (soil and residues). The common segmentation techniques used for this purpose are: colour index-based segmentation (López-García et al., 2022; Castillo-Martínez et al., 2020), threshold-based segmentation (Lu et al., 2022; Liao et al.,2020) and learning-based segmentation approaches (Wani et al., 2022; Shoaib et al., 2022).
In this context, this study describes an upgrade of previous work by Di Gennaro et al. (2019) aimed at developing an automated system for the prediction of grape yield, based on the analysis of RGB images acquired by the UAV platform. In particular, the present research was focused on: i) the perspective correction of the images to minimise geometric deformations caused by lens distortion and differences of UAV positioning in terms of distance and inclination of camera along the row; ii) the automated identification of the plants of interest and grape bunches in various scenarios (i.e., different levels of land slope, row orientation, exposure, low and bright light during cloudy and sunny weather); iii) the evaluation of yield estimation using the information contained the images acquired with the UAV-platform.
To this aim, RGB images of parcels of different vineyards were acquired by the UAV platform, and coloured tags of known size were placed in the field as control points to register the images and identify the vines. The individuation of colour thresholds for segmentation of the coloured tags and the grape bunches has been performed using the colourgrams approach (Antonelli et al., 2004). In brief, this method involves transforming the RGB image into a signal, namely a colourgram, which encodes the original image’s colour information. (Giraudo et al., 2018; Orlandi et al., 2018b; Lopes et al., 2020). By employing the colourgrams method, it was feasible to concurrently examine the frequency distribution curves of multiple colour parameters that were computed for every image in the dataset. This enabled the identification of the colour parameter that resulted in the most efficient segmentation of the elements required for each step of image processing. Finally, the number of segmented grape pixels was converted in yield estimation using a linear regression model calculated between the bunch surface derived from image analysis and the bunch weight obtained by ground measurements on representative vines. Then, the yield from remote sensing data was validated against ground truth measurements on test parcels.
Materials and methods
1. Experimental design
The research was carried out during the 2021 and 2022 vegetative seasons in four Sangiovese cv. Vineyards are located near Siena in Italy (Figure 1).
Table 1 resumes the characteristics of the vineyards monitored in the research.
Vineyard | Coordinates | Surface (m2) | Row orientation | Exposure | Planting date | Training system | Vine spacing (m) | Slope (%) |
V_a | 43°26’25.23”N 11°23’35.49”E | 8.000 | N-S | S | 2009 | Guyot | 2.40 × 0.80 | 9 |
V_b | 43°26’17.79”N 11°23’33.90”E | 6.200 | N-S | N | 2008 | Guyot | 2.40 × 0.80 | 8 |
V_c | 43° 25′ 45.30″N 11° 17′ 17.92″E | 12.000 | NW-SE | S–E | 2008 | Spur-pruned single cordon | 2.20 × 0.75 | 10 |
V_d | 43°27'44.22"N 11°16'55.92"E | 20.000 | E-W | S | 2010 | Spur-pruned single cordon | 2.50 × 0.80 | 0 (terracing) |
Figure 2 shows an operational scheme of the experimental activity divided into 2 main steps: a survey with a UAV equipped with a multispectral sensor to characterise the spatial variability and define a representative experimental design; subsequently pre-harvest low-altitude surveys with a UAV equipped with RGB sensor for the acquisition of images of the bunches that were processed to apply a yield estimation model.
Considering the strong effect of the vigour factor on vine productive response, the experimental design was planned based on vegetative spatial variability within each vineyard. A series of preliminary multispectral surveys were conducted at the end of July 2021 using a DJI Phantom 4 Multispectral (P4M) (SZ DJI Technology Co, Ltd, Shenzhen, China) equipped with a six optics camera including one RGB sensor and five monochrome 2MP Sensors for spectral response measurement in the blue (450 nm ± 16 nm), green (560 nm ± 16 nm), red (650 nm ± 16 nm), red-edge (730 nm ± 16 nm) and near-infrared band (840 nm ± 26 nm). The UAV flights were conducted in clear sky conditions from 11:30 to 14:00. The radiometric correction process of multispectral images was performed through a vicarious calibration based on the absolute radiance method using 5 Lambertian surface and homogeneous panels at known reflectance (0.02 %, 0.09 %, 0.25 %, 0.50 %, 0.88 %). Multispectral images were then processed using DJI Terra software (SZ DJI Technology Co, Ltd, Shenzhen, China), which allowed the image's radiometric correction, the orthomosaics generation, and finally the NDVI (Normalized Difference Vegetation Index) calculation. The NDVI map allowed for easy identification of representative vigour zones for grape detection ensuring correct experimental design planning (Romboli et al., 2017; Di Gennaro et al., 2019). In detail, within each vineyard, four parcels were selected: two parcels were located in a high vigour (HV) area and two parcels were located in a low vigour (LV) area (Figure 3).
Each parcel was composed of nine consecutive vines and was delimited by four cyan tags (12 cm × 10 cm) placed at the corners of a rectangle. Moreover, red tags (10 cm × 10 cm) were placed on the trunks of the vines number 1, 4, 7, and 10. The cyan and red tags are necessary for the image processing procedure described in section 2, such as geometric distortion correction, area measurement, and cropping of vines. During season 2022, the developed approach was tested in three vineyards (V_b, V_c, V_d), which are representative of different slopes, row orientation, and exposure conditions. To calculate the calibration model, 24 representative bunches (12 in HV + 12 in LV) were marked on vines adjacent to the experimental parcels in two vineyards (V_c and V_b) in the 2021 season. These bunches were used as training sets and the linear regression model was validated on the three test datasets as described in Section 2.3 At the end of each productive season (2021/2022), the yield per sampled vine was calculated by cutting each single bunch and weighing the total. At harvest time in the 2021 and 2022 seasons, imagery from the free Sentinel2 (S2) ESA Copernicus Programme satellite was used to calculate the average NDVI of each plot and confirm that the experimental plots were correctly identified with regard to vegetative spatial variability.
2. Fruiting zone image acquisition
This study employed two distinct UAV platforms. Initially, multispectral monitoring was conducted to assess spatial variability through the calculation of vegetation indices. Subsequently, a series of flight campaigns were performed using a UAV equipped with an RGB camera to capture images of the fruiting zone. As regards the yield estimation task, the multirotor prototype Helyx-Zero was developed ad-hoc (Sigma Ingegneria, Lucca, Italy) to acquire RGB images of the bunches (Figure 4a). The UAV was equipped with an additional board, specifically the Raspberry Pi Zero 2W, which mainly served the purpose of image storage and control of the RGB camera tilt angle. The frame was developed with a 3D printing technique using a highly resistant but extremely lightweight plastic material (Figure 4b).
The UAV monitoring system was powered by a 3000 mAh 3S LiPo battery, which provides more than 15 minutes of operational activity. RGB images were acquired setting the flight altitude at 5 m above the ground and placing the sensor at 45 degrees with respect to the vertical at the ground. In the first season, two flight campaigns were performed on September 2/3, 2021 (first measurement session—T1_2021) and September 22/23, 2021 (second measurement session—T2_2021), while in the second season, one flight campaign was performed on September 9, 2022.
3. Image processing
3.1. Image processing workflow structure
The sequential progression of the RGB image development process is seen in Figure 4 and may be briefly outlined in the subsequent three stages: i) geometric distortion correction, ii) Region Of Interest (ROI) cropping, and iii) grape detection. Each of the three stages is based on image segmentation conducted using the colour characteristics of the scene's components, which in this case are grape berries, and cyan and red tags. The process of image segmentation was performed through the colourgram method (Antonelli et al., 2004). This is achieved by the process of condensing the colour-related data of each image into its respective colourgram. By sequentially combining the frequency distribution curves of the R, G, and B channels, as well as other variables that are directly derived from the R, G, and B values, the colourgram of a given image may be acquired. The analysed variables include several key metrics. Brightness is determined by summing the values of the RGB channels. The relative colours, namely relative red, relative green, and relative blue are calculated by dividing the lightness by the ratio of each channel to the overall lightness. Additionally, hue, saturation, and value (HSV) are derived by converting the RGB coordinates into the HSV colour space. Finally, nine score vectors (three for each model) are obtained by applying a Principal Component Analysis (PCA) approach to the raw, mean-centred, and autoscaled unfolded image data. In this manner, it is possible to analyse a dataset of RGB images in a completely automated way and without any a priori assumption. Furthermore, although colourgrams are generated by sequentially merging the frequency distribution curves of various colour parameters, it is still possible to revert to the pixel level in the original image domain to reconstruct the features of interest. Colourgram calculation and analysis were performed using a Graphical User Interface, Colourgrams GUI, developed in MATLAB environment and freely downloadable from https://www.chimslab.unimore.it/downloads/ (Calvini et al., 2020). The Colourgrams graphical user interface enabled the simultaneous calculation of colourgrams for RGB pictures captured by the UAV platform and the acquisition of a comprehensive overview of the colour attributes of the sample images. In this manner, it was possible to identify the optimal thresholds valid for all the images to perform the segmentation of each element needed for image processing (cyan tags for geometric distortion correction, red tags for plant cropping, and grape bunch for yield prediction).
3.2 Geometric distortion correction
To correct the image perspective due to lens distortion and differences in UAV positioning in terms of distance and inclination of the camera along the row, the RGB images were first registered using a control points selection tool (Figure 5a). To this aim, the cyan tags that delimit the four corners of the parcel were used. To begin, the process required the automated detection and segmentation of the cyan tags included in the obtained pictures. In pursuit of this objective, an examination was conducted on the colourgrams of the images. The image reconstruction functionality of the Colourgrams GUI enabled the identification of relative red as the colour parameter that would yield optimal results for the segmentation of cyan tags across all the images. A threshold value equal to 0.15 of relative red was applied: pixels with relative red values higher than the threshold are ascribable to the background, while pixels with relative red values lower than the threshold are ascribable to the cyan tags. For further details about Colourgrams GUI and the image reconstruction procedure, the reader is referred to Calvini et al. (2020). Based on the results, the images were converted into a binary image by setting to 1.00 all the pixels with a value of relative red lower than 0.15 and setting to zero all the remaining pixels. Then, two different filtering operations were applied to improve the segmentation:
- 1) all the objects in the binary images composed by less than 1300 pixels were eliminated, i.e., their value was set to be equal to zero;
- 2) the holes present in the segmented objects were filled; in this way, only the areas of cyan tags of the parcel under investigation are considered.
The exact positions of the cyan tags were used as moving points, while the dimensions of a rectangle were used as fixed points and the centre points of each cyan tag were discovered. By using the pairs of control points (fixed points and moving points), the transformation type-specified geometric transformation was deduced. Since the visual scene seems to be slanted, "projective" transformation was used in this instance; that is, parallel lines converge toward a vanishing point while straight lines stay straight. The geometric change was then implemented in the initial picture. Moreover, the cyan tags in the corrected images were used to measure image area (Figure 3a). In particular, the distance between the cyan tags from right to left (cmx) and from up to down (cmy) was measured for each parcel in the considered vineyards and expressed in centimetres. Then, cmx and cmy values were divided by the corresponding number of image pixels obtaining the resolution of each pixel expressed in terms of cm/pixels for both x dimension (Kx) and y dimension (Ky). Then, for each image, the conversion factor Kn was calculated as
where Kn (cm2/pixel) is the area expressed in cm2 covered by each pixel of the n-th image, while Kx and Ky are the cm/pixels resolution between the cyan tags from right to left (x-axis) and from up to down (y-axis) in the considered image.
3.3 ROI cropping
Each corrected image of the parcel was cropped into three parts based on the red tags included in the image scene (Figure 5b). In this manner, three ROIs were obtained, each one containing three different grapevines: ROI 1 (grapevine 1, 2, 3), ROI 2 (grapevine 4, 5, 6), and ROI 3 (grapevine 7, 8, 9). The identical methodology that was outlined for the segmentation of the cyan tags was used for the segmentation of the red tags (see Section 3.2). The optimal colour parameter for segmentation in this instance was determined to be relative green, with a threshold value of 0.25. Pixels exhibiting relative green values greater than 0.25 were classified as the background, whereas pixels with relative green values less than 0.25 were classified as the red tags. Subsequently, every segmented picture underwent a binary conversion process wherein all pixels with values over 0.25 were set to one and the remaining pixels were turned to zero. Additionally, the segmentation in this instance was optimised by eliminating tiny items with a width of fewer than 500 pixels and then filling in the gaps in the discovered objects. In total, in the 2021 season, two image datasets were obtained (T1_2021 for the first and T2_2021 for the second measurement session), each one composed of 96 ROI images (4 vineyards × 4 parcels × 2 acquisitions × 3 ROIs). The vines in two parcels of the V_d vineyard were affected by grey mould disease caused by the fungus Botrytis cinerea prior to harvest. Consequently, the images from these two parcels (one in a high-vigour and the other in a low-vigour area) were excluded from the dataset, resulting in a final total of 84 images. In the 2022 season, one image dataset (named 2022) was obtained composed of 36 ROI images (3 vineyards × 4 parcels × 3 ROIs).
3.4 Grape detection
Relative blue, out of all the colour factors analysed in the colourgram, enabled the most effective outcomes in terms of red grape sample recognition and background pixel elimination (leaf, trunk, soil, etc.). A threshold value of 0.40 for relative blue was determined by a more thorough examination of the colourgrams. Pixels with relative blue values beyond the threshold are indicative of the background, whereas pixels with relative blue values falling below the threshold are indicative of ripe red grapes. As a result, the ROI images were transformed into binary images by assigning the value one to those pixels whose relative blue value was below 0.40, and zero to the rest pixels. Following this, three distinct filters were used to enhance the grape segmentation:
- 1) morphological closing of the objects on the binary image;
- 2) removal of the holes in the detected objects;
- 3) elimination of small objects with less than 100 pixels.
Thus, just the pixels that were associated with the clusters were taken into account (Figure 5c). The total number of segmented pixels of the grapes was thereafter calculated for each ROI image and multiplied by the Kn value of the considered image, to calculate the total bunch surface (Bs) of the ROI. All the image processing steps were performed using both ad-hoc routines written in MATLAB ver. 9.11 environment and functions available in the MATLAB Image Processing Toolbox ver. 11.4 (The Mathworks Inc, Natick, MA, USA).
4. Statistical Analysis
The correlation between the yield per vine weighted by conventional ground sampling and the bunch surface generated from image analysis (refer to Section 3.4) was used to estimate the yield. The training and validation phases were conducted on separate datasets, one dedicated to training and the other to validation. For the training phase, a linear regression model was computed using a set of 24 bunches extracted from images acquired outside the experimental parcels, while validation was performed on images collected from the experimental parcels in four vineyards over three monitoring campaigns conducted across the two years of research activity. Initially, a linear regression model was constructed using the images comprising the calibration set, taking into account the bunch surface area (cm2) derived from image pixel counts and the bunch weight (g) as assessed on representative vines encompassing both high and low vigour parcels. Following this, bunch weight estimation of the test set image was performed using the regression model, beginning with the bunch surface generated from remote sensing data.
Subsequently, the estimated yield per vine from images acquired in the experimental parcels during two surveys in 2021 (referred to as 'T1_2021' and 'T2_2021') and one survey in 2022 (referred to as 'T_2022'), were validated against the yields of each individual vine per parcel, measured through traditional destructive method at harvest time of each production season. In terms of the coefficient of determination and root mean square error (RMSE), the performance of the calibration models is evaluated (R2).
Results
Calculation of the NDVI values allowed the identification of two representative zones accounting for vigour heterogeneity for each one of the four vineyards: one representative of the high vigour (HV) zone and the other of the low vigour (LV) zone. Remote sensing data related to vigour variability within the vineyard were confirmed by ground measurements for yield data acquired through the sampling of 9 vines in each zone identified by UAV survey (Table 2). The relationship between the bunch surface derived from image analysis and the bunch weight observed by traditional ground sampling shows a good correlation with an R2 equal to 0.85 (Figure 5a), then the equation (y = 2.9279x – 50.578, where the x and y parameters are related to bunch surface weight) was used for the prediction of the yield from remote sensing data.
Vineyard | Vigour | Yield kg/vine (mean± dev.st) | NDVI S2 (mean± dev.st) |
V_a | HV | 1.97±0.75 | 0.29±0.03 |
LV | 1.33±0.88 | 0.23±0.01 | |
V_b | HV | 2.36±0.83 | 0.32±0.01 |
LV | 0.52±0.25 | 0.20±0.02 | |
V_c | HV | 2.04±0.81 | 0.39±0.01 |
LV | 0.49±0.33 | 0.26±0.01 | |
V_d | HV | 2.80±1.58 | 0.39±0.01 |
LV | 1.52±0.61 | 0.26±0.04 |
The results of the comparison between yield measurements in the field and yield estimation from the UAV image analysis approach for the test set images are reported in Figure 6a. For both the test set images acquired at T1 (Figure 6b) and the test set images acquired at T2 (Figure 6c) in the 2021 season, satisfactory results were obtained, with R2 values equal to 0.85 and RMSE values equal to 1.29 kg/3 vines for T1_2021 and 1.24 kg/3 vines for T2_2021. The results obtained at T1_2021 and T2_2021 are comparable, therefore it is possible to have a good yield estimation also several weeks before harvesting. The yield is estimated as the sum of production of the three vines included in each ROI, therefore considering an average per single vine (calculated by averaging the two RMSE values and dividing by 3) the RMSE value is equal to 0.42 kg/vine. Furthermore, it has to be underlined that the yield estimates were calculated considering the two repeated images acquired for each parcel as separate objects, to evaluate the reproducibility of estimated values. In this manner, it was possible to evaluate the repeatability of the procedure through the estimate of the within-sample variability. Furthermore, the approach developed with the samples acquired in the 2021 season was tested on the images acquired in the 2022 season. The correlation between measured and estimated yield shows satisfactory results, with the R2 value equal to 0.89 and the RMSE value equal to 1.21 kg/3 vines (Figure 5d). This result is comparable to the result obtained in the previous season and confirms that the model can also be applied to different seasons.
Discussion
The present work is an upgrade of a previous research published by Di Gennaro et al. (2019), that is focused on the analysis of aerial digital images to predict the yield in a single vineyard and required a preliminary supervised step to cut ROI zones.
In this advancement, two important issues are discussed and faced. The first one concerns geometric image correction necessary to reduce the deformation between the real world and the aerial image acquired at a greater distance from the target vines caused by lens distortion. This is necessary to extract accurate quantitative information from the images. Then, the study is focused on colour segmentation to detect the different objects depicted in the images and automate the image analysis process. Indeed, the identification of optimal threshold values considering the proper colour parameters according to the specific objects to be identified allowed automate the different image analysis steps. Segmentation of the cyan tags considering the relative red colour parameter was used for geometric distortion correction and the calculation for each image of the conversion factor (Kn) accounting for the spatial resolution, expressed as cm2/pixel, of each image pixel. Subsequently, segmenting the red tags using the relative green colour parameter allowed for the identification of the vines in the images. Meanwhile, a thresholding procedure based on the relative blue parameter was employed to segment the grape bunches and calculate their corresponding surface area by multiplying the pixel count by the Kn conversion factor for the respective image. The UAV yield estimation is calculated by applying the model built using bunch dimensions extracted by pixel counts from images and bunch weight observed by ground measurements on representative vines. The improved approach presented in this work allows a completely unsupervised grape segmentation workflow, which represents a strong upgrade of the previous methodology described by the authors.
The image dataset used in this work, which encompasses two different vigour zones, collected in four different vineyard scenarios, accounts for several variability factors to validate our approach. The variations in the scenarios include different levels of land slope, row orientation, and exposure, whereas the variations in illumination include low and bright light during cloudy and sunny weather, respectively. These variations make the detection problem challenging. Nevertheless, thanks to the colourgrams approach that allowed us to evaluate all the images simultaneously, it was possible to identify colour thresholds applicable to all the images acquired in different situations, thus automating the image processing phase.
The results obtained in the 2021 season show that it is possible to estimate the yield with an R2 equal to 0.85 several weeks before the harvest; furthermore, the approach can also be reproduced in different years. To further improve the performance of image analysis and yield estimation it also could be possible in the future to adjust and identify more specific thresholds for image segmentation according to the light condition of specific wine terroirs and/or for specific weather conditions.
As discussed in the previous work (Di Gennaro et al., 2019), the proposed approach is a good alternative to traditional measurement for several reasons, such as overcoming the ground limits, low cost, accuracy, and relative speed. The studies demonstrated the potential in terms of recognising the grape bunch and quickly estimating ripe yield at the time of flight; however, it suffers from two unavoidable physical limits represented by the green bunch and leaf cover. For these reasons, our study focused on red ripe grapes and employed a partial defoliation treatment on the morning side (northeast) of the canopy. This approach aimed to prevent sunburn and leverage the positive effects of canopy management, such as improving air circulation, reducing the risk of fungal attacks on grape clusters, decreasing humidity, and facilitating the penetration of fungicide sprays. Obviously, in case of complete defoliation, there would be greater visibility of the grape clusters and minimal shading, which would enhance segmentation performance. (Pieri & Fermaud, 2005; Sabbatini & Howell, 2010; Noyce et al., 2016). The potential of UAVs for large-scale and fast image acquisition can also be transferred to other agricultural contexts. Regarding this aim, we are developing a similar approach for apple production. In this context, image processing is facilitated by environmental conditions; for example, apple trees are often cultivated on flat land, and the fruits are not significantly obscured by leaves, which aids in their identification.
In recent years, several innovative solutions based on artificial intelligence applied to 2D image analysis have been proposed to assess bunch compactness directly in the field and to estimate several OIV descriptors (Hacking et al., 2020; Lopes & Cadima, 2021). Khoker et al. (2023) studied an approach based on deep learning and computer vision techniques to automatically count the inflorescence number at the clearly visible inflorescences (E-L 12) phenological stage using RGB video data and demonstrated that this correlates to measured yield within 4 % to 11 % error. The video capturing was performed in an “on-the-go” manner with a ground vehicle driving at speeds of 3 to 4 km/h. Although these methods are precise as the result of a proximal sensing approach, it is weak in terms of timing, which plays a key role in agriculture management. The advantage of the UAV approach is that working at a greater distance from the target vines allows one to acquire up to 10 plants in one image, at the same time providing enough resolution to correctly discriminate the clusters within the canopy. In this regard, technological advancements have enabled high-resolution RGB cameras (20MP) to be mounted on even entry-level commercial drones (around € 1000 or less), allowing for sub-centimetre detail even at a 10 m flight altitude with minimal economic investment.
The results obtained from the present approach provided an average R² of 0.85 and RMSE of 0.42 kg/vine in 2021, and R² of 0.89 and 0.40 kg/vine in 2022. In the literature, there are no studies on UAV unsupervised yield estimation per vine reporting comparable RMSE indicators. For example, Torres-Sánchez et al. (2021) proposed an unsupervised and automated workflow for grape cluster detection using UAV photogrammetric point clouds and colour indices, achieving an R² value of 0.82 between harvest weight and the projected area of points classified as grapes. Considering a study by Hacking et al. (2019) on yield estimation per vine from 2D ground-based images, the authors achieved an R² of 0.88 and RMSE of 0.44 kg/vine on a weight range similar to that presented in this work. This confirms that the results obtained here are comparable to and slightly better than, those in the literature. Victorino et al. (2022) compared the accuracy of an image-based yield estimation approach with a manual method. They selected the most appropriate set of image variables to predict the yield bypassing the occlusions by leaves. The image-based yield estimates outperformed manual ones achieving an R2 equal to 0.86. The result is satisfactory but presents some limits related to the fully automated field applicability. In particular, the images were acquired by the “on-the-go” vehicle with a blue background behind each vine to estimate canopy porosity, and, also, each vine needed a static plastic scale under the cordon to identify the segment and to provide scaling information for image analysis.
To streamline the image processing procedures, a specialised user-friendly graphical interface was also created in this study. This tool enables the uploading of photos obtained from the UAV platform. It processes each image using the suggested method and presents the segmented grapes along with the estimated yield per plant data. Tags of various colours were strategically positioned in the field to streamline the process of correcting distorted images, measuring areas, and identifying vines. Nevertheless, if the vine spacing is already known, it is feasible to automate ROI extraction to circumvent the need for installing red tags in the field. The future aim of this project is to enhance the automatic correction of image distortion by utilising machine learning to identify items in the surroundings, such as vineyard poles.
Conclusion
In the present paper, we proposed an approach based on aerial image analysis to estimate yield in vineyards. The methods consist of acquiring RGB images using a UAV platform, processing the images to reduce perspective differences through a control point selection tool, and segmenting the grape pixels, which are then converted into yield per plant. The correlation between the yield from remote sensing data and ground truth measurements calculated using a calibration set of images were then tested on a validation set, showing satisfactory results with R2 values equal to 0.85 and 0.89, for the 2021 and 2022 seasons, respectively. Further improvements can be also reasonably gained by validating the image analysis workflow on other cultivars, and at the same time exploring different approaches, such as machine/deep learning tools for bunch pixels recognition and segmentation. Based on the obtained results, future perspectives include extending the method to other agricultural contexts and integrating UAV capabilities for large-scale analysis with other technologies, such as smartphones. This approach aims to create an easy-to-use tool enabling more targeted analyses, improved outcomes, and greater field applicability, while also reducing the need for defoliation treatments.
Acknowledgements
Authors are grateful to Marchesi Mazzei, Agricola Cennino, and Castello di Ama farms for having hosted the research, founded by DIGIVIT Project PSR 2014-2020 Regione Toscana. Also, the authors gratefully acknowledge Sigma Ingegneria for having developed the prototype Helyx-Zero.
References
- Antonelli, A., Cocchi, M., Fava, P., Foca, G., Franchini, G. C., Manzini, D., & Ulrici, A. (2004). Automated evaluation of food colour by means of multivariate image analysis coupled to a wavelet-based classification algorithm. Analytica Chimica Acta, 515(1), 3-13. https://doi.org/10.1016/j.aca.2004.01.005
- Aquino, A., Millan, B., Diago, M. P., & Tardaguila, J. (2018). Automated early yield prediction in vineyards from on-the-go image acquisition. Computers and Electronics in Agriculture. 144, 26–36. https://doi.org/10.1016/j.compag.2017.11.026
- Bellvert, J., Marsal, J., Girona, J., Gonzalez-Dugo, V., Fereres, E., Ustin, S. L., & Zarco-Tejada P.J. (2016). Airborne thermal imagery to detect the seasonal evolution of crop water status in peach, nectarine and saturn peach orchards. Remote Sensing. 8, 1–17. https://doi.org/10.3390/rs8010039
- Bruno, A., Moroni, D., Dainelli, R., Rocchi, L., Morelli, S., Ferrari, E., Toscano, P., & Martinelli, M. (2022). Improving plant disease classification by adaptive minimal ensembling. Frontiers in Artificial Intelligence, 5, 868926. https://doi.org/10.3389/frai.2022.868926
- Calvini, R., Orlandi, G., Foca, G., & Ulrici, A. (2020). Colourgrams GUI: A graphical user-friendly interface for the analysis of large datasets of RGB images. Chemometrics and Intelligent Laboratory Systems, 196, 103915. https://doi.org/10.1016/j.chemolab.2019.103915.
- Caruso, G., Palai, G., Tozzini, L., D'Onofrio, C., & Gucci, R. (2023). The role of LAI and leaf chlorophyll on NDVI estimated by UAV in grapevine canopies. Scientia Horticulturae, 322, 112398. https://doi.org/10.1016/j.scienta.2023.112398
- Castillo-Martínez, M. Á., Gallegos-Funes, F. J., Carvajal-Gámez, B. E., Urriolagoitia-Sosa, G., & Rosales-Silva, A. J. (2020). Colour index based thresholding method for background and foreground segmentation of plant images. Computers and Electronics in Agriculture, 178, 105783. https://doi.org/10.1016/j.compag.2020.105783.
- Comba, L., Biglia, A., Aimonino, D. R., & Gay, P. (2018). Unsupervised detection of vineyards by 3D point-cloud UAV photogrammetry for precision agriculture. Computers and electronics in agriculture, 155, 84-95. https://doi.org/10.1016/j.compag.2018.10.005
- Di Gennaro S.F., Toscano P., Cinat P., Berton A., & Matese A. (2019). A Low-Cost and Unsupervised Image Recognition Methodology for Yield Estimation in a Vineyard. Frontiers in Plant Science. 10:559. https://doi.org/10.3389/fpls.2019.00559
- Di Gennaro, S.F., Vannini, G.L., Berton, A., Dainelli, R., Toscano, P., Matese, A. (2023). Missing Plant Detection in Vineyards Using UAV Angled RGB Imagery Acquired in Dormant Period. Drones. 7, 349. https://doi.org/10.3390/drones7060349
- Dutta, M. K., Singh, A., & Ghosal, S. (2015). A computer vision based technique for identification of acrylamide in potato chips. Computers and Electronics in Agriculture, 119, 40-50. https://doi.org/10.1016/j.compag.2015.10.007
- Fernandez-Vazquez, R., Stinco, C. M., Melendez-Martinez, A. J., Heredia, F. J., & Vicario, I. M. (2011). Visual and instrumental evaluation of orange juice colour: a consumers’ preference study, J. Sensory Studies, 26, 436-444. https://doi.org/10.1111/j.1745-459X.2011.00360.x
- Foca, G., Masino, F., Antonelli, A., & Ulrici, A. (2011). Prediction of compositional and sensory characteristics using RGB digital images and multivariate calibration techniques, Analytica Chimica Acta, 706(2), 238-245. https://doi.org/10.1016/j.aca.2011.08.046Get rights and content
- Gago, J., Douthe, C., Coopman, R. E., Gallego, P. P., Ribas-Carbo, M., Flexas, J., Escalona, J., & Medrano, H. (2015). UAVs challenge to assess water stress for sustainable agriculture. Agricultural Water Management. 153, 9–19. https://doi.org/10.1016/j.agwat.2015.01.020.
- Giraudo, A., Calvini, R., Orlandi, G., Ulrici, A., Geobaldo, F., & Savorani, F. (2018). Development of an automated method for the identification of defective hazelnuts based on RGB image analysis and colourgrams. Food Control, 94, 233-240. https://doi.org/10.1016/j.foodcont.2018.07.018.
- Hacking, C., Poona, N., & Poblete-Echeverría, C. (2020). Vineyard yield estimation using 2-D proximal sensing: A multitemporal approach. Oeno One, 54 (4), 793–812. 10.20870/oeno-one.2020.54.4.3361
- Hacking, C.; Poona, N.; Manzan, N., & Poblete-Echeverría, C. (2019). Investigating 2-D and 3-D Proximal Remote Sensing Techniques for Vineyard Yield Estimation. Sensors, 19, 3652. https://doi.org/10.3390/s19173652
- Hamuda, E., Glavin, M., & Jones, E. (2016). A survey of image processing techniques for plant extraction and segmentation in the field. Computers and Electronics in Agriculture, 125, 184-199. https://doi.org/10.1016/j.compag.2016.04.024
- Íñiguez, R., Palacios, F., Barrio, I., Hernández, I., Gutiérrez, S., & Tardaguila, J. (2021). Impact of Leaf Occlusions on Yield Assessment by Computer Vision in Commercial Vineyards. Agronomy, 11, 1003. https://doi.org/10.3390/agronomy11051003
- Jackson, D.I., & Lombard, P.B. (1993). Environmental and Management Practices Affecting Grape Composition and Wine Quality-A Review. American Journal of Enology and Viticulture, 44, 409–430.
- Khojastehnazh, M., Omid, M., & Tabatabaeefar, A. (2010). Development of a lemon sorting system based on colour and size. African Journal of Plant Science, 4(4), 122-127.
- Khokher, M. R., Liao, Q., Smith, A. L., Sun, C., Mackenzie, D., Thomas, M. R., Wang, D., & Edwards, E. J. (2023). Early Yield Estimation in Viticulture based on Grapevine Inflorescence Detection and Counting in Videos. IEEE Access, 11, 37790 – 37808. 10.1109/ACCESS.2023.3263238
- Liao, J., Wang, Y., Zhu, D., Zou, Y., Zhang, S., & Zhou, H. (2020). Automatic segmentation of crop/background based on luminance partition correction and adaptive threshold. IEEE Access, 8, 202611-202622. 10.1109/ACCESS.2020.3036278.
- Liew, L.H., Wang, Y.C., & Cheah, W.S. (2012). Evaluation of control points distribution on distortions and geometric transformations for aerial images rectification. Procedia Engineering, 41, 1002 – 1008. https://doi.org/10.1016/j.proeng.2012.07.275
- Lopes, C. M., & Cadima, J. (2021). Grapevine bunch weight estimation using image-based features: comparing the predictive performance of number of visible berries and bunch area. Oeno One. 10.20870/oeno-one.2021.55.4.4741
- Lopes, J.F., Barbon, A.P.A.C., Orlandi, G., Calvini. R., Lo Fiego, D.P., Ulrici, A., & Barbon Jr.S. (2020). Dual Stage Image Analysis for a complex pattern classification task: Ham veining defect detection. Biosystems Engineering, 129-144. https://doi.org/10.1016/j.biosystemseng.2020.01.008.
- López-García, P., Ortega, J. F., Pérez-Álvarez, E. P., Moreno, M. A., Ramírez, J. M., Intrigliolo, D. S., & Ballesteros, R. (2022). Yield estimations in a vineyard based on high-resolution spatial imagery acquired by a UAV. Biosystems Engineering, 224, 227-245. https://doi.org/10.1016/j.biosystemseng.2022.10.015
- Lu, Y., Young, S., Wang, H., & Wijewardane, N. (2022). Robust plant segmentation of colour images based on image contrast optimization. Computers and Electronics in Agriculture, 193, 106711. https://doi.org/10.1016/j.compag.2022.106711
- Menozzi, C., Calvini, R., Nigro, G., Tessarin, P., Bossio, D., Calderisi, M., Ferrari, V., Foca, G., & Ulrici, A. (2023). Design and application of a smartphone-based device for in vineyard determination of anthocyanins content in red grapes. Microchemical Journal, 191, 108811. https://doi.org/10.1016/j.microc.2023.108811
- Noyce, P., Steel, C., Harper, J., & Wood, M. R. (2016). The basis of defoliation effects on reproductive parameters in vitis vinifera l. cv. chardonnay lies in the latent bud. American Journal of Enology and Viticulture, 67, 199–205. 10.5344/ajev.2015.14051
- Nuske, S., Wilshusen, K., Achar, S., Yoder, L., Narasimhan, S., & Singh, S. (2014). Automated Visual Yield Estimation in Vineyards. Journal of Field Robotics, 31, 837–860. https://doi.org/10.1002/rob.21541
- OIV. (2007). World Vitivinicultural Statistics 2007 – Structure of the World Vitivinicultural Industry 2007. Available at: http://news.reseauconcept.net/images/oiv_uk/Client/Statistiques_commentaires_annexes_2007_EN.pdf (accessed May 15, 2018).
- Orlandi, G., Calvini, R., Foca, G., & Ulrici, A. (2018a). Automated quantification of defective maize kernels by means of multivariate image analysis. Food Control, 85, 259-268. https://doi.org/10.1016/j.foodcont.2017.10.008
- Orlandi, G., Calvini, R., Pigani, L., Foca, G., Vasile Simone, G., Antonelli, A, & Ulrici, A. (2018b). Electronic eye for the prediction of parameters related to grape ripening. Talanta, 186, 381-388. https://doi.org/10.1016/j.talanta.2018.04.076
- Pérez-Zavala, R.; Torres-Torriti, M.; Cheein, F.A.; & Troni, G. A. (2018). Pattern Recognition Strategy for Visual Grape Bunch Detection in Vineyards. Computers and Electronics in Agriculture, 151, 136–149. https://doi.org/10.1016/j.compag.2018.05.019.
- Pieri, P., & Fermaud, M. (2005). Effects of defoliation on temperature and wetness of grapevine berries. Acta Horticulturae. 689, 109-116. https://doi.org/10.17660/ActaHortic.2005.689.9
- Poblete-Echeverría, C., Olmedo, G. F., Ingram, B., & Bardeen, M. (2017). Detection and segmentation of vine canopy in ultra-high spatial resolution rgb imagery obtained from unmanned aerial vehicle (UAV): a case study in a commercial vineyard. Remote Sensing. 9:268. https://doi.org/10.3390/rs9030268.
- Pôças, I., Paço, T. A., Paredes, P., Cunha, M., & Pereira, L. S. (2015). Estimation of actual crop coefficients using remotely sensed vegetation indices and soil water balance modelled data. Remote Sensing. 7, 2373–2400 https:/doi.org/10.3390/rs70302373.
- Pothen, Z., & Nuske, S. T. (2016). Automated assessment and mapping of grape quality through image-based colour analysis. IFAC-PapersOnLine, 49, 72–78. https://doi.org/10.1016/j.ifacol.2016.10.014
- Quevedo, R. A., Aguilera, J. M., & Pedreschi, F. (2010). Colour of salmon fillets by computer vision and sensory panel. Food and Bioprocess Technology, 3, 637-643 https://doi.org/10.1007/s11947-008-0106-6.
- Richards, J. A., & Jia, X. (2006). Remote sensing digital image analysis - An introduction, Fourth Edition, Springer-Verlag Berlin Heidelberg, Germany.
- Romboli, Y., Di Gennaro, S. F., Mangani, S., Buscioni, G., Matese, A., Genesio, L., & Vincenzini M. (2017). Vine vigour modulates bunch microclimate and affects the composition of grape and wine flavonoids: an unmanned aerial vehicle approach in a Sangiovese vineyard in Tuscany. Australian Journal of Grape and Wine Research, 23, 368–377. https://doi.org/10.1111/ajgw.12293
- Sabbatini, P., & Howell, G. S. (2010). Effects of early defoliation on yield, fruit composition, and harvest season cluster rot complex of grapevines. Horticultural Science, 45, 1804–1808. https://doi.org/10.21273/HORTSCI.45.12.1804
- Santesteban, L. G., Di Gennaro, S. F., Herrero-Langreo, A., Miranda, C., Royo, J. B., & Matese, A. (2017). High-resolution UAV-based thermal imaging to estimate the instantaneous and seasonal variability of plant water status within a vineyard. Agricultural Water Management, 183, 49–59. https://doi.org/10.1016/j.agwat.2016.08.026.
- Shoaib, M., Hussain, T., Shah, B., Ullah, I., Shah, S. M., Ali, F., & Park, S. H. (2022). Deep learning-based segmentation and classification of leaf images for detection of tomato plant disease. Frontiers in Plant Science, 13. https://doi.org/10.3389/fpls.2022.1031748
- Torres-Sánchez, J.; Mesas-Carrascosa, F.J.; Santesteban, L.-G.; Jiménez-Brenes, F.M.; Oneka, O.; Villa-Llop, A.; Loidi, M.; & López-Granados, F.(2021). Grape Cluster Detection Using UAV Photogrammetric Point Clouds as a Low-Cost Tool for Yield Forecasting in Vineyards. Sensors, 21, 3083. https://doi.org/10.3390/s21093083.
- Ulrici, A., Foca, G., Ielo, M. C., Volpelli, L. A., & Lo Fiego, D. P. (2012). Automated identification and visualization of food defects using RGB imaging: Application to the detection of red skin defect of raw ham. Innovative Food Science & Emerging Technologies, 16, 417-426 https://doi.org/10.1016/j.ifset.2012.09.008
- Victorino, G., Braga, R. P., Santos-Victor, J., & Lopes, C. M. (2022). Comparing a New Non-Invasive Vineyard Yield Estimation Approach Based on Image Analysis with Manual Sample-Based Methods. Agronomy, 12(6), 1464. https://doi.org/10.3390/agronomy12061464
- Victorino, G., Braga, R., Santos-Victor, J., & Lopes, C. M. (2020). Yield components detection and image-based indicators for non-invasive grapevine yield prediction at different phenological phases. Oeno One.
- Wani, J. A., Sharma, S., Muzamil, M., Ahmed, S., Sharma, S., & Singh, S. (2022). Machine learning and deep learning based computational techniques in automatic agricultural diseases detection: Methodologies, applications, and challenges. Archives of Computational Methods in Engineering, 29(1), 641-677. https://doi.org/10.1007/s11831-021-09588-5