Wine aging authentication through near infrared spectroscopy: a feasibility study on chips and barrel-aged wines

Aim: This research primarily focuses on exploring the suitability of near infrared (NIR) spectroscopy with multivariate data analysis as a tool to classify commercial wines depending on the aging process. It is aimed at discriminating between wines aged in barrels and those obtained using alternative products (chips). Methods and results: Around 75 commercial barrel-aged red wines issued from the appellation “Valpolicella” (Italy) were analyzed. Moreover, 15 wines were aged at the experimental winery of the Research Centre of Viticulture and Enology in Asti using different types of commercial oak chips. Wines were analyzed in transmittance using NIR regions of the electromagnetic spectrum. Principal component analysis (PCA) and partial least squares (PLS) analyses were used to classify wines: a preliminary step was carried out using PCA that showed interesting groups in the whole data set. Next, in order to test if combined explanatory variables made it possible to discriminate treatments and how they are useful to predict which group a new observation will belong to, an orthogonal partial least squares discriminant analysis (OPLS-DA) was carried out. Several wine groups were considered, defined by factors including the aging process, the type of oak used for aging (wood barriques, barrels or chips) and the wine typologies (differing for some enological parameters). Conclusions: Overall, OPLS-DA models correctly classified >90 % of the wines. These results demonstrate the potential of combining spectroscopy with chemometric data analysis as a rapid method to classify wines according to their aging process. Nevertheless, the development of a mathematical model for predictive purposes is a complex task: indeed, large databases for different wines should be constructed, and other spectral IR zones might be evaluated for improving the method performance in determining wine aging process. Significance and impact of the study: This study contributes to the development of an easy-to-use and easily applicable NIR method for correlating the infrared “fingerprint” spectrum with the aging process in wines, aimed at implementing a technique able to discriminate wines aged with different wood types, that can be progressively used in the laboratory for routine fraud inspection.


INTRODUCTION
Wine is generally sold with labeling information regarding its characteristics -including the grape variety used, the appellation or grape growing region, the vintage or age -and price and consumer expectations are often determined based on these data. This information often implies facts about the aging process that the wine underwent, either implicit (as contained in the appellations regulations and vintage indication) or explicit (description of the aging process). Some recent and high-profile examples highlight the need for improvements in our ability to authenticate the labeling information for any given bottle of wine (Takeoka and Ebeler, 2011).
Indeed, the different frauds threatening the wine market include the replacement of the wine declared on the label with another of lesser value. Failure to fulfill the wine appellation regulations, and allowing a less valuable wine to be put on the market in place of a quality one, is one of these cases (Takeoka and Ebeler, 2011). For many wines, these rules also constrain to a specific aging time in wood and a specific aging mode (barriques, barrels, etc.) that is not necessarily respected. This happens because the traditional aging process is associated with high financial costs, due to the price of the containers and the long refinement time (Pérez-Coello and Díaz-Maroto, 2009). In this context, the use of oak alternatives provides the winemaker with a way to lower the costs related to traditional refinement in barrels while adding to their wine a woody touch without the need to use barrels. Without proper regulation, this could lead to fraud if such wine is offered as barrel-aged wine (Cano-Lopez et al., 2008;Ortega-Heras et al., 2010), as false use of quality indications by unauthorized parties is detrimental to consumers and legitimate producers.
In Europe, the use of oak fragments to macerate wines is an alternative to oak barreling, and was regulated for the first time in 2005 after a long debate. This enological practice is approved by EU regulations (CE) No 2165/2005 and(CE) No 1507/2006, which define the terms of use of oak fragments in wine, and the current wine Regulation 606/2009 (Appendix IX and XVI) also encompasses the use of chips in winemaking. Nevertheless, European regulations on wine do protect specific labeling (for example, protected origin or "reserve") for wines that have obtained exclusively through aging in barrels. In addition, the OIV (International Organisation of Vine and Wine) resolutions on this matter explicitly forbid wines with particular indications to be treated with wood fragments, as the enological CODEX clearly differentiates between the two methods of "aging in small capacity wooden containers (OENO 8/01)" and "usage of pieces of oak wood in winemaking (OENO 9/01)" (OIV, 2006). Moreover, each member state has the right to restrict EU authorizations on its own national territory. For example, in Italy the use of wood alternatives as chips and staves is currently only allowed on common wines and IGT (Indicazione Geografica Tipica), but not on DOP (Denominazione di Origine Protetta) wines with appellation categories DOCG or DOC (Denominazione di Origine Controllata e Garantita and Denominazione di Origine Controllata), equivalent to the EU Protected Designation of Origin (PDO).
Beyond regulations on compliance by wine producers, an open question is the reliability of a method for distinguishing barrel-aged from chips-refined wines that could be affordably applied in routine fraud control. To date, as far as the authors know, there is still a lack of quick and inexpensive screening methods able to assess the veracity of statements regarding the duration or manner of aging (Versari et al., 2014). Therefore, cost-effective, fast and non-invasive analytical tools should be implemented in order to distinguish between these two types of treatments and avoid possible frauds. The investigation presented herein was undertaken with the aim of exploring the suitability of near infrared spectroscopy (NIR), coupled with multivariate analysis, as a rapid, simple and economical method for the discrimination between wines aged with different methods.
Among other analytic techniques, infrared spectroscopy coupled with multivariate data analysis was chosen as an interesting tool because it is used for the determination of oak volatile compounds in wine (Garde-Cerdán et al., 2010Zhang et al., 2010), for classifying barrels (Michel et al., 2013) and tannins (Ricci et al., 2016), and it has recently been proposed for discriminating wines aged in different types of wood containers and for different time periods (Basalekou et al., 2017;Ferreiro-González et al., 2019;Sanchez-Gomez et al., 2019). Nevertheless, an application for discriminating between traditional wood aging and alternatives is still to be studied. In this regard, it is necessary to identify the discriminating algorithms and create robust databases (Sanchez-Gomez et al., 2019) in order to have a method that can be used routinely in the laboratory and can quickly acquire certain information about the type of wood used for the aging of wines. To this purpose, a research project was jointly carried out by the CREA Viticulture and Enology Research Centre, with the Italian Central Inspectorate for Quality Protection and Fraud Prevention (ICQRF).
The present study was carried out in order to complete and contextualize for the Italian wine scene various results obtained by foreign research groups on similar topics (Basalekou et al., 2017;Ferreiro-González et al., 2019;Sanchez-Gomez et al., 2019), and around 90 wines belonging to the DOC Valpolicella (Regione Veneto -Disciplinari vini DOCG, DOC e IGT, 2013) were tested by NIR. Full spectra were acquired, and a PLS algorithm was trained as a screening method in order to obtain a regression model that would enable the correlation of profile spectra with the aging process that a wine underwent.

MATERIALS AND METHODS
A total of 73 commercial barrel-aged wines were analyzed, and 15 wines were aged at laboratory scale using different types of commercial oak chips on a Valpolicella base wine. Aging experiments were carried out at the experimental winery of the Research Centre for Viticulture and Enology in Asti, and all the analyses were carried out at Central Inspectorate for Quality Protection and Fraud Prevention of Conegliano (TV).

Commercial wine samples
In total, 88 wines were analyzed; 40 wines were collected and retrieved by sampling directly at the wineries, from the aging containers or from bottles under storage at the cellars. One single appellation, Valpolicella, was chosen for this trial, in order to analyze wines that shared the same grape varieties but with different vinification techniques, aging period and analytical parameters (see Table 1) according to the wine category (Amarone della Valpolicella DOCG, Ripasso Valpolicella DOC, Valpolicella DOC). All wines were obtained from producers whose refining methods were ascertained and verified. Another 33 wines were purchased on the market. Aging procedure thereof is declared by the producers and guaranteed by the DOCG regulation verified by the consortium (Regione Veneto, Disciplinari vini DOCG, DOC e IGT, 2013). Finally, 15 wines were artificially refined with chips at the Research Centre experimental winery, as detailed in the next paragraph, and artificial treatment was necessary because oak chips are not allowed by DOC regulations. Table 1 lists the distribution and characteristics of the 88 wines tested.

Oak chips used and protocol employed for the aging trial
Different chips from three companies were procured, encompassing a significant number of alternative products with different source wood (French oak or American oak), granulometry (7-15 mm pieces or 2-7 mm granules), and different toasting methods (toasted and untoasted; different roasting degrees within the toasted chips were not considered because the classification of the declared toasting level is not homogeneous among different producers). In detail, 15 samples of chips were used, and the characteristics thereof are listed in Table 2.
For aging assays on finished wines, tanks of 5 L capacity were filled with one Valpolicella DOC wine and the different types of oak fragments were added in 3 g/L doses. The wines and fragments were in contact for 45 days (6 weeks) in the cellar, at 18 °C, after which the oak fragments were removed by racking and the wine was bottled by 0.75 L for analysis.  *Aging in wood containers (barriques or larger barrels), according to the appellation regulation; **samples artificially refined with chips at experimental winery (use of chips is not permitted for DOC commercial wines).

Near infrared scanning
Near infrared (NIR) spectra were recorded in transmittance mode on an MPA Bruker Near infrared spectrometer (Bruker Optik GmbH, Germany) equipped with an TE-InGaAs detector, in the range 11500-4000 cm -1 at a temperature of 30 °C, using 1 mL volume and 6 mm internal pathlength clear glass vials, sealed with a polyethylene snap caps. For each sample, 32 scans were recorded with a spectral resolution of 4 cm -1 and then averaged. A preliminary analysis of spectra using the Instrumental software (OPUS 6.5, Bruker Optik GmbH, Germany) allowed us to identify the ranges that were useful for processing with further chemometrics analysis. The ranges 11255-7183 cm -1 and 6372-5410 cm -1 were chosen, because they were not saturated by water absorbance.

Statistical analysis
Statistical treatments were carried out using SIMCA 15.0.2 software (Umetrics, Sartorius, Sweden); other analyses and related graphic representations were using the statistical freeware software PAST (Hammer, Harper, and Ryan, 2001).
The chemometric analysis was based firstly on a principal component analysis (PCA) to investigate interrelationships between samples and to unveil patterns and trends. Next, an orthogonal partial least squares (OPLS) regression (Wold, 1966;Barker and Rayens, 2003) was applied to derive the models to be applied to a discriminant analysis (OPLS-DA), as previously proposed also by other authors in the same investigation field (Garde-Cerdán et al., 2010;Tao et al., 2012). The PCA method extracts the most relevant information from an X dataset by projecting it into latent variables (LVs), by a linear combination of the original variables, thus reducing high-dimensional and strongly correlated original datasets. Outliers were identified using the automated SIMCA algorithm based on residuals, deviations between the data and the PC model, named DModX (Bylesjö et al., 2006).
This led to three different datasets fed to the OPLS analysis: 1) an ensemble dataset with one outlier removed; 2) a subset of Valpolicella wines with one outlier removed; 3) a dataset divided into three groups with one outlier removed. Hence, in the performed analysis, the total number of samples varied between 89 and 30, with nearly two-thirds used for modeling (calibration) purposes and one-third for validation.

RESULTS AND DISCUSSION
The work was based on the collection of a number of wine samples whose origin was certain, including any information regarding the type of  refinement the wine had undergone. The aim was to obtain a robust database that would allow the development of a reliable discriminating statistical method for the different types of wood used.
Commercial wines aged in barriques, barrels and in steel tanks were recovered (Table 1) and compared with wines aged ad hoc with 15 different types of oak chips (Table 2).

Wines differences and aging: the dataset at a glance
PCA was used to provide an overview of a data table to reveal dominating variables, trends and to find outliers. Therefore, a PCA-X analysis was carried out, regarding the independent variables (X) for the identification of dependable trends and possible X outliers. The main goal of this analysis was to indicate if any of the principal components describing the whole dataset variance were possibly related to the type of refinement that wines underwent.
The first PCA analysis was performed on the X dataset (wavelengths values) with no outliers removed. The first, second, third and fourth principal components (or LVs) explained 99.9 % of the X dataset variance. Furthermore, this PCA analysis allowed recognizing patterns, mainly associated with component 2, and also identifying one outlier (data not shown), subsequently removed. A second PCA analysis was then performed on the X dataset with outliers removed (Figure 1); the first, second, third and fourth principal components (or LVs) still explained 99.7 % of the X dataset variance. This analysis also allowed identifying different clusters, as depicted in Figure 1, where component two is reported with the aid of component four (together accounting for more than 40 % of total variation).
It is worth noting that wines artificially refined with chips were mostly found in a different sector than wines refined with barriques and larger barrels (Amarone and, to some extent, also Ripasso), and also from Valpolicella unoaked wines. This last finding was particularly promising, as all the Valpolicella wines (treated or not with oak chips) shared the main chemical characteristics (see Table 1) and enological production practices: these wines only differed for the oak chips refinement impact, that was therefore caught by the chemometric method under test. Another general finding arising from the data overview was that different chips, although giving different results, led to the production of wines that were still grouped together.

Wine differences and aging: discriminant analysis
The orthogonal partial least squares discriminant analysis (OPLS-DA) was carried out with two main objectives. The first objective was descriptive: to find new discriminating variables obtained from the combination of original variables, so that the projections of predetermined observation classes are well separated in the new variable space. The second objective was decision-oriented: to define a rule for assigning new individuals (measured in the same way as the sample or training set) to one of the predetermined classes (Genisheva et al., 2018).

Focus on chips-refined wines
As a first step, an OPLS-DA analysis was carried out on data referring to the Valpolicella wines only: unaged wines and experimental wines treated with chips ( Figure 2). Figure 2 shows how the new discriminating variables, obtained from the orthogonal projections of combined original variables, allow for clearly separating the observations so that predetermined classes are well separated in the new variable space. This result, although obtained on a smaller and simplified data set, was encouraging for the application of the method for further analysis, as the impact of 15 different chips was clearly detectable by the NIR method and distinguishable from non-refined wines. This supports and widens the results obtained in a previous work based on infrared spectroscopy (Basalekou et al., 2017), in which FT-IR was applied on four different Greek wines and chips refinement was detected (although one single oak chip product was applied to each wine).

Analysis on all wines
As a further step, the OPLS-DA analysis was carried out on the whole data set, in order to explore the potential of this chemometric approach to discriminate chips-refined wines not only from untreated wines, but also from barrel-aged wines.
Three sets of wine data were then identified: a training set consisting of 55 samples to estimate model parameters; an internal validation set of 29 samples to assess the predictive ability of the obtained models; and an external, independent test set of 33 samples to test the model predictive ability. All the wines belonging to training and validation sets had certain wood aging information (sampled at wineries or processed at the Research Centre experimental winery), whereas the external validation set was constituted by commercial wines whose aging process was acquired through declaration. The training set samples were then divided into three different predetermined groups: non-aged wines; wines aged in barrique or larger barrels; and wines refined with chips.
Choice of the explanatory variables to be adopted for the realization of the OPLS-DA model was automatically implemented through a programmed stepwise forward procedure by the FIGURE 2. OPLS-DA analysis on Valpolicella wines. Score plot showing two subgroups of samples according to the aging processes: 1) green, no wood aging; 2) blue, refined with chips.
SIMCA software. R2X(cum) of the model was assessed at 0.999, and R2(cum) of the two first predictive projections was 0.748 (Eriksson, 1999;Bylesjö et al., 2006). Figure 3 shows how the graphic representation of different types of treatment makes it possible to distinguish between traditionally refined wines (barrique/larger wood barrels) and those aged with chips or unaged.
The overall performance of the OPLS-DA is appreciable by observing the confusion matrixes. The cross-validation matrix is represented in Table 3, where the diagonal (bold) indicates the correctly identified samples. According to the purpose of the study, Amarone and Ripasso wines were grouped together in the same category, as they were aged in barriques or larger barrels. The cross-validation was performed to assess the accuracy of the classification and predictive ability, considering 29 random samples within the training set. Cross-validation verifies the placement within groups when, one at a time, individual elements are evaluated with the classification algorithm obtained by excluding them from the prediction sample. The confusing matrix summarizes the classification of cross-validation samples by the classification algorithm and makes it possible to quickly appreciate the percentage of wellclassified observations: 96.55 %.
The OPLS discriminating analysis was finally tested on an independent data set, to assign to a predicted category (barrel-aged or unrefined wines) samples purchased on the market and to compare the result with the information declared by the producers and guaranteed by the Consorzio Vini Valpolicella. The analysis was performed on a sample group made up of 19 Amarone, four Ripasso and ten Valpolicella (DOCG and DOC wines) commercially available wines, and produced the following results: 90.1 % of DOC and DOCG wine samples were assigned to the correct   3) red, aged in barriques or larger barrels.
classes as declared: 22/23 wines were assigned to the barrique/barrel category (as provided by the disciplinary) while eight of the ten wines that did not undergo wine aging (as provided by the disciplinary) were correctly classified as unoaked.
In this context, it is worth noting that this is the first study in which such a technique has been used to analyze several real wines, coming from different wineries and collected at different refinement stages (from cellar samples to commercial bottles) for discrimination purposes: indeed, previous works either compared differently-aged samples issued from the same wine masses (Basalekou et al., 2017;Sanchez-Gomez et al., 2019) or wines from only one winery (Ferreiro-González et al., 2019). Certainly, further studies would be needed to validate the potential of the technique in order to improve this tool for wine authentication, enlarging both the array of wine types (e.g. testing different appellations and grape varieties), and the range of oak alternatives (including staves, cubes, blocks and other size of products) to be tested, possibly in presence of micro-oxygenation (Rubio-Bretón et al., 2018;Sáenz-Navajas et al., 2018;del Alamo-Sanza et al., 2019).

CONCLUSIONS
Considering the data as a whole, it is possible to infer some preliminary remarks on the relations between the NIR-based spectroscopic and chemometric analysis, and the aging process that some Italian wines underwent. Indeed, the present study was carried out in the frame of a project aimed at developing a suitable, easy-to-use and easily applicable NIR method for correlating the infrared "fingerprint" spectrum with the aging process in wines. This feasibility study was the first step in implementing a technique for discriminating wines aged with different wood types that can be progressively used in laboratories for routine fraud inspection. Although this approach can be further improved in the future, promising perspectives arise from a preliminary OPLS-DA application in classifying wines. At this regard, >96.5 % of correct answers in the internal validation check and >90 % in an external test check were obtained.
The significance of the study is still introductory because of the restricted data set (only one wine appellation was chosen for the feasibility study) and the sub-optimal IR spectral range analyzed: some IR zones close to mid-infrared, previously correlated to xylovolatiles and other aging-derived compounds (Garde-Cerdán et al., 2010;Ricci et al., 2016;Basalekou et al., 2017) were not measured due to instrument limitations. However, the promising results show the potential of IR spectroscopy and chemometric analysis for discriminating wines issued from different aging processes. This is consistent with previous research that, although performed on wines from a different geographic origin, also suggested similar conclusions (Basalekou et al., 2017;Ferreiro-González et al., 2019;Sanchez-Gomez et al., 2019).
In the future, there may be a need for widening the spectral range of infrared spectroscopy (through acquisition of a new optical model for the instrument) for acquiring the data on spectral zones related to xylovolatiles concentrations in wines (Garde-Cerdán et al., 2010;Basalekou et al., 2017), in order to implement a robust method discriminating wines treated or not with chips. Indeed, in the frame of the same project, GC-MS analyses were also performed on a subset of 30 wines analyzed in this work (Valpolicella wines, treated and untreated with chips, as described in Petrozziello et al., article submitted to OenoOne, same special issue), to investigate a possible improvement of the method based on coupling GC and NIR results.
In conclusion, this feasibility study showed that the described tool based on IR spectroscopy and chemometric analysis can be useful as a preliminary step for orienting institutional inspection activities in fraud control.