Stepwise linear discriminant analysis to differentiate Spanish red wines by their Protected Designation of Origin or category using physico-chemical parameters

Aim: The aim of this work was to determine the physico-chemical variables that differentiating red wines from the “Castilla y León” Spanish region by their Protected Designation of Origin (PDO) and wine category (“young”, “oak”, “crianza”, or “reserve”). Methods and results: A total of 135 commercial red wines from four Spanish PDOs in the region of Castilla and León were analysed. Forty physico-chemical parameters, related to classical enological parameters, phenolic and polysacharidic composition, and content of higher alcohols were evaluated. Differences in physico-chemical composition were found in red wines from different PDOs and different categories. Stepwise linear discriminant analysis (SLDA) was applied to find a linear combination of the physico-chemical variables that separate and classify the red wines according to the PDO or category. One SLDA model selected 15 physico-chemical variables that allowed for good discrimination and classification of the wines from different PDOs. The SLDA model selected seven variables for wine category differentiation, but only allowed for good discrimination between young wines and aged wines (“crianza” and “reserve”). Conclusions: The variables that contributed most to the separation of Tempranillo red wines were total polyphenols, total tannins, and absorbance values at 230 nm and 280 nm. The polysaccharides with an average molecular weight of 10 kDa, flavanols, stilbenes and 2-methyl-1-butanol were those most associated with the differentiation of the wines elaborated with the Mencía grape variety. The percentage of polymeric anthocyanins and absorbance at 230 nm could be proposed as good indicators for aged wines, and total tannins for young wines. Significance and impact of the study: This study provides improved knowledge of the physico-chemical variables that could be used as markers of the origin of wines and/or the grape variety ( Tempranillo and Mencía ) and that allow differentiating young wines from those aged for a long time.


INTRODUCTION
Wine is widely known as a complex matrix composed mainly of water and ethanol and, to a lesser extent, a large number of chemical compounds such as phenolic compounds, polysaccharides, non-fermentable sugars, organic acids, glycerol, volatile compounds, etc.The content of these compounds has an important role in the quality of wines, which can vary depending on the grape variety and the enological technology used in the winemaking process, and several environmental aspects such as soil, geographical location, and weather conditions (Riu-Aumatell et al., 2002;Monagas et al., 2005a;Robinson et al., 2012;Serrano-Lourido et al., 2012).
Wine phenolic compounds play an important role in the sensory properties and have been proposed as chemical markers of the geographical origin of grapes (Makris et al., 2006).Anthocyanins are directly responsible for color in grapes and young wines (Glories, 1984).Monomeric anthocyanins are progressively transformed into more stable oligomeric and polymeric pigments during wine aging (Monagas et al., 2005b) due to several chemical reactions producing changes in the color of the wine.Other phenolic compounds such as flavanols, flavonols, hydroxycinnamic and hydroxybenzoic acid derivatives can contribute to modifying the different sensory properties of red wines, such as astringency, bitterness, structure and color (Gawel, 1998;Schwarz et al., 2005;Hufnagel and Hofmann, 2008;Sáenz-Navajas et al., 2010).
Wine polysaccharides are compounds that come from the grapes and yeasts involved in the fermentation process and they can have a positive effect on the technological and sensory characteristics of wines.They can interact with phenolic compounds, reducing wine astringency and bitterness (del Barrio-Galán et al., 2011;González-Royo et al., 2013;del Barrio-Galán et al., 2015), and improving mouthfeel (Vidal et al., 2004) and stabilization (Poncet-Legrand et al., 2007) of wines.However, some types of wine polysaccharides can also have several negative effects on the technological process of winemaking, such as the filtration process (Belleville et al., 1991).
Other wine compounds, such as organic acids, ethanol, glycerol and higher alcohols, can make an important contribution to wine characteristics.The main organic acids present in red wines are tartaric acid and lactic acid which are mainly responsible for acidity (Mato et al., 2005), and have an important role in physico-chemical stabilization, and the balance and sensory perception of wines (Silva et al., 2015).The ethanol content depends on the sugar content of grapes and influences the sensory characteristics of wines, increasing the bitterness and reducing the astringency (Fontoin et al., 2008;Rinaldi et al., 2012).Glycerol can also have an influence on several sensory properties of wines, such as sweetness and body (Noble and Bursick, 1984), and is mainly produced by yeast during alcoholic fermentation (Remize et al., 2001).Finally, higher alcohols are formed during fermentation, and could have a positive or negative effect on wine sensory properties depending on their concentration (de la Fuente-Blanco et al., 2017).
The identification of a wine's geographical origin has an important commercial role in the wine industry (Urvieta et al., 2018).This is an important factor that consumers consider (Famularo et al., 2010), and many are highly oriented to the consumption of high-quality wines (Urvieta et al., 2018).Wine category (young or aged) also influences consumer choice; during oak barrel and bottle aging, the structure of the phenolic compounds changes and the physico-chemical and sensory properties of the wines can modify the quality.
Various studies have focused on differentiating wines geographical regions and country, according to their physical-chemical and sensory parameters and through using multivariate statistical tools (Pérez-Magariño and González-San José, 2001;Cliff et al., 2007;Riovanto et al., 2011;Serrano-Lourido et al., 2012).Other studies have focused on monitoring the aging time of red wines using different physico-chemical parameters (Agazzi et al., 2018;Astray et al., 2019).However, to our knowledge this is the first study that uses wines from several important Spanish Protected Designations of Origin (PDOs) that are very close geographically.Therefore, the aim of this work was to determine the physicochemical parameters that differentiate the red wines from four PDOs in Castilla y León and those PDO red wines by their category ("young", "oak", "crianza" and "reserve").

Wines
A total of 135 commercial red wines from four Spanish PDOs in the region of Castilla y León (north Spain), were analyzed: 76 wines from Ribera del Duero (RD), 21 wines from Bierzo (BI), 22 wines from Toro (TO) and 16 wines from Cigales (CI).These wines were from different vintages (2003 to 2016).
There are certain specifications from the Regulatory Councils of each PDO.The wines from RD and TO must be elaborated with at least 75 % Tempranillo grape variety; the wines from RD can be blended with other varieties, such as Cabernet-Sauvignon, red Grenache, Malbec, Merlot and Albillo, and those from TO can only be blended with the red Grenache grape variety.The wines from CI must be elaborated with at least 50 % Tempranillo and/or red and gray Grenache and can then be blended with other authorized grape varieties (Cabernet-Sauvignon, Merlot and Syrah).Finally, the wines from BI must be elaborated with at least 70 % Mencía grape variety and can be blended with Tintorera Grenache.
The wines from each PDO were classified into four categories: 1) "young" -not aged in oak barrels; 2) "oak" -aged in oak barrels for more than than 3 months; 3) "crianza" -a minimum aging period of 24 months, with at least 12 of these months in oak barrels; 4) "reserve" -a minimum aging period of 36 months, with at least 12 of these months in oak barrels.The remaining aging time for "crianza" and "reserve" wines must be done in the bottle.Table 1 shows the number of wines in each category for each PDO.
Mayor volatile compound standards were supplied by Fluka, Sigma-Aldrich and Alfa Aesar (Lancashire, UK).Organic acid standards were supplied by Panreac (Madrid, Spain).Glucose, fructose and glycerol standards were supplied by Sigma-Aldrich.
The ethanol for high performance liquid chromatography (HPLC) analyses was provided by Panreac, and the ethanol 96 % was from Labkem (Spain).Acetonitrile and methanol for HPLC analyses were supplied by Carlo Erba (Sabadell, Spain) and the remaining reagents by Panreac.Water Milli-Q was obtained through a Millipore system (Bedford, MA).

Analytical methods
Ethanol, total and free SO 2 , total acidity (TA) and pH were determined according to the official methods of OIV (2015).
The color intensity, tonality and percentage of blue tones (% blue) were evaluated as indicates in Glories (1984).The total polyphenols (TP) (expressed in mg/L of gallic acid) and total anthocyanins (expressed in mg/L of malvidin-3glucoside) were analyzed according to the methods described in Pérez-Magariño et al. (2009).Total tannins (TT) (expressed in mg/L of catechin) were analyzed according to methods in Mercurio et al. (2007) and polymeric anthocyanins (polymeric ACY) (expressed in percentage) according to Levengood and Boulton (2004).Absorbances at 230 nm and 280 nm (A230 and A280) were also measured because of their correlation with the phenolic content of the wines.These absorbances were measured with a quartz cuvette (1 cm of path length) after sample dilution 1:400 with Milli-Q water.These physico-chemical parameters were all measured using an UV/Vis Agilent Cary 60 spectrophotometer (Santa Clara, California, USA).
Low molecular weight phenolic compounds were analyzed using High Performance Liquid Chromatography coupled to a diode array detector (Agilent Technologies 1100 Series, HPLC-DAD system).The samples were directly injected following the chromatographic conditions described in Pérez-Magariño et al. (2008).The extraction and quantification of soluble polysaccharides (expressed in mg/L of dextrans) was carried out following the methodology described by Guadalupe et al. (2012).These compounds were analyzed using high performance size exclusion chromatography coupled to a refractive index detector (Agilent Technologies 1100 Series, HPSEC-RID system), using two Shodex columns in serie: (OHpak SB-803 HQ and OHpak SB-804 HQ; 300 mm × 8 mm i.d.; Showa Denko, Tokio, Japan).Seven analytical standards of dextran from Leuconostoc mesenteroides were used for the molecular weight calibration.Dextran with a 270 kDa molecular weight and one pectin (esterified potassium salt from citrus fruit) were used as external standards for quantification.This methodology makes it possible to separate four polysaccharide fractions according to their molecular weight: PS1 (polysaccharides with a molecular weight average of 150 kDa); PS2 (polysaccharides with a molecular weight average of 60 kDa); PS3 (polysaccharides with a molecular weight average of 10 kDa); and PS4 (polysaccharides with a molecular weight average of 7 kDa).
Organic acids (tartaric, malic, lactic and acetic acids), glucose, fructose and glycerol were analyzed according to the methodology described in Monteiro-Coelho et al. (2018), using HPLC coupled to a DAD and a RID, with some modifications.Briefly, 2 mL of wine was filtered through an 0.45 μm PVDF filter and a volume of 20 μL was injected.The column was an AMINEX HPX-87H (300 × 7.8 mm) with internal particles of 9.0 μm (Bio Rad, California, USA).The flow rate applied was 0.6 mL/min using 4.0 mM H 2 SO 4 as a mobil phase.The temperature of the column oven was maintained at 65 °C and the RID flow cell was kept at 30 °C.The quantification was carried out at 205 nm for organic acids and by RID for glucose, fructose and glycerol.

Statistical analyses
Stepwise linear discriminant analysis (SLDA) was applied to find a linear combination of the variables that characterize or separate two or more classes of objects.In this study, the forward method was used to select the variables most useful for differentiating the wines according to PDO or wine category.This procedure begins with no variables in the model and adds the variables with the highest discriminant power.The selection of variables carried out by the model was done using the F statistic (minimum F value = 4).The goodness of the prediction capacity of the SLDA discriminant models was evaluated by cross-validation in four steps, and in each step excluding 25 % of the cases.An analysis of variance (ANOVA) and a least significant difference test (LSD) at a significance level of p < 0.05 was performed, with the physico-chemical variables selected by the SLDA models according to PDO and category criteria for explaining significant differences in the content of the different wines.
The statistical analyses were carried out with standarized data, using the Statgraphics Centurion XVII statistical package.

RESULTS AND DISCUSSION
Forty physico-chemical variables were used in the SLDA to determine which ones allow to separating the red wines according to their PDO and their category.The variables included were as follows: three color parameters (color intensity, tonality and percentage of blue tones); four phenolic groups (TP, TT, total anthocyanins, polymeric ACY); seven groups of low molecular weight phenolic compounds (hydroxybenzoic and hydroxycinnamic acids, tartaric esters of hydroxycinnamic acids, flavanols, flavonols and derivatives, phenolic alcohols and stilbenes); two TP: total polyphenols.TT, total tannins; PS3 (polysaccharides with a molecular weight average of 10 kDa) 2-met-1-but, 2-methyl-1-butanol; 3-met-1-but, 3-methyl-1-butanol.
The best model obtained by the SLDA selected the 15 physico-chemical variables that most contributed to the differentiation of wines from different PDOs.The variables selected were as follows (from highest to lowest discriminant power according to the F statistic values): flavonols and derivatives, flavanols, PS3, ethyl acetate, lactic acid, stilbenes, acetic acid, 3-methyl-1-butanol, A230, total tannins, A280, total polyphenols, 1-propanol, diacetyl, and 2-methyl-1-butanol.
The highest discriminant power of flavanols and flavonols are in agreement with the results reported by Makris et al. (2006) as good indicators for differentiating red wines from Greece according to their geographical origin.
The SLDA selected three discriminant functions that explained the total variance.By using the three discriminant functions it was possible to separate the wines according to their PDO.Figures 1a and  1b show the scores and loadings represented in the plane of the two first discriminant functions, which explained 89.2 % of the total variance.As can be seen in Figure 1a, the first discriminant function, which explained 55.4 % of the total variance, allowed a separation between the wines from RD and BI (located on the left of the plane) and the wines from TO and CI (located on the right of the plane).Thus, the physico-chemical variables on the left of the plane were more associated with the separation of the wines from RD and BI, as A230 contributed most to this separation (Figure 1b).The ANOVA results show that the value of A230 was significantly higher in the wines from RD than in the others (Table 2).The variables on the right of the plane (A280, TP, TT, 3-methyl-1-butanol and acetic acid) were more associated with the separation of the wines from TO and CI.The A230 and A280 measurements have been commonly used for a rapid determination of the total phenolic content in wines (Kennedy et al., 2006;Boulet et al., 2016) due to their high correlation with these wine compounds.Nevertheless, the relative contribution of A230 with tannins, other polyphenols, or other wine compounds contents, are not clear yet (Boulet et al., 2016).
The second discriminant function, which explained the 33.8 % of total variance, made it possible to distinguish the wines from BI from the other PDOs.The physico-chemical parameters that most contributed to the separation  TP, total polyphenols; TT, total tannins; PS3 (polysaccharides with a molecular weight average of 10 kDa) 2-met-1-but, 2-methyl-1-butanol; 3-met-1-but, 3-methyl-1-butanol.
of these wines were PS3, TT, flavanols, 2-methyl-1-butanol, A230 and stilbenes.This separation could be associated with the grape variety used, because the red wines from BI were elaborated with cv.Mencía and those from the rest PDOs with cv.Tempranillo.The content of PS3 was significantly higher in the wines from BI than in the others, with the exception of wines from TO where the PS3 content was similar.Conversely, the BI wines showed the highest content of flavanols and stilbenes.Low differences were found in the other variables.
Figures 2a and 2b show the scores and loadings defined in the first and the third discriminant functions in the plane and explained 66.2 % of the total variance.The third discriminant function explained 10.8 % total variance, and allowed a good separation between the wines from TO and CI.The wines from CI were located at the top of the plane, and the physico-chemical variables that most contributed to their separation, from highest to lowest weight of their loadings, were A230, A280, 1-propanol, 2-methyl-1-butanol and TP.
The rest of the physico-chemical variables were most associated with the separation of the wines from TO, which were located at the bottom of the plane, and the variables with the highest loading weights were flavonols and their derivatives, ethyl acetate, diacetyl, flavanols and PS3.
The classification matrix of the model indicated that, in total, 97.8 % of the wines studied were correctly classified using the variables selected by the SLDA model: 98.7 % of the wines from RD, 95.2 % from BI, 95.5 % from TO and 100 % from CI.The wines that were missclassified might be due to the geographical proximity between PDOs, as described in a study by Pérez-Magariño and González-San José (2001) with wines from different Spanish PDOs.The validation of the model using four cross-validation steps showed that 91.7 % of the wines were correctly classified.Therefore, the model can be considered good and stable.
The results obtained in our study are in agreement with other studies carried out with wines from different regions, which some of them were close geographically but others not, with different enviromental conditions and with different grape varieties.The phenolic composition is the most useful for characterizing and discriminating wines from different regions.However, the phenolic compounds selected for discriminating the wines were different in each study.Thus, Buscema and Boulton (2015) reported good differentiation in Malbec wines from four Mendoza regions, selecting different phenolic compounds that included anthocyanins and nonanthocyanin compounds.Another study carried out by Rodríguez-Delgado et al. (2002) showed good differentiation of wines from different production areas in the Canary Islands (Spain) using five non-anthocyanin phenolic compounds.Peña-Neira et al. (2000) discriminated wines from four Spanish PDOs (La Mancha, Valdepeñas, Rioja and Cariñena) using their phenolic composition, mainly related to some hydroxybenzoic (gallic acid) and hydroxycinnamic (caffeic acid) acids and phenolic alcohols (tyrosol).Di Paola-Naranjo et al. (2011) obtained good classifications by variety and the origin of wines from three provinces of Argentina, using the phenolic profile and other compounds.They concluded that transresveratrol was the phenolic compound that discriminated between these three regions.Another study showed the important role of phenolic compounds (anthocyanins and non-anthocyanins) in the classification of wines from three Spanish PDOs (Penedes, Rioja and Ribera del Duero) (Serrano-Lourido et al., 2012).However, they concluded that the discrimination of wines from Rioja and Ribera del Duero was more difficult, probably because production areas are geographically quite close.In another study carried out by Pérez-Magariño and González-San José (2001), the wines from five different Spanish PDOs (Ribera del Duero, Rioja, Valdepeñas, La Mancha and Madrid) were correctly differentiated, with anthocyanic pigments as the most discriminant variables.
As mentioned above, the content of polysaccharide with low molecular weight (PS3) also contributed to the separation of the wines according to the PDO.Cejudo-Bastante et al. (2018) also showed an effect of the polysaccharide composition in Carignan red wines from six different areas in Chile: they found that polysaccharide fractions with a high molecular weight had a greater influence on differentiating the wines from these areas than polysaccharide fractions with a low molecular weight.
In our study, several higher alcohols contributed to the separation of the wines by PDO.Sagratini et al. (2012) found that 3-methyl-1-butanol (one isoamyl alcohol) was one of the most discriminant compounds (together with ethyl decanoate and ethyl octanoate) for separating Montepulciano red wines from two different Italian regions.Another SLDA was carried out to discriminate the wines according to their category, using the same 40 physico-chemical variables.In this case, the best model selected seven physico-chemical variables that most contributed to differentiating the wines according to the category.The variables selected were as follows, from highest to lowest discriminant power: polymeric ACY, glycerol, flavanols, flavonols and their derivatives, A230, TP and TT.The final model selected three discriminant functions that explained 83.1 % of the total variance: the first discriminant function explained 40.7 %, the second 31.6 %, and the third 10.8 %.
Figures 3a and 3b show the scores and loadings in the plane of the two first discriminant functions.Only a good separation between "young" and "reserve" wines was observed, mainly associated with the second discriminant function.In general, the "reserve" wines were located at the top of the plane and the "young" wines at the bottom.
According to the weight of the physico-chemical variables in this discriminant function, polymeric ACY, A230 and flavanols were positively correlated with the separation of "reserve" wines.
Table 3 shows that the percentage of polymeric ACY increased with aging of the wines: the longer the aging of the wines, the higher the content of polymeric ACY.This result was expected and is in agreement with other studies reported in the literature (Burin et al., 2011;Chira et al., 2011;del Barrio-Galán et al., 2011;McRae et al., 2012;Dipalmo et al., 2016;Agazzi et al., 2018).These compounds are mainly formed during the aging of wines (in oak barrel and/or bottle) because chemical reactions between the monomeric anthocyanins and other phenolic compounds and metabolites (Monagas et al., 2005b;De Rosso et al., 2009) having an important role on the long-term color stability of aged red wines (Boulton, 2001).The value of A230 was significantly higher in the aged wines than in the "young" wines, and could be proposed as an indicator of wines that have experienced some type of aging.
Conversely, the variables at the bottom of the plane (TT, glycerol, flavonols and their derivatives and TP) allowed the separation of "young" wines, and the TT and flavonols and their derivatives were the most significant variables.The "young" wines had significantly lower content of TT and TP than the aged wines, possibly because, in general, the winemakers selected wines with a high phenolic content to be aged for a longer time.However, the "young" and the "oak" wines presented higher content of flavanols and flavonols and their derivatives than the wines aged for a longer time.
The classification matrix reported by the SLDA model classified correctly 80 % of wines according to their category.As was previously presented, the best separation was found between "reserve" and "young" wines, as these wines showed the best classification percentages (90 % for "reserve" and 87.5 % for "young").Next, 80.6 % of "crianza" wines were classified correctly.Finally, "oak" wines presented the lowest classification percentage at 64.9 %, as these wines were confused with "young" or "crianza" wines.This result was expected because the aging time of "oak" wines is less stringent than "crianza" and "reserve", which must comply with a minimum aging time.
According to the cross-validation, the percentage of the wines classified correctly was lower but acceptable for "reserve" and "young" wines (83 % and 81 %, respectively).However, the results for "crianza" and "oak" wines were poorer (70.5 % and 55 %, respectively).Various studies have evaluated the discriminant power of several physico-chemical variables between young and aged wines.Agazzi et al. (2018) discriminated Malbec red wines from Mendoza and California at the beginning of aging and after five years.They observed that aging time affects significantly the total polyphenols and total monomeric anthocyanins, flavanols, flavonols, and hydroxycinnamic acids content.The aged wines presented higher concentrations of p-coumaric acid and lower concentrations of monomeric anthocyanins than young wines.Dipalmo et al. (2016) observed that red wines from the Primitivo grape variety aged for two years showed an increase of polymeric anthocyanins (pyranoanthocyanins). Finally, Astray et al. (2019) determined the aging time of red wines from PDO Toro using classical enological parameters and total polyphenol index.

CONCLUSIONS
Differences in physico-chemical composition were found in red wines from different Spanish PDOs and different categories.Phenolic composition and absorbance values associated with total phenolic content allowed good discrimination and classification of wines from different PDOs.
It was posible to discriminate the wines mainly elaborated with the Tempranillo grape variety (RD, TO and CI) and those elaborated with the Mencía grape variety (BI), and the absorbance values and phenolic groups such as TP and TT were the most important variables for this differentiation.
Other variables such as polysaccharides with low molecular weight, isoamyl alcohols and low molecular weight phenolic compounds (mainly flavanols and flavonols), were relatively important in this differentiation, mainly associated with those wines elaborated with the Mencía grape variety.
Although there are clear differences between the wines elaborated with different grape varieties, the winemaking techniques used in each region, and even in each winery, could also have an effect.
The discrimination of wines according to their category was more difficult, and only an acceptable classification of "young" wines and those with longer aging periods ("crianza" and "reserve") was achieved.The percentage of polymeric ACY and A230 could be proposed as good indicators for aged wines, and TT for young wines.
Further studies should be carried out that analyse other physico-chemical variables that could differentiate the red wines by their origin and/or category, such as volatile compounds, and sensory attributes.

FIGURE 1 .
FIGURE 1. Plot of scores (a) and loadings (b) for PDO wine differentiation using discriminant functions 1 and 2.

FIGURE 2 .
FIGURE 2. Plot of scores (a) and loadings (b) for PDO wine differentiation using discriminant functions 1 and 3.

TABLE 2 .
Mean values ± standard deviation of the variables selected by the SLDA for wine PDO discrimination.
a-cSuperscript letters for each compound or parameter indicate statistically significant differences at p < 0.05.* PS3: polysaccharides with a molecular weight average of 10 kDa.

TABLE 3 .
Mean values ± standard deviation of the variables selected by the SLDA for wine category discrimination.