Development of UV-vis and FTIR Partial Least Squares models: comparison and combination of two spectroscopy techniques with chemometrics for polyphenols quantification in red wine

Polyphenolic compounds are considered to have a major impact on the quality of red wines. Sensory perception, such as astringency and bitterness, are mainly related to condensed tannin, while colour intensity and evolution is due to anthocyanin composition. Therefore, the quick analytical measurement of phenolic compounds appears to be a real challenge for wine monitoring. Fourier transform infrared (FTIR) and ultraviolet-visible (UV-vis) spectroscopy with chemometrics are good candidates for predicting polyphenolic contents in wines, but they have not yet been compared in terms of efficiency of each wavelength area. Thus, the possibility of combining the two areas has not been investigated. This work sought to determine the tannin and anthocyanin content of ninety-two wines. The wine selection covered different vintages, varieties and regions. Tannin concentration was analysed by precipitation with protein and polysaccharide and by the Bate-Smith assay. Free anthocyanin concentration was analysed by bisulfite bleaching and the monomers/polymers ratio was analysed using the Adams-Harbertson method. Molecular anthocyanin concentration was also obtained by HPLC/UV-vis. Two spectra were collected using UV-vis and FTIR devices. The data collected were statistically analysed using the partial least squares (PLS) regression method. The correlations obtained were relevant to both of the spectrum areas studied, with a coefficient of determination for cross validation larger than 0.7 for most parameters studied. While the two spectroscopic methods gave almost identical results, FTIR indicated higher robustness for the prediction of tannin concentration. Conversely, UV-vis appeared to be more relevant when determining anthocyanin concentration and evolution. Finally, the models obtained when combining the two spectrum areas gave slightly better results. When a selection of different visible wavelengths were added to the FTIR spectrum, the results showed that the prediction of anthocyanin parameters improved considerably, thus highlighting the importance of the visible area when estimating these compounds.


INTRODUCTION
Polyphenolic compounds are present in high concentrations in red wine. In particular, the flavonoid family can significantly impact the quality of wines, as well as their aging (Cheynier et al., 2006). Sensory perception, such as astringency and bitterness, are mainly related to tannin concentration and composition (Noble, 1998;Vidal et al., 2003), while colour intensity and evolution is due to anthocyanin composition (Mazza and Francis, 1995;Somers, 1971). During aging, tannin and anthocyanin molecules evolve; they degrade and polymerise, thus impacting the organoleptic properties of wine. In addition, a significant supply of oxygen can impact these different parameters (Iacobucci and Sweeny, 1983;Petrozziello et al., 2018). Therefore, we understand the importance of measuring these molecules in wines.
Several methods have been developed to analyse polyphenols in wines; however, most of these methods require time, laboratory equipment and knowledge, making them unsuitable for a quick analysis. In order to facilitate the analysis of these compounds, new methods can be used, such as spectroscopy analysis coupled with chemometrics. While the information found in different spectral areas can be useful for measuring polyphenols, it can sometimes be difficult to extract; wine is a complex matrix, and many compounds absorb at the same wavelengths as polyphenols and it is not possible to directly read different concentrations. Therefore, it is necessary to include chemometric analysis coupled with spectral analysis in the design of prediction models (Cozzolino et al., 2011a). Different statistical analyses have given conclusive results, but the Partial Least Squares regression (PLS) appears to be the most effective in producing robust and effective prediction models (Haaland and Thomas, 1988;Wold et al., 2001).
Several spectral areas coupled with chemometrics have already demonstrated their potential for the analysis of polyphenols; Ultraviolet-visible (UV-vis) appears to be one such area that can be used for analysing fermenting wines and finished wines. Several studies have been conducted in order to predict polyphenolic concentration in these samples. A first study conducted in 2007 highlighted the potential of UV-vis to predict anthocyanins, polymeric pigments and tannins on a large dataset of fermenting samples (Skogerson et al., 2007). Another study enlarged the spectral reading to near infrared when predicting the concentration of catechin, epicatechin and malvidin-3-O-glucoside in thirty-nine wines from Spain (Martelo- Vidal and Vázquez, 2014). The results showed a lack of precision, but they demonstrated the ability of the technique to predict specific concentrations. Two different studies have reported good predictions for precipitable tannins. The first, conducted by Dambergs et al. (2012), focused on methylcellulose precipitation with UV area and demonstrated the ability for the calibration to be transferred to another laboratory. The other, conducted by Aleixandre-Tudo et al. (2015), used methylcellulose and bovine serum albumin precipitation to investigate the differences in ability to predict different parameters in fermenting wine samples. A most recent study conducted on a large panel of fermenting samples and wines from south Africa has demonstrated that UV-vis alongside chemometrics can be considered as a suitable method for predicting specific polyphenolic compounds (Aleixandre-Tudo et al., 2018a). Because of the ability of UV-vis to detect molecules with carboncarbon double bonds and pi bonds, this technique is useful for the detection of polyphenols and avoids the absorbance of the most predominant compounds of the wine matrix.
Another spectroscopic technique which gives predictive results is Fourier transform infrared (FTIR). The absorption bands contain much more information than the UV-vis and the advantage is that several instruments already use this technology to measure various compounds in wine, such as sugar, alcohol, or different acids (Bauer et al., 2008;Moreira and Santos, 2005;Pizarro et al., 2011). The development of infrared prediction models can therefore be directly applicable to these instruments, allowing the information contained in the wines provided to be enriched. Cozzolino et al. (2004) demonstrated the ability of infrared with chemometrics to predict the concentration of several phenolic compounds (such as malvidin-3-O-glucoside, pigmented polymers and tannins) during wine fermentation. Anthocyanin concentration in wines has also been investigated in order to distinguish the different monomeric anthocyanins (Soriano et al., 2007); the results showed a good correlation for the anthocyanin level, but a lack of precision for the less concentrated monomers. Another study conducted by Di Egidio et al. (2010) showed a robust prediction for total phenolics, total flavonoids and total anthocyanins for fermentation samples, using nearinfrared and mid-infrared techniques. However, the use of 280 nm and 540 nm absorbance readings without pretreatment to calculate the concentration of flavonoids and anthocyanins respectively is a matter for discussion. More recently, Aleixandre-Tudo et al. (2018b) investigated the difference between Fourier transform infrared (FTIR), Fourier transform near infrared (FT-NIR) and attenuated total reflectance mid infrared (ATR-MIR) to analyse the phenolic composition of fermenting samples and red wine. The results obtained were comparable and predictive for each technique, and showed that FT-NIR appears to be the most accurate technique.
The UV-vis and infrared areas give efficient results for polyphenol prediction in red wine, but the comparison of these two techniques has not yet been investigated.
The aim of this study was to develop new tools for the wine industry to measure polyphenols in finished wines. To do this, we looked at two different spectral zones: UV-vis (200-700 nm) and FTIR (925-5011 cm -1 ). In order to obtain the most robust results possible, the two absorption areas were compared to determine their effectiveness in the design of a prediction model for measuring different polyphenolic parameters in wine. In addition, the complementarity of these two methods was studied in order to investigate the possibility of obtaining even more robust results by adding two different spectral zones. Because the wine matrix can be impacted by many parameters (Geladi, 2003), this study focused on ninety-two wines from the widest possible range of grape varieties and vintages, as well as different geographical regions (which impact the method of winemaking). In order to measure polyphenols in these samples, different reference methods were applied. The tannins were dosed according to the Bate-Smith method and by precipitation with methylcellulose and bovine serum albumin. Anthocyanins were dosed via bisulfite bleaching and by high-performance liquid chromatography (HPLC/UV-vis). In order to develop prediction models for the parameters studied, partial least squares (PLS) regression was applied to find correlations between the different spectra obtained and the reference analysis. The robustness of the resulting models was investigated via cross-validation.

Spectra measurement
All measurements were carried out in triplicate.

FTIR measurement
Samples were scanned on a Winescan Flex (FOSS, Hillerød, Denmark) at a 3.858 cm -1 interval over the wavelength range 925-5011 cm-1, with water as the reference blank. Spectra were registered in transmittance and converted into absorbance values.

UV-vis measurement
Samples were scanned after a 1/100 water dilution adjusted to pH 3.3 with Hydrochloric acid on a Jasco V-630 UV-VIS Spectrophotometer (JASCO, Japan) at a 1 nm interval over the wavelength range 200-700 nm, with water as the reference blank and 10 mm path length in a quartz cuvette. The spectra were registered in absorbance.

Quantification of free anthocyanins by bisulphite bleaching
The concentration of free anthocyanins in the different samples was estimated using a method based on the ability of bisulphite to bleach these compounds (Ribéreau-Gayon and Stonestreet, 1965). Two tubes were prepared, one containing 1 mL of wine solution (250 μL of wine, 250 μL of ethanol with 0.1 % Hydrochloric acid v/v, and 5 mL of water with 2 % Hydrochloric acid v/v) and 400 μL of water, and another containing wine solution and 400 μL of bisulphite solution (15 % bisulphite v/v in water). After 20 min, the difference in absorbance at 520 nm (10 mm path length with water as a blank) between the two tubes was recorded and free anthocyanin concentration, expressed as malvidin-3-O-glucoside equivalent, determined by reference to a calibration curve established by Ribéreau-Gayon and Stonestreet.

Determination of the ratio of monomeric to polymeric pigments and tannins concentration based on the Adams-Harbertson assay
The method used is based on anthocyanin metabisulphite bleaching and on the ability of polymeric pigment and tannins to precipitate with protein (Harbertson et al., 2003). 125 μL of wine was diluted in 375 μL of a wine model buffer containing 12 % ethanol v/v and 5 g/L tartaric acid, adjusted to pH 3.3 with Sodium hydroxide. After several tests, the choice of this dilution was made to obtain an absorbance of below 1.2 during the whole analytical process. In a first 1.5-mL microfuge tube, 500 μL of diluted wine was mixed with 1 mL of acetic acid-NaCl buffer (200 mM acetic acid and 170 mM NaCl, adjusted to pH 4.9 with Sodium hydroxide. The absorbance at 520 nm (10 mm path length with water as a blank) of 1 mL of the mixture was read (the A value), then 80 μL of a 0.36 M sodium metabisulphite solution was added. After 10 min, the absorbance at 520 nm was read again (the B value). In a second microfuge tube, 500 μL of diluted wine was mixed with 1 mL of acetic acid-NaCl buffer containing bovine serum albumin at 1 g/L. After 20 min, the tube was centrifuged for 5 min at 13,500 g. One mL of the supernatant was mixed with 80 μL of a 0.36 M sodium metabisulphite solution. After 10 min, the absorbance at 520 nm was read (the C value).
The absorbance due to monomeric pigments (MP) is calculated as Δ (A-B), the absorbance due to small polymeric pigments (SPP) is C, and the absorbance due to large polymeric pigments (LPP) is calculated as Δ (B-C). Total polymeric pigments (PP) -which is the sum of the small polymeric pigments and the large polymeric pigments -was added as another parameter.
In the second microfuge tube, the pellet was discarded from the remaining liquid and washed with 250 μL of acid acetic-NaCl buffer. The tube was centrifuged for 1 min at 13,500 g and the supernatant was discarded. The pellet was dissolved in in 875 μL of a buffer containing 5 % w/v triethylamine and 5 % w/v sodium dodecyl sulfate adjusted to pH 9.4 with Sodium hydroxide. The background absorbance was measured at 510 nm (10 mm path length with water as a blank), and 125 μL of a ferric chloride solution was added (10 mM ferric chloride and 10 mM Hydrochloric acid in water). After 20 min at room temperature, the reaction absorbance was measured at 510 nm. Tannins absorbance is calculated as Δ (Background-Reaction/0.875) and reported on a catechin calibration curve to be expressed in catechin equivalent.

Determination of tannins concentration by methylcellulose precipitation
The method used is based on the ability of tannins to precipitate with polysaccharide like methylcellulose (Sarneckis et al., 2006). Tannins will react with methylcellulose to form an insoluble tannin-polymer complex insulated by centrifugation. In a first 5 mL microfuge tube, 50 μL of wine was added with 800 μL of a saturated ammonium sulfate solution, 1950 μL of water and 1200 μL of a methylcellulose solution (0.04 % w/v in water, viscosity: 1,500 cP, 2 % in water at 20 °C). In a second microfuge tube, 50 μL of wine was added with 800 μL of a saturated ammonium sulfate solution and 3150 μL of water. After homogenisation and 10 min at room temperature, both tubes were centrifugated 10,000 g for 5 min. Supernatants are measured at 280 nm (10 mm path length with water as a blank in quartz cuvette). Tannins absorbance was calculated as the difference between the two tubes and reported on an epicatechin calibration curve to be expressed in epicatechin equivalent.

Total tannins by Bate-Smith assay
This method was developed by Bate-Smith and is based on the transformation of proanthocyanidins into anthocyanidins by heating in acid environment (Ribéreau-Gayon and Stonestreet, 1966). In two different closed hydrolysis tubes 4 ml of wine diluted 50 times, 2 ml of water and 6 ml of hydrochloric acid 37 % were added. The first tube was placed in an ice bath at 0 °C for 30 min. The second tube was placed in a water bath at 100 °C for 30 min. Total tannins absorbance was calculated at 550 nm (10 mm path length with water as a blank) as the difference between the two tubes and reported on a calibration curve established by Ribéreau-Gayon and Stonestreet, using the following formula: C (mg/L) = 19,330*Δabsorbance.

Principal component analysis
Principal component analysis (PCA) was performed on reference data, FTIR spectra and UV-vis spectra before the construction of prediction models using RStudio with FactomineR and Factoshiny packages. This preliminary analysis of the data allows any variations in the dataset, correlations between individuals or variables and outliers to be identified.

Partial least squares regression
Partial least squares (PLS) regression was performed using Matlab 2017 (MathWorks, Natick, MA, USA) coupled with PLS_Toolbox by Eigenvector (Manson, WA, USA). Before PLS regression was carried out, autoscale preprocessing was applied to the dataset. The automatic variable selection (VIP or sRatio) proposed by PLS_Toolbox was used to refine spectral wavelengths selected to build the PLS regression. After this pretreatment, calibration models were developed using PLS regression with leave-p-out cross-validation. For each model created, there were 10 cross-validation subgroups. The crossvalidation result obtained is a good indicator of the model's ability to predict values for external samples and testing model robustness.

Dataset analysis
In order to fully understand the results obtained by the prediction models (such as their error or their robustness) it is necessary to be familiar with the dataset that the design of these models was based on. Therefore, the first step in this work was to analyse the dataset obtained by the analysis of wine samples.
First, UV-vis and FTIR spectra area were redefined to match the absorbance limit of the instruments used, as well as the response of polyphenols in these spectral zones. Thus, it was decided that for UV-vis spectra absorbances greater than 1.2, and therefore above the spectrophotometer detection limit, would not be considered. The spectrum obtained was from 250 to 700 nm. While some references demonstrated areas of interest for polyphenols below 250 nm, it was decided not to dilute the sample further so as to keep the visible part of the response clear enough for the analysis of the concentration of both anthocyanins and tannins via a single sample reading. Regarding FTIR, in 2008, Jensen et al. demonstrated that the spectral response of wine polyphenols was between 3000 cm -1 to 925 cm-1 (Jensen et al., 2008). Only this area of the infrared spectrum was kept.
The reference data obtained was investigated. Mean, minimum (min), maximum (max), standard deviation (SD), coefficient of variation (CV) and standard deviation of handling (SD handling) was calculated and reported in Table  1. Total glucoside was calculated as the sum of monoglucoside anthocyanins, total acetyl as the sum of monoglucoside acetyl anthocyanins, total coumaroyl as the sum of monoglucoside coumaroyl anthocyanins and total anthocyanins as the sum of all anthocyanins quantified.
These data show a very low content of most molecular anthocyanins. Apart from Mlv-3-O-glc accounting for about 50 % of total anthocyanins content, all other anthocyanins have an average value of less than 10 mg/L. Considering the difficulty in predicting very low concentrations using FTIR or UV-vis, it was decided not to consider each molecular anthocyanin, but rather the total sum of these anthocyanins. For deeper analysis, the sum of anthocyanin monoglucoside, monoglucoside acetyl and monoglucoside coumaroyl was also considered. Some grape varieties, such as pinot noir, show deficiencies in acetylated anthocyanins, which is why it may be interesting to investigate these parameters (Dimitrovska et al., 2011).
Regarding coefficients of variation for condensed tannins, a noticeable variation was observed. Total tannins measured using the Bate-Smith method showed the lowest coefficient of variation (36 %). This can be explained by the fact that, in this method, condensed tannins were measured independently of their reactivity. On the other hand, tannins measured via the bovine serum albumin assay and the methylcellulose assay show a higher coefficient of variation (49 % and 40 % respectively). While the methylcellulose assay showed a variation similar to that of the Bate-Smith assay, the bovine serum albumin measurement showed a much greater variation. This could be explained by the very particular reactivity of tannins with proteins, and the fact that astringency is better represented by the concentration of tannin (Boulet et al., 2016).
It is possible to apply the same reasoning to three close parameters which measure anthocyanins: total anthocyanins, free anthocyanins and monomeric pigments via the Adams-Harbertson assay. The variation observed here, however, is much larger, with 112 % for total anthocyanins, 76 % for free anthocyanins and 43 % for monomeric pigments. Although these parameters provide similar information, they can be considered as independent from each other, each providing interpretable data.
Overall, the error due to manipulation was low compared to the data analyses. The highest error value was recorded for the methylcellulose assay, corresponding to an error of 8 % compared to the average value.
For further analysis of the database, PCA was applied to the reference analyses (individuals and variables, with the vintage as an external variable), UV-vis spectra and FTIR spectra, as shown in Figure 1. This allowed us to highlight any grouping of samples or variables, the proper dispersion of data, and any outliers.
For the 3 PCA analyses, we looked at the first two dimensions, which explain much of the variation in the datasets, accounting for 78 % of the variation in the dataset for reference data, 91 % for UV-vis spectra and 86 % for FTIR spectra. The graphs of the individuals from reference data show a relatively homogeneous distribution of wines, with, however, several outliers. Indeed, the wines 7, 8, 77, 84 and 87 seem detached from the main cloud of points. If we compare this to the graphs of individuals from spectral data, we find that for the UV-vis spectra there are three outliers in common: wines 77, 84 and 87. Regarding the FTIR spectra, these three wines also seem to be detached from the main point cloud, along with an additional outlier: wine 75. Three of these outliers (wines 75, 84 and 87) are distinguished by a concentration of condensed tannins far above average, and wine 77 by a very low concentration. Wines 7 and 8, stand out for their high concentration of molecular anthocyanin.
The variable graph shows a strong distinction between variables, which are split into two groups. The first group is composed of total anthocyanins, free anthocyanin, monomeric pigments, total coumaroyl, total glucoside and total acetyl; is highly correlated with the first dimension; and explains the variation in concentration in anthocyanins. In this group, the variables seem to be extremely correlated with each other, except for monomeric pigments. The 2nd group is composed of MCP, BSA, total tannins via the Bate-Smith assay, and all the different parameters of polymeric pigments. It is also highly correlated with the second dimension, and explains the variation in concentration of condensed tannins, as well as the variation of polymerised pigments. The vintage was added as an external variable, and we find a correlation similar to the parameters defining anthocyanins, as explained by the first dimension.
In order to identify possible groupings of individuals in relation to their vintage and grape variety per polyphenol composition, a graph of individuals was created which highlights these subgroups (Figure 2).
In the first graph of individuals classified according to vintage, several groups can be observed. Indeed, the 2017 vintage -the youngest vintage to be analysed -stands out the most, followed by a grouping for the 2016, 2015 and 2014 vintages. Older vintages do not differ enough to be grouped together. As this separation can be made along the axis of dimension one, it is possible to consider the different parameters of anthocyanins as a marker for the age of a wine.
The graph of individuals classified by grape variety shows no real differentiation, and has a massive overlay of confidence ellipse. Therefore, even if the composition of polyphenols varies from one grape variety to another, there does not seem to be enough variation to differentiate the grape varieties. However, this is only applicable to this dataset, which has a high number of grape varieties in terms of number of individuals.
Overall, the database shows a lot of diversity, very few groupings, and no outliers that cannot be explained by their parameters. The graph of individuals based on reference analyses, UV-vis spectra and FTIR spectra shows a homogeneous point cloud. This dataset therefore seems ideal for applying PLS regression in order to construct models for predicting the composition of polyphenol representative of a finished wine.

FIGURE 2.
Reference analysis PCA grouping per vintage and variety, with confidence ellipses.

Comparison between UV-vis and FTIR models
In order to compare the usefulness of the two spectral zones studied for predicting the polyphenol composition of wines, prediction models were constructed by PLS regression for each reference parameter studied. The results obtained for the UV-vis spectrum are shown in Table 2, and those for the FTIR spectra in Table 3.
Several parameters were postponed: coefficient of determination for calibration (R 2 Cal), coefficient of determination for cross-validation (R2CV), root mean square error of calibration (RMSECal), root mean square error of cross-validation (RMSECV), relative percentage difference of cross-validation (RPD CV) and number of latent variables used to build the model.
The various studied parameters determine whether the prediction models are robust enough to assess the level of polyphenols in the wines  in the database. R 2 Cal explains how data fit with the calibration line using all samples. While this first parameter alone is not enough to assess the robustness of a model, a low value already indicates the difficulty in organising the data to calibrate a model. R2CV explains how randomly removed data fit with the calibration line. This is a performant indicator of the effectiveness of the model in predicting external samples. A model with high R2Cal and low R2CV indicates its dependence on each sample in its design, and its inability to correctly predict external samples. To complete these two parameters, RMSECal and RMSECV were added to visualise the margin of error induced by the model. The last important parameter is the RPD CV, calculated using the following formula: SD/RMSE CV. According to the literature, an RPD under 1.4 indicates a nonreliable model for prediction; it must only be used as an indicator. When the RPD is greater than 1.4 and less than 2, the model starts being reliable enough to be used for prediction. When the RPD is above 2, the model starts to be considered as good, and when it is above 3, it is considered as excellent (Cozzolino et al., 2011b;Ferrer-Gallego et al., 2011;Martelo-Vidal and Vázquez, 2014).
From the results obtained from the analysis of molecular anthocyanins by HPLC/UV for the two spectrum areas in question, it is possible to see the difficulty in predicting these parameters. Compared to other results, none of them show R2CV above 0.8, and RPD CV above 2.
Comparing FTIR and UV-vis we find a slightly higher prediction for UV-vis spectra. Only the parameter of total acetyl is better predicted by FTIR spectra. However, the low concentration of total acetyl (mean concentration at 8.33 mg/L) in the studied wines and the high number of latent variables (19) used to obtain the best performing model indicate that this result may be due to an over-correlation. Moreover, the result is at odds with a previous study which found that the total acetyl parameter was the least well predicted in a dataset of fermentation wines (Miramont et al., 2019). Only the total anthocyanins parameter can be used as an indicator, with RPD CV values for the UV-vis spectra and FTIR spectra of 1.96 and 1.69 respectively.
Regarding the prediction of the Adams-Harbertson method parameters, the comparison of models for UV-vis and FTIR spectra gave heterogeneous results. This assay has the advantage of being able to compare ratios of monomeric pigments and different polymeric pigments, and thus to evaluate the evolution of wine anthocyanins. Prediction models obtained by UV-vis spectra appear to be well adapted to this assessment. With R2CV for MP, SPP, LPP and PP equal to 0.81, 0.88, 0.93 and 0.97 respectively, UV-vis models shows strong correlations. On the other hand, with R2CV for MP, SPP, LPP and PP equal to 0.55, 0.71, 0.78 and 0.83 respectively, FTIR models clearly show that they lack precision and cannot be used for these parameters.
For free anthocyanins -the last parameter studied with regards wine colour -the previously obtained trend remains unchanged. The UV-vis spectra model highlights predictive results, with R2CV and RPD CV equal to 0.80 and 2.26 respectively, while the FTIR spectra model is lower in precision, with R2CV and RPD CV equal to 0.71 and 1.86 respectively.
Overall, the comparison of the models obtained with UV-vis and FTIR spectra for anthocyanins shows that UV-vis spectra is superior in its ability to obtain a good prediction. This difference can be explained by the propensity of anthocyanins to absorb into the visible and their ability to impact the colour depending on their degree of polymerisation. In addition, the results obtained with the FTIR spectra appear to be weaker than those obtained in similar studies. This can be explained by the high variability of the wines used to build the database. The absorption due to anthocyanins could thus be impacted by the strong background noise caused by the many compounds that absorb in this spectral area. The lack of precision is also increased by the low concentration of anthocyanin compounds in the dataset.
When focusing on the analysis of condensed tannins, we found that the previously observed trend was less clear. For the two parameters studied, BSA and total tannins, the results for UV-vis and FTIR spectra were very close, with a slightly higher accuracy for the infrared spectra.
With an RPD CV value of 2.69 and 2.51 for BSA and total tannins respectively for the FTIR models, this spectral area shows a real potential for condensed tannins prediction, despite a strong variation in the database induced by the diversity of grape varieties, vintages, and origin of the wines. While the values for UV-vis spectra were lower, with an RPD CV value of 2.48 and 2.24 for BSA and total tannins respectively, they also indicate that UV-vis spectra is a reliable method for condensed tannins analysis.
Unlike the two methods presented above, the last parameter, MCP, showed very heterogeneous results for the two spectral areas studied. While the prediction with UV-vis spectra remained consistent with previous results (RPD CV = 2.38 and R2CV = 0.83), the use of FTIR spectra showed a very sharp drop in prediction (RPD CV = 1.70 and R2CV= 0.66). Any analytical bias can be excluded, because the UV-vis results remain predictive; it is therefore necessary to look for the explanation for this difference in the analysis itself. Methylcellulose polymer interacts indiscriminately with condensed tannins and polymerised pigments, and the reading difference at 280 nm does not -unlike the BSA analysis -separate them. We can therefore hypothesise that the diversity of the precipitated compounds is too great for the FTIR reading.
In order to test the complementarity of spectral areas, prediction models were constructed using a combination of FTIR and UV-vis spectra and reported in table 4. Autoscale pre-processing of the spectra used avoids over-exploitation of one spectral area over the other in the model design.
An overall improvement in the results can be seen when combining UV-vis and FTIR spectra compared to those obtained with just UV-vis or just FTIR spectra. However, they are still relatively similar to the best result obtained via each of the two spectra. It is difficult to attribute the increase in robustness to a complementarity of both spectra. One hypothesis is that it is due to a decrease in noise from other wine compounds, which could also explain the overall decrease in latent variables needed to design the best prediction model. Overall, the prediction gain would be negligible compared to instruments which would be required to perform the analysis, and stand-alone UV-vis and FTIR analyses would be preferable.
We also find that the parameters which do not show improvement improvement are related to anthocyanins, demonstrating the large involvement of the visible area in the design of predictive models. The addition of the FTIR area adds more uncertainty than robustness for these compounds.
If the visible greatly improves the prediction of anthocyanins, it is possible to integrate specific visible wavelengths to increase the robustness of the FTIR models. The wavelengths 420nm, 520 nm and 620 nm were chosen, because they are already used as colour parameters in the wine industry -to calculate colour intensity and modified colouring intensity -and because some FTIR devices used in wine analysis offer these parameters. Therefore, the models would be directly applied to these devices, should this combination increase the robustness of the prediction.
Prediction models using the combination of FTIR spectra and wavelengths 420 nm, 520 nm and 620 nm are shown in Table 5.
Compared to the previous results obtained for FTIR spectra alone, an overall increase in accuracy for all anthocyanin parameters can be observed. An overall decrease in the number of latent variables used to design the most predictive model also suggests that FTIR spectra, combined with the three visible wavelengths, can explain the dataset more easily.
The parameters obtained by HPLC/UV analysis show that, even if the robustness of the models has increased, their accuracy is still insufficient. In order to be usable, the precision of very specific parameters (e.g., anthocyanin monoglucoside, monoglucoside acetyl and monoglucoside coumaroyl with an RPD CV value of 1.70, 1.79 and 1.59 respectively) would need to be increased. The sum of the monomeric anthocyanins -giving an RPD value of 1.80 -shows too much uncertainty to be used as a prediction of concentration, but it can be considered as an indicator of the level of nonpolymerised anthocyanins. The same observation can be made with free anthocyanins measured by bisulfite bleaching; with an RPD CV value of 1.90, this parameter can be considered as an indicator and not as a precise concentration reading.
The results obtained using the Adams-Harbertson method show a considerable increase in performance. The values of RPD CV for MP, SPP, LPP and PP range from 1.45, 1.84, 2.06 and 2.37 respectively for FTIR spectra to 1.83, 2.28, 2.41 and 3.15 respectively for FTIR spectra with visible wavelengths. Thus, the accuracy of these parameters -which could be considered too weak to be exploitable with FTIR alone -shows now results high enough to be considered. The addition of these wavelengths will allow an FTIR device to assess the level of evolution of anthocyanins in an analysed wine because of the ratio between the monomeric pigments and the different polymerised pigments.

CONCLUSION
This study aimed to compare and identify the main qualities of two spectroscopic areas, UVvis and FTIR, coupled with chemometrics, used for predicting the concentrations of the main polyphenols in wine. The choice of sample set was based on high variability, in order to avoid any bias related to the vintage, grape variety or winemaking method.
The results showed that, overall, the analysis of UV-vis spectra seemed more appropriate for the prediction of wine polyphenols, with the analysis of anthocyanins well above the FTIR spectrum, and the analysis of condensed tannins slightly lower. However, FTIR models showed very conclusive results in the analysis of condensed tannins, and the versatility of this method -which can also measure other important parameters of wine, such as acids, ethanol or sugars -suggests that it would be preferable to the UV-vis method.
In order to test the complementarity of the two spectral zones, prediction models were constructed with the combination of FTIR and UV-vis spectra. Although most of the studied parameters showed a slight increase in prediction, it was not enough to indicate a real complementarity. However, this study has shown the value of the visible in the prediction of anthocyanins. To improve FTIR prediction models for these compounds, new models were developed by adding three specific visible wavelengths (420 nm, 520 nm and 620 nm) to the FTIR spectrum. These new models demonstrate that the addition of such wavelengths could compensate for the lack of predictive efficiency of FTIR spectra in the dosage of anthocyanins.
To complete this study, new wine samples could be used to validate the effectiveness of the built models and the differences between the two spectral areas. In addition, variability could be investigated by adding fermenting wines to look for the impact of variation in compounds, such as ethanol or sugar, on the robustness of models.