Development of a Hierarchical Rate-All-That-Apply (HRATA) methodology for the aromatic characterisation of wine

Wine aromatic characterisation is generally a complex task, even for well-trained assessors. To facilitate such characterisation, aroma terms are typically arranged in some sort of hierarchical structure, such as aroma wheels. However, information about this structure is lost with existing data acquisition and treatment methods. To fill this gap, we propose a new approach, Hierarchical-Rate-All-That-Apply (HRATA), for the characterisation of products. It combines the Rate-All-That-Apply (RATA) methodology with a hierarchical structuring of general and specific attributes. The aim is first to facilitate data acquisition and, secondly, to account for the hierarchical links among attributes during data analysis. We applied an HRATA approach to the characterisation of five rosé wines by 66 subjects based on 118 hierarchically structured aromatic attributes. Using monadic evaluation, assessors were asked to select all the attributes that characterised each wine and to rate their intensity on a three-point scale. For the data analysis, an initial coding step was carried out to represent the hierarchical structure of the attributes, which also made it possible to manage a large amount of non-evaluated data. After that, statistical tests and multivariate analyses were tailored for both the identification of discriminating attributes and the determination of a product map. Finally, the characterisation obtained with HRATA was compared to the results obtained from a descriptive analysis (DA) conducted by a trained panel. HRATA represents an interesting alternative for obtaining aromatic characterisation using a panel of subjects without collective common training or with diverse skill sets.


INTRODUCTION
Quantitative Descriptive Analysis (QDA) by a trained panel is often considered the reference method for assessing the sensory characteristics associated with a product (Sidel, 2004).However, this method is both time-consuming and expensive due to the amount of training it requires.There is thus an ongoing search for alternative methods of sensory analysis that are easy, fast, and cost-effective.
In recent years, sensory characterisation has become increasingly oriented toward consumers (Ares and Varela, 2017), and several methods have been developed for use with non-specialist panels, such as Check-All-That-Apply (CATA) or Rate-All-That-Apply (RATA) (Ares et al., 2014;Danner et al., 2017;Meyners et al., 2016;Vidal et al., 2018;Copper et al., 2019;Mezei et al., 2021;Pineau et al., 2022).While these methods are not as accurate as the descriptive analysis, Danner et al. (2017) reported that when the RATA method was applied to wines with a panel of untrained consumers, it generated results that were, to a large extent, similar to those obtained from a descriptive analysis carried out with a trained panel.
With regard to the wine sector, sensory methods have also been adapted for use with panels composed of professionals without collective training (Campo et al., 2010;Coulon-Leroy et al., 2017;Lawrence et al., 2013;Longo et al., 2020;Perrin et al., 2008).Other proposed methods of sensory characterisation of wine are based on the free description, such as free profile methods (Perrin et al., 2008;Varela and Ares, 2014) or mixed profile methods (Coulon-Leroy et al., 2017).In these, each subject freely generates their own attributes to describe wine, in addition (or not) to a predefined list.Free comment methodology has been successfully used for the description of wine by both consumers and professionals (Lawrence et al., 2013;Mahieu et al., 2020).The advantage of these methods is that they are more flexible, offering greater freedom to the subjects.However, they also generate a high number of individual attributes, which makes data aggregation difficult and can hinder the interpretation of the results.
Wine presents an aromatic complexity (Spence and Wang, 2018) and to simplify the wide spectrum of aroma terms, they are often presented following a hierarchical arrangement or grouped in a manner that structures and manages a large number of attributes.Categories and aromatic terms can be represented as an aroma wheel, as proposed by Noble et al. (1984) and Noble et al. (1987), Sáenz-Navajas et al. (2021), or arranged in a table, as in Caillé et al. (2017), Campo et al. (2008), Campo et al. (2010), or Bindon et al. (2014).With these approaches, terms are structured in such a way as to create nested groups of general and specific attributes.For example, the group 'Fruity' includes the sub-group 'Citrus', which, in turn, includes the attributes 'Lemon' and 'Orange'.
Initially, the purpose of Noble's wheel was to align the various terminologies proposed to describe the aromatic characteristics of wines, to facilitate communication between winemakers, researchers, marketers and consumers (Lawless and Civille, 2013;Noble et al., 1984).Some authors have used it as a basis for training panellists, as in the work of Caillé et al. (2017) or Bindon et al. (2014).The main benefits of nesting are that it reduces the number of attributes to be considered and can facilitate the standardisation of terms among studies or panels.For this reason, the nesting of attributes is a common step in the training session of a panel.Nowadays, using hierarchical links between attributes to simplify sensory description is common practice in sensory analysis (Bindon et al., 2014;Caillé et al., 2017;Larssen et al., 2018;Villière et al., 2018).However, to the best of our knowledge, information on the hierarchical structure of attributes (general and specific) is lost in the process of data acquisition or data analysis.In Campo et al. (2008), the authors proposed a sensory descriptive analysis based on citation frequencies, which involved a large number of terms structured in categories.However, the evaluation was mainly focused on specific attributes considered as independent.Even if they considered the hierarchical nature of the terms when they built their contingency table, no strategy was proposed to organise the information from specific attributes to categories they belonged to and considered the proximity between the attributes.
This study aimed to go further by incorporating the hierarchical structure of a wine-odour lexicon into the analysis of a sensory descriptive task and not just by using the structure during the characterisation by the panel.Indeed, it is not easy to assess all attributes quantitatively (as in quantitative descriptive analysis) due to the high aromatic complexity of wines.Nevertheless, we thought that the hierarchical method could allow assessing the aromatic intensity and finally increases the discrimination power because only the selection of the most salient attributes would be expected (Vidal et al., 2018); we chose to base our approach on RATA methodology.In comparison to a classical RATA analysis, which includes between 15 and a maximum of 50 attributes (e.g., Ares et al., 2018), we used a total of 118 attributes.Therefore, we propose here a Hierarchical-Rate-All-That-Apply method (HRATA), including techniques for data acquisition and data treatment, for use in generating reliable aromatic characterisations of wine in a professional context (wine competitions, sensory controls as part of the regulated quality schemes, sensory benchmarking by companies or Geographical Indications unions).
The study is divided into two parts: -Firstly, we will present the implementation of the HRATA approach.Particular attention will be paid to the statistical treatments used for HRATA data and the adaptations of the RATA method of data treatment (Meyners et al., 2016) to the specificities of the HRATA dataset.
-Secondly, we will present sensory descriptions obtained using the HRATA method for characterisation of the aromatic profiles of rosé wines.Finally, the results obtained using the HRATA approach will be compared to those of a descriptive analysis performed on the same set of products.

Samples and methods of presenting wines
Five French rosé wines were selected from the 2018 and 2019 vintages based on the expertise of the research group GRAPPE (Angers, Loire Valley area, France) and the 'Rosé Wine Experimentation and Research Centre' (Vidauban, Provence area, France), with the goal of presenting the panel a wide range of aromatic notes.Three wines from Provence, two wines from Middle Loire Valley and one wine from the Rhône Valley area, on various terroirs with a diversity of winegrowing (soil and climate characteristics, variety, yield) and winemaking practices (maceration times, fermenting temperature, ageing).Of these five wines, one was replicated among the samples presented to assess the reproducibility of the method.In this study, the five wines are designated A to E, and the duplicated samples are indicated by E and E*.
For each method (HRATA and DA), samples were presented in a monadic sequence according to a Williams Latin square design.Wines were served at 13 °C (using icebags), in black glasses (4 cL per glass) so that colour did not influence the subjects' aromatic perception (Coulon-Leroy et al., 2018).Wines were labelled with three-digit random codes.In two cases, data collection took place in a standard sensory booth (ISO, 2007), under white illumination (in 'l'Ecole Supérieure des Agricultures', Angers, France and in the 'Rosé Wine Experimentation and Research Center', Vidauban, France).Data collections were automated using FIZZ software (Biosystèmes®, 1990).

Descriptive analysis by trained subjects (DA)
A descriptive analysis was carried out by a trained panel from the 'Rosé Wine Experimentation and Research Center'.This panel was composed of 13 trained judges-six women and seven men, between 35 and 65 years old-who were specialised in the description of rosé wines.They signed an employment contract with the 'Rosé Wine Experimentation and Research Centre' and received a salary.All judges had received regular training for at least three years on the 15 aromatic attributes presented on the right side of Table 2.These aromatic attributes were not generated specifically for the chosen five wines but were selected according to the expertise of the 'Rosé Wine Experimentation and Research Centre'.They are typically used for the characterisation of rosé wines, and the order of the attributes was the same for each judge and as that used in training (firstly, the judges evaluated the odours by orthonasal olfaction and secondly, the aromas by retronasal olfaction).In addition to this general training and the selection of the attributes by the panel leader, judges also participated in two specific training sessions (on two separate days), which lasted 1.45 hr and were dedicated to the evaluation of the rosé wines used in the present study.All the wines (A, B, C, D, E, and the replicate E*) were evaluated during the same session (another day).Attributes were assessed using a continuous scale, which was then transformed into a rating ranging from 0 to 10.

Hierarchical-Rate-All-That-Apply task (HRATA) methodology
The main goal of HRATA is to propose a hierarchical structure of attributes that depicts the set of samples involved in the evaluation.Subjects can freely select as many attributes (family, category, and/or terms) as they wish, depending on their individual appreciation (Figure 1).
The family name and category names were also available for selection if judges could not select specific attributes under each.Each subject is then asked to rate (three-point scale) the perceived intensity of each selected attribute.
As the objective was firstly to develop the method and not to characterise the wines too thouroughly and secondly not to tire out the panel too much, we have chosen not to have a complete description of the wines but only to focus on the lexicon of odours.Judges were asked to focus only on orthonasal aromatic description.

FIGURE 1.
A screenshot showing 118 attributes could be checked off and evaluated on a 3-point intensity scale (FIZZ software, Biosystèmes®, 1990).

Subjects
Sixty-six subjects (41 wine consumers, 13 students in viticulture and oenology, and 12 professionals from the wine sector) were recruited for the HRATA experiment.All subjects possessed a minimum level of wine knowledge, which was evaluated using a questionnaire similar to that presented by Koenig et al. (2020).Subjects with an objective knowledge score greater than or equal to 8/14 were retained for the experiment.The type level of expertise was not considered during the data analysis.As the objective is to use HRATA in a professional context and without consuming too much time, subjects did not receive any collective or specific training in the description of rosé wine before the HRATA task.
Table 1 depicts the socio-demographic characteristics of the subjects involved in the experiment.

Hierarchical structuration
Attributes used for the description of wines were arranged according to the hierarchical structure presented on the left-hand side of Table 2.These attributes of HRATA were selected and hierarchically structured according to the results of Koenig et al. (2021).The hierarchical structuration was aimed to be a general structure for wine description and not specific to the wines from the study.Here, the term 'families' is used for more general attributes (e.g., 'fruit'), 'categories' refers to intermediate attributes (e.g., 'red fruit' or 'white/ yellow fruit'), and 'terms' are the more-specific attributes (e.g., 'blackcurrant', 'cherry', or 'apricot').Thus, if a subject specifically recognised the odour of 'lemon', s/he could select it on the software interface.If subjects only perceived a fruity odour, without a specific identity, they could select the attribute 'fruit' or 'citrus'.

Application of HRATA to wine characterisation
The six samples (five wines plus a duplicated sample for one) were characterised according to the HRATA methodology.From the table of 118 attributes, the 66 subjects were instructed to select all attributes that corresponded to odours perceived in the wine.As our aim was not to characterise products exhaustively but to test the HRATA approach, we focused on orthonasal aromatic description, which is a less-demanding sensory task.
A full hierarchical structure of the attributes was presented to the subjects on a paper sheet to have an overall view of the attributes and their structuration.The main objective of HRATA is to propose a hierarchical structure of attributes representing all the samples concerned by the evaluation.At this stage, this structure is materialised using an exploration interface which consists in identifying each family by a tab grouping all the attributes belonging to it.In the same way, the subject can interactively choose an attribute to visualise the nested terms associated with it.During the evaluation phase, the subject can thus select, without constraint, as many attributes (family, category and/or terms) as they wish according to their individual sensitivity (Figure 1).For each attribute selected (family name, category name, or at the most specific level a term), subjects were next asked to assign an intensity score on a three-point scale ('low', 'medium', and 'intense').This prototype interface was designed in Fizz software (Biosystèmes®, 1990) to respond specifically to the desired structuring of HRATA.Families and categories were presented as tabs, and the attributes belonging to each family (or category) were listed in a corresponding tab.It was easy to switch from one table to another to find an attribute.

Data coding of the raw HRATA evaluations
As the HRATA subjects could rate the set of products with different attributes and, most importantly, at different levels of generality, we designed a dedicated statistical processing strategy for HRATA evaluation.Compared to conventional RATA data, our dataset was composed of a larger number of attributes (118, comprising 89 terms, 20 categories, and 9 families), of which only a few were likely to be selected.Consequently, our HRATA dataset contained a much larger proportion of non-evaluated data than what is usually encountered with RATA.To address this, an initial coding step was performed and the statistical techniques usually applied to RATA data (see Meyners et al., 2016) were adapted to this specific data configuration.
The HRATA data analysis process, including the coding pre-treatment, is illustrated in Figure 2 and detailed below.

3.4.1.
Step 1: Data coding with the integration of the hierarchy HRATA data were collected in a product x attribute data table for each subject with intensity scores ranging from 1 (low) to 3 (high) (Figure 3(a)).If an attribute was not selected by a subject for a given product, then the corresponding cell was left empty (as for the 'Fruit' attribute for Wine A in the example of Figure 3(a)).At this stage, the data array included all the attributes selected, regardless of their nature, i.e., without taking into account the hierarchical structure.
To incorporate the hierarchical structure of the attributes, a first coding step was used to aggregate intensity ratings from the lower hierarchical levels to the higher ones, as explained below (Figure 3(b)).A second coding step was then used to address non-evaluated data (Figure 3(c)).
The hierarchical structure of the data was accounted for using the following rules: (1) If a subject rates a category, then the rating provided to the category is imputed to the family to which it belongs.
(2) If a subject rates a term, then the rating provided to the term is imputed to both the category and the family to which it belongs (following rule (1)).
In practice, if several terms were evaluated within a category (or family), an aggregation rule was applied to combine the various scores at the lower level into a single value at the higher one.More specifically, the maximum aggregation rule was adopted: (3) If several terms are evaluated within a single category, then the category is assigned the highest value given to any term nested within it.The same reasoning applies to the family level.On the left are the attributes used for HRATA analysis; on the right are the attributes used for descriptive analysis (DA).To better compare the two methods, the rows indicate the likely equivalence between DA and HRATA attributes; the DA attributes were fitted into the same hierarchical system for interpretation but not for data capturing or analysis (in italics, attributes fitting several categories).
within it), then the same rule is applied: the highest value of the different scores is assigned to the category (or family).
In this way, a product x attribute data table was computed for each subject that represented the hierarchical structure of the odour terms used in the evaluation.

3.4.2.
Step 2: Rules for non-evaluated data and attributes At this stage, each table contained numerous non-evaluated attributes.In a typical RATA data treatment, the non-evaluated attributes are commonly imputed with a value of 0 (Ares et al., 2018;Danner et al., 2017;Meyners et al., 2016;Oppermann et al., 2017;Vidal et al., 2018).However, this can have a significant impact on the results when the number of non-evaluated attributes is large.Moreover, one can argue that if an attribute was not evaluated by a subject, this does not necessarily mean that the subject considered this attribute to have an intensity of zero; instead, it could simply mean that they did not take it into account, given the large number of attributes available.
A high frequency of zero values can bias usual indicators of central tendency and dispersion.This potential for bias was amplified here because of the larger number of attributes considered; typical RATA datasets generally comprise between 15 and 50 attributes (e.g., Ares et al., 2018), while our dataset contained 118 attributes (9 families, 20 categories, and 89 terms).The number of attributes available for selection was, therefore, much greater than the number of attributes evaluated by each subject for each wine.
To limit this bias, an additional coding rule was defined as follows: (1) If an attribute was never selected by a subject for any product, then the attribute qualifies as non-considered (NC) (as for 'mango' in Figure 3(c)).
(2) If an attribute was selected by the subject for at least one product, then the non-evaluated data for the other products are considered to be of zero intensity and imputed as 0.
This imputation rule aimed to differentiate among attributes that, on the one hand, were simply not perceived by a subject in a given product and, therefore, should have an intensity of zero and, on the other hand, those that were never taken into consideration for any product and should therefore correspond to NC.
This rule made it possible to filter the initial table by only retrieving the attributes selected by a subject, i.e., those with at least one assigned value for one product.As a result, the number of attributes considered varied among subjects.Finally, the retained attributes were transformed from three-point intensity variables to four-point intensity variables {0,1,2,3} to take into account the imputation rule.

Statistical data analysis of recorded HRATA data
The data obtained after these coding steps were then analysed to identify the attributes that could significantly differentiate among the various products and to obtain a product map.

Determination of discriminant terms, categories, and families
When dealing with RATA data, in most cases, the failure to meet assumptions of data normality does not prevent the use of analysis of variance in testing whether or not an attribute is discriminant.Indeed, Meyners et al. (2016) showed, based on the comparative analysis of two consumer studies using the RATA question, that this type of test is robust for analysing RATA data.Tests for product difference usually gave the same results when RATA data were analysed with ANOVA and Cochran's Q test.However, due to a large number of attributes proposed in our HRATA approach, a FIGURE 2. Sheets available for subjects to help them visualise all the attributes and their hierarchical organisation, during the HRATA experiment, as the software interface was still at a prototype stage.
Léa Koenig et al. very pronounced asymmetry was found in the distribution of the scores for a large majority of low-level attributes (terms), with a high frequency of zeros; this made ANOVA an inappropriate choice for our analysis.
As an alternative, then, we decided to assess the frequency of use.For a given attribute, the question was whether its frequency of selection differed significantly among products and among subjects.Attributes were, therefore, transformed from four-point intensity variables to binary variables (0 for zero-intensity and 1 otherwise, i.e., with an intensity value of 1, 2, or 3).A logistic regression was then performed for each attribute, with subject and product as independent factors (Dobson and Barnett, 2008).As mentioned above, subjects who did not consider an attribute were excluded from the analysis of that attribute so that the number of subjects for each attribute varied.From the logistic regression outputs, discriminant attributes were identified using a level of risk, a, of 5 %.

Multidimensional descriptive analysis
A multidimensional descriptive analysis allows products to be described based on a set of attributes.To this end, data were aggregated for each attribute across all subjects who scored that attribute.Dravnieks' score was used to take into account both the frequency of selection of the attribute and the intensity values assigned (Dravnieks, 1982;Vidal et al., 2018).For the calculation of this score, the frequency of attribute selection and the sum of intensity values were expressed as a percentage of the maximum possible values for each of the respective criteria (frequency and intensity).
The geometric mean of these two percentages was then used to obtain the percentage of applicability of the attribute.
Equation 1: D ik : Dravnieks score for the attribute i and the product k p ik : number of subjects who rated the attribute i for the product k p i : number of subjects who rated the attribute i for at least one product x ik : sum of intensity scores for the attribute i and the product k FIGURE 3. Schematic representation of the HRATA data analysis, including a coding step designed to (b) integrate the hierarchical structure and (c) apply a rule for non-evaluated data.(d) Data were then subjected to logistic regression for each attribute, involving subsets of subjects and computation of the Dravnieks scores for PCA.For illustrative purposes, the evaluation of six attributes for three wines is depicted.Family-level attributes are underlined and in bold; category-level attributes are in bold.
x i : sum of maximal values for the attribute i and at least one product A Principal Component Analysis (PCA) was performed on the Dravnieks score matrix (Figure 2(d)), with the 20 categories as active variables.The variables were centred but not standardised (cov-PCA) as they were derived from the same Dravnieks scoring procedure.The terms and families were introduced as supplementary variables.This emphasis on category attributes derives from the fact that, in the hierarchical wine lexicon, categories correspond to the attributes commonly involved in the sensory description of wines (e.g., King et al., 2013, Green et al., 2011, Danner et al., 2017).Most of the commonly used attributes (at a 'category level') were present in the list of DA attributes used by the 'Rosé Wine Experimentation and Research Centre' (Table 1).Therefore, we considered the category level to be the baseline of our hierarchy.Using supplementary variables then made it possible to be more precise (with the terms) or more global (with the families).

Comparison between HRATA and Descriptive Analysis (DA)
The characterisation provided by the HRATA method was compared to the results obtained from DA, a two-way analysis of variance, with interactions (product as fixed effect, subjects as the random effects), was performed on DA data to identify the attributes that discriminate among products (p-value < 0.05).For each attribute, values of the average score across all trained judges were computed, and the averaged data matrix was submitted to a PCA.As for HRATA analyses, the variables were centred but non-standardised (cov-PCA) as the scale is the same for all descriptors, and the judged (expert panel) were trained together using scales.
The overall similarity between the product configurations obtained by HRATA and DA was assessed using the RV coefficient (Robert and Escoufier, 1976).
All statistical analyses were performed using in-house scripts written in R (R 3.6.0version software).Logistic regression was performed using the glm function.PCAs were performed using the PCA function of the FactoMineR package (Husson et al., 2018).

Sensory characterisation by HRATA
1.1.Integration of the hierarchy Data were aggregated from lower levels to higher ones, as illustrated in Figure 3 (b).Figure 4 depicts the number of attributes (categories and families) with non-zero intensity values before and after hierarchical integration, demonstrating how this step reduces the number of missing scores.For example, hierarchical integration increased the frequency of scores for the 'fruit' family attribute from 27 to 302.Indeed, the 'fruit' family itself originally received very few scores since the majority of subjects described their sensory perceptions using finer-scale (category-or term-level) attributes within this family; the difference was particularly notable in this case since the family 'fruit' contains 5 categories and 31 terms.Another interesting example was the 'chemical' attribute, which was only rarely selected as a family or category, while the terms 'alcohol' and 'rubber' within those groups were chosen more often.

Determination of discriminant terms, categories, and families
Logistic regression was performed to determine the discriminant attributes, to identify those for which the probability of selection differed significantly among products.Table 3 shows the 5 family-level attributes, 10 category-level attributes, and 23 terms for which a significant product effect was identified (p-value < 0.05) (from a total of 9 families, 20 categories, and 89 terms).Half of the categories were used to discriminate among products, in the sense that they were frequently cited for certain products than for others.Within these categories, some terms stood out.For example, the 'lactic' category had a very low p-value (Table 3), and within this category, the term 'butter' significantly distinguished certain products from others, while the terms 'yeast' and 'bread' did not.In the case of the 'earthy' category (p-value for the product effect of less than 5 %), though, none of the terms within the category were found to be discriminant; it seems that only the combination of terms in the category made it possible to differentiate among products.On the other hand, some categories were not associated with any significant discriminating ability (e.g., 'chemical_c'), while certain terms belonging to them were (e.g., 'alcohol' and 'rubber').

Multidimensional data analysis
Data were analysed using a non-standardised PCA on the matrix of Dravnieks scores.All category-level attributes were used as active variables, while family-level attributes and terms were included as supplementary variables.
The results of the PCA with only the category-level attributes are displayed in Figure 5.For the sake of clarity, the discriminant categories (listed in Table 3) are shown in black in the panel, with all other categories in grey (Figure 5  Figure 5 shows the discriminant category-level attributes (in black) together with all discriminant terms and family-level attributes (Table 3) projected as supplementary variables onto the two first principal component axes.
The projection of specific terms onto the PCA plot in Figure 4 made it possible to obtain more detailed information about the specific odour notes of the wines.We can illustrate this by focusing on two examples, namely the 'chemical' category and the 'floral' category.
The 'chemical_c' category was composed of six attributes ('alcohol', 'rubber', 'nail polish remover', 'oil', 'sulfur', and 'vinegar'), of which only two were significant: 'rubber' and 'alcohol'.The term 'rubber' was selected by six subjects and had a Dravnieks score of 47.14 for wine A (Table 3).In the PCA plot (Figure 6), this attribute was highly correlated with the first dimension, while the term 'alcohol' was negatively correlated with both the first and the second dimensions.Thus, wine A was alone in being characterised by a 'rubber' note.
For the category 'floral_c', the Dravnieks scores for the different wines were between 24.15 and 43.09 (Table 3) based on evaluations by 54 subjects.This category, which is quite often used according to these scores, was composed of six attributes ('acacia', 'orange blossom', 'jasmine', 'lilac', 'rose', and 'violet').Interestingly, within this category, the term 'violet' (evaluated by 18 subjects) was more closely associated with wines C, E, and E*.
From a sensory point of view, these two examples show how the decomposition of categories into their component attributes can provide additional information on wine characterisation.In classical sensory evaluation, such as DA description, wines are more likely to be described using Here, however, HRATA evaluation enabled a deeper assessment of more specific characteristics of the wine, for example, 'rubber' and 'violet', even those the use of those terms differed from their corresponding categories.

Determination of discriminant terms, categories, and families
A two-way analysis of variance was performed on DA data to identify the attributes that discriminate among products; we considered odours and aromas.The expert panel used 29 attributes (14 odours and corresponding aromas and the attribute 'faults') to define the sensory properties of the studied rosé wines.Values of the Fisher statistics of the analysis of variance (ANOVA) and p-values are given in Table 4.The effect of the wine factor was found to be significant (p-value < 0.05) for 4 odours, 4 aromas, and the attribute 'faults'.

Multidimensional data analysis
The data obtained from the Descriptive Analysis (DA) conducted by the trained panel were submitted to a PCA.We used only the discriminant attributes with a p-value < 0.20 and odour attributes in coherence with the HRATA experiment, aroma attributes were used as illustrative variables (Figure 7).Wine A was mainly characterised by an 'animal' odour.Wines B, C, and E were characterised by 'citrus', 'flowers', and 'sweets' odours.Finally, wines E and E* were characterised by 'ripe, compote, confit fruit' odours.
These results were largely consistent with the aromatic characterisation produced using the HRATA method, especially with respect to the 'red fruit' note of wines E and E* if we also considered the aroma attribute and the 'animal' and 'faults' notes of wine A (Figure 7).
Moreover, when we compared the configuration of the wines obtained from the PCA (first two dimensions) of the HRATA data with that of the DA results, we found an acceptable degree of similarity, with an RV coefficient of 84 %.
We noted differences in the use of the attributes 'dried fruit/ flowers/vegetables' and 'ripe fruit/compote', which did not stand out as much in the description obtained from HRATA.Generally speaking, though, the description generated by HRATA was similar to that obtained by descriptive analysis.

DISCUSSION
Here, the HRATA approach enabled us to obtain results for wine aromatic characterisation that was comparable to the results of a DA performed by a trained panel.Similar comparisons have been performed between the RATA method and DA (Ares et al., 2018;Danner et al., 2017;Oppermann et al., 2017), and while the maps produced by the two methods are generally similar, Ares et al. (2018) reported that RATA is less discriminating than DA when describing complex products (such as wine).This was consistent with One of the main reasons for this discrepancy was likely the fact that all aromatic notes did not necessarily have the same weight and meaning in the two lexicons.Some aromatic notes were not clearly represented in the HRATA lexicon compared to that used for DA: the DA attribute 'ripe/compote/confit fruits' was present in the HRATA lexicon only as the attribute 'ripe fruit' in the 'red fruits' category.Instead, the attribute 'dried fruits/flowers/vegetables' was divided into three different categories in HRATA.These differences were due to the fact that the DA was based on the typical lexicon used by the trained panel of the 'Rosé Wine Experimentation and Research Centre' instead of on the hierarchical lexicon generated and used for the HRATA analysis.
One more, judges performing the DA may be focused less on the odours as they knew that they would also evaluate the aromas.To investigate this further, it would be interesting to perform a DA by selecting only the category-level attributes of HRATA and comparing characterisations based on odour and aroma evaluations.
Regarding the HRATA approach, we made a choice to use subjects who possessed at least a minimal knowledge of wine but who were not specifically trained for the characterisation of rosé wine.The subjects, therefore, represented a wide range  of wine expertise, from consumers, students in viticulture and oenology, and professionals from the wine sector.Consumers were overrepresented in the HRATA evaluation.
We compared, on one side, a classical DA evaluation with a trained panel and, on another side, the HRATA evaluation with a mixed panel without common training, as the aim was to propose an alternative approach.We also preferred to use two independent panels rather than compare the method with the same group.We have shown that the characterisation provided by the HRATA method was in agreement with that of a classical DA.
From the taster's point of view, this approach permits greater freedom in evaluation, as the hierarchical structure makes it possible to assess general and/or more specific attributes depending on the sensitivity of the individual without difficulty.A new friendly and ergonomic interface has been created since the end of this study (Figure 8).
From the experimenter's point of view, incorporating the lexicon structure should increase the accuracy of product characterisation via the introduction of a greater number of specific attributes.
Furthermore, the proposed strategy for data analysis is easy to implement and provides results that fully take into account the hierarchical structure for the purpose of interpretation.
In designing this study, we made several choices regarding the methodology and statistical approach to be used.First of all, to aggregate information from one hierarchical level to the next, we chose to use a maximum rating aggregation rule, which seemed to us to be the most logical choice compared to summation or averaging of intensity ratings, for example.Nevertheless, other aggregation rules are possible, for example, those proposed within the framework of multi-criteria aggregation in fuzzy logic (Grabisch and Perny, 2003).Alternatively, a weighting aggregation rule could have been investigated instead of our practice of weighting all attributes equally regardless of the rating selected.
This first coding step allowed information to be imputed from the lowest to the highest level in the hierarchy.However, the number of non-evaluated attributes remained very large.For this reason, our analysis distinguished between two situations: attributes that were not considered at all by the subject and attributes that were given a score of zero intensity for some products.In classical RATA data processing, this distinction is not made; an attribute that is not evaluated has a score of 0. However, we found it preferable to limit the number of cases of automatic imputation to 0 as much as possible.
By using logistic regression, we could directly address whether an attribute was chosen to characterise one product more often than others.Although this approach has the disadvantage of losing quantitative information about intensity scores, it was the method best adapted for dealing with the high number of non-selected attributes.Indeed, the very high frequency of 0 in the dataset strongly skewed the distribution of the response.This further demonstrated that ANOVA would be an inappropriate statistical analysis for this type of dataset.For reasons of robustness, the probability of citation, as calculated in this study, was a more appropriate statistical treatment given the data distribution.
As explained above, only attributes at the category level were used as active variables in the PCA.This choice enabled us to carry out a standard statistical treatment that is easy to implement and simple to interpret in the context of sensory description.By using the other significant attributes as supplementary variables, we could consider the lexicon's hierarchical structure and provide clarification in terms of interpretation.
Within the hierarchical presentation of the attributes used in this study, we further incorporated an intensity rating scale, with which the subjects were asked to record the intensity of each attribute.Without this rating step, though, this method easily becomes a Hierarchical-Check-All-That-Apply (HCATA) approach.On this point, Meyners et al. (2016) pointed out that RATA may be more prone to bias than CATA with respect to response strategies (Sudman and Bradburn, 1986).RATA requires more cognitive input; people need to think more about the attributes scoring the intensity of each selected attribute, and subjects may decide to simplify the task by not taking the time to select the same number of attributes they otherwise would.Pineau et al. (2022) investigate faster alternatives to sensory profiling (including CATA and RATA) and highlight that RATA characterisation was closer to sensory profiling.Further investigation should be considered to better evaluate the differences between different rating scales (or no rating scale for HCATA) in terms of both the aromatic characterisation of the products and the user experience.A large number of attributes is considered in our HRATA methodology compared to RATA experiments (e.g., Copper et al., 2019;Mezei et al., 2021); having a large number of attributes can allow one to characterise precisely the wines as in free profiling, it avoids to select words by consensus as classically performed to evaluate and describe the sensory space of wines (Barbe et al., 2021).
HRATA needs more data pre-treatment before univariate and multivariate analyses; this may cost more time in the data analysis compared to RATA; however, these steps can be easily automated.This disadvantage is compensated by the superiority of HRATA to access more informative data compared to RATA or DA.

CONCLUSION
The HRATA approach presented in this study represents an interesting alternative for obtaining aromatic characterisation using a panel of subjects without collective training, generating and selecting descriptors and also with diverse skill sets (in our case: consumers, professionals, and students).Our objective was to propose an easy-to-implement approach as well as a simple and relevant strategy for data analysis.Compared to classical descriptive sensory analysis, HRATA provides an effective solution for describing complex products like wine.We considered a hierarchical structuration of sensory attributes with three levels (families, categories and terms).The category level, classically used in a descriptive sensory analysis with a pre-established list of descriptors, can be the baseline of the hierarchy during data analysis.Terms allow for precise characterisation, and families allow one to summarise the characterisation.
More generally, the guidelines provided in this paper are general enough to be applied to the aromatic evaluation of any complex product whose attributes can be represented hierarchically.Our data processing approach can be used for other sensory or hierarchically designed data sets.Of course, the hierarchical structure and the number of levels may vary depending on the product under consideration.
FIGURE 8. New graphical interface of Fizz software using the HRATA method with an aroma wheel (Biosystèmes®, 1990).

FIGURE 4 .
FIGURE 4. Number of scores per attribute (families and categories belonging to each family), with non-zero intensity values, before and after integration of the hierarchy.Family levels are underlined.
(b)).Wine A stands out, particularly, in the first dimension, which accounts for more than 60 % of the inertia.This wine was characterised by 'animal' and 'earthy' odour notes.The other wines are separated along the second dimension of the PCA based on notes of 'tropical fruits' and 'white and yellow fruits'; wines B, C, and D had 'white and yellow fruit' aromas and wine D was characterised by 'tropical fruit'.Wines E and E* (corresponding to the replicated wine) were characterised by notes of 'red fruits'.The proximity of the two replicates (E and E*) with respect to the first two principal components demonstrates the consistency of the orthonasal aromatic evaluation of the subjects.

FIGURE 5 .
FIGURE 5. PCA plots based on Dravnieks' scores for category-level attributes.(a) Configuration of products and (b) configuration of variables (attributes in grey were found to be not discriminant, at a 5 % level of risk).

FIGURE 7 .
FIGURE 7. PCA plots based on DA data from trained subjects.(a) Configuration of the products and (b) configuration of the variables (in black: odour attributes evaluated by orthonasal olfaction, in grey: aromatic aromas evaluated by retronasal olfaction).

TABLE 1 .
Frequency table of gender, age, and occupation within the panel (%, n = 66).

TABLE 3 .
Dravnieks' score and logistic regression results for wines based on the frequency of use for each family, category, and term.