Discrimination and authentication of geographical origin of Turkish Taşköprü garlic by investigating volatile organosulfur compound profiles and multivariate analyses

Nowadays, counterfeiting and adulteration on foods take place around the world in a variety of ways. Identification and authentication of geographical origin of agricultural products has great importance not only for food safety, but for protection of registrations as well. This study aimed at discriminating the Turkish Taşköprü garlic, possessing protected geographical indication (GI) in Turkey and GI registration from the European Union, from the other samples. For this reason, the combination of headspace-gas chromatography-mass spectrometry (HS-GC-MS) analysis of the volatile organosulfur compounds (VOSCs) and two multivariate analysis techniques, namely hierarchical cluster analysis (HCA) and principal component analysis (PCA), were employed for classifying the garlic samples on the basis of their geographical origin. Discrimination of Taşköprü garlic and the other samples including a suspicious sample imported from China was accomplished by performing two dimensional and three dimensional PCA analyses to relative amounts of VOSCs and also to chromatogram raw data.

opportunity in Turkey is Taşköprü garlic. Another city that stands out in garlic production is Gaziantep with well-known "Araban garlic" which is also registered by the Turkish Patent and Trademark Office with geographical indication. However, Araban garlic can be sold under the name Taşköprü in the open market. The reason why some consumers prefer imported garlic from China is its lower price. However, some vendors benefiting from the high course of the market sell Chinese garlic at a price at the same level as Taşköprü garlic. Regardless of its price, Taşköprü garlic is generally preferred in Turkey owing to its taste and strong aroma. It is seen also in the Turkish market that some of the garlic samples with the inscription of domestic production on the label are actually imported from abroad, especially from China. Therefore, authentication of regional and national origin of garlic samples must be distinguished, because the consumers are concerned about the originality of garlic. Thus, sustainability of the quality of the product can be ensured and economic fraud can be prevented by protecting the GI registration.
In the literature, very recent studies on discrimination of Allium species and garlic samples originating from different geographical areas across the globe have attempted to examine its elemental profiles, physicochemical properties, isotopic profiles, metabolomics, volatile profiles performing various analytical techniques and statistical methods. Fourteen selected studies from the literature are summarized with the employed target analyses and instruments, and statistical methods are listed in Table 1.  [12] 2 Multielemental analysis, Volatile compound analysis, metabolomics analysis ICP-MS 4 , HS-SPME-GC-MS 5 , UHPLC-Q-TOF/MS PCA, PLS-DA [13] 3 Organosulfur volatile profiling HS-SPME-GC-MS PCA, PLS-DA [14] 4 Spectral data analysis HRMAS-NMR 6 , PCA [15]  Two of the studies presented in Table 1 discriminated the garlic samples from different regions by investigating the  analyses of volatile compounds combined with chemometric methods [13,14]. Mi et al. classified the garlic samples  collected from the four major production regions of China by applying PCA and PLS-DA to 55 elements, 68 volatiles  (15 alkanes, 9 aldehydes, 3 alcohols, 3 acids, 3 ketones, 2 esters, and 33 other volatile compounds), and 854 metabolites  which were previously quantified by ICP-MS, HS-SPME-GC-MS, and UHPLC-Q-TOF/MS, respectively. According to the statistical results, 10 chemical elements, 6 volatiles, and 225 metabolites were suggested as candidate markers for the discrimination of the geographical origins of the garlic samples collected from different regions [13]. In another study, Biancolillo et al. presented a method of combination of HS-SPME/GC-MS analysis of volatile organosulfur compounds (VOSCs) and PLS-DA for classification of red garlic (Allium sativum L.) samples grown in four different regions in Italy. Relative (%) peak areas of the identified 13 of VOSCs were used in the chemometric analysis [14].
On the other hand, to discriminate the geographical origins of the garlic samples, NMR and FTIR spectral data have been statistically analyzed with various chemometric tools. Jo et al. achieved to classify the garlic samples and onion samples obtained in Korea and China after analyzing NMR spectral data by PCA [15]. Ritota et al. successfully discriminated red and white garlic samples, which were collected from four different areas in Italy, by examining the NMR data analysis with PLS-DA [16]. Biancolillo et al. presented a nondestructive approach based on ATR-FTIR spectroscopy combined with chemometrics (PLS-DA, SO-CovSel-LDA, and SO-PLS-LDA) to distinguish the garlic samples cultivated in four different regions in Italy [17].
In this study, to statistically elucidate similarities and dissimilarities among the VOSC profiles and the chromatogram raw data, which consisted of thousands of values, obtained from HS-GC-MS analyses, HCA and PCA methods were utilized. Relative contents (%) of the identified 17 of VOSCs were analyzed by HCA and PCA while the chromatograms' raw data were processed with a PCA application designed for spectral data analyses. The novelty of this article that the method used for PCA for spectroscopy was successfully adapted to chromatograms of the garlic samples cultivated in different regions. Besides, the proposed study did not need further examinations of the other substances such as the other volatiles by using SPME, multielemental profiling, and isotopic ratio analyses for discrimination of the target geographical origins. Moreover, the absence of requirement of additional extraction processes provided these advantages: I) it reduced the sample preparation time and costs, II) the toxic reagents and solvents were neither used during sample preparation nor released to the environment.
Consequently, discriminations of not only Taşköprü garlic and the garlic samples collected from different provinces of Turkey, but also a suspicious sample with the inscription of domestic production and unknown samples, which is important for traceability issue, were achieved by performing the chemometric approach for the first time in the literature.

Instrumentation
Separation and determination of VOSCs in garlic samples was carried out by using a PerkinElmer (PE) Clarus 500 GC-MS equipped with a Turbomatrix HS40 headspace autosampler. A free fatty acid phase (FFAP) GC column (PE) with dimensions 30 m length, 0.25 mm i.d., and 0.5 μm df. was utilized for chromatographic separations of VOSCs. Helium was used as the carrier gas at 1 mL/min constant flow rate throughout HS-GC-MS system with HS pressure of 30 psi. All HS-GC-MS conditions are summarized in Table 2. To manage all the parameters of GC-MS system and data acquisition by computer, a TurboMass software was used. Compounds were identified by mass spectral matching in the NIST library.

Samples
The garlic samples from Kahramanmaraş Afşin Koçovası (KMR), Gaziantep Araban (GTP), which have also GI registrations, and Hatay (HTY) were obtained from the different markets in İstanbul while six different Kastamonu Taşköprü (KT) garlic samples were supplied from Taşköprü district of Kastamonu Province. In addition, two samples with no labels and a sample, of which the seller confessed that it was Chinese garlic, with the inscription of domestic production on the label were purchased from local markets and they were coded as suspected samples (SS).

Sample preparation
For each sample, randomly selected garlic clove samples were peeled and then 10-g samples were homogenized by grinding. After 1 g of the homogeneous sample was directly weighed into a 20 mL-HS vial, its lid was tightly closed with the gas-tight teflon silicone cap prior to loading to the HS autosampler.

Data analysis
The statistical analyses were conducted by utilizing the PCA and HCA applications installed in OriginPro software (OriginLab, version 9.6.5.169) after downloading the applications from the official website of the company [26][27][28]. The reason why PCA was used is that PCA is one of the well-established techniques for dimensionality reducing and interpreting large multivariate data sets with underlying linear structures. PCA is a multivariate analysis technique and its purpose is to extract fundamental or important information from input data into a set of new orthogonal variables called principal components [29]. The PC1 is the most variability of the data, and the PC2 is the next most variability, and other components continue in this order. It, therefore, allows discovering previously unsuspected relationships among the samples.
In this study, PCA was applied in two ways. The relative contents of the VOSCs were processed with PCA application while the raw data of the chromatograms were processed using an easy-to-use application of OriginPro software, namely, PCA for spectroscopy. Even though the latter application was developed for spectra (IR, Fluorescence, UV-Vis, Raman, etc.), successfully discriminative results were also achieved from the chromatograms. On the other hand, the application, namely, heat map with dendrogram (HMD) was used to perform HCA along columns and rows of the relative contents (%) of the VOSCs and samples, respectively, to plot the two-way HMD.

HS-GC-MS analysis
As can be seen in Table 3, the chromatographic profiling of the garlic samples were implemented by investigating 17 VOSCs, which were found as the major compounds under the optimized HS-GC-MS conditions. The chromatograms of the garlic samples are exhibited in Figure 1. In the chromatograms obtained as a result of the analyses performed under the same chromatographic conditions, a noticeable difference was observed as the fewer and smaller peaks on the chromatogram of SS1 sample. The other compounds such as terpenes, aldehydes, alcohols, and the other minor compounds were not detected by using the static HS-GC-MS. The relative contents (%) and molecular structures of the VOSCs examined in this study are demonstrated in Table 3. Among the VOSCs, the most abundant three compounds were diallyl disulfide, di-2-propenyl trisulfide, and methyl 2-propenyl trisulfide in the Turkish garlic samples. Whereas the relative amount of diallyl disulfide was found 63.44% in SS1 which was approximately two-times higher than the average values of the other samples, di-2-propenyl trisulfide and methyl 2-propenyl trisulfide were found 0% and 0.5%, respectively. The highest content value of di-2-propenyl trisulfide was observed in GTP sample with 42.10%. Naturally, it is almost impossible to come to a conclusion by examining numerous data and filtering them. In addition, if an official expert report is to be prepared, the results based on objective values should be presented. Therefore, the present study aimed at utilizing the statistical analyses.

Statistical analysis 3.2.1. HCA and PCA for relative contents of the samples
To statistically determine the differences in 204 data (17 × 12 matrix) given in Table 3, HCA and PCA, which are the unsupervised methods and do not require any labeled or reference data unlike supervised methods, were performed in the present study. Clusterization by HCA provided more apparent results for easily comprehending the differentiations of the garlic samples compared to Table 3. As seen in two-way HMD presented in Figure 2, the horizontal and vertical clusters belong to garlic samples and VOSCs, respectively. It is clear from the clustergram that the first cluster was of the merely SS1 while the second cluster was consisted of subclusters of the Turkish garlic samples from different regions. Since the SS2 and SS3, whose origin were unknown, were found between GTP and KMR, it was concluded that these suspected samples had been grown in Turkey. On the other hand, the third subcluster comprised the only six KT samples.    Figure 3 exhibits the resulting score plot and the corresponding loadings of the first two PCs with a total covariance of 96.8%. The corresponding loading plots were used to identify the compounds that allowed the clustering in PCA. Figures  2 and 3 illustrate that in the first cluster of HMD and the vectors in the loading plot for PCA of the relative contents (%), the dominant compounds that distinguish the SS1, GTP, and KT sample from other samples was diallyl disulfide, di-2propenyl trisulfide, and methyl 2-propenyl trisulfide, respectively. Also, methyl 2-propenyl disulfide was the compound to differentiate SS1 from the closest sample GTP. All samples contained diallyl disulfide with different percentages though only SS1 did not possess di-2-propenyl trisulfide and methyl 2-propenyl trisulfide, which was another decisive difference. It was observed from PCA of relative contents of the samples, KT, GTP, HTY, and KMR samples were distributed in the top-left, bottom-left, top-right, and bottom-right quadrants, respectively.
As in the 3rd subcluster of HCA, KT samples were accumulated in a separate quadrant as a result of PCA applied to the relative contents. Moreover, KMR was seen as the closer sample for SS2 and SS3. On the other hand, according to Figures 2  and 3, KT was distinctly separated from the other samples grown different regions in Turkey with these compounds; diallyl disulfide for HTY, di-2-propenyl trisulfide (major one) and 1-allyl-2-(prop-1-en-1-yl)disulfane for GTP, and 1-allyl-2-(prop-1-en-1-yl)disulfane for KMR.

PCA for chromatograms
The raw data of the chromatograms exported from the TurboMass software of HS-GC-MS's computer were imported by this way Y data were chosen for chromatogram data and X data was entered for time. PCA for 1 (X) × 12 (Y) matrix consisting of 57,200 data (13 × 4400) were processed in a few seconds.
According to the score plot PCA for chromatogram given in Figure 4, among the Turkish garlic samples, KT mostly extended in the bottom-right quadrant unlike the PCA for relative contents of the VOSCs (see Figure 3). Distribution of the KT samples did not lie in the same quadrants with GTP, HTY, and KMR. Moreover, another remarkable point of the PCA results that the garlics grown in Hatay and Gaziantep, two neighboring cities, were clustered in the upper-right quadrant. On the other hand, unknown samples were clustered in the same region with the sample grown in Kahramanmaraş again like the aforementioned results.
As a result of PCA plotted of PC1 versus PC2 explaining 54.8% and 29.2% of the total variance, respectively, five of six KT samples were in the bottom-right quadrant, but only KT2 sample lied in the bottom-left quadrant. Therefore, a 3D PCA was constructed with 93.4% of the cumulative percentage of covariance. According to the 3D PCA performed with chromatographic raw data given in Figure 5, it was concluded that KT samples were completely distinguished from the other Turkish garlic samples. Furthermore, Figure 6 demonstrates 2D PCA where the PC1 and PC2 accounted for 48.7% and 24.6% of the total variance, SS1 was found in 95% confidence level interval with the Turkish samples; that is, SS1 sample could not be separated by this way. However, the cumulative percentage of covariance reached 92.3% by performing 3D PCA (see Figure7) formed with also using the third principal component which explains 19.0% of the total variance. Thus, discrimination of SS1 garlic from Turkish sample was achieved.

Discussion
The overall results revealed that VOSCs profiling in combination with multivariate analyses are the suitable approaches for the authentication of the geographical origin of garlic. Considering all the results, it can be concluded that the SS1 sample is definitely an imported product, as the seller admits, whereas SS2 and SS3 were Turkish garlic samples and it can even be said that they had been most possibly produced in Kahramanmaraş. It is worth noting that geographical origin separation of garlic samples, especially world-famous "Taşköprü garlic", were achieved by PCA and HCA techniques applied to HS-GC-MS analysis results of VOSCs. This methodology, thus, does not need any additional examination of VOCs results by utilizing further sample preparation procedures like SPME. In the literature, Mi et al. and Biancolillo et al. classified the garlic samples by using PLS-DA after explorating the data with PCA [13,14]. Both studies utilized HS-SPME-GC-MS systems for analyzing the VOCs. Mi et al. reported that among the 68 of VOCs, 1,2-dimethoxybenzene, 1-(2-methyl-1-cyclopenten-1-yl)-ethanone, mequinol, 2-methoxyphenol, 3,4-dimethylthiophene, 1-allyl-2-(prop-1-en-1-yl)disulfane were assigned as the VOCs that were responsible for separation of the four groups of garlics cultivated in different regions in China [13]. In addition, according to the work presented by Biancolillo et al., though relative (%) contents of some VOSCs were characteristics of several categories, diallyl disulfide and methyl 2-propenyl disulfide (allyl methyl disulfide) were reported as the major compounds among the 13 of VOSCs, and an inverse relationship was found between the content of these two compounds in the garlic classes [14]. Furthermore, relative (%) content of methyl 2-propenyl trisulfide (allyl methyl trisulfide) was the cornerstone for discrimination of KT from the other garlics in this article whereas this compound had been found in the garlics from three different regions in Italy [14]. Like the other studies [13,14], diallyl disulfide was determined as the major compound and its content contributed to discriminating HTY samples from the other samples in Turkey while its content was found significant in two of the four regions in Italy [14]. In addition, di-2-propenyl trisulfide and 1-allyl-2-(prop-1-en-1-yl)disulfane were the determinative compounds especially for the GTP sample whereas the first compound was found in all four regions in Italy [14]. According to Table 3, di-2-propenyl trisulfide was not detected in SS1. Similarly, Mi et al. had not reported any result for di-2-propenyl trisulfide in Chinese garlics [13].
On the other side, it can also be said that the chemometric spectral data analysis performed with the noninvasive ATR-FTIR method is more advantageous than the sophisticated instrumental techniques in terms of ease of sample preparation, low operating cost, and fast operation [17]. In this respect, the proposed study may seem to be disadvantageous, but identification of each volatile compound which is responsible for discriminations can be individually accomplished by gas chromatographic systems. On the other hand, even though HS-GC-MS is an invasive method, this is negligible for garlic samples which are usually easily accessible unless a sample with very low amount sent for examination in forensic or food control laboratories.
As a suggestion, in order to protect the rights of GI registrants, different techniques should be developed for different types of samples apart from the analytical method and the statistical approaches as conducted in this study. Determination of geographical origin by evaluation of a single result obtained from routine analyses is a very difficult issue. Through created databases in the relevant authorities of the states, it may be possible to prepare reports based on objective results with the similar approaches and even using artificial intelligence for a single sample examined in laboratories (forensic, food control, institutes, etc) that act as experts. On the other hand, there is no doubt that the approach proposed in this study can be used to discriminate garlic as well as other species of different genera of the same family from each other. In addition, from the perspective of a forensic chemist, if the sample is a forensic finding, determining its origin may also enable it to be presented as important evidence in a case.

Conclusion
In the present study, the combinations of HS-GC-MS and chemometrics were investigated for discrimination not only of Taşköprü garlic and the other garlic samples cultivated in different cities in Turkey (Kahramanmaraş, Gaziantep, and Hatay), but also of garlic samples from Turkey and China. To the best of the author's knowledge, this is the first study to investigate discrimination of geographical origin of well-known Taşköprü garlic by using HCA and PCA to relative content of the VOSCs and to chromatogram raw data. The noteworthy advantages of the methodologies described in this article are as follows: having simple and fast sample preparation procedure, being a cheap and green examination in terms of not applying pretreatments such as liquid-liquid or liquid-solid extractions, rapid statistical analysis of relative contents of VOSCs or directly chromatograms' raw data consisting of thousands of data, and allowing geographical origin prediction. Consequently, when this study is considered with a holistic approach, the proposed methods have the potential for tracking which can provide reliable supportive results to evaluate the quality and detect frauds of garlic samples.

Conflict of interest
The author declares that he has no conflicts of interest.