Wine Dataset for Machine Learning Classification Experiment
Creators
Contributors
Data collectors:
Description
Dataset Description
These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines.
Context and methodology
This dataset was created for research in machine learning classification and chemometrics, specifically to test and compare different supervised learning algorithms.
The dataset provides chemical analysis measurements of 178 wines from three cultivars grown in the same Italian region. It is used to evaluate the performance of classification algorithms (Multi-Layer Perceptron, Decision Tree, Gaussian Process Classifier) in predicting wine classes. It allows testing models under well-posed classification conditions and reproducibility studies.
Data were obtained by laboratory chemical analysis of wines, measuring 13 constituents such as Alcohol, Malic acid, Ash, Magnesium, Total phenols, Flavanoids, and Proline. The original dataset was donated by Riccardo Leardi (University of Genoa) and hosted in the UCI Machine Learning Repository.
Technical details
The dataset is a single file (wine.data) with 178 rows (instances) and 14 columns (13 features + 1 target class).
Column names follow a consistent naming convention corresponding to chemical constituents (e.g., Alcohol, Malic_Acid, Ash, …, Proline, class).
It can be opened programmatically using Python (pandas, numpy), R, or similar tools.
The ucimlrepo Python package can fetch the dataset programmatically.
The dataset is ready for machine learning analysis after standard preprocessing (scaling recommended for some classifiers).
Further details
No missing values in the dataset.
First column (class, Column 0) is the target variable (values 1–3).
All features are numeric (continuous or integer).
Recommended for testing classification algorithms and reproducibility studies.
Licensed under CC-BY 4.0, so reuse and redistribution is allowed with proper attribution.
Files
Additional details
Identifiers
- DOI
- 10.24432/C5PC7J
Related works
- Is derived from
- Dataset: 10.24432/C5PC7J (DOI)
Dates
- Submitted
- 2025-11-30Date of deposit to test repository
References
- Aeberhard, S., & Forina, M. (1992). Wine [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5PC7J