The Iris dataset is a well-known dataset in the field of machine learning and statistics, often used for classification tasks. It was introduced by the British biologist and statistician Sir Ronald A. Fisher in his 1936 paper, "The Use of Multiple Measurements in Taxonomic Problems." The dataset consists of 150 samples from three different species of iris flowers: Iris setosa, Iris versicolor, and Iris virginica. Each species is represented by 50 samples, with four features recorded for each sample: the sepal length, sepal width, petal length, and petal width (all in centimeters).
FISHER, R.A. (1936), THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS. Annals of Eugenics, 7: 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x