Iris Dataset Classification in Python | Machine Learning

by | Dec 17, 2022 | Coding, Machine Learning

Introduction of the Project

In this coding tutorial, we will perform Iris dataset classification in python. This machine learning model basically predicts the species of iris flower and classifies the flower into 3 categories, namely Setosa, Versicolor, and Virginica. We will classify on the basis of given features like sepal length, sepal width, petal length, and petal width in cms. The dataset for this model is in-built into the sklearn module of python, so we can directly import our dataset from the sklearn module.

 

Objectives

1. The objective of building this model is to classify the species of iris flower into mainly three categories.

  • Iris.Setosa
  • Iris.Versicolor
  • Iris.Virginica

2. To predict the species of iris flower on the basis of features like sepal length, sepal width, petal length, and petal width in cms.

3. This model can be useful to the research departments that have been working in the field of botany. People like plant scientists can also use this model for research purposes.

Requirements

Source Code

# Using Matplotlib, we create a scatter plot where the color of
#each data point corresponds to the class label.
# To make plotting easier, we limit ourselves to the first two
# features (iris.feature_names[0] being the sepal length and
# iris.feature_names[1] being the sepal width).
# We can see a nice separation of classes in the following figure:
plt.figure(figsize=(10, 6))
plt.scatter(data[:, 0], data[:, 1], c=target, cmap=plt.cm.Paired, s=100)
plt.xlabel(iris.feature_names[0])
plt.ylabel(iris.feature_names[1]);
# Splitting our data set into training and testing
# assigning the test size as given(10 percent for training
# and rest for testing)
X_train, X_test, y_train, y_test = model_selection.train_test_split(
data, target, test_size=0.1, random_state=42
)
X_train.shape, y_train.shape
X_test.shape, y_test.shape
# training the classifier
lr = cv2.ml.LogisticRegression_create()
# We then have to specify the desired training method.
# Here, we can choose cv2.ml.LogisticRegression_BATCH or
# cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to
# know is that we want to update the model after every data point,
# which can be achieved with the following code:
lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
lr.setMiniBatchSize(1)
lr.setIterations(100)
lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train);
lr.get_learnt_thetas()
metrics.accuracy_score(y_train, y_pred)
ret, y_pred = lr.predict(X_test)
metrics.accuracy_score(y_test, y_pred)

Explanation of the Code

1. Initially, we imported the dataset from the sklearn module of python,

2. The sklearn module contains the features like sepal length, sepal width, petal length, and petal width. On the basis of all these features, our machine-learning model predicts the species of the iris flower.

3. Then we trained our classifier using a Logistic Regression algorithm which is a type of regression algorithm that helps to solve classification problems.

Output

We can see a clear and easily understandable separation of classes in the following figure.

Iris Dataset Classification in Python | Machine Learning

Conclusion

Hence we have successfully built our machine learning model for Iris Dataset Classification in Python, which predicts the species of iris flower on the basis of given features. This machine learning coding project acts as a helping hand for the group of people working in the domain of botany.

 

You May Also Like To Create…

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *