Introduction of the Project
In this coding tutorial, we will perform Iris dataset classification in python. This machine learning model basically predicts the species of iris flower and classifies the flower into 3 categories, namely Setosa, Versicolor, and Virginica. We will classify on the basis of given features like sepal length, sepal width, petal length, and petal width in cms. The dataset for this model is in-built into the sklearn module of python, so we can directly import our dataset from the sklearn module.
1. The objective of building this model is to classify the species of iris flower into mainly three categories.
2. To predict the species of iris flower on the basis of features like sepal length, sepal width, petal length, and petal width in cms.
3. This model can be useful to the research departments that have been working in the field of botany. People like plant scientists can also use this model for research purposes.
- Python installed with all the necessary libraries
- Jupyter Notebook or Visual Studio Code
# Using Matplotlib, we create a scatter plot where the color of #each data point corresponds to the class label. # To make plotting easier, we limit ourselves to the first two # features (iris.feature_names being the sepal length and # iris.feature_names being the sepal width). # We can see a nice separation of classes in the following figure: plt.figure(figsize=(10, 6)) plt.scatter(data[:, 0], data[:, 1], c=target, cmap=plt.cm.Paired, s=100) plt.xlabel(iris.feature_names) plt.ylabel(iris.feature_names); # Splitting our data set into training and testing # assigning the test size as given(10 percent for training # and rest for testing) X_train, X_test, y_train, y_test = model_selection.train_test_split( data, target, test_size=0.1, random_state=42 ) X_train.shape, y_train.shape X_test.shape, y_test.shape # training the classifier lr = cv2.ml.LogisticRegression_create() # We then have to specify the desired training method. # Here, we can choose cv2.ml.LogisticRegression_BATCH or # cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to # know is that we want to update the model after every data point, # which can be achieved with the following code: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH) lr.setMiniBatchSize(1) lr.setIterations(100) lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train); lr.get_learnt_thetas() metrics.accuracy_score(y_train, y_pred) ret, y_pred = lr.predict(X_test) metrics.accuracy_score(y_test, y_pred)
Explanation of the Code
1. Initially, we imported the dataset from the sklearn module of python,
2. The sklearn module contains the features like sepal length, sepal width, petal length, and petal width. On the basis of all these features, our machine-learning model predicts the species of the iris flower.
3. Then we trained our classifier using a Logistic Regression algorithm which is a type of regression algorithm that helps to solve classification problems.
We can see a clear and easily understandable separation of classes in the following figure.
Hence we have successfully built our machine learning model for Iris Dataset Classification in Python, which predicts the species of iris flower on the basis of given features. This machine learning coding project acts as a helping hand for the group of people working in the domain of botany.
Cisco Ramon is an American software engineer who has experience in several popular and commercially successful programming languages and development tools. He has been writing content since last 5 years. He is a Senior Manager at Rude Labs Pvt. Ltd.