Why Is It Important To Tune The Parameters Of SVM Model

protonAutoML
4 min readJun 29, 2021

--

The goal of the SVM algorithm is to use a training set of objects (samples) separated into classes to find a hyperplane in the data space. In this article, we will talk about the parameters of the SVM model to be considered while handling data.

SVMs have supervised learning models which can be used to classify different types of data. SVMs usually deal with high-dimensional input features and large datasets, making them a preferred method for a lot of classification problems. SVM is a classification model that maps data into a higher dimensional space based on numeric attributes (features). The SVM then attempts to classify new samples by using the (learned) discriminant function in the form of a hyperplane.

It is also important to note that SVMs can be trained either to maximize the margin between classes or to minimize the number of misclassifications. Although both these objectives are optimal and closely related, there might be scenarios where one objective may perform better than another one. Therefore, it is important to understand this crucial difference when implementing an SVM for our problem. Let’s consider three situations:

Scenario 1: If we have a binary classifier problem then SVMs are used to minimize the number of misclassifications.

Scenario 2: If our classification task involves multi-class problems such as >2 classes or <2 classes, then minimizing the margin between them is taken into consideration. Let’s take an example and look at it in detail:

In this scenario, if we try to maximize the margin between points A and B (maximize γ ), we won’t be able to achieve that; instead, we will be getting records belonging to C as well in the output space, which severely impacts our data since C is not even being considered for prediction. Therefore by considering a very small margin (minimize γ ) we are able to achieve a proper separation between our points A, B, and C.

Scenario 3: If our classifier has more than 2 classes (neither binary nor multi-class; but for e.g., 4 classes) then it is not possible to create boundaries between these classes that all of them will be properly separated from each other as shown in the above example.

Another classification problem where SVMs perform extremely well is — document clusters. Let’s consider the following two examples:

In this scenario, if we try to divide documents based on their characteristics into different groups then it is very difficult to separate documents belonging to different ‘clusters’ since there is a lot of overlap between these groups. However, by using an SVM we can easily identify and separate these clusters from each other.

Another very common use case for SVMs is dimensionality reduction. In this scenario, we transform high-dimensional feature vectors into low-dimensional ones; the main aim of this transformation being reducing the overfitting problem (which may occur when working with higher numbers of dimensions) by decreasing the number of features while achieving better results in terms of generalisability. An excellent example for the application of SVMs for dimensionality reduction would be:

“The numerical attributes are transformed into binary attributes by using one-vs.-rest schemes. The decision boundary separates the majority class (here, positive sentiment) from the rest of the classes. In binary classification problems, we can achieve a better result by using SVMs rather than OLS.”

We use SVM almost every day while browsing the web or checking our emails; most of us may not even realize that these complex algorithms are behind such simple tasks. These algorithms play very crucial roles in various fields, like — finance and economics, image processing, medicine, science, and technology, etc., apart from being used for other more advanced solutions as well. This is one of the main reasons why it has been named Machine Learning’s Swiss Army Knife.

SVMs have also been used to implement some of Google’s popular machine learning algorithms such as PageRank, AdSense, and News Spam Filter.

The only restriction for the method is that the input data should be linearly separable. Let’s understand this term: if we look at our above example then we can see that by considering a margin of 1 between points A and B (which results in no overlapping area), will result in them being separated from each other; since they are linearly separable. Therefore it is very important to understand this term before implementing SVM using python or R.

Originally published at https://protonautoml.com on June 29, 2021.

--

--

protonAutoML
protonAutoML

Written by protonAutoML

protonautoml.com mission is to make Internet companies AI driven. We are both consulting firm and automl software provider

No responses yet