Support Vector Machines (SVM) Basic concepts and algorithm
Support Vector Machines (SVM) Basic concepts and Algorithm
Support Vector is one of the strongest but mathematically complex supervised learning algorithm used for both regression and Classification. It is strictly based on the concept of decision planes (most commonly called hyper planes) that define decision boundaries for the classification. A decision plane is one that separates between a set of data having different class memberships.
SVM
It performs classification by finding the optimal hyperplane that maximizes the margin between the two classes with the help of support vectors.
Hyperplanes and Support Vectors
Hyperplanes are the decision boundaries that divide the data points into different classes. Data points falling on either side of the hyperplane can be attributed to different classes. The dimension of the hyperplane depends upon the number of features or attributes. If the number of input features is 2, then the hyperplane comes to be a line. It means to say it can be separated linearly. If the number of input features is 3, then the hyperplane becomes a two-dimensional plane. As the number of input features goes on increasing from 3, it becomes harder to imagine the hyperplane.
Support vectors and Hyerplanes
Support vectors are data points that are located closer to the hyperplane. They affect the position and orientation of the hyperplane. With the help of these support vectors, we tends maximize the margin of the classifier. Any change in the support vectors will change the position of the hyperplane. These are the points are the vital points that help us build our SVM classifier.
Classification using SVM/Finding a right hyperplane
When data are linearly separable:
For linearly separable data, classification is done by finding an optimal hyperplane between the classes.The distance between the hyperplane and the nearest data point(Support vector) from either set is known as the margin. The goal is to choose a hyperplane with the greatest possible margin between the hyperplane and any point within the training set, giving a greater chance of new data being classified correctly.
Linearly seprable data (SVM)
When data are linearly inseparable:
For non-linearly separable data, kernel functions are used. Kernels can be defined as functions that take non linear data as input and transform it into required form(i.e. separable form). There are several Kernal functions like linear kernel, Polynomial kernel, RBF Kernel,Sigmoid kernel.
linearly inseparable Data to linearly separable Data
Kernel SVM
In the SVM algorithm, the kernel SVM takes a kernel function and transforms it into the required form that maps data to a higher dimension that can be separated.
Some of the most common types of kernel function are:
Linear Kernel: K(Xi,Xj) = Xi.Xj
Polynomial kernel: K(Xi,Xj) =( γXi.Xj+C)d , where d is the degree of the polynomial that should be specified.
RBF Kernel: K(Xi,Xj) =exp(- γ|Xi -Xj|2), it is used for non-linearly separable variables. For distance metric squared Euclidean distance is used.
Sigmoid kernel: K(Xi,Xj) =tanh( γXi.Xj+C), it is similar to logistic regression is used for binary classification
Kernel trick usesthe kernel function to transform the data into a higher dimensional feature space to make it possible to perform the linear separation for classification.
So, it is better to use linear SVMs for linear problems, and non-linear kernels such as the sigmoid kernel, Radial Basis Function kernel for non-linear problems.
Understanding KNN(K-nearest neighbor) with example. It is probably, one of the simplest but strong supervised learning algorithms used for classification as well regression purposes. It is most commonly used to classify the data points that are separated into several classes, in order to make prediction for new sample data points. It is a non-parametric and lazy learning algorithm. It classifies the data points based on the similarity measure (e.g. distance measures, mostly Euclidean distance). Assumption of KNN : K- NN algorithm is based on the principle that, “the similar things exist closer to each other or Like things are near to each other.” In this algorithm ‘K’ refers to the number of neighbors to consider for classification. It should be odd value. The value of ‘K’ must be selected carefully otherwise it may cause defects in our model. If the value of ‘K’ is small then it causes Low Bias, High variance i.e. over fitting of model. In the same way if ‘K’ is v...
Implementation Of KNN (From Scratch in PYTHON) Implementation Of KNN (From Scratch in PYTHON) KNN classifier is one of the simplest but strong supervised machine learning algorithm. It can be used for both classification and regression problems. There are some libraries in python to implement KNN, which allows a programmer to make KNN model easily without using deep ideas of mathematics. But if we try to implement KNN from scratch it becomes a bit tricky. Before getting into the program lets recall the algorithm of KNN: Algorithm for K-NN: 1. Load the given data file into your program 2. Initialize the number of neighbor to be considered i.e. ‘K’ (must be odd). 3. Now for each tuples (entries or data point) in the data file we perform: i. Calculate distance between the data point (tuple) to be classified and each data points in the gi...
Supervised Machine Learning What Is Supervised Learning? It is the machine learning algorithm that learns from labeled data. After the data is analyzed and learned, the algorithm determines which label should be given to new data supplied by the user based on pattern and associating the patterns to the unlabeled new data. Supervised Learning algorithm has two categories i.e Classification & Regression Classification predicts the class or category in which the data belongs to. e.g.: Spam filtering and detection, Churn Prediction, Sentiment Analysis, image classification. Regression predicts a numerical value based on previously observed data. e.g.: House Price Prediction, Stock Price Prediction. Classification Classification is one of the widely and mostly used techniques for determining class the dependent belongs to base on the one or more independent variables. For simple understanding, what classification algorithm does is it simply makes a decision boundary between data po...
Comments
Post a Comment