Linear SVM (Support Vector Machine)
This is a legacy Apache Ignite documentationThe new documentation is hosted here: https://ignite.apache.org/docs/latest/
Support Vector Machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.
Apache Ignite Machine Learning module only supports Linear SVM. For more information look at SVM in Wikipedia.
Model
A Model in the case of SVM is represented by the class SVMLinearClassificationModel. It enables a prediction to be made for a given vector of features, in the following way:
SVMLinearClassificationModel model = ...;
double prediction = model.predict(observation);Presently Ignite supports a few parameters for SVMLinearClassificationModel:
isKeepingRawLabels- controls the output label format: -1 and +1 for false value and raw distances from the separating hyperplane (default value: false)threshold- a threshold to assign +1 label to the observation if the raw value is more than this threshold (default value: 0.0)
SVMLinearClassificationModel model = ...;
double prediction = model
.withRawLabels(true)
.withThreshold(5)
.predict(observation);Trainer
Base class for a soft-margin SVM linear classification trainer based on the communication-efficient distributed dual coordinate ascent algorithm (CoCoA) with hinge-loss function. This trainer takes input as a Labeled Dataset with -1 and +1 labels for two classes and makes binary classification.
The paper about this algorithm could be found here https://arxiv.org/abs/1409.1458.
Presently, Ignite supports the following parameters for SVMLinearClassificationTrainer:
amountOfIterations- amount of outer SDCA algorithm iterations. (default value: 200)amountOfLocIterations- amount of local SDCA algorithm iterations. (default value: 100)lambda- regularization parameter (default value: 0.4)seed- one of initialization parameters which helps to reproduce models (trainer uses random to choose the observation during local iteration in weight vector update)
// Set up the trainer
SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer()
.withAmountOfIterations(AMOUNT_OF_ITERATIONS)
.withAmountOfLocIterations(AMOUNT_OF_LOC_ITERATIONS)
.withLambda(LAMBDA);
// Build the model
SVMLinearBinaryClassificationModel mdl = trainer.fit(
ignite,
dataCache,
vectorizer
);Example
To see how SVM Linear Classifier can be used in practice, try this example that is available on GitHub and delivered with every Apache Ignite distribution.
The training dataset is the subset of the Iris dataset (classes with labels 1 and 2, which are presented linear separable two-classes dataset) which could be loaded from the UCI Machine Learning Repository.
Updated 9 months ago
