MACHINE LEARNING
Cheat Sheet
PART 1
SWIPE>
What is Machine learning?
Machine learning is a branch of computer science which deals with
system programming in order to automatically learn and improve
with experience. For example: Robots are programed so that they can
perform the task based on data they gather from sensors. It
automatically learns programs from data.
Mention the difference between Data Mining
and Machine learning?
Machine learning relates with the study, design and development of
the algorithms that give computers the capability to learn without
being explicitly programmed. While, data mining can be defined as
the process in which the unstructured data tries to extract knowledge
or unknown interesting patterns. During this process machine,
learning algorithms are used.
What is ‘Overfitting’ in Machine learning?
In machine learning, when a statistical model describes random error
or noise instead of underlying relationship ‘overfitting’ occurs. When a
model is excessively complex, overfitting is normally observed,
because of having too many parameters with respect to the number
of training data types. The model exhibits poor performance which
has been overfit.
Why overfitting happens?
The possibility of overfitting exists as the criteria used for training the
model is not the same as the criteria used to judge the efficacy of a
model.
How can you avoid overfitting ?
By using a lot of data overfitting can be avoided, overfitting happens
relatively as you have a small dataset, and you try to learn from it. But
if you have a small database and you are forced to come with a model
based on that. In such situation, you can use a technique known as
cross validation. In this method the dataset splits into two section,
testing and training datasets, the testing dataset will only test the
model while, in training dataset, the datapoints will come up with the
model.
In this technique, a model is usually given a dataset of a known data
on which training (training data set) is run and a dataset of unknown
data against which the model is tested. The idea of cross validation is
to define a dataset to “test” the model in the training phase.
What is inductive machine learning?
The inductive machine learning involves the process of learning by
examples, where a system, from a set of observed instances tries to
induce a general rule.
What are the five popular algorithms of
Machine Learning?
* Decision Trees
+ Neural Networks (back propagation)
+ Probabilistic networks
+ Nearest Neighbor
+ Support vector machines
What are the different Algorithm techniques in
Machine Learning?
The different types of techniques in Machine Learning are
+ Supervised Learning
+ Unsupervised Learning
+ Semi-supervised Learning j
+ Reinforcement Learning
* Transduction
+ Learning to Learn
What are the three stages to build the
hypotheses or model in machine learning?
* Model building
+ Model testing
+ Applying the model
What is the standard approach to supervised
learning?
The standard approach to supervised learning is to split the set of
example into the training set and the test.
What is ‘Training set’ and ‘Test set’?
In various areas of information science like machine learning, a set of
data is used to discover the potentially predictive relationship known
as ‘Training Set’. Training set is an examples given to the learner, while
Test set is used to test the accuracy of the hypotheses generated by
the learner, and it is the set of example held back from the learner.
Training set are distinct from Test set.
List down various approaches for machine
learning?
The different approaches in Machine Learning are
* Concept Vs Classification Learning
+ Symbolic Vs Statistical Learning
+ Inductive Vs Analytical Learning
What is not Machine Learning?
+ Artificial Intelligence
+ Rule based inference
Explain what is the function of ‘Unsupervised
Learning’?
+ Find clusters of the data
+ Find interesting directions in data
* Interesting coordinates and correlations
+ Find novel observations/ database cleaning
+ Find low-dimensional representations of the data
Explain what is the function of ‘Supervised
Learning’?
* Classifications
* Speech recognition
+ Regression
+ Predict time series
+ Annotate strings
What is algorithm independent machine
learning?
Machine learning in where mathematical foundations is independent
of any particular classifier or learning algorithm is referred as
algorithm independent machine learning?
What is the difference between artificial
learning and machine learning?
Designing and developing algorithms according to the behaviours
based on empirical data are known as Machine Learning. While
artificial intelligence in addition to machine learning, it also covers
other aspects like knowledge representation, natural language
processing, planning, robotics etc.
What is classifier in machine learning?
A classifier in a Machine Learning is a system that inputs a vector of
discrete or continuous feature values and outputs a single discrete
value, the class.
What is classifier in machine learning?
A classifier in a Machine Learning is a system that inputs a vector of
discrete or continuous feature values and outputs a single discrete
value, the class.
What are the advantages of Naive Bayes?
In Naive Bayes classifier will converge quicker than discriminative
models like logistic regression, so you need less training data. The
main advantage is that it can’t learn interactions between features.
In what areas Pattern Recognition is used?
Pattern Recognition can be used in
Computer Vision
Speech Recognition
Data Mining
Statistics
Informal Retrieval
Bio-Informatics
What is Genetic Programming?
Genetic programming is one of the two techniques used in machine
learning. The model is based on the testing and selecting the best
choice among a set of results.
What is Inductive Logic Programming in
Machine Learning?
Inductive Logic Programming (ILP) is a subfield of machine learning
which uses logical programming representing background knowledge
and examples.
What is Model Selection in Machine Learning?
The process of selecting models among different mathematical
models, which are used to describe the same data set is known as
Model Selection. Model selection is applied to the fields of statistics,
machine learning and data mining.