0% found this document useful (0 votes)

59 views6 pages

Introduction

The document provides an introduction to machine learning by using the example of developing an algorithm to play Tic-Tac-Toe. It defines the learning task as playing Tic-Tac-Toe, with performance measured as the percentage of games won against opponents. The target function is defined to assign a numeric score to each game board state based on weighted linear combinations of features. The algorithm learns by playing games against itself, using the least mean squares method to adjust the weights to minimize error between predicted and actual game outcomes.

Uploaded by

Arjun Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views6 pages

Introduction

Uploaded by

Arjun Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Introduction to Machine Learning

Andre Guggenberger 24. Oktober 2007

This paper provides a brief introduction to Machine Learning. Its based on Machine Learning, written by Tom M. Mitchell and some resources of the World Wide Web. See chapter References for more informations. The source code can be found here: [3] In this paper I will characterize the nature of machine learning and provide a general overview of the concepts of machine learning tasks in a very practical manner. Througout this paper we will work on one concrete example and we will develop one possible solution for this given problem. After reading this paper the reader should have an imagination of machine learning problems and how to solve these.

1 What is (Machine) Learning?

One of the mayor abilities of men is the process of learning. Much of scientic effort has run into the examination, what behind this ability is. But until now there is no unique denition of learning and we are just beginning to understand, how learning works. We will dene learning very simplied: Denition: Learning consists of remembering, of the ability of combining well-known facts and the recognition of patterns. The human learning process is much more complex as here described, but it is good starting point to think about this issue. If we could write a program, which has the same ability to learn as a human, we would come close to the dream of some famous science ction writers. Nowadays we are far away from this, but algorithm for special problems have been invaded. And with the research and development of such algorithm the understanding of the human ability of learning might improve. In the context of computer programs we dene learning as in [1]):

Denition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. We will call a program (respectively the component of a program), which learns, a learner.

2 Example: Tic Tac Toe

Nearly everybody knows Tic Tac Toe. It is a very simple, easy-to-learn game. As in [1] suggested we will focus on this game. Shortly, two players play against each other, each one has a mark, the aim of this game is to place three of their own marks in a horizontal, vertical or diagonal row in a 3x3 grid. More informations can be found here: [2]. We will now dene Tic Tac Toe using the above denition of learning: Denition: Task T: playing Tic Tac Toe Performance P: percent of games one against opponents Training experience E: playing practice games against itself In the context of machine learning the last point is the most important. To develop a program to play Tic Tac Toe and to measure the performance of the games, no special knowledge is necessary, this is a plain programming task. The design of the latter one can have a huge impact on success or failure of a learner. We have some possibilities to design the training experience. We can provide the learner training examples consisting of game states and the correct move for each (direct feedback) or we can only provide each move sequences and the nal outcome (indirect feedback). The latter case is mostly the more complex one. There the learner has to infer the quality of a move from the result of the game (won/lost). This is called credit assignment. It is a very difcult task to determine, if a move in the sequence is positive or negative for the nal outcome, because a game can be lost even when early moves were optimal. But this is not the only design decision we have to make. It is possible, that a teacher selects board states and provides the correct moves. Or the learner can select board states and ask the teacher for the correct moves. Or the learner can play against itself with no teacher present. In this case the learner selects board states autonomous and evaluates the moves regarding the nal outcome.

Finally the training experience should be similar like the real-world experience (over the real-world examples the performance P must be measured). That means the training experience should represent the real-world. We say the distribution of training examples should follow the distribution similar to the test (real-world) examples. Tic Tac Toe is a very simple example, so in our scenario this should not be a difcult problem. But if we have a more complicate learning task (like the checker task described in chapter 1 in [1]), this could be serious issue. If a checker-learner just plays against itself (in the training phase), it might never encounter some important board states, which it would need in the real-world. In such a case we say the distribution of training examples is not fully representative of the distribution of the real-world examples. In practice, it is often necessary to learn from a distribution of examples that is not fully representative. It is important to understand, that mastering one distribution does not necessary lead to a good performance over some other distribution. And it is also important to know, that most of the modern machine learning theory is based on the assumption that the distribution of the training examples is similar to the distribution of the test examples. This assumption is necessary for the learners ability to learn, but we have to keep in mind, that in practice this assumption has often be violated. For our learner we have decided that our system will train by playing games against itself. So now we have to dene what type of knowledge will be learned. Our Tic Tac Toe system can generate every legal move from any board state. So our system has to learn how to choose the best move from among the legal moves. This legal moves represent a search space and we need the best search strategy. We call a function, which choose the best move from a given board state, target function. For Tic Tac Toe we dene the target function this way: V : B -> R, where B is a legal board state and V maps a numeric score R to B. There, better board states get a higher score, worse board states lower score. So in our scenario, our learner has to learn this target function. To select the best move from a board state, the learner has to generate all possible successor board states and has to use V to choose the best board state (and so the best move). Most real-world examples are to complex to learn V exactly. In general we are looking for a approximation of the target function. We call this function V. There are many options for V. For the Tic Tac Toe system we choose a linear combination for V. V (b) = w0 + w1 x1 + w2 x2 + w3 x3 + w4 x4 + w5 x5 + w6 x6 where w0 through w6 are weights and the xs are so called features: x1 : number of blue marks (1)

x2 : number of red x3 : number of two in one row/column/diagonal (blue) x4 : number of two in one row/column/diagonal (red) x5 : number of red in winning position x6 : number of red in winning position With this target function our learner has just to adjust the weights. This is our whole learning task. The weights will determine the importance of each feature. So we complete our denition of the Tic Tac Toe learning task: Denition: Task T: playing Tic Tac Toe Performance P: percent of games one against opponents Training experience E: playing practice games against itself Target function: V:B -> R Target function representation: V (b) = w0 + w1 x1 + w2 x2 + w3 x3 + w4 x4 + w5 x5 + w6 x6 (2)

Note: With this denition we have reduced the learning of a Tic Tac Toe strategy to the adjusting of weights (w0 through w6 ) in the target function representation. We have decided, that our learner will train by playing against itself. So the only information the learner is able to access is, whether the game was won or lost. We have said, that the learner has to learn to choose the best move. Therefore the learner has to store every single board state and has to assign a score to each board state. The board with the best score represent the best move. It is very simple to assign a score to the last board (the board before the end of the game): If the game was won, we assign +100, if it was lost, we assign -100. So our next challenge is to assign a score to the intermediate board states. If a game was lost, it does not mean, that every intermediate board state is bad. It is i.e. possible, that a game was perfect and just the last move was fatally bad. In [1] a very surprising solution for this problem is presented: Vtrain (b) < V c (Successor(b)) (3)

V c is the current approximation of V, Successor denotes the next board state following b and Vt rain is the training value of the board b. So, to summarize, we use the successor board state of b to calculate the score of the training board state b.

The last thing we need is the learning algorithm to adjust the weights. We decide to use the least mean squares, the LMS training rule. LMS training rule helps us to minimize the error between the training values and the values of the current approximation. The algorithm adjust the weights a small amount in the direction that reduces the error. Note that there are other algorithms, but for our problem this algorithm is sufcient. Now we can design the Tic Tac Toe system; learner plays against itself calculates the features of every board state (xi ) calculates the score of every board state using the features uses the current weights to choose the current best move calculates the training scores for the boards (using the successor board state) if game was won, set last training score to +100 if game was lost, set last training score to -100 if game was a tie, set last training score to 0 for each training score adjust weights using: wi < wi + n(Vtrain (b) V c (b)) xi (4)

n is a small constant (e.g. 0.1). Vtrain (b) - V c (b) is the error, we can see, that we change the weights to reduce the error between the training examples. The implementation of the TicTacToe-system contains three classes: 1. Game: represents one game 2. TicTacToeSimpleOpponent: A TicTacToe player, who randomly makes his moves 3. TicTacToeLearner: uses the LMS training rule to learn playing TicTacToe (the training partner is TicTacToeSimpleOpponent) After some experiments with the parameters (number of training loops, ...) I got following results: If two TicTacToeSimpleOpponents play against each other, the TicTacToeSimpleOpponent, who starts, will win around 59% of the games and loses 29%. A TicTacToeLearner, who uses 70 rounds to train with a TicTacToeSimpleOpponent, wins about 70% of following games against TicTacToeSimpleOpponents.

If we increase the training rounds to 500, it will win approximately 86% of the games and will lose only 6% of it. We can increase the training rounds to 1500, but the result is just a marginal increase of won games.

3 Exercises
Our LMS training rule is a stochastic gradient-descent search. We will now try to prove this (but I am not sure if it is correct). We have to show that LMS training E rule alters weights in propotion to: xi E = (VT V c )2
n

V c = w0 + w1 x1 + ... =
i=1

wi xi

wi = wi + n(VT V c )xi
n n

E = 2(VT V ) V

= 2(VT
i=1

wi xi )
i=1

tbc ...

References
[1] Machine Learning, Mitchell, Tom M. (1997), ISBN 0-07-115467-1 [2] Tic Tac Toe, http://en.wikipedia.org/wiki/Tic_tac_toe [3] Source Code, http://code.google.com/p/mindthegap/

LSTM-AutoEncoders. Understand and Perform Composite & - by Bob Rupak Roy - DataDrivenInvestor
100% (1)
LSTM-AutoEncoders. Understand and Perform Composite & - by Bob Rupak Roy - DataDrivenInvestor
9 pages
Instructional Humor and Its Compliance Gained During Virtual Learning
No ratings yet
Instructional Humor and Its Compliance Gained During Virtual Learning
35 pages
ML 5 Units
No ratings yet
ML 5 Units
466 pages
Psychology Frontiers and Applications Canadian 5th Edition Passer Test Bank download
100% (2)
Psychology Frontiers and Applications Canadian 5th Edition Passer Test Bank download
61 pages
VTU Exam Question Paper With Solution of 18MCA53 Machine Learning Feb-2022-Dr.gnaneswari
No ratings yet
VTU Exam Question Paper With Solution of 18MCA53 Machine Learning Feb-2022-Dr.gnaneswari
27 pages
UNIT 1 Machine Learning MTech
No ratings yet
UNIT 1 Machine Learning MTech
167 pages
Aiml M3 C1
No ratings yet
Aiml M3 C1
59 pages
Cognition and Language
No ratings yet
Cognition and Language
29 pages
UNIT 1 - Introduction
No ratings yet
UNIT 1 - Introduction
26 pages
DS Unit Iv
No ratings yet
DS Unit Iv
89 pages
Ijirt154128 Paper
No ratings yet
Ijirt154128 Paper
5 pages
Psychology
No ratings yet
Psychology
3 pages
Machine Learning Notes-1 (ML Design)
No ratings yet
Machine Learning Notes-1 (ML Design)
7 pages
Instructional Designer - TS LDS - JD
No ratings yet
Instructional Designer - TS LDS - JD
3 pages
Curriculum Quiz
No ratings yet
Curriculum Quiz
22 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
26 pages
machine learning UNIT-I notes
No ratings yet
machine learning UNIT-I notes
38 pages
Architecture: Building Blocks of A System in Its Environments
No ratings yet
Architecture: Building Blocks of A System in Its Environments
3 pages
Module 1 Concept Learning Notes
No ratings yet
Module 1 Concept Learning Notes
26 pages
ML UNIT-1 NOTES
No ratings yet
ML UNIT-1 NOTES
15 pages
ACTIVELISTENING
No ratings yet
ACTIVELISTENING
11 pages
Literary Criticism-2
No ratings yet
Literary Criticism-2
22 pages
ML-UNIT-1 - Introduction PART-1
No ratings yet
ML-UNIT-1 - Introduction PART-1
60 pages
Motor Control and Motor Learning PP
No ratings yet
Motor Control and Motor Learning PP
22 pages
ML Unit - 1
No ratings yet
ML Unit - 1
85 pages
What Is A Research Question
No ratings yet
What Is A Research Question
54 pages
What Is Learning?: CS 391L: Machine Learning
No ratings yet
What Is Learning?: CS 391L: Machine Learning
6 pages
project file abhay yadav (1)
No ratings yet
project file abhay yadav (1)
63 pages
Module 1 (3)
No ratings yet
Module 1 (3)
97 pages
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
No ratings yet
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
26 pages
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
No ratings yet
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
53 pages
Module 1 (3)-pages
No ratings yet
Module 1 (3)-pages
77 pages
Unit 1 1
No ratings yet
Unit 1 1
26 pages
A Seminar Report On Machine Learing
No ratings yet
A Seminar Report On Machine Learing
10 pages
ML Module1 Chapter1
No ratings yet
ML Module1 Chapter1
38 pages
The Weight of Regret Understanding, Overcoming, and Growing from It
No ratings yet
The Weight of Regret Understanding, Overcoming, and Growing from It
3 pages
Eid 403 ML Module I Lecture Notes
No ratings yet
Eid 403 ML Module I Lecture Notes
26 pages
Unit 4
No ratings yet
Unit 4
45 pages
Module 1
No ratings yet
Module 1
27 pages
Machine Learning (Unit-1)
No ratings yet
Machine Learning (Unit-1)
24 pages
Module 2 PDF
No ratings yet
Module 2 PDF
26 pages
ML-1
No ratings yet
ML-1
86 pages
Unti 1 ML
No ratings yet
Unti 1 ML
26 pages
Unit 1
No ratings yet
Unit 1
14 pages
ML - Unit 1 - Part I
No ratings yet
ML - Unit 1 - Part I
24 pages
ML Chapter-1
No ratings yet
ML Chapter-1
39 pages
Unit 1 ML
No ratings yet
Unit 1 ML
14 pages
ML RUSA Module 1 Intro
No ratings yet
ML RUSA Module 1 Intro
30 pages
ML1
No ratings yet
ML1
28 pages
Module 1
No ratings yet
Module 1
27 pages
Monitoring Visual Sustained Attention With A Low-Cost EEG Headset
No ratings yet
Monitoring Visual Sustained Attention With A Low-Cost EEG Headset
4 pages
01 Introduction ML
No ratings yet
01 Introduction ML
60 pages
ML Unit I Notes
No ratings yet
ML Unit I Notes
27 pages
ml notes
No ratings yet
ml notes
47 pages
Handbook of Distance Education
No ratings yet
Handbook of Distance Education
325 pages
Module 1
No ratings yet
Module 1
28 pages
Group 5 - The Meaning of Language
No ratings yet
Group 5 - The Meaning of Language
30 pages
World Savvy Global Competence Matrix 2014 1
No ratings yet
World Savvy Global Competence Matrix 2014 1
1 page
Ecs 403 ML Module I
No ratings yet
Ecs 403 ML Module I
33 pages
Evaluationbymentorteacher
No ratings yet
Evaluationbymentorteacher
4 pages
Effective Applications of Learning: Speech Recognition
No ratings yet
Effective Applications of Learning: Speech Recognition
52 pages
ML Unit 1 CS
100% (2)
ML Unit 1 CS
102 pages
ML Unit 1
No ratings yet
ML Unit 1
156 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
21 pages
Module 3 - AIML
No ratings yet
Module 3 - AIML
134 pages
ML Unit-I Chapter-I Introduction
No ratings yet
ML Unit-I Chapter-I Introduction
36 pages
Anger Workbook Teen
64% (11)
Anger Workbook Teen
21 pages
Week 1-2
No ratings yet
Week 1-2
4 pages
Module 1 Notes PDF
No ratings yet
Module 1 Notes PDF
26 pages
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
No ratings yet
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
77 pages
ML Unit-I
No ratings yet
ML Unit-I
121 pages
The Effect of Conformity On Estimating The Number of Sweets in A Jar
100% (1)
The Effect of Conformity On Estimating The Number of Sweets in A Jar
13 pages
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
No ratings yet
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
46 pages
Unit 1 ML
No ratings yet
Unit 1 ML
60 pages
ML (Unit-1)
No ratings yet
ML (Unit-1)
17 pages
Video Tutorial: Machine Learning 17CS73
100% (2)
Video Tutorial: Machine Learning 17CS73
27 pages
DLL English 10 2022
100% (1)
DLL English 10 2022
8 pages
Course 1
No ratings yet
Course 1
59 pages
OC DLL Week 2 Grade 11
100% (1)
OC DLL Week 2 Grade 11
4 pages
Module 25 - Cognitive Development of High School Learners
No ratings yet
Module 25 - Cognitive Development of High School Learners
19 pages
ML Module Notes
No ratings yet
ML Module Notes
139 pages
AIML Module - 03 21CS4
No ratings yet
AIML Module - 03 21CS4
34 pages
Unit 4
No ratings yet
Unit 4
13 pages
Machine Learning
No ratings yet
Machine Learning
111 pages
ML First Unit
No ratings yet
ML First Unit
70 pages
Artificial Intelligence & Machine Learning - Policy Paper - Internet Society
No ratings yet
Artificial Intelligence & Machine Learning - Policy Paper - Internet Society
16 pages
English 7-Lesson Plan (Tone and Mode)
No ratings yet
English 7-Lesson Plan (Tone and Mode)
3 pages
Surviving Introduction to Finance
From Everand
Surviving Introduction to Finance
James Triplett
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)

Introduction

Uploaded by

Introduction

Uploaded by

Introduction to Machine Learning

Andre Guggenberger 24. Oktober 2007

1 What is (Machine) Learning?

2 Example: Tic Tac Toe

You might also like