1
+ <!DOCTYPE html>
2
+ < html >
3
+ < head >
4
+ < meta http-equiv ="content-type " content ="text/html;charset=utf-8 "/>
5
+ < meta name ="viewport " content ="width=device-width, initial-scale=1.0 "/>
6
+ < meta name ="description " content =""/>
7
+
8
+ < meta name ="twitter:card " content ="summary "/>
9
+ < meta name ="twitter:image:src " content ="https://avatars1.githubusercontent.com/u/64068543?s=400&v=4 "/>
10
+ < meta name ="twitter:title " content ="Distilling the Knowledge in a Neural Network) "/>
11
+ < meta name ="twitter:description " content =""/>
12
+ < meta name ="twitter:site " content ="@labmlai "/>
13
+ < meta name ="twitter:creator " content ="@labmlai "/>
14
+
15
+ < meta property ="og:url " content ="https://nn.labml.ai/distillation/readme.html "/>
16
+ < meta property ="og:title " content ="Distilling the Knowledge in a Neural Network) "/>
17
+ < meta property ="og:image " content ="https://avatars1.githubusercontent.com/u/64068543?s=400&v=4 "/>
18
+ < meta property ="og:site_name " content ="LabML Neural Networks "/>
19
+ < meta property ="og:type " content ="object "/>
20
+ < meta property ="og:title " content ="Distilling the Knowledge in a Neural Network) "/>
21
+ < meta property ="og:description " content =""/>
22
+
23
+ < title > Distilling the Knowledge in a Neural Network)</ title >
24
+ < link rel ="shortcut icon " href ="/icon.png "/>
25
+ < link rel ="stylesheet " href ="../pylit.css ">
26
+ < link rel ="canonical " href ="https://nn.labml.ai/distillation/readme.html "/>
27
+ <!-- Global site tag (gtag.js) - Google Analytics -->
28
+ < script async src ="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH "> </ script >
29
+ < script >
30
+ window . dataLayer = window . dataLayer || [ ] ;
31
+
32
+ function gtag ( ) {
33
+ dataLayer . push ( arguments ) ;
34
+ }
35
+
36
+ gtag ( 'js' , new Date ( ) ) ;
37
+
38
+ gtag ( 'config' , 'G-4V3HC8HBLH' ) ;
39
+ </ script >
40
+ </ head >
41
+ < body >
42
+ < div id ='container '>
43
+ < div id ="background "> </ div >
44
+ < div class ='section '>
45
+ < div class ='docs '>
46
+ < p >
47
+ < a class ="parent " href ="/ "> home</ a >
48
+ < a class ="parent " href ="index.html "> distillation</ a >
49
+ </ p >
50
+ < p >
51
+
52
+ < a href ="https://github.com/lab-ml/labml_nn/tree/master/labml_nn/distillation/readme.md ">
53
+ < img alt ="Github "
54
+ src ="https://img.shields.io/github/stars/lab-ml/nn?style=social "
55
+ style ="max-width:100%; "/> </ a >
56
+ < a href ="https://twitter.com/labmlai "
57
+ rel ="nofollow ">
58
+ < img alt ="Twitter "
59
+ src ="https://img.shields.io/twitter/follow/labmlai?style=social "
60
+ style ="max-width:100%; "/> </ a >
61
+ </ p >
62
+ </ div >
63
+ </ div >
64
+ < div class ='section ' id ='section-0 '>
65
+ < div class ='docs '>
66
+ < div class ='section-link '>
67
+ < a href ='#section-0 '> #</ a >
68
+ </ div >
69
+ < h1 > < a href ="(https://nn.labml.ai/distillation/index.html) "> Distilling the Knowledge in a Neural Network</ a > </ h1 >
70
+ < p > This is a < a href ="https://pytorch.org "> PyTorch</ a > implementation/tutorial of the paper
71
+ < a href ="https://papers.labml.ai/paper/1503.02531 "> Distilling the Knowledge in a Neural Network</ a > .</ p >
72
+ < p > It’s a way of training a small network using the knowledge in a trained larger network;
73
+ i.e. distilling the knowledge from the large network.</ p >
74
+ < p > A large model with regularization or an ensemble of models (using dropout) generalizes
75
+ better than a small model when trained directly on the data and labels.
76
+ However, a small model can be trained to generalize better with help of a large model.
77
+ Smaller models are better in production: faster, less compute, less memory.</ p >
78
+ < p > The output probabilities of a trained model give more information than the labels
79
+ because it assigns non-zero probabilities to incorrect classes as well.
80
+ These probabilities tell us that a sample has a chance of belonging to certain classes.
81
+ For instance, when classifying digits, when given an image of digit < em > 7</ em > ,
82
+ a generalized model will give a high probability to 7 and a small but non-zero
83
+ probability to 2, while assigning almost zero probability to other digits.
84
+ Distillation uses this information to train a small model better.</ p >
85
+ < p > < a href ="https://app.labml.ai/run/d6182e2adaf011eb927c91a2a1710932 "> < img alt ="View Run " src ="https://img.shields.io/badge/labml-experiment-brightgreen " /> </ a > </ p >
86
+ </ div >
87
+ < div class ='code '>
88
+
89
+ </ div >
90
+ </ div >
91
+ </ div >
92
+ </ div >
93
+ < script src ="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=TeX-AMS_HTML ">
94
+ </ script >
95
+ <!-- MathJax configuration -->
96
+ < script type ="text/x-mathjax-config ">
97
+ MathJax . Hub . Config ( {
98
+ tex2jax : {
99
+ inlineMath : [ [ '$' , '$' ] ] ,
100
+ displayMath : [ [ '$$' , '$$' ] ] ,
101
+ processEscapes : true ,
102
+ processEnvironments : true
103
+ } ,
104
+ // Center justify equations in code and markdown cells. Elsewhere
105
+ // we use CSS to left justify single line equations in code cells.
106
+ displayAlign : 'center' ,
107
+ "HTML-CSS" : { fonts : [ "TeX" ] }
108
+ } ) ;
109
+ </ script >
110
+ < script >
111
+ function handleImages ( ) {
112
+ var images = document . querySelectorAll ( 'p>img' )
113
+
114
+ console . log ( images ) ;
115
+ for ( var i = 0 ; i < images . length ; ++ i ) {
116
+ handleImage ( images [ i ] )
117
+ }
118
+ }
119
+
120
+ function handleImage ( img ) {
121
+ img . parentElement . style . textAlign = 'center'
122
+
123
+ var modal = document . createElement ( 'div' )
124
+ modal . id = 'modal'
125
+
126
+ var modalContent = document . createElement ( 'div' )
127
+ modal . appendChild ( modalContent )
128
+
129
+ var modalImage = document . createElement ( 'img' )
130
+ modalContent . appendChild ( modalImage )
131
+
132
+ var span = document . createElement ( 'span' )
133
+ span . classList . add ( 'close' )
134
+ span . textContent = 'x'
135
+ modal . appendChild ( span )
136
+
137
+ img . onclick = function ( ) {
138
+ console . log ( 'clicked' )
139
+ document . body . appendChild ( modal )
140
+ modalImage . src = img . src
141
+ }
142
+
143
+ span . onclick = function ( ) {
144
+ document . body . removeChild ( modal )
145
+ }
146
+ }
147
+
148
+ handleImages ( )
149
+ </ script >
150
+ </ body >
151
+ </ html >
0 commit comments