Deep Learning.pdf


Preview of PDF document deep-learning.pdf

Page 1 234802

Text preview


Con
Conten
ten
tents
ts
Contents
Website

vii

Wcebsite
A
kno
knowledgmen
wledgmen
wledgments
ts

vii
viii

Acknowledgments
Notation

viii
xi

Notation
1
In
Intro
tro
troduction
duction
1.1 Who Should Read This Bo
Book?
ok? . . . . . . . .
1 1.2
Introduction
Historical Trends in Deep Learning . . . . .
1.1 Who Should Read This Book? . . . . . . . .
1.2 Historical Trends in Deep Learning . . . . .
I Applied Math and Mac
Machine
hine Learning Basics
I Applied
Math and Machine Learning Basics
2
Linear Algebra
2.1 Scalars, Vectors, Matrices and Tensors . . .
2 2.2
LinearMultiplying
Algebra Matrices and Vectors . . . . . .
2.1 Iden
Scalars,
ectors,
Matrices
and T
2.3
Identit
tit
tity
yV
and
In
Inverse
verse
Matrices
. ensors
. . . . .. .. ..
2.2 Linear
Multiplying
Matrices
and
Vectors
2.4
Dep
Dependence
endence
and
Span
. . .. .. .. .. .. ..
2.3
Iden
tit
y
and
In
verse
Matrices
2.5 Norms . . . . . . . . . . . . . .. .. .. .. .. .. .. ..
2.4 Sp
Linear
endence
and Span
. . . . .. .. ..
2.6
Special
ecial Dep
Kinds
of Matrices
and V. ectors
2.5
Norms
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2.7 Eigendecomp
Eigendecomposition
osition . . . . . . . . .. .. .. .. .. ..
2.6 Singular
Special Kinds
Matrices
and V
2.8
ValueofDecomp
Decomposition
osition
. ectors
. . . . .. .. ..
2.7 The
Eigendecomp
osition Pseudoinv
. . . . . . erse
. . .. .. .. .. .. ..
2.9
Mo
Moore-P
ore-P
ore-Penrose
enrose
Pseudoinverse
2.8
Singular
V
alue
Decomp
osition
2.10 The Trace Op
Operator
erator . . . . . .. .. .. .. .. .. .. ..
2.9 The
The Determinan
Mo
ore-Penrose
2.11
Determinant
t . Pseudoinv
. . . . . . erse
. . .. .. .. .. .. ..
2.10 Example:
The Trace Principal
OperatorComp
. . .onents
. . . .Analysis
. . . . . ..
2.12
Components
2.11 The Determinant . . . . . . . . . . . . . . .
2.12
Example:
Principal
Components
Analysis .
3 Probabilit
Probability
y and
Information
Theory
3.1 Wh
Why
y Probabilit
Probability?
y? . . . . . . . . . . . . . . .
3 Probability and Information Theory
3.1 Why Probability? . . . . . . . . . . . . . . .
i
i

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

.
.
.
.

xi1
8
1
11
8
11
29

.
.
..
..
..
..
..
..
..
..
..
..
.
.
.

29
31
31
31
34
31
36
34
37
36
39
37
40
39
42
40
44
42
45
44
46
45
47
46
48
47
48
53
54
53
54

. . . . . . . . . . . .