Deep Learning.pdf


Preview of PDF document deep-learning.pdf

Page 1...3 4 567802

Text preview


CONTENTS

10
10

11
11

12
12

III

9.10 The Neuroscien
Neuroscientific
tific Basis for Conv
Convolutional
olutional Netw
Networks
orks . . . .
9.11 Con
Conv
volutional Net
Networks
works and the History of Deep Learning .
9.10 The Neuroscientific Basis for Convolutional Networks . . . .
9.11
ConvMo
olutional
NetRecurrent
works and the
History
of Deep
Learning .
Sequence
Modeling:
deling:
and
Recursiv
Recursive
e Nets
10.1 Unfolding Computational Graphs . . . . . . . . . . . . . . .
Sequence
Motdeling:
10.2
Recurren
Recurrent
Neural Recurrent
Net
Netw
works .and
. . .Recursiv
. . . . . e. Nets
. . . . . . . .
10.1 Bidirectional
Unfolding Computational
10.3
RNNs . . . .Graphs
. . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
10.2 Enco
Recurren
t Neural
Networks . . . . . . .Architectures
. . . . . . . .. .. .. ..
10.4
Encoder-Deco
der-Deco
der-Decoder
der Sequence-to-Sequence
10.3
Bidirectional
RNNs
.
.
.
.
.
.
.
.
.
.
.
.
.
10.5 Deep Recurren
Recurrentt Net
Netw
works . . . . . . . . . .. .. .. .. .. .. .. .. .. ..
10.4 Recursiv
Enco
der-Deco
der Net
Sequence-to-Sequence
10.6
Recursive
e Neural
Netw
works . . . . . . . .Architectures
. . . . . . . .. .. .. ..
10.5 The
DeepChallenge
RecurrentofNet
workserm. .Dep
. .endencies
. . . . . .. .. .. .. .. .. .. .. .. ..
10.7
Long-T
Long-Term
Dependencies
10.6
Recursiv
e
Neural
Net
w
orks
.
.
.
.
.
10.8 Ec
Echo
ho State Net
Netw
works . . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. ..
10.7 Leaky
The Challenge
of Other
Long-T
erm Dependencies
. . Time
. . . .Scales
. . . ..
10.9
Units and
Strategies
for Multiple
10.8
Ec
ho
State
Net
w
orks
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10.10 The Long Short-T
Short-Term
erm Memory and Other Gated RNNs .. .. ..
10.9 Optimization
Leaky Units and
Strategies
for Multiple
10.11
for Other
Long-T
Long-Term
erm Dep
Dependencies
endencies
. . Time
. . . .Scales
. . . ..
10.10 Explicit
The LongMemory
Short-Term
10.12
. . .Memory
. . . . .and
. . Other
. . . . Gated
. . . .RNNs
. . . .. .. ..
10.11 Optimization for Long-Term Dependencies . . . . . . . . . .
10.12
Explicit
Memory
Practical
metho
methodology
dology. . . . . . . . . . . . . . . . . . . . . . . .
11.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . .
Practical
metho
dology
11.2
Default
Baseline
Mo
Models
dels . . . . . . . . . . . . . . . . . . . .
11.1 Determining
Performance Whether
Metrics .to. Gather
. . . . .More
. . .Data
. . . .. .. .. .. .. .. .. .. ..
11.3
11.2 Selecting
Default Baseline
Models . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
11.4
Hyp
Hyperparameters
erparameters
11.3
Determining
Whether
11.5 Debugging Strategies .to. Gather
. . . . .More
. . .Data
. . . .. .. .. .. .. .. .. .. ..
11.4 Example:
Selecting Hyp
erparameters
. . Recognition
. . . . . . . .. .. .. .. .. .. .. .. .. ..
11.6
Multi-Digit
Number
11.5 Debugging Strategies . . . . . . . . . . . . . . . . . . . . . .
11.6
Example: Multi-Digit Number Recognition . . . . . . . . . .
Applications
12.1 Large Scale Deep Learning . . . . . . . . . . . . . . . . . . .
Applications
12.2
Computer Vision . . . . . . . . . . . . . . . . . . . . . . . .
12.1
Large
Deep Learning
12.3 Sp
Speec
eec
eech
hScale
Recognition
. . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
12.2 Natural
Computer
Vision Pro
. . cessing
. . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
12.4
Language
Processing
12.3 Other
SpeechApplications
Recognition .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
12.5
12.4 Natural Language Processing . . . . . . . . . . . . . . . . .
12.5 Other Applications . . . . . . . . . . . . . . . . . . . . . . .
Deep Learning Researc
Research
h

III Linear
Deep F
Learning
Researc
h
13
actor Mo
Models
dels
13.1 Probabilistic PCA and Factor Analysis .
13 13.2
LinearIndep
Factor
Mo
dels onent Analysis (ICA)
Independen
enden
endent
t Comp
Component
13.1
Probabilistic
PCA
and F.actor
13.3 Slo
Slow
w Feature Analysis
. . . Analysis
. . . . . ..
13.2 Sparse
Independen
t Comp
13.4
Co
Coding
ding
. . .onent
. . . Analysis
. . . . . (ICA)
. . . .
13.3 Slow Feature Analysis . . . . . . . . . .
13.4 Sparse Coding . . . . . . . . iv
. . . . . . .

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
..
..
.
.

.
.
.
.
.
.
..
..
..
..
..
..
..
..
..
..
.
.
.
.
..
..
..
..
.
.
.
.
..
..
..
.
.

.
.
..
..
.
.

.
.
.
.
.
.
..
..
..
..
..
..
..
..
..
..
.
.
.
.
..
..
..
..
.
.
.
.
..
..
..
.
.

.
.
..
..
.
.

. 365
. 372
. 365
. 374
372
. 376
. 374
379
376
.. 396
.. 379
397
.. 399
396
.. 397
401
.. 403
399
401
.. 406
403
.. 409
.. 406
411
.. 409
415
411
.. 419
. 415
. 424
419
. 425
. 424
428
425
.. 429
428
.. 430
.. 439
429
430
.. 443
. 439
443
. 446
. 446
. 446
455
.. 446
461
455
.. 464
461
.. 480
. 464
. 480
489
.
.
..
..
.
.

489
492
493
492
494
493
496
494
499
496
499