The outline of CNNs
- Spatial locality & invariance
- Convolution and filters
- Max Pooling
- Example architecture
- Illustrations: what do CNNs not learn?
Convenets: HighLevel Illustraion: create structures.
map x to $h_L\times h_{L-1}\times h_{L-2}….\times h_1(x)$
$h_i(z)=\sigma(w_iz+b_i).$
注意这里$w_i$不能用FFNN中的任意matrix,而是要用一个有special structure的矩阵。
How to input an image into a neural network?
make image into vector/matrix
为什么imageIdentification 不用FNN?
Image has locality and translation invariance.
main ideas of CNNs
Convolution: local detectors spatial locality
Weight sharing: apply same detector to all image patches
efficiency (much fewer parameters!)
/ translation invariance.
Pooling
*abstract away locality *
Convolutional Layer:1 D example
Filter: detect signal
- Stride: 移动的filter的步长
- weight sharing: 是这个filter在整个vector上都进行应用的。
- 而又因为weight sharing 的机制, CNN与FCNN相比,我们需要的unique weigth就很少。
Padding
为了防止越算越短,所以需要padding
pad with what?
Key point
convolution is a linear operation. We move same window of weights over all patches and compute linear combinations.
即在数学意义上,卷积运算也是”linear”运算,因为它做的是$\sum w_{ij} \times x_{ij}$
MaxPooling
将局部性提取出来。
convolution layer to do feature engineering.
CNN 思想
不断地去learn more & more complex features.
直到这些feature可以让我们有足够的自信去做到分类。
影响CNN
- input Rotations?
- Deep CNNs?
最后放一个图开心一下: