The feature map is the output of
one filter applied to the previous layer. It looks hard, doesn't it? So, let me
put it in a simple way.
In convolutional networks (CNN), you look at an image through a smaller
window and move that window to the right and down. That way you can find
features in that window, for example, a horizontal line or a vertical line or a
curve, etc. What exactly a convolutional neural network considers an important
feature is defined while learning.
Wherever you find those features, you report that in the feature maps. A certain combination of features in a certain area can signal a larger, more complex feature exists there.
I know it sounds confusing again, so giving you another example in a simple way. For instance, in a 32 × 32 image, dragging the 5 × 5 receptive fields across the input image data with a stride width of 1 will result in a feature map of 28 × 28 (32–5+1 × 32–5+1) output values or 784 distinct activations per image. Hope this answered your question.
Post a Comment