A Convolutional Neural Network (CNN) is a type of Deep Learning algorithm specifically designed to process structured arrays of data, such as images. Unlike standard Neural Networks, CNNs automatically learn to detect features (like edges, shapes, and objects) without manual feature engineering.
- Takes raw image data as input.
- Represented as a 3D Matrix: (Height × Width × Channels).
- B. Convolutional Layer (The Brain)
- Purpose: To extract features from the input image.
- Operation: A small matrix called a Filter (or Kernel) slides (convolves) over the image. It performs element-wise multiplication and sums up the results to create a Feature Map.
- Terminology:
- Stride: The number of pixels the filter moves at a time.
- Padding: Adding zeros around the image border to keep the output size the same.
C. Activation Layer (ReLU)- Purpose: To introduce non-linearity into the network (since real-world data is non-linear).
- Function: Most common is ReLU (Rectified Linear Unit). It converts all negative pixel values to zero
- D. Pooling Layer (Downsampling)
- Purpose: To reduce the dimensionality (size) of the feature maps while keeping important information. This reduces computational power and prevents overfitting.
- Types:
- Max Pooling: Picks the maximum value from a portion of the image (most common).
- Average Pooling: Calculates the average value.
E. Flattening- Purpose: Converts the final 2D feature map matrix into a 1D Linear Vector. This vector is then fed into a standard Neural Network.
F. Fully Connected (FC) & Output Layer- FC Layer: Connects every neuron in one layer to every neuron in another. It performs the final classification based on features extracted.
- Softmax/Sigmoid: Functions used in the final layer to give a probability score (e.g., 90% chance it's a "Cat").
How CNN Works- Forward Propagation: The image passes through Convolution, ReLU, and Pooling layers. Each layer learns more complex features (e.g., Layer 1 learns edges; Layer 10 learns faces).
- Loss Function: The model compares its prediction to the actual label (e.g., predicted "Dog" but it was a "Cat").
- Backpropagation: The "error" is sent back through the network to adjust the Weights of the filters.
- Optimization: The process repeats thousands of times until the error (Loss) is minimized
Input → [Conv + ReLU] → [Pooling] → [Flatten] → [Fully Connected] → [Output]

































