This is part of the CNN Architectures series by Dimitris Katsios. Find all CNN Architectures online:
- Notebooks: MLT GitHub
- Video tutorials: YouTube
- Support MLT on Patreon
SqueezeNet
We will use the tensorflow.keras Functional API to build SqueezeNet from the original paper: “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” by Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer.
In the paper we can read:
[i] “[…] we implement our expand layer with two separate convolution layers: a layer with 1×1 filters, and a layer with 3×3 filters. Then, we concatenate the outputs of these layers together in the channel dimension.”
We will also make use of the following Table [ii]:

as well the following Diagrams [iii] and [iv]
Network architecture
Based on [ii] the network
- starts with a Convolution-MaxPool block
- continues with a series of Fire blocks separated by MaxPool layers
- finishes with Convolution and Average Pool layers.
Notice that there is no Fully Connected layer in the model which means that the network can process different image sizes.
Fire block
The Fire block is depicted at [iii] and consists of:
- a 1×1 Convolution layer that outputs the
squeezed
tensor - a 1×1 Convolution layer and a 3×3 Convolution layer applied on the squeeze tensor and the ouputs of which are then concatenated as described in [i]
Workflow
We will:
- import the neccesary layers
- write a helper function for the Fire block ([iii])
- write the stem of the model
- use the helper function to write the main part of the model
- write the last part of the model
1. Imports
Code:
from tensorflow.keras.layers import Input, Conv2D, Concatenate, \ MaxPool2D, GlobalAvgPool2D, Activation
2. Fire block
Next, we will write the Fire block function
This function will:
- take as inputs:
- a tensor (
x
) - the filters of the 1st 1×1 Convolution layer (
squeeze_filters
) - the filters of the 2nd 1×1 Convolution and the 3×3 Convolution layers (
expand_filters
)
- a tensor (
- run:
- apply a 1×1 conv operation on
x
to getsqueezed
tensor - apply a 1×1 conv and a 3×3 conv operation on
squeezed
- Concatenate these two tensors
- apply a 1×1 conv operation on
- return the concatenated tensor
Code:
def fire_block(x, squeeze_filters, expand_filters): squeezed = Conv2D(filters=squeeze_filters, kernel_size=1, activation='relu')(x) expanded_1x1 = Conv2D(filters=expand_filters, kernel_size=1, activation='relu')(squeezed) expanded_3x3 = Conv2D(filters=expand_filters, kernel_size=3, padding='same', activation='relu')(squeezed) output = Concatenate()([expanded_1x1, expanded_3x3]) return output
3. Model stem
Based on [ii]:
layer name/type | output size | filter size / stride |
---|---|---|
input image | 224x224x3 | |
conv1 | 111x111x96 | 7×7/2 (x96) |
maxpool1 | 55x55x96 | 3×3/2 |
the model starts with:
- a Convolution layer with 96 filters and kernel size 7×7 applied on a 224x224x3 input image
- a MaxPool layer with pool size 3×3 and stride 2
Code:
input = Input([224, 224, 3]) x = Conv2D(96, 7, strides=2, padding='same', activation='relu')(input) x = MaxPool2D(3, strides=2, padding='same')(x)
4. Main part
Based on [ii]:
layer name/type | filter size / stride | s1x1(#1×1 squeeze) | e1x1(#1×1 expand) | e3x3(#3×3 expand) |
---|---|---|---|---|
fire2 | 16 | 64 | 64 | |
fire3 | 16 | 64 | 64 | |
fire4 | 32 | 128 | 128 | |
maxpool4 | 3×3/2 | |||
fire5 | 32 | 128 | 128 | |
fire6 | 48 | 192 | 192 | |
fire7 | 48 | 192 | 192 | |
fire8 | 64 | 256 | 256 | |
maxpool8 | 3×3/2 | |||
fire9 | 64 | 256 | 256 |
the model continues with:
- Fire block (fire2) with 16 squeeze and 64 expand filters
- Fire block (fire3) with 16 squeeze and 64 expand filters
- Fire block (fire4) with 32 squeeze and 128 expand filters
- a MaxPool layer (maxpool4) with pool size 3×3 and stride 2
- Fire block (fire5) with 32 squeeze and 128 expand filters
- Fire block (fire6) with 48 squeeze and 192 expand filters
- Fire block (fire7) with 48 squeeze and 192 expand filters
- Fire block (fire8) with 64 squeeze and 256 expand filters
- a MaxPool layer (maxpool8) with pool size 3×3 and stride 2
- Fire block (fire9) with 64 squeeze and 256 expand filters
Code:
x = fire_block(x, squeeze_filters=16, expand_filters=64) x = fire_block(x, squeeze_filters=16, expand_filters=64) x = fire_block(x, squeeze_filters=32, expand_filters=128) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) x = fire_block(x, squeeze_filters=32, expand_filters=128) x = fire_block(x, squeeze_filters=48, expand_filters=192) x = fire_block(x, squeeze_filters=48, expand_filters=192) x = fire_block(x, squeeze_filters=64, expand_filters=256) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) x = fire_block(x, squeeze_filters=64, expand_filters=256)
5. Last part
Based on [ii]:
layer name/type | filter size / stride |
---|---|
conv10 | 1×1/1 (x1000) |
avgpool10 | 13×13/1 |
the model ends with:
- a Convolution layer with 1000 filters and kernel size 1×1
- a Average Pool layer with stride 1 which based on [iv] is Global
- a Softmax activation applied on the output number ([iv])
Code:
x = Conv2D(filters=1000, kernel_size=1)(x) x = GlobalAvgPool2D()(x) output = Activation('softmax')(x) from tensorflow.keras import Model model = Model(input, output)
Check number of parameters
We can also check the total number of parameters of the model by calling count_params()
on each result element of model.trainable_weights
.
According to [ii] (col: #parameter before pruning) there are 1,248,424 (total) parameters at SqueezeNet model.
Code:
>>> import numpy as np >>> import tensorflow.keras.backend as K >>> int(np.sum([K.count_params(p) for p in model.trainable_weights])) 1248424
Final code
Code:
from tensorflow.keras.layers import Input, Conv2D, Concatenate, \ MaxPool2D, GlobalAvgPool2D, Activation def fire_block(x, squeeze_filters, expand_filters): squeezed = Conv2D(filters=squeeze_filters, kernel_size=1, activation='relu')(x) expanded_1x1 = Conv2D(filters=expand_filters, kernel_size=1, activation='relu')(squeezed) expanded_3x3 = Conv2D(filters=expand_filters, kernel_size=3, padding='same', activation='relu')(squeezed) output = Concatenate()([expanded_1x1, expanded_3x3]) return output input = Input([224, 224, 3]) x = Conv2D(96, 7, strides=2, padding='same', activation='relu')(input) x = MaxPool2D(3, strides=2, padding='same')(x) x = fire_block(x, squeeze_filters=16, expand_filters=64) x = fire_block(x, squeeze_filters=16, expand_filters=64) x = fire_block(x, squeeze_filters=32, expand_filters=128) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) x = fire_block(x, squeeze_filters=32, expand_filters=128) x = fire_block(x, squeeze_filters=48, expand_filters=192) x = fire_block(x, squeeze_filters=48, expand_filters=192) x = fire_block(x, squeeze_filters=64, expand_filters=256) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) x = fire_block(x, squeeze_filters=64, expand_filters=256) x = Conv2D(filters=1000, kernel_size=1)(x) x = GlobalAvgPool2D()(x) output = Activation('softmax')(x) from tensorflow.keras import Model model = Model(input, output)