This is part of the CNN Architectures series by Dimitris Katsios. Find all CNN Architectures online:
- Notebooks: MLT GitHub
- Video tutorials: YouTube
- Support MLT on Patreon
XCEPTION
We will use the tensorflow.keras Functional API to build Xception from the original paper: “Xception: Deep Learning with Depthwise Separable Convolutions” by François Chollet. [paper]
In the paper we can read:
[i] “all Convolution and SeparableConvolution layers are followed by batch normalization [7] (not included in the diagram).”
[ii] “All SeparableConvolution layers use a depth multiplier of 1 (no depth expansion).”
We will also use the following Diagram [iii]:

as well the following Table [iv] to check the total number of parameters:

Network architecture
The model is separated in 3 flows as depicted at [iii]:
- Entry flow
- Middle flow with 8 repetitions of the same block
- Exit flow
According to [i] all Convolution and Separable Convolution layers are followed by batch normalization.
Workflow
We will:
- import the neccesary layers
- write one helper function for the Conv-BatchNorm block and one for the SeparableConv-BatchNorm block according to [i]
- write one function for each one of the 3 flows according to [iii]
- use these helper functions to build the model.
1. Imports
Code:
from tensorflow.keras.layers import Input, Conv2D, SeparableConv2D, \ Add, Dense, BatchNormalization, ReLU, MaxPool2D, GlobalAvgPool2D
2.1. Conv-BatchNorm block
The Conv-BatchNorm block will:
- take as inputs:
- a tensor (
x
) - the number of filters of the Convolution layer (
filters
) - the kernel size of the Convolution layer (
kernel_size
) - the strides of the Convolution layer (
strides
)
- a tensor (
- run:
- apply a Convolution layer to
x
- apply a Batch Normalization layer to this tensor
- apply a Convolution layer to
- return the tensor
Code:
def conv_bn(x, filters, kernel_size, strides=1): x = Conv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same', use_bias=False)(x) x = BatchNormalization()(x) return x
Note: We include use_bias=False for the final number of parameters to match the ones written at [iv].
2.2. SeparableConv-BatchNorm
The SeparableConv-BatchNorm block has similar structure with the Conv-BatchNorm one
Code:
def sep_bn(x, filters, kernel_size, strides=1): x = SeparableConv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same', use_bias=False)(x) x = BatchNormalization()(x) return x
3.1. Entry flow

Code:
def entry_flow(x): x = conv_bn(x, filters=32, kernel_size=3, strides=2) x = ReLU()(x) x = conv_bn(x, filters=64, kernel_size=3) tensor = ReLU()(x) x = sep_bn(tensor, filters=128, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=128, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=128, kernel_size=1, strides=2) x = Add()([tensor, x]) x = ReLU()(x) x = sep_bn(x, filters=256, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=256, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=256, kernel_size=1, strides=2) x = Add()([tensor, x]) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=728, kernel_size=1, strides=2) x = Add()([tensor, x]) return x
3.2. Middle flow

Code:
def middle_flow(tensor): for _ in range(8): x = ReLU()(tensor) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) tensor = Add()([tensor, x]) return tensor
3.3. Exit flow

Code:
def exit_flow(tensor): x = ReLU()(tensor) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=1024, kernel_size=3) x = MaxPool2D(3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=1024, kernel_size=1, strides=2) x = Add()([tensor, x]) x = sep_bn(x, filters=1536, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=2048, kernel_size=3) x = ReLU()(x) x = GlobalAvgPool2D()(x) x = Dense(units=1000, activation='softmax')(x) return x
4. Model code
Code:
input = Input(shape=[299, 299, 3]) x = entry_flow(input) x = middle_flow(x) output = exit_flow(x) from tensorflow.keras import Model model = Model(input, output)
Check number of parameters
We can also check the total number of trainable parameters of the model by calling count_params()
on each result element of model.trainable_weights
.
According to [iv] there are 22,855,952 trainable parameters at Xception model.
Code:
>>> import numpy as np >>> import tensorflow.keras.backend as K >>> np.sum([K.count_params(p) for p in model.trainable_weights]) 22855952
Final code
Code:
from tensorflow.keras.layers import Input, Conv2D, SeparableConv2D, \ Add, Dense, BatchNormalization, ReLU, MaxPool2D, GlobalAvgPool2D def conv_bn(x, filters, kernel_size, strides=1): x = Conv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same', use_bias=False)(x) x = BatchNormalization()(x) return x def sep_bn(x, filters, kernel_size, strides=1): x = SeparableConv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same', use_bias=False)(x) x = BatchNormalization()(x) return x def entry_flow(x): x = conv_bn(x, filters=32, kernel_size=3, strides=2) x = ReLU()(x) x = conv_bn(x, filters=64, kernel_size=3) tensor = ReLU()(x) x = sep_bn(tensor, filters=128, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=128, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=128, kernel_size=1, strides=2) x = Add()([tensor, x]) x = ReLU()(x) x = sep_bn(x, filters=256, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=256, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=256, kernel_size=1, strides=2) x = Add()([tensor, x]) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=728, kernel_size=1, strides=2) x = Add()([tensor, x]) return x def middle_flow(tensor): for _ in range(8): x = ReLU()(tensor) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=728, kernel_size=3) tensor = Add()([tensor, x]) return tensor def exit_flow(tensor): x = ReLU()(tensor) x = sep_bn(x, filters=728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=1024, kernel_size=3) x = MaxPool2D(3, strides=2, padding='same')(x) tensor = conv_bn(tensor, filters=1024, kernel_size=1, strides=2) x = Add()([tensor, x]) x = sep_bn(x, filters=1536, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters=2048, kernel_size=3) x = ReLU()(x) x = GlobalAvgPool2D()(x) x = Dense(units=1000, activation='softmax')(x) return x input = Input(shape=[299, 299, 3]) x = entry_flow(input) x = middle_flow(x) output = exit_flow(x) from tensorflow.keras import Model model = Model(input, output)