This is part of the CNN Architectures series by Dimitris Katsios. Find all CNN Architectures online:

- Notebooks: MLT GitHub
- Video tutorials: YouTube
- Support MLT on Patreon

## DenseNet

We will use the tensorflow.keras Functional API to build **DenseNet** from the original paper: “Densely Connected Convolutional Networks” by Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger.

In the paper we can read:

[i]“Note that each “conv” layer shown in the table corresponds the sequence BN-ReLU-Conv.”

[ii]“[…] we combine features by concatenating them. Hence, the $\ell th$ layer has $\ell$ inputs, consisting of the feature-maps of all preceding convolutional blocks.”

[iii]“If each function $H_\ell$ produces $k$ feature-maps, it follows that the $\ell th$ layer has $k_0 + k × (\ell − 1)$ input feature-maps, where $k_0$ is the number of channels in the input layer.”

[iv]“The initial convolution layer comprises 2k convolutions of size 7×7 with stride 2”

[v]“In our experiments, we let each 1×1 convolution produce 4k feature-maps.”

[vi]“If a dense block contains m feature-maps, we let the following transition layer generate $\lfloor \theta m \rfloor$ output feature-maps, where $0< \theta ≤1$ is referred to as the compression factor. […] we set $\theta$ = 0.5 in our experiment.”

We will also make use of the following Table **[vii]** and Diagram **[viii]**:

## Network architecture

We will implement the Dense-121 (k=32) version of the model (marked with red in **[vii]**).

The model:

- starts with a Convolution-Pooling block
- continues with a series of: — Dense block — Transition layer
- closes with a
*Global Average pool*and a*Fully-connected*block.

In every Dense block the input tensor passes through a series of *conv* operations with fixed number of filters (*k*) and the result of each one is then concatenated to the original tensor **[ii]**. Thus the number of feature maps of the input tensor follows an arithmetic growth at every internal stage of the Dense block by *k* tensors per stage **[iii]**.

In order for the size of the tensor to remain manageable the model makes use of the * Transition layers*.

At each *Transision layer* the number of feature maps of the input tensor is reduced to half (multiplied by $\theta=0.5$) (**[vi]**).

Also the spatial dimensions of the input tensor are halved by an *Average Pool* layer (**[vii]**).

### Dense block

At each Dense block we have a repetition of:

- 1×1 conv with $4\cdot k$ filters
- 3×3 conv with k filters

blocks.

As it is written in **[i]**:

each “conv” layer corresponds the sequence BN-ReLU-Conv

## Workflow

We will:

- import the neccesary layers
- write the
*BN-ReLU-Conv*function (**[i]**) - write the
*dense_block()*function - write the
*transition_layer()*function - use the functions to build the model

### 1. Imports

**Code:**

import tensorflow from tensorflow.keras.layers import Input, BatchNormalization, ReLU, \ Conv2D, Dense, MaxPool2D, AvgPool2D, GlobalAvgPool2D, Concatenate

### 2. BN-ReLU-Conv function

The *BN-ReLU-Conv* function will:

- take as inputs:
- a tensor (
)`x`

- the number of filters for the
*Convolution layer*()`filters`

- the kernel size of the
*Convolution layer*()`kernel_size`

- a tensor (
- run:
- apply
*Batch Normalization*to`x`

- apply ReLU to this tensor
- apply a
*Convolution*operation to this tensor

- apply
- return the final tensor

**Code:**

def bn_rl_conv(x, filters, kernel_size): x = BatchNormalization()(x) x = ReLU()(x) x = Conv2D(filters=filters, kernel_size=kernel_size, padding='same')(x) return x

### 3. Dense block

We can use this function to write the *Dense block* function.

This function will:

- take as inputs:
- a tensor (
)`tensor`

- the filters of the conv operations (
)`k`

- how many times the conv operations will be applied (
)`reps`

- a tensor (
- run
times:`reps`

- apply the 1×1 conv operation with $4\cdot k$ filters (
**[v]**) - apply the 3×3 conv operation with $k$ filters (
**[iii]**) *Concatenate*this tensor with the input`tensor`

- apply the 1×1 conv operation with $4\cdot k$ filters (
- return as output the final tensor

**Code:**

def dense_block(tensor, k, reps): for _ in range(reps): x = bn_rl_conv(tensor, filters=4*k, kernel_size=1) x = bn_rl_conv(x, filters=k, kernel_size=3) tensor = Concatenate()([tensor, x]) return tensor

### 4. Transition layer

Following, we will write a function for the transition layer.

This function will:

- take as input:
- a tensor (
)`x`

- the compression factor (
)`theta`

- a tensor (
- run:
- apply the 1×1 conv operation with
times the existing number of filters (`theta`

**[vi]**) - apply Average Pool layer with pool size 2 and stride 2 (
**[vii]**)

- apply the 1×1 conv operation with
- return as output the final tensor

Since the number of filters of the input tensor is not known a priori (without computations or hard coded numbers) we can get this number using the `tensorflow.keras.backend.int_shape()`

function. This function returns the shape of a tensor as a tuple of integers

In our case we are interested in the number of feature maps/filters, thus the last number [-1] (channel last mode).

**Code:**

def transition_layer(x, theta): f = int(tensorflow.keras.backend.int_shape(x)[-1] * theta) x = bn_rl_conv(x, filters=f, kernel_size=1) x = AvgPool2D(pool_size=2, strides=2, padding='same')(x) return x

### 5. Model code

Now that we have defined our helper functions, we can write the code of the model.

The model starts with:

- a Convolution layer with $2\cdot k$ filters, 7×7 kernel size and stride 2 (
**[iv]**) - a 3×3 Max Pool layer with stride 2 (
**[vii]**)

and closes with:

- a Global Average pool layer
- a Dense layer with 1000 units and
*softmax*activation (**[vii]**)

Notice that after the last *Dense block* there is no *Transition layer*. For this we use a different letters (d, x) in the `for`

loop so that in the end we can take the output of the last *Dense block*.

**Code:**

IMG_SHAPE = 224, 224, 3 k = 32 theta = 0.5 repetitions = 6, 12, 24, 16 input = Input(IMG_SHAPE) x = Conv2D(2*k, 7, strides=2, padding='same')(input) x = MaxPool2D(3, strides=2, padding='same')(x) for reps in repetitions: d = dense_block(x, k, reps) x = transition_layer(d, theta) x = GlobalAvgPool2D()(d) output = Dense(1000, activation='softmax')(x) from tensorflow.keras import Model model = Model(input, output)

## Final code

**Code:**

import tensorflow from tensorflow.keras.layers import Input, BatchNormalization, ReLU, \ Conv2D, Dense, MaxPool2D, AvgPool2D, GlobalAvgPool2D, Concatenate def bn_rl_conv(x, filters, kernel_size): x = BatchNormalization()(x) x = ReLU()(x) x = Conv2D(filters=filters, kernel_size=kernel_size, padding='same')(x) return x def dense_block(tensor, k, reps): for _ in range(reps): x = bn_rl_conv(tensor, filters=4*k, kernel_size=1) x = bn_rl_conv(x, filters=k, kernel_size=3) tensor = Concatenate()([tensor, x]) return tensor def transition_layer(x, theta): f = int(tensorflow.keras.backend.int_shape(x)[-1] * theta) x = bn_rl_conv(x, filters=f, kernel_size=1) x = AvgPool2D(pool_size=2, strides=2, padding='same')(x) return x IMG_SHAPE = 224, 224, 3 k = 32 theta = 0.5 repetitions = 6, 12, 24, 16 input = Input(IMG_SHAPE) x = Conv2D(2*k, 7, strides=2, padding='same')(input) x = MaxPool2D(3, strides=2, padding='same')(x) for reps in repetitions: d = dense_block(x, k, reps) x = transition_layer(d, theta) x = GlobalAvgPool2D()(d) output = Dense(1000, activation='softmax')(x) from tensorflow.keras import Model model = Model(input, output)