What are we doing?

Sometimes we want to see inputs and outputs of PyTorch layers to build an intuition of what they do. If I've read the docs and put a few tensors through the layer while checking the inputs and outputs shapes, generally that's enough.

But sometimes there's weird parameters that I can't get my head around or I just want to see it working, so building interactive widgets helps me grow my understanding.

So in this post I'll show you how I built an interactive widget to explore PyTorch's ConvTranspose1d, while explaining a bit about the layer itself. We'll use Anacondas's HoloViz tools (Holoviews, Panel and Bokeh) for the plotting and interactivity.

The end goal is to have a interactive plot for interacting with ConvTranspose1d parameters and seeing the output like this tweet.

Introduction to Transposed Convolutions

Before learning about Transposed Convolutions, you're best learning about Convolutions first. CS231n is a great resource for learning about them.

As you may know, Convolutions are often used to efficiently reduce a dimensions of the input in neural networks. In the case of image classification tasks, they are used to efficiently reduce an input image to a single class score.

Transposed Convolutions are useful when you want to grow your network in a certain dimension. For example, say you have a image segmentation task, in which you want a class prediction per pixel, you can use strided Convolutions to reduce the dimensions and then grow the dimensions back to their original sizel with Transposed Convolutions. This is done in U-net style architectures.

Conveniently, PyTorch has implemented ConvTranspose1d such that if it has the same input parameters as Conv1d and if you pass a tensor through both, the output tensor will be the same shape as the input tensor (provided you set output_padding correclty).

Imports

#collapse-hide
import torch
import torch.nn as nn
from panel.interact import interact
from panel import widgets
import panel as pn
from IPython.display import display
import holoviews as hv
from holoviews import opts
import numpy as np
hv.extension('bokeh', logo=False)

Create an image from a PyTorch Tensor

Firstly, to create an image from a 2d numpy array, we'll use the Holoviews library denoted hv here.

There's a few little bits here to make it nicer but hv.Image(out) would have worked fine.

You can skip this if you're not interested in the visualisation details. We set the bounds so we have correct axes. We use * operator to overlay the image with hv.Labels(image) so that the values are printed on top of each pixel. We set the vdims to a fixed range so that colours don't change between updates. We set the width to change depending on the number of pixels so it's easier to watch it grow. L‹astly, we return the image in a HoloViews pane with linked_axes=False so that each plot gets it's own axes.

#collapse-show
hv.extension('bokeh', logo=False)
output_dim = 4
def image(out, feature_dim=output_dim, title='', xlabel='Sequence Dimension', ylabel='Feature Dimension'):
    output_image = hv.Image(out, 
                            vdims=hv.Dimension('z', range=(0, 100)), 
                            bounds=(0, 0, out.shape[-1], feature_dim))
    layout = output_image * hv.Labels(output_image)
    layout.opts(
        hv.opts.Image(cmap='PiYG', 
                      xlabel=xlabel,
                      ylabel=ylabel,
                      title=title,
                      width=50*out.shape[-1])
    )
    return pn.pane.HoloViews(layout, linked_axes=False)

Create Predictable Input Data

Then we create some synthetic input data, I could create random data but instead I create predictable data so it's easier to think about.

#collapse-show
seq_len = 5
input_dim = 6
input_data = torch.tensor([list(range(1, seq_len+1))]*input_dim).double()
image(input_data.detach().numpy(), feature_dim=input_dim)

Input Data

Another thing I sometimes do is set the weights of the layer itself when doing these visualisations. These would be randomly initialised and then learned by the network in practice.

#collapse-show
kernel_size = 7
weights = torch.tensor([[list(range(i, i+kernel_size)) for i in range(output_dim)]]*input_dim).double()
assert weights.shape == (input_dim, output_dim, kernel_size)
print('Weights Shape [in,out,k]: ', list(weights.shape))
bias = torch.tensor(list(range(output_dim))).double()
print('Bias: ', bias)
Weights Shape [in,out,k]:  [6, 4, 7]
Bias:  tensor([0., 1., 2., 3.], dtype=torch.float64)

For each of the input channels, we have learned filters of shape Kernel Size * Output Channels. Here's one of them:

#collapse-show
image(weights[0].detach().numpy(), xlabel='Kernel Size', ylabel='Output Channels')

 Weights

 Weights Shape

Interactive Sliders

To have sliders dynamically update the plot, we'll use widgets and interact from panel.

 @interact and widgets

The @interact decorator allows you to create widgets and the visualisation that depends on them at the same time. So in this case, we want widgets to control the different parameters of ConvTranspose1d. As the widgets change, conv_out is called and returns the image we defined before.

We use widgets.IntSlider to explicitly create the widgets for each parameter in the conv_transpose_out function. One interesting thing to note is that use_bias=True automatically creates a checkbox for us. Finally, we return a pn.Column to compose the input and output images.

#collapse-show
hv.extension('bokeh', logo=False)
input_dim = 6
output_dim = 4
kernel_size = 7
seq_len = 5
stride = 2
dilation = 1
use_bias = True

@interact(padding=        widgets.IntSlider(name='Padding',               start=0,end=5, step=1, value=1), 
          output_padding= widgets.IntSlider(name='Output Padding',        start=0,end=3, step=1, value=0), 
          seq_len=        widgets.IntSlider(name='Input Sequence Length', start=1,end=10,step=1, value=seq_len),
          kernel_size=    widgets.IntSlider(name='Kernel Size',           start=3,end=5, step=1, value=kernel_size),
          stride=         widgets.IntSlider(name='Stride',                start=1,end=5, step=1, value=stride),
          use_bias=True
         )
def conv_transpose_out(padding, output_padding, seq_len, kernel_size, stride, use_bias):    
    if output_padding > stride:
        return 'Output Padding needs to be less than or equal to Stride'
    conv_t = nn.ConvTranspose1d(input_dim, output_dim, 
                                kernel_size=kernel_size, 
                                stride=stride, 
                                padding=padding, 
                                output_padding=output_padding, 
                                bias=use_bias)
    input_data = torch.tensor([list(range(1, seq_len+1))]*input_dim).double()
    conv_t.weight.data = weights[:,:,:kernel_size]
    if use_bias: 
        conv_t.bias.data = bias
    in_tensor = input_data[None,:,:seq_len]
    in_seq_len = in_tensor.shape[-1]
    out = conv_t(in_tensor).squeeze(0).detach().numpy()
    in_image =  image(input_data.detach().numpy(), input_data.shape[0], 'Input')
    out_image = image(out, out.shape[0], 'ConvTranspose1d Output')
    return pn.Column(out_image)
                    

If you run this code yourself, you can interact with all parameters together. To be able to show this on a static HTML Github Pages blog, I needed to reduce the space of the sliders so here's how padding and output padding affect the size of the output.

 Padding and Output Padding

You can see the padding reduces the size of the sequence dimension. As described in the PyTorch documentation:

Note: The padding argument effectively adds "dilation * (kernel_size - 1) - padding" amount of zero padding to both sizes of the input

The output padding arguement adds padding to one side. From the PyTorch documentation:

Note: When stride > 1, Conv1d maps multiple input shapes to the same output shape. output_padding is provided to resolve this ambiguity by effectively increasing the calculated output shape on one side.

Kernel Size and Stride

And here's how the kernel size and stride affect the sequence dimension. You can see that a bigger stride increases the size of the sequence dimension, as opposed to decreasing it like with regular convolutions.

Play with it yourself!

If you'd like to run this yourself or create your own visualisations with different layers, all the code from this post is available on Github.

I personally love learning about new layers in PyTorch and finding ways to interact with them visually. What do you think about these styles of visualisations? Did you learn a bit about Transposed Convolutions or creating interactive visualisations in Python by reading this article? If so, feel free to share it, and you’re also more than welcome to contact me (via Twitter) if you have any questions, comments, or feedback.

Thanks for reading! :rocket:

Follow me on Twitter here for more stuff like this.