Conv3d¶
- class continual.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None, temporal_fill='zeros')[source]¶
Continual 3D convolution over a spatio-temporal input signal.
Continual Convolutions were proposed by Hedegaard et al.: “Continual 3D Convolutional Neural Networks for Real-time Processing of Videos”, in ECCV (2022), https://arxiv.org/pdf/2106.00050.pdf (paper) https://www.youtube.com/watch?v=Jm2A7dVEaF4 (video).
Assuming an input of shape (B, C, T, H, W), it computes the convolution over one temporal instant t at a time where t ∈ range(T), and keeps an internal state. Two forward modes are supported here.
- Parameters:
in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
stride (int or tuple, optional) – Stride of the convolution. NB: stride > 1 over the first channel is not supported. Default: 1
padding (int or tuple, optional) – Zero-padding added to all three sides of the input. NB: padding over the first channel is not supported. Default: 0
dilation (int or tuple, optional) – Spacing between kernel elements. NB: dilation > 1 over the first channel is not supported. Default: 1
groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional) – If
True
, adds a learnable bias to the output. Default:True
temporal_fill (string, optional) –
'zeros'
or'replicate'
(= “boring video”). temporal_fill determines how state is initialised and which padding is applied during forward_steps along the temporal dimension. Default:'replicate'
- Variables:
weight (Tensor) – the learnable weights of the module of shape . The values of these weights are sampled from where
bias (Tensor) – the learnable bias of the module of shape (out_channels). If
bias
isTrue
, then the values of these weights are sampled from wherestate (List[Tensor]) – a running buffer of partial computations from previous frames which are used for the calculation of subsequent outputs.