RecyclingPositionalEncoding¶

class continual.RecyclingPositionalEncoding(embed_dim, num_embeds, learned=True, forward_update_index_steps=1)[source]¶

Recycling Positional Encoding with learned or static weights.

Recycling Positional Encoding were proposed by Hedegaard et al. in “Continual Transformers: Redundancy-Free Attention for Online Inference” https://arxiv.org/abs/2201.06268 (paper) https://www.youtube.com/watch?v=gy802Tlp-eQ (video).

When static encoding is selected, the module employs “Cyclic Positional Encoding” as proposed by Ma et al. in “Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer” https://arxiv.org/abs/2110.02544.

Parameters:

embed_dim (int) – dimensionality of positional embeddings.
num_embeds (int) – number of embeddings to recycle among.
learned (bool) – whether embeddings should be learned or static sinusoidal
forward_update_index_steps (int) – the number of index steps to offset the encoding query with each time forward is called. This ensures that positional encodings have a new starting position at each call.

Examples:

pe = RecyclingPositionalEncoding(
    embed_dim=10,
    num_embeds=16 * 2 - 1,
    forward_update_index_steps=0
)
x = torch.zeros((1, 10, 16))  # (B, C, T)

o_forward = pe.forward(x)
o_forward_steps = pe.forward_steps(x[:, :, :-1])
o_forward_step = pe.forward_step(x[:, :, -1])

assert torch.equal(o_forward[:, :, :-1], o_forward_steps)
assert torch.equal(o_forward[:, :, -1], o_forward_step)