CORESA_2023_11.jpg

Transformer-Based Image Compression Without Positional Encoding
Bouzid Arezki  1@  , Anissa Mokraoui  1@  , Fangchen Feng  1@  
1 : Laboratoire de Traitement et Transport de l'Information
Université Sorbonne Paris nord

In this paper, we address the image compression problem and introduce the Swin Non-Positional Encoding (SwinNPE) transformer. SwinNPE improves the efficiency of the SwinT transformer while reducing the number of model parameters. We generalize the Swin cell and propose the Swin convolutional block, which can better handle the local correlation between image patches. Additionally, the Swin convolutional block can capture the local context between tokens without relying on positional encoding, reducing the model complexity. Preliminary results show that SwinNPE outperforms state-of-the-art CNN-based architectures in terms of the trade-off between bit-rate and distortion, achieving results comparable to SwinT with 16% less computational complexity on the Kodak dataset.


Personnes connectées : 1 Vie privée
Chargement...