COmpression et REprésentation des Signaux Audiovisuels

FR EN

sciencesconf.org:coresa2023:469346

2D versus 3D Convolutional Spiking Neural Networks Trained with Unsupervised STDP for Human Action Recognition

Mireille El Assal 1, @ , Ioan Marius Bilasco 1, @ , Tirilly Pierre 1, @

1 : Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Université de Lille : UMR9189, Centrale Lille : UMR9189, Centre National de la Recherche Scientifique : UMR9189, Centrale Lille, Université de Lille, Centre National de la Recherche Scientifique

Spiking neural networks (SNNs) are third generation biologically plausible models that process the information in the form of spikes. Unsupervised learning with SNNs using the spike timing dependent plasticity (STDP) rule has the potential to overcome some bottlenecks of regular artificial neural networks, but STDP-based SNNs are still immature. In this work, we study the performance of SNNs when challenged with the task of human action recognition. In this paper we introduce a multi-layered 3D convolutional SNN model trained with unsupervised STDP. We show that STDP-based convolutional SNNs can learn motion patterns using 3D kernels, thus enabling motion-based recognition from videos. We also compare the performance of this model to those of a 2D STDP-based SNN when challenged with the KTH and Weizmann datasets. Finally, we give evidence that 3D convolution is superior to 2D convolution with STDP-based SNNs.

Type :	:	oral
Thématiques	:	Image, vidéo et géométrie

Vie privée | Accessibilité