FR EN

Conférences plénières

Plénière 1 : The next generation of JPEG standards for human and machine consumption

Artificial Intelligence (AI) has emerged as a key technology in a wide range of applications, including those in visual information processing.

Computer vision based on AI has shown outstanding results in pattern recognition and content understanding and AI-based image processing has demonstrated unparalleled performance in cognitive enhancement of visual content as witnessed by advances in super-resolution and denoising operations taking advantage of deep learning techniques. More recent efforts in AI-based compression have shown superior performance when compared to the most advanced to state of the art in image coding.

Based on the above observations, the JPEG standardization committee has started new projects on visual information coding for next generation applications where visual content should be efficiently represented to be used for both human and machine consumption. The first project in this direction which is near completion, is referred to as JPEG AI. The scope of JPEG AI is defined as a learning-based image coding standard that offers a single-stream, compact and compressed domain representation, targeting both human visualization, with significant compression efficiency improvement over current image coding standards, as well as effective performance for image processing and computer vision tasks. Other similar projects have been launched or are under investigation, such as JPEG Pleno learning based point cloud compression and JPEG XE, towards standardization of event based imaging.

In this talk we will provide an overview of JPEG AI, its current status and its performance when compared to state of the art. We will then continue with a discussion about the roadmap and status of other JPEG standardization efforts motivated by recent progress in AI-centric imaging.

Touradj Ebrahimi is professor of image processing at Ecole Polytechnique Fédérale de Lausanne (EPFL). He is active in teaching and research in multimedia signal processing and heads the Multimedia Signal Processing Group at EPFL. Since 2014, he has been the Convenor (Chairman) of the JPEG standardization Committee which has produced a family of standards that have revolutionised the world of imaging. He represents Switzerland as the head of its delegation to JTC1 (in charge of standardization of information technology in ISO and IEC), SC29 (the body overseeing MPEG and JPEG standardization) and is a member of ITU representing EPFL. He was previously the chairman of the SC29 Advisory Group on Management until 2014. Prof. Ebrahimi is involved in Ecma International as a member of its ExeCom since 2020 and serves as consultant, evaluator and expert for European Commission and other governmental funding agencies in Europe and advises a number of Venture Capital companies in Switzerland in their scientific and technical audits. He has founded several startup and spinoff companies in the past two decades, including the most recent RayShaper SA, a research company based in Crans-Montana involved in AI powered multimedia. Prof. Ebrahimi is author or co-author of over 300 scientific publications and holds a dozen invention patents. His areas of interest include image and video compression, media security, quality of experience in multimedia and AI based image and video processing and analysis. Prof. Ebrahimi is a Fellow of the IEEE, SPIE and EURASIP and has been recipient of several awards and distinctions, including a Prime Time Emmy Award for JPEG, the IEEE Star Innovator Award in Multimedia and SMPTE Progress Medal.

Plénière 2 : Intégration de connaissances contextuelles d’environnements urbains dans le processus d’apprentissage de modèles profonds pour l’amélioration de leur performance pour des tâches visuelles

L’intégration de connaissances contextuelles dans le développement de systèmes de vision artificielle attire de plus en plus de chercheurs, notamment lorsqu’il s’agit d’approches à base de modèles par apprentissage. Dans cette présentation, nous nous intéressons à comment intégrer des connaissances contextuelles (géométriques, sémantiques, …) dans le processus d’apprentissage de modèles profonds pour essayer d’améliorer leur performance pour des tâches visuelles telle que l’estimation de la profondeur à partir d’images monoculaires ou encore la segmentation panoptique, dans des environnements de conduite urbains. Dans le premier cas (estimation de la profondeur), nous proposons d’assister l’apprentissage du modèle profond avec l’introduction d’indices monoculaires en entrée du modèle. Ces indices sont extraits à partir d’une carte de segmentation sémantique en utilisant un raisonnement ontologique. Dans le cas de la deuxième tâche visuelle (segmentation panoptique), nous proposons d’inclure des connaissances sur les relations spatiales des objets perçus dans une scène, dans la fonction perte du modèle. L’analyse et la comparaison des résultats expérimentaux sont réalisées sur différentes bases de données publiques relatives à des scènes de conduite urbaines.

Yassine Ruichek est Professeur à l'Université de Technologie de Belfort Montbéliard (UTBM) et rattaché au laboratoire CIAD (Connaissances et Intelligence Artificielle Distribuées). Il a obtenu son doctorat en ingénierie automatique et informatique et l'Habilitation à Diriger des Recherches en science physique à l'Université de Lille en 1997 et 2005, respectivement. Ses centres d'intérêt en recherche concernent la perception et la localisation basées multi-capteurs, y compris la vision artificielle, la reconnaissance et la classification de formes, l'apprentissage machine et la fusion de données, avec pour champs applicatifs les transports intelligents et la vidéo-surveillance.

Vie privée | Accessibilité