Doctoral thesis (Dissertations and theses)
Towards Reliable Explanations of Deep Neural Networks - Explainable Artificial Intelligence for Vision and Vision-Language Models
Stassin, Sédrick
2025
 

Files


Full Text
Thèse___Manuscrit (1).pdf
Author postprint (197.15 MB) Creative Commons License - Attribution, ShareAlike
Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to



Details



Keywords :
Intelligence Artificielle; IA; XAI; Explicabilité; Deep Learning; Evaluation; Convolutional Neural Networks; CNN; Vision Transformers; ViT; Biais
Abstract :
[en] Today, we are fortunate to live in an unprecedented era. Day by day, human ity develops a technology that surpasses itself, revolutionizes, and advances at an extraordinary pace: artificial intelligence (AI), challenging our ability to regulate and comprehend it. This rapid progress has been largely fueled by deep neu ral networks (DNNs), whose increasingly complex architectures have significantly improved performance and precision. Additionally, the rise of transformers and their various adaptations, such as Vision Transformers (ViTs), multimodal mod els like Vision-Language Transformers (VL), and large language models (LLMs), represent a new frontier of complexity. However, these advances have come at the cost of reduced interpretability, raising critical concerns about transparency and trust. Amidst this rapid evolution, ensuring the transparency and reliability of AI systems has become a pressing priority, particularly in high-stakes applications. This thesis addresses this challenge through the lens of explainable AI (XAI), a field dedicated to elucidating the decision-making processes of AI systems to ensure that they are interpretable and trustworthy. The central aim of this work is to develop and guide towards more reliable explanations for neural networks, with a particular focus on XAI for vision tasks. Through this approach, this research seeks to lay the foundation for building more robust and trustworthy AI systems in the future. This thesis is structured around three major contributions: 1. Bias Detection and Mitigation: The thesis investigates how biases in convolutional neural network (CNN) models can be identified using XAI methods, and then explores strategies to mitigate these biases. A case study on two datasets (X-ray lung; biased colored digits) illustrates how dataset bias impacts model understanding and offers pathways for improvement. 2. Evaluation of Explainability Methods: A comprehensive framework is proposed for selecting and evaluating XAI methods using explainability metrics. This framework is applied to convolutional and transformer-based neural networks, providing insights into their interpretability and correlation, and exposing limitations in current evaluation practices. 3. Advancing Explainability for Novel Architectures: A novel explainability method is introduced for Vision Transformers, more reliable than its perturbation-based counterparts, along with an adaptable extension for multimodal models handling diverse data types such as images and text. Our results stand out consistently by securing top rankings across various state-of-the-art metrics. Together, these contributions advance the field of AI explainability, offering practical tools and insights for developing transparent and reliable AI systems. By addressing critical challenges in bias mitigation, method evaluation, and architectural adaptability, this work aims to ensure that AI technologies can be safely and confidently integrated into society, fostering trust in their potential.
Disciplines :
Computer science
Author, co-author :
Stassin, Sédrick  ;  Université de Mons - UMONS > Faculté Polytechnique > Service Informatique, Logiciel et Intelligence artificielle
Language :
English
Title :
Towards Reliable Explanations of Deep Neural Networks - Explainable Artificial Intelligence for Vision and Vision-Language Models
Alternative titles :
[fr] Vers des Explications Fiables des Réseaux Neuronaux Profonds - Intelligence Artificielle Explicable pour les Modèles de Vision et de Vision-Language
Defense date :
29 January 2025
Number of pages :
203 + 18
Institution :
UMONS - University of Mons [Polytechnique], Mons, Belgium
Degree :
Doctorat en Sciences de l'Ingénieur et Technologie
Promotor :
Mahmoudi, Sidi  ;  Université de Mons - UMONS > Faculté Polytechnique > Service Informatique, Logiciel et Intelligence artificielle
Siebert, Xavier  ;  Université de Mons - UMONS > Faculté Polytechnique > Service de Mathématique et Recherche opérationnelle
President :
Lecron, Fabian ;  Université de Mons - UMONS > Faculté Polytechnique > Service de Management de l'Innovation Technologique
Secretary :
Mancas, Matei  ;  Université de Mons - UMONS > Faculté Polytechnique > Service Information, Signal et Intelligence artificielle
Jury member :
Bontempi, Gianluca;  ULB - Université Libre de Bruxelles > Computer Science > Machine Learning Group
De Vleeschouwer, Christophe;  UCL - Catholic University of Louvain > Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM)
Research unit :
F114 - Informatique, Logiciel et Intelligence artificielle
Research institute :
Infortech
Numediart
Available on ORBi UMONS :
since 29 April 2025

Statistics


Number of views
17 (4 by UMONS)
Number of downloads
0 (0 by UMONS)

Bibliography


Similar publications



Contact ORBi UMONS