Multimodal Learning for Anomaly Detection in Cyber-Physical Smart Infrastructure: A Systematic Survey
Keywords:
Multimodal learning, anomaly detection, cyber-physical systems, smart infrastructure, sensor fusion, deep learning, graph neural networks, transformer models, industrial IoT, intrusion detection, time-series analysis, federated learningAbstract
Cyber-physical smart infrastructures (CPSIs) refer to an ever-expanding number of networks of connected systems, including smart electrical grids and intelligent transportation systems, industrial automation systems and smart health systems. These systems generate uninterrupted, high-speed and multi-dimensional data of a wide variety of sensing, communication and control systems. Identifying deviations in such an environment, be it due to equipment malfunctions, deviations in operations, or intentional cyber-attack, is of utmost importance to safety and non-disruptive provision of services. Conventional single-modality methods only deal in one type of data at a time, which dramatically constrains their ability to detect cross modal signatures of the complex data that define anomalies in realworld CPSIs. Multimodal learning overcomes this weakness by modeling jointly complementary information between heterogeneous sources of data, such as sensor readings or network traffic logs, surveillance video streams, text messages and context metadata. The current survey is a well-structured and extensive analysis of multimodal learning techniques used in the context of detecting anomalies in CPSIs. We establish a taxonomy of fusion strategies and architectural paradigms and examine in depth 75 peer-reviewed publications released between 2018 and 2024, and how they are applied in five key domains of CPSI. We further discuss benchmark datasets, evaluation practices, and a set of clearly identified open challenges covering data heterogeneity, label scarcity, real-time constraints, adversarial threats, and explainability requirements. The survey concludes with concrete research directions that reflect the practical demands of deploying multimodal anomaly detection in real infrastructure environments
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt the material, but only for non-commercial purposes. You must give appropriate credit to the author(s).

