Advancing Transparency and Responsibility in Machine Learning: The Critical Role of FAIR Principles - A Comprehensive Review
Swapna Krishnakumar Radha, Andrey Kuehlkamp, Jarek Nabrzyski
- Year
- 2025
- Citations
- 4
Abstract
Recent advances in machine learning (ML) across various fields have been remarkable. Integration of ML in healthcare has marked a new era, enabling significant improvements such as enhanced diagnostic accuracy, treatment plans, and better patient outcomes. ML algorithms have proved effective in financial industries in detecting fraud by analyzing transaction patterns. In the retail industry, ML improves customer experience through personalized recommendations. Additionally, ML is critical for improving natural language processing, robotics, and other automation processes. However, the rapid evolution of ML makes it necessary to ensure responsible data use, highlighting the importance of privacy of user data, fairness, and transparency. It is crucial to balance the breakthroughs made by ML with ethical considerations. Implementation of FAIR principles in ML is vital under these circumstances. These principles guide ethical and responsible use of not only data but also ML models, thereby enhancing trust and aligning with societal values. This article presents a systematic literature review of the current state of application of FAIR principles in the field of ML. We aim to understand the existing systems that are used in the ML domain for management of data and model lifecycle while adhering to FAIR principles. We have also presented an overview of various proposals and ongoing initiatives that aim to follow the FAIR movement to enable long-term preservation of ML models and data. This enables future advancement of domains by leveraging already existing models and datasets. Additionally, we discuss our ongoing and future work, which explores how technologies such as blockchain and self-sovereign identity can be used to verify data veracity, transparently create and manage data lineage, provide access control mechanisms to ensure responsible data usage while respecting privacy of users and confidentiality of their data used in the design, development, training and testing of ML models.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
Genetic Programming: On the Programming of Computers by Means of Natural Selection
John R. Koza
1992