Reduction of imbalanced data to improve the accuracy of deep learning algorithms for federated learning techniques

Momina Shaheen

Reduction of imbalanced data to improve the accuracy of deep learning algorithms for federated learning techniques

Files

D-SOFT~1.PDF (4.73 MB)

Date

2025

Authors

Momina Shaheen

Publisher

UMT, Lahore

Abstract

Federated learning is a leading machine learning paradigm that facilitates collaborative model training across decentralized nodes while ensuring data privacy and security. In edge computing environments addressing imbalanced training data is a critical challenge due to its non-independent and identically distributed form and variable size. This research explores the impact of global data imbalance on Federated Learning (FL) model accuracy, revealing complexities in mitigating its negative effects. Through empirical analysis and theoretical investigations, new insights into the mechanisms degrading FL accuracy are uncovered, leading to the proposal of a novel method tailored for FL networks. The proposed framework employs two strategies: global distribution data augmentation and synthesis for rebalancing training data, and client rescheduling by mediators for partial equilibrium among edge devices. Experiments on various distributed datasets reveal significant improvements in learning accuracy. This study's main contribution is its analysis of the negative impact of imbalanced training data on federated learning (FL) model accuracy and the development of effective strategies to mitigate this issue. By integrating AI techniques like data augmentation and class estimation into the FL framework, the approach enhances accuracy with minimal computational overhead. This innovative approach utilizes advanced artificial intelligence methodologies within the federated learning (FL) framework to address imbalanced training data and improve the robustness of FL systems in edge computing. Rigorous experimental validation on two datasets— Fashion-MNIST and a dataset stock data—shows that the method achieves nearly 92% accuracy across both types, highlighting its effectiveness in FL for edge computing viii environments. This experimentation on distinct type of datasets including image classification and financial predictive analytics, the method shows significant enhancements in FL model accuracy, underscoring its potential to revolutionize FL methodologies and foster resilient machine learning (ML) systems in edge computing.

URI

https://escholar.umt.edu.pk/handle/123456789/18036

Collections

2025

Full item page