An E-Commerce Based Loan Prediction Through User Profiling
Loading...
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
UMT, Lahore
Abstract
Online shopping is trending and convenient in recent years, and the online business is rapidly developing in the retail industry. Where the user has to pay before trying, whether the product meets their satisfaction criteria or not, that can cause customer churn. Therefore, some eCommerce stores offer a “try before you buy and pay overtime facility” ( n.d.) to mitigate the churn rate. In such cases authenticating the credibility of the customer is very important and crucial if we have no personal information about the customer, no credit history, zip code, salary or any bank details provided, even if we have no label information about whether the customer can pay back or not. Hence, the retail industry is looking for ways to automate the process and make it more efficient to predict the credibility of the customer. Credit scoring has shown to be an effective technique for eCommerce companies to identify prospective churn customers and default debtors. The purpose of this research is to combine unsupervised and supervised techniques with analytics to get the most accurate possible results. In this thesis, we introduced two risk-scoring ensemble prediction models that combine different algorithms to analyze various hypotheses and make a new hypothesis for credit assessment. Firstly, the model predicts the retention score of the customers by using the TabNet classification model and then uses these probabilities scores to predict the customer's credit scores by user profiling. Customers who have a low predicted probability value are likely to be not satisfied customers and have a low level of credibility. To predict the credibility of the user we use the unsupervised GraphSAGE DBSCAN embedding model, and use these embeddings to map them into a Graph-network, and find the demographic-based similarities between users to segment them. Six popular evaluation metrics, consisting of accuracy, the area under the curve (ROC-AUC), F1 score, precision, recall and KS statistics are employed to evaluate the performance of the churn prediction model and achieve 96% accuracy on the test set. The Silhouette Coefficient score, Calinski-Harabasz Index, and Davies-Bouldin Index metrics are used to estimate the proposed unsupervised clustering approach, and results can be reviewed by human analysts. This research examines consumer purchasing, churning, and credibility patterns using graph-based embedding techniques. The study then analyses the trends behind the factors that contributed to the decline in consumer validation in the retail industry by comparing these to the different eCommerce datasets in place.